Skip to content

Conversation

@nlgwcy
Copy link
Contributor

@nlgwcy nlgwcy commented Nov 10, 2024

What type of PR is this?
/kind enhancement

What this PR does / why we need it:
As discussed at the community meeting, the current xDS bpf map has the following problems:

  1. Map-in-map update is slow. To support a large number of map records, a batch of map records need to be prepared in advance, causing memory waste and complex implementation logic.
  2. The value length of inner-map is 1300, which is a waste in most scenarios.

Solution:

  1. The map-in-map is no longer used to manage xDS data storage.
  2. Define a group of bpf hash map (64/192/296/1600), select a proper bpf map for storing the xDS data based on the length, set the map flag to NO_PREMALLOC, and apply for the memory as required.
  3. For data of the string type, the size cannot exceed MAP_VAL_STR_SIZE(192). For data of the repeat type, the size cannot exceed MAP_VAL_REPEAT_SIZE(1600). That is, for the repeat member of the xDS, the number of instances cannot exceed 200.

After the modification, the content of the xDS pointer is as follows:
ptr(outer_key):
Reserved(4 bytes) + map_type(1 byte) + inner_idx(3 bytes)

Known issues:

  1. In extreme scenarios, the repeat member has a large number of instances (more than 200), and a bpf map cannot store all the instances. Therefore, multiple records need to be designed for storage.

Which issue(s) this PR fixes:
Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing change?:


Signed-off-by: wuchangye <wuchangye@huawei.com>
@codecov
Copy link

codecov bot commented Nov 10, 2024

Codecov Report

Attention: Patch coverage is 48.27586% with 15 lines in your changes missing coverage. Please review.

Project coverage is 47.85%. Comparing base (cd1579d) to head (e57194b).
Report is 228 commits behind head on main.

Files with missing lines Patch % Lines
pkg/bpf/ads/loader.go 33.33% 2 Missing and 6 partials ⚠️
pkg/bpf/workload/loader.go 36.36% 2 Missing and 5 partials ⚠️
Files with missing lines Coverage Δ
pkg/bpf/ads/sock_connection.go 56.52% <ø> (-0.38%) ⬇️
pkg/bpf/bpf.go 40.93% <100.00%> (ø)
pkg/bpf/utils/bpf_helper.go 100.00% <100.00%> (ø)
pkg/bpf/workload/sock_connection.go 53.67% <ø> (-0.34%) ⬇️
pkg/bpf/workload/sock_ops.go 55.44% <ø> (-0.44%) ⬇️
pkg/bpf/workload/xdp.go 47.82% <ø> (-0.75%) ⬇️
pkg/bpf/workload/loader.go 24.48% <36.36%> (-5.91%) ⬇️
pkg/bpf/ads/loader.go 36.48% <33.33%> (-9.20%) ⬇️

... and 30 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 52d8f9e...e57194b. Read the comment docs.

@nlgwcy
Copy link
Contributor Author

nlgwcy commented Nov 10, 2024

@hzxuzhonghu @lec-bit

Copy link
Member

@hzxuzhonghu hzxuzhonghu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM, but this is not compatable with arm ithink. i would approve after a test report though cc @lec-bit

"Map64",
"Map192",
"Map296",
"Map1600",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I am not sure why select 296 and 1600, is there byte alignments then we need tp adjust to 2^N

__uint(value_size, MAP_VAL_SIZE_64);
__uint(max_entries, MAP_MAX_ENTRIES);
__uint(map_flags, BPF_F_NO_PREALLOC);
} map64 SEC(".maps");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a kmesh_ prefix to all the maps

__u32 outer_key = (__u32)(uintptr_t)ptr;

kmesh_parse_outer_key(outer_key, &type, &inner_idx);
if (type != map_type)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the error log is needed here.

else if (__builtin_types_compatible_p(type, void *)) \
val_tmp = get_ptr_val_from_map(&map1600, MAP_TYPE_1600, ptr); \
else if (__builtin_types_compatible_p(type, void **)) \
val_tmp = get_ptr_val_from_map(&map1600, MAP_TYPE_1600, ptr); \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this the same as line 204?

if (__builtin_types_compatible_p(type, char *)) \
val_tmp = get_ptr_val_from_map(&map192, MAP_TYPE_192, ptr); \
else if (__builtin_types_compatible_p(type, void *)) \
val_tmp = get_ptr_val_from_map(&map1600, MAP_TYPE_1600, ptr); \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
val_tmp = get_ptr_val_from_map(&map1600, MAP_TYPE_1600, ptr); \
val_tmp = get_ptr_val_from_map(&map1600, MAP_TYPE_296, ptr); \

Signed-off-by: wuchangye <wuchangye@huawei.com>
Signed-off-by: wuchangye <wuchangye@huawei.com>
static inline int selected_oneof_field(void *value, const ProtobufCFieldDescriptor *field)
{
uint32_t n = *(uint32_t *)((char *)value + field->quantifier_offset);
unsigned int n = *(unsigned int *)((char *)value + field->quantifier_offset);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not use uint32 anymore, aren't they same?

static void free_elem(void *ptr)
{
if ((uintptr_t)ptr < MAX_OUTTER_MAP_ENTRIES)
if ((uintptr_t)ptr < UINT32_MAX)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this mean

__uint(max_entries, MAP_MAX_ENTRIES);
__uint(map_flags, BPF_F_NO_PREALLOC);
} kmesh_map296 SEC(".maps");
struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a blank line before

__uint(max_entries, MAP_MAX_ENTRIES);
__uint(map_flags, BPF_F_NO_PREALLOC);
} kmesh_map192 SEC(".maps");
struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a blank line

struct bpf_map_info info = {};
__u32 info_len = sizeof(info);
unsigned char type = MAP_GET_TYPE(*outer_key);
unsigned int inner_idx = MAP_GET_INDEX(*outer_key);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

somewhere you use the parse function, should keep consistent

Copy link
Member

@hzxuzhonghu hzxuzhonghu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

YOu should also update the enahanced bpf2go package

struct map_mng {
int inner_fds[MAP_TYPE_MAX];
struct bpf_map_info inner_infos[MAP_TYPE_MAX];
unsigned char mim_used_bitmap[MAP_TYPE_MAX][MIM_BITMAP_SIZE];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does mim abbrev for

}

static int free_outter_map_entry(struct op_context *ctx, void *outter_key)
static int bitmap_find_first_clear(unsigned char *bitmap, int len)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a little too slow, i am thinking if we can use optimistic algorithm, record last allocated bit postion. And at the next time, we allocate from the last position, this can largely reduce the loop and compare

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea~ It will be optimized.

return fmt.Errorf("failed to set api env")
}

ret := C.deserial_init(restart.GetStartType() == restart.Restart)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so this does not support restart now

Signed-off-by: wuchangye <wuchangye@huawei.com>
}
#endif

#define KMESH_GET_PTR_VAL(ptr, type) \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Defined as whether a function can reduce the number of instructions?

Signed-off-by: wuchangye <wuchangye@huawei.com>
Copy link
Member

@hzxuzhonghu hzxuzhonghu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

much vbetter

/lgtm

@kmesh-bot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: hzxuzhonghu

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@kmesh-bot kmesh-bot merged commit 47b3d1a into kmesh-net:main Nov 18, 2024
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants