-
Notifications
You must be signed in to change notification settings - Fork 3.8k
-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BCC cannot handle alignment in struct used in HashMap #606
Comments
@drzaeus77 I think attribute ((aligned (8))) is not recognized by python side, so it's a bug. |
Yep, it works with the __builtin_memcpy without extra alignment directives. I was actually going to fix this, but I could not figure out where I would start looking, is there a document that describes the structure and the internal components of bcc? |
@4ast: It's not actually on the Python side. I checked the JSON type description we're getting in |
@4ast @drzaeus77: Seems to me that we would have to call |
Getting the offset out clang is pretty easy, but I can't find the way in ctypes to enforce the alignment of a particular field or the struct as a whole. The pack attribute is for the opposite case.
|
Maybe you could insert artificial padding fields? |
Bringing the discussion from #1131: I recently hit a related bug when trying to access key/values from a complex C structure. This would cause random segfaults or absurd interpreter corruption (i.e. objects would 'lose' some of their attributes over time ...). My take on it was that as the C structs are 8bytes aligned, while ctypes assumes only 4, the buffers it was allocating to call the various lib_bpf functions were too small (i.e. the lib_bpf code was overwriting cpython memory ...). I looked into how re-using the field offset info that is available in the clang frontend in order to align the ctypes structures properly. It does not seem feasible as: (i) the pack directive for ctypes Structure cannot grow larger than 4 on my system (it is ultimately defined by libffi in cpython) (ii) The offset variable of fields in a ctype Structure is read-only for the python VM (i.e. I cannot alter a class definition to change the auto-computed offsets). I guess that all of this could be solved by generating struct format strings instead of ctypes objects, but that would needed some extensive work to make it backward compatible with current code. Adding artificial padding fields would work as long as the ctypes structure pack parameter is set to 1 (which will make padding computation straightforward as this will disable ctypes own padding ...). At the very least, the ctypes-generated layout could be checked and a warning/error could be reported if offset values are inconsistent, i.e. exporting field offset alongside their type/names and checking that the offset value computed by ctypes is consistent (as given by My current workaround ensures that all 'parsed' structs (i.e. contained in maps accessed by python) are packed, and have their fields 4bytes aligned (possibly over-sizing some of them or adding unused padding). This seems sufficient for now (ctypes over-allocates in the worst case), but is certainly optimal wrt. performance. |
This patch intends to fix issue #606. Currently, the key/value type information is passed from C++ to Python through a JSON interface. The JSON is constructed by traversing the struct/field's through clang AST interface. Once Python gets the JSON, it will reconstruct the C structure through ctype module. There are two known issues where Python reconstructed C structure may not be the same as the original C structure: . if user explicitly use "__attribute__ ((align))" to alter field alignment and such information is not passed to Python. . the "__int128" type is a u64[2] array in python. So in C, __int128 needs to align on 16 bytes boundary, and in Python, it aligns with 8 bytes boundary. To solve this issue, this patch provided the structure with added padding fields to Python. For example, struct { char a; __int128 b; }; Python will receive struct { char a; char __pad_1[15]; __int128 b; }; Signed-off-by: Yonghong Song <yhs@fb.com>
This patch intends to fix issue #606. Currently, the key/value type information is passed from C++ to Python through a JSON interface. The JSON is constructed by traversing the struct/field's through clang AST interface. Once Python gets the JSON, it will reconstruct the C structure through ctype module. There are two known issues where Python reconstructed C structure may not be the same as the original C structure: . if user explicitly use "__attribute__ ((align))" to alter field alignment and such information is not passed to Python. . the "__int128" type is a u64[2] array in python. So in C, __int128 needs to align on 16 bytes boundary, and in Python, it aligns with 8 bytes boundary. To solve this issue, this patch provided the structure with added padding fields to Python. For example, struct { char a; __int128 b; }; Python will receive struct { char a; char __pad_1[15]; __int128 b; }; Signed-off-by: Yonghong Song <yhs@fb.com>
This patch intends to fix issue #606. Currently, the key/value type information is passed from C++ to Python through a JSON interface. The JSON is constructed by traversing the struct/field's through clang AST interface. Once Python gets the JSON, it will reconstruct the C structure through ctype module. There are two known issues where Python reconstructed C structure may not be the same as the original C structure: . if user explicitly use "__attribute__ ((align))" to alter field alignment and such information is not passed to Python. . the "__int128" type is a u64[2] array in python. So in C, __int128 needs to align on 16 bytes boundary, and in Python, it aligns with 8 bytes boundary. To solve this issue, this patch provided the structure with added padding fields to Python. For example, struct { char a; __int128 b; }; Python will receive struct { char a; char __pad_1[15]; __int128 b; }; Signed-off-by: Yonghong Song <yhs@fb.com>
This patch intends to fix issue iovisor#606. Currently, the key/value type information is passed from C++ to Python through a JSON interface. The JSON is constructed by traversing the struct/field's through clang AST interface. Once Python gets the JSON, it will reconstruct the C structure through ctype module. There are two known issues where Python reconstructed C structure may not be the same as the original C structure: . if user explicitly use "__attribute__ ((align))" to alter field alignment and such information is not passed to Python. . the "__int128" type is a u64[2] array in python. So in C, __int128 needs to align on 16 bytes boundary, and in Python, it aligns with 8 bytes boundary. To solve this issue, this patch provided the structure with added padding fields to Python. For example, struct { char a; __int128 b; }; Python will receive struct { char a; char __pad_1[15]; __int128 b; }; Signed-off-by: Yonghong Song <yhs@fb.com>
This patch intends to fix issue iovisor#606. Currently, the key/value type information is passed from C++ to Python through a JSON interface. The JSON is constructed by traversing the struct/field's through clang AST interface. Once Python gets the JSON, it will reconstruct the C structure through ctype module. There are two known issues where Python reconstructed C structure may not be the same as the original C structure: . if user explicitly use "__attribute__ ((align))" to alter field alignment and such information is not passed to Python. . the "__int128" type is a u64[2] array in python. So in C, __int128 needs to align on 16 bytes boundary, and in Python, it aligns with 8 bytes boundary. To solve this issue, this patch provided the structure with added padding fields to Python. For example, struct { char a; __int128 b; }; Python will receive struct { char a; char __pad_1[15]; __int128 b; }; Signed-off-by: Yonghong Song <yhs@fb.com>
Hi,
I had the following struct as an item in a hash map:
I needed to copy the contents of p to the stack and I though uint64_t* copies would be faster and less instructions so I did:
However, this caused the compiler to restore a register like this
r1 = (uint64_t*)(r5 + 2)
(I don't remember the actual error line) which of cause caused a misalignment error. I came to the conclusion ( even though I think the compiler should not have done that uint64_t* restoring) that this does not work as the struct is misaligned... So I changed the struct to:which works as expected... however, now when I fill the map from userspace... the data that I put into p is off by 6 byte, it seems that the userspace is unable to detect the actual alignment of the members in the struct and starts to put data for p right after len. The alignment should cause the compiler to add 6 bytes of padding between len and p.
Of course I can change everything in a way that it works however I find this rather difficult to debug...
Using
__builtin_memcpy
does not produce this behavior, there I can leave the alignment out or I can also switch p and len in the struct to make everything work...I was just wondering is this a conceptual limitation or just a bug in BCC?
The text was updated successfully, but these errors were encountered: