-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[BPF] Support for DW_TAG_variant_part
in BTF generation
#155783
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This behavior can be reproduced only with Rust. However, this trick cleans up the noisy IR produced by Rust from the panic handler and all unnecessary core types. I hope it's good enough for the test.
✅ With the latest revision this PR passed the undef deprecator. |
d634545
to
8fad84a
Compare
This |
8fad84a
to
f295fba
Compare
It goes away after removing the single-value variant: diff --git a/data-carrying-ebpf/src/main.rs b/data-carrying-ebpf/src/main.rs
index d472859..a4f5f33 100644
--- a/data-carrying-ebpf/src/main.rs
+++ b/data-carrying-ebpf/src/main.rs
@@ -4,15 +4,12 @@
pub enum DataCarryingEnum {
First { a: u32, b: i32 },
Second(u32, i32),
- Third(u32),
}
#[unsafe(no_mangle)]
pub static X: DataCarryingEnum = DataCarryingEnum::First { a: 54, b: -23 };
#[unsafe(no_mangle)]
pub static Y: DataCarryingEnum = DataCarryingEnum::Second(54, -23);
-#[unsafe(no_mangle)]
-pub static Z: DataCarryingEnum = DataCarryingEnum::Third(36);
#[cfg(not(test))]
#[panic_handler] Definitely something to report and fix in Rust. For now, to unblock this fix, I'm sticking to the first two variants. |
f295fba
to
1de07f4
Compare
Let's change the name to: |
1de07f4
to
7a09868
Compare
✅ With the latest revision this PR passed the C/C++ code formatter. |
7a09868
to
413a98a
Compare
DW_TAG_variant_part
in BTF generation
OK, it really looks like I need to fix the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good except for the note about HasBitField from Yonghong. Do you plan to wrap this up?
I do, but I'm blocked by Rust which is emitting |
But we know how rust represents the enum, you don't need rust to wrap up the test case. E.g. the IR below is all you need. It can even be cleaned up a bit.
|
Oh, so in your case rustc didn't emit vadorovsky@413a98a#diff-f411e8cbe94fed63916ca9eebac7652eddbb2db4770934eb94c26632216cbc58R42 The difference in our codes that I added one more constant for the other variant and used I'm AFK ATM. Tomorrow I will try to produce a similar IR without (The issue with Rust emitting |
It's a backend test. When I write such tests I start from something generated by frontend but cut away any unrelated stuff. In this particular case the only relevant thing is the structure of debug information generated for variant/variant_part. You can drop initialization from those globals and it would still be fine for the backend test.
rustc version 1.88.0 (6b00bc388 2025-06-23) (Red Hat 1.88.0-1.el9) |
413a98a
to
a22acb8
Compare
@eddyz87 After playing a bit with your and my code, I realized that Rust emits All comments should be addressed now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this change is fine, but just realized an unfortunate quirk regarding discriminator handling. Consider the following example:
pub enum Adt {
First { a: u32, b: i32 },
Second(u32, i32),
}
pub static X: Adt = Adt::First{a:0, b:0};
With corresponding IR:
!7 = !DICompositeType(tag: DW_TAG_variant_part, scope: !4, file: !5, size: 96, align: 32, elements: !8, ..., discriminator: !22)
!8 = !{!9, !17}
!9 = !DIDerivedType(tag: DW_TAG_member, name: "First", scope: !7, file: !5, baseType: !10, size: 96, align: 32, extraData: i32 0)
!10 = !DICompositeType(tag: DW_TAG_structure_type, name: "First", ..., elements: !11, ...)
!11 = !{!12, !14}
!12 = !DIDerivedType(tag: DW_TAG_member, name: "a", scope: !10, file: !5, baseType: !13, size: 32, align: 32, offset: 32, flags: DIFlagPublic)
...
!22 = !DIDerivedType(tag: DW_TAG_member, scope: !4, file: !5, baseType: !13, size: 32, align: 32, flags: DIFlagArtificial)
And corresponding DWARF:
0x0000003d: DW_TAG_structure_type
DW_AT_name ("Adt")
DW_AT_byte_size (0x0c)
DW_AT_accessibility (DW_ACCESS_public)
DW_AT_alignment (4)
0x00000045: DW_TAG_variant_part
DW_AT_discr (0x0000004a)
0x0000004a: DW_TAG_member
DW_AT_type (0x000000b2 "u32")
DW_AT_alignment (4)
DW_AT_data_member_location (0x00)
DW_AT_artificial (true)
0x00000051: DW_TAG_variant
DW_AT_discr_value (0x00)
0x00000053: DW_TAG_member
DW_AT_name ("First")
DW_AT_type (0x0000006e "test_debug_info::Adt::First")
DW_AT_alignment (4)
DW_AT_data_member_location (0x00)
0x0000005e: NULL
0x0000005f: DW_TAG_variant
DW_AT_discr_value (0x01)
0x00000061: DW_TAG_member
DW_AT_name ("Second")
DW_AT_type (0x0000008f "test_debug_info::Adt::Second")
DW_AT_alignment (4)
DW_AT_data_member_location (0x00)
0x0000006c: NULL
0x0000006d: NULL
0x0000006e: DW_TAG_structure_type
DW_AT_name ("First")
DW_AT_byte_size (0x0c)
DW_AT_accessibility (DW_ACCESS_public)
DW_AT_alignment (4)
0x00000076: DW_TAG_member
DW_AT_name ("a")
DW_AT_type (0x000000b2 "u32")
DW_AT_alignment (4)
DW_AT_data_member_location (0x04)
DW_AT_accessibility (DW_ACCESS_public)
0x00000082: DW_TAG_member
DW_AT_name ("b")
DW_AT_type (0x000000b9 "i32")
DW_AT_alignment (4)
DW_AT_data_member_location (0x08)
DW_AT_accessibility (DW_ACCESS_public)
0x0000008e: NULL
Note how offsets for a
and b
are shifted by 4 bytes and variant part itself has three members, beside First and Second there is also an anonymous member 0x0000004a
, representing a descriminator at offset 0.
Meaning that in BTF union representing the variant part has to have three members, not two. Which is a bit inconvenient, as it is not a part of "elements", but instead a separate "discriminator" reference.
Wdyt?
Good catch!
Given that:
Wouldn't it be more correct to represent the variant part as a struct with two members - discriminant and union (which then contains the variants as elements)? Another option would be extending BTF to actually represent the variant part in a similar way to either LLVM DI or DWARF, showing the discriminant and variants explicitly. |
This makes sense, but will require adjusting member offsets, compared to what LLVM describes in DI. I'd avoid such complication at the moment, but you can give it a try if you want. Note that it will have to be done in a generic way, meaning that we can't hard code that discriminant is always 4 bytes and at offset 0, this info would have to be extracted from DI. Also note, that similar reconstruction will heed to happen in pahole, eventually, when BTF generated for rust kernel code would become important. Pahole generates BTF from DWARF.
Given that there are alternative options: a union with additional member for discriminator, or a struct with discriminator and a union, I don't think kernel upstream would be happy to extend BTF. |
For BPF, I think we can do @eddyz87 suggested below: For non-BTF, we probably cannot do much in llvm and pahole needs to do above conversion. Is it possible for rust frontend to generate easier debuginfo which can be easily mapped to BTF? |
Alright, I will go forward with the union and offsets then. Hopefully I can get it working by tomorrow.
Yes, and I'm willing to try implementing that myself in pahole, once this PR is merged, unless there is someone else already planning such work. |
a22acb8
to
aed0232
Compare
@eddyz87 @yonghong-song I got it working and went with the additional member solution. |
aed0232
to
be94a5f
Compare
struct BTF::BTFMember Discriminator; | ||
const auto *DDTy = STy->getDiscriminator(); | ||
|
||
InitialOffset += DDTy->getOffsetInBits() + DDTy->getSizeInBits(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there is no chance for getOffsetInBits()
being anything else than 0 here, but I'm still including it to make the implementation generic instead of just assuming things. But let me know if you have other thoughts.
llvm/lib/Target/BPF/BTFDebug.cpp
Outdated
|
||
Discriminator.NameOff = BDebug.addString(DDTy->getName()); | ||
Discriminator.Offset = DDTy->getOffsetInBits(); | ||
const auto *BaseTy = tryRemoveAtomicType(DDTy->getBaseType()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not think we need tryRemoveAtomicType(). Did you find a use case that tryRemoveAtomicType() could be necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't. You're right, I removed it now.
llvm/lib/Target/BPF/BTFDebug.cpp
Outdated
BTFMember.NameOff = BDebug.addString(DCTy->getName()); | ||
BTFMember.Offset = InitialOffset + DCTy->getOffsetInBits(); | ||
const auto *DTy = cast<DIType>(DCTy); | ||
BTFMember.Type = BDebug.getTypeId(DTy); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DTy is not needed. Directly using BDebug.getTypeId(DCTy).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Variant part, represented by `DW_TAG_variant_part` is a structure with a discriminant and different variants, from which only one can be active and valid at the same time. The discriminant is the main difference between variant parts and unions represented by `DW_TAG_union` type. Variant parts are used by Rust enums, which look like: ```rust pub enum MyEnum { First { a: u32, b: i32 }, Second(u32), } ``` This type's debug info is the following `DICompositeType` with `DW_TAG_structure_type` tag: ```llvm !4 = !DICompositeType(tag: DW_TAG_structure_type, name: "MyEnum", scope: !2, file: !5, size: 96, align: 32, flags: DIFlagPublic, elements: !6, templateParams: !16, identifier: "faba668fd9f71e9b7cf3b9ac5e8b93cb") ``` With one element being also a `DICompositeType`, but with `DW_TAG_variant_part` tag: ```llvm !6 = !{!7} !7 = !DICompositeType(tag: DW_TAG_variant_part, scope: !4, file: !5, size: 96, align: 32, elements: !8, templateParams: !16, identifier: "e4aee046fc86d111657622fdcb8c42f7", discriminator: !21) ``` Which has a discriminator: ```llvm !21 = !DIDerivedType(tag: DW_TAG_member, scope: !4, file: !5, baseType: !13, size: 32, align: 32, flags: DIFlagArtificial) ``` Which then holds different variants as `DIDerivedType` elements with `DW_TAG_member` tag: ```llvm !8 = !{!9, !17} !9 = !DIDerivedType(tag: DW_TAG_member, name: "First", scope: !7, file: !5, baseType: !10, size: 96, align: 32, extraData: i32 0) !10 = !DICompositeType(tag: DW_TAG_structure_type, name: "First", scope: !4, file: !5, size: 96, align: 32, flags: DIFlagPublic, elements: !11, templateParams: !16, identifier: "cc7748c842e275452db4205b190c8ff7") !11 = !{!12, !14} !12 = !DIDerivedType(tag: DW_TAG_member, name: "a", scope: !10, file: !5, baseType: !13, size: 32, align: 32, offset: 32, flags: DIFlagPublic) !13 = !DIBasicType(name: "u32", size: 32, encoding: DW_ATE_unsigned) !14 = !DIDerivedType(tag: DW_TAG_member, name: "b", scope: !10, file: !5, baseType: !15, size: 32, align: 32, offset: 64, flags: DIFlagPublic) !15 = !DIBasicType(name: "i32", size: 32, encoding: DW_ATE_signed) !16 = !{} !17 = !DIDerivedType(tag: DW_TAG_member, name: "Second", scope: !7, file: !5, baseType: !18, size: 96, align: 32, extraData: i32 1) !18 = !DICompositeType(tag: DW_TAG_structure_type, name: "Second", scope: !4, file: !5, size: 96, align: 32, flags: DIFlagPublic, elements: !19, templateParams: !16, identifier: "a2094b1381f3082d504fbd0903aa7c06") !19 = !{!20} !20 = !DIDerivedType(tag: DW_TAG_member, name: "__0", scope: !18, file: !5, baseType: !13, size: 32, align: 32, offset: 32, flags: DIFlagPublic) ``` BPF backend was assuming that all the elements of any `DICompositeType` have tag `DW_TAG_member` and are instances of `DIDerivedType`. However, the single element of the outer composite type `!4` has tag `DW_TAG_variant_part` and is an instance of `DICompositeType`. The unconditional call of `cast<DIDerivedType>` on all elements was causing an assertion failure when any Rust code with enums was compiled to the BPF target. Fix that by: * Handling `DW_TAG_variant_part` in `visitStructType`. * Replacing unconditional call of `cast<DIDerivedType>` over `DICompositeType` elements with a `switch` statement, handling both `DW_TAG_member` and `DW_TAG_variant_part` and casting the element to an appropriate type (`DIDerivedType` or `DICompositeType`). To keep BTF simple and make BTF relocations correct, represent the discriminator as the first element and apply an offset to all elements. Fixes: llvm#155778
be94a5f
to
c3d8eb4
Compare
; CHECK-BTF-NEXT: [2] UNION '(anon)' size=12 vlen=3 | ||
; CHECK-BTF-NEXT: '(anon)' type_id=4 bits_offset=0 | ||
; CHECK-BTF-NEXT: 'First' type_id=3 bits_offset=32 | ||
; CHECK-BTF-NEXT: 'Second' type_id=6 bits_offset=32 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we kindly agree that we do not like have different bit_offset for union members.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you agree then with representing it as a struct with a discriminator and an union? Like:
[2] STRUCT '(anon)' size=12 vlen=2
'(anon)' type_id=4 bits_offset=0 // discriminator
'(anon)' type_id=3 bits_offset=32 // union with variants
[3] UNION '(anon)' size=8 vlen=2
'First' type_id=4 bits_offset=0
'Second' type_id=7 bits_offset=0
That's what I proposed initially #155783 (comment) and I'm quite confident that the code won't be too complicated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. This should work. Thanks!
Variant part, represented by
DW_TAG_variant_part
is a structure with a discriminant and different variants, from which only one can be active and valid at the same time. The discriminant is the main difference between variant parts and unions represented byDW_TAG_union
type.Variant parts are used by Rust enums, which look like:
This type's debug info is the following
DICompositeType
withDW_TAG_structure_type
tag:With one element being also a
DICompositeType
, but withDW_TAG_variant_part
tag:Which has a discriminator:
Which then holds different variants as
DIDerivedType
elements withDW_TAG_member
tag:BPF backend was assuming that all the elements of any
DICompositeType
have tagDW_TAG_member
and are instances ofDIDerivedType
. However, the single element of the outer composite type!4
has tagDW_TAG_variant_part
and is an instance ofDICompositeType
. The unconditional call ofcast<DIDerivedType>
on all elements was causing an assertion failure when any Rust code with enums was compiled to the BPF target.Fix that by:
DW_TAG_variant_part
invisitStructType
.cast<DIDerivedType>
overDICompositeType
elements with aswitch
statement, handling bothDW_TAG_member
andDW_TAG_variant_part
and casting the element to an appropriate type (DIDerivedType
orDICompositeType
).Fixes: #155778