Skip to content

Conversation

vadorovsky
Copy link
Contributor

@vadorovsky vadorovsky commented Aug 28, 2025

Variant part, represented by DW_TAG_variant_part is a structure with a discriminant and different variants, from which only one can be active and valid at the same time. The discriminant is the main difference between variant parts and unions represented by DW_TAG_union type.

Variant parts are used by Rust enums, which look like:

pub enum MyEnum {
    First { a: u32, b: i32 },
    Second(u32),
}

This type's debug info is the following DICompositeType with DW_TAG_structure_type tag:

!4 = !DICompositeType(tag: DW_TAG_structure_type, name: "MyEnum",
     scope: !2, file: !5, size: 96, align: 32, flags: DIFlagPublic,
     elements: !6, templateParams: !16,
     identifier: "faba668fd9f71e9b7cf3b9ac5e8b93cb")

With one element being also a DICompositeType, but with DW_TAG_variant_part tag:

!6 = !{!7}
!7 = !DICompositeType(tag: DW_TAG_variant_part, scope: !4, file: !5,
     size: 96, align: 32, elements: !8, templateParams: !16,
     identifier: "e4aee046fc86d111657622fdcb8c42f7", discriminator: !21)

Which has a discriminator:

!21 = !DIDerivedType(tag: DW_TAG_member, scope: !4, file: !5,
      baseType: !13, size: 32, align: 32, flags: DIFlagArtificial)

Which then holds different variants as DIDerivedType elements with DW_TAG_member tag:

!8 = !{!9, !17}
!9 = !DIDerivedType(tag: DW_TAG_member, name: "First", scope: !7,
     file: !5, baseType: !10, size: 96, align: 32, extraData: i32 0)
!10 = !DICompositeType(tag: DW_TAG_structure_type, name: "First",
      scope: !4, file: !5, size: 96, align: 32, flags: DIFlagPublic,
      elements: !11, templateParams: !16,
      identifier: "cc7748c842e275452db4205b190c8ff7")
!11 = !{!12, !14}
!12 = !DIDerivedType(tag: DW_TAG_member, name: "a", scope: !10,
      file: !5, baseType: !13, size: 32, align: 32, offset: 32,
      flags: DIFlagPublic)
!13 = !DIBasicType(name: "u32", size: 32, encoding: DW_ATE_unsigned)
!14 = !DIDerivedType(tag: DW_TAG_member, name: "b", scope: !10,
      file: !5, baseType: !15, size: 32, align: 32, offset: 64,
      flags: DIFlagPublic)
!15 = !DIBasicType(name: "i32", size: 32, encoding: DW_ATE_signed)
!16 = !{}
!17 = !DIDerivedType(tag: DW_TAG_member, name: "Second", scope: !7,
      file: !5, baseType: !18, size: 96, align: 32, extraData: i32 1)
!18 = !DICompositeType(tag: DW_TAG_structure_type, name: "Second",
      scope: !4, file: !5, size: 96, align: 32, flags: DIFlagPublic,
      elements: !19, templateParams: !16,
      identifier: "a2094b1381f3082d504fbd0903aa7c06")
!19 = !{!20}
!20 = !DIDerivedType(tag: DW_TAG_member, name: "__0", scope: !18,
      file: !5, baseType: !13, size: 32, align: 32, offset: 32,
      flags: DIFlagPublic)

BPF backend was assuming that all the elements of any DICompositeType have tag DW_TAG_member and are instances of DIDerivedType. However, the single element of the outer composite type !4 has tag DW_TAG_variant_part and is an instance of DICompositeType. The unconditional call of cast<DIDerivedType> on all elements was causing an assertion failure when any Rust code with enums was compiled to the BPF target.

Fix that by:

  • Handling DW_TAG_variant_part in visitStructType.
  • Replacing unconditional call of cast<DIDerivedType> over DICompositeType elements with a switch statement, handling both DW_TAG_member and DW_TAG_variant_part and casting the element to an appropriate type (DIDerivedType or DICompositeType).

Fixes: #155778

@vadorovsky
Copy link
Contributor Author

@eddyz87 @tamird

Copy link
Contributor Author

@vadorovsky vadorovsky Aug 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This behavior can be reproduced only with Rust. However, this trick cleans up the noisy IR produced by Rust from the panic handler and all unnecessary core types. I hope it's good enough for the test.

Copy link

github-actions bot commented Aug 28, 2025

✅ With the latest revision this PR passed the undef deprecator.

@vadorovsky vadorovsky force-pushed the bpf-data-carrying-enum branch from d634545 to 8fad84a Compare August 28, 2025 08:21
@vadorovsky
Copy link
Contributor Author

vadorovsky commented Aug 28, 2025

This undef issue seems like something that needs to be fixed in Rust. I will take a look.

@vadorovsky vadorovsky force-pushed the bpf-data-carrying-enum branch from 8fad84a to f295fba Compare August 28, 2025 08:38
@vadorovsky
Copy link
Contributor Author

This undef issue seems like something that needs to be fixed in Rust. I will take a look.

It goes away after removing the single-value variant:

diff --git a/data-carrying-ebpf/src/main.rs b/data-carrying-ebpf/src/main.rs
index d472859..a4f5f33 100644
--- a/data-carrying-ebpf/src/main.rs
+++ b/data-carrying-ebpf/src/main.rs
@@ -4,15 +4,12 @@
 pub enum DataCarryingEnum {
     First { a: u32, b: i32 },
     Second(u32, i32),
-    Third(u32),
 }

 #[unsafe(no_mangle)]
 pub static X: DataCarryingEnum = DataCarryingEnum::First { a: 54, b: -23 };
 #[unsafe(no_mangle)]
 pub static Y: DataCarryingEnum = DataCarryingEnum::Second(54, -23);
-#[unsafe(no_mangle)]
-pub static Z: DataCarryingEnum = DataCarryingEnum::Third(36);

 #[cfg(not(test))]
 #[panic_handler]

Definitely something to report and fix in Rust. For now, to unblock this fix, I'm sticking to the first two variants.

@vadorovsky vadorovsky force-pushed the bpf-data-carrying-enum branch from f295fba to 1de07f4 Compare August 28, 2025 08:45
@eddyz87
Copy link
Contributor

eddyz87 commented Aug 28, 2025

Let's change the name to: Support for DW_TAG_variant_part in BTF generation. Data carrying enum is not an official term anyway (is this how algebraic data types are called these days?).

@vadorovsky vadorovsky force-pushed the bpf-data-carrying-enum branch from 1de07f4 to 7a09868 Compare September 1, 2025 09:34
Copy link

github-actions bot commented Sep 1, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@vadorovsky vadorovsky force-pushed the bpf-data-carrying-enum branch from 7a09868 to 413a98a Compare September 1, 2025 09:42
@vadorovsky vadorovsky changed the title [BPF] Handle data-carrying enums [BPF] Support for DW_TAG_variant_part in BTF generation Sep 1, 2025
@vadorovsky
Copy link
Contributor Author

OK, it really looks like I need to fix the undef usage in static variables in Rust first. 😞

Copy link
Contributor

@eddyz87 eddyz87 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vadorovsky ,

Looks good except for the note about HasBitField from Yonghong. Do you plan to wrap this up?

@vadorovsky
Copy link
Contributor Author

vadorovsky commented Sep 9, 2025

Do you plan to wrap this up?

I do, but I'm blocked by Rust which is emitting undef for all static variable definitions. Because of that, there is no way to make the CI happy with the IR test I added. Still trying to figure out a proper fix on Rust's side. If you prefer, I can close or mark this PR as draft until I sort it out.

@eddyz87
Copy link
Contributor

eddyz87 commented Sep 9, 2025

Do you plan to wrap this up?

I do, but I'm blocked by Rust which is emitting undef for all static variable definitions. Because of that, there is no way to make the CI happy with the IR test I added. Still trying to figure out a proper fix on Rust's side. If you prefer, I can close or mark this PR as draft until I sort it out.

But we know how rust represents the enum, you don't need rust to wrap up the test case. E.g. the IR below is all you need. It can even be cleaned up a bit.

$ cat test-debug-info.rs 
pub enum Adt {
    First { a: u32, b: i32 },
    Second(u32, i32),
}

pub static X: Adt = Adt::First{a:0, b:0};

$ rustc --emit=llvm-ir -C debuginfo=full --crate-type=lib -o - test-debug-info.rs 
; ModuleID = 'test_debug_info.bfdfecb05037b21e-cgu.0'
source_filename = "test_debug_info.bfdfecb05037b21e-cgu.0"
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

@_ZN15test_debug_info1X17h1cd60cbc8df4384cE = constant [12 x i8] zeroinitializer, align 4, !dbg !0

!llvm.module.flags = !{!23, !24, !25, !26}
!llvm.ident = !{!27}
!llvm.dbg.cu = !{!28}

!0 = !DIGlobalVariableExpression(var: !1, expr: !DIExpression())
!1 = distinct !DIGlobalVariable(name: "X", linkageName: "_ZN15test_debug_info1X17h1cd60cbc8df4384cE", scope: !2, file: !3, line: 6, type: !4, isLocal: false, isDefinition: true, align: 32)
!2 = !DINamespace(name: "test_debug_info", scope: null)
!3 = !DIFile(filename: "test-debug-info.rs", directory: "/home/ezingerman/tmp", checksumkind: CSK_MD5, checksum: "d031470f0fcafae0cde20cea6e49f258")
!4 = !DICompositeType(tag: DW_TAG_structure_type, name: "Adt", scope: !2, file: !5, size: 96, align: 32, flags: DIFlagPublic, elements: !6, templateParams: !16, identifier: "b82d4441ddd9815f702f04ccb4300dfa")
!5 = !DIFile(filename: "<unknown>", directory: "")
!6 = !{!7}
!7 = !DICompositeType(tag: DW_TAG_variant_part, scope: !4, file: !5, size: 96, align: 32, elements: !8, templateParams: !16, identifier: "7cc4d1a13945d2033b3024c6d22bc7f3", discriminator: !22)
!8 = !{!9, !17}
!9 = !DIDerivedType(tag: DW_TAG_member, name: "First", scope: !7, file: !5, baseType: !10, size: 96, align: 32, extraData: i32 0)
!10 = !DICompositeType(tag: DW_TAG_structure_type, name: "First", scope: !4, file: !5, size: 96, align: 32, flags: DIFlagPublic, elements: !11, templateParams: !16, identifier: "3be91d17592010323b31ce88a6234f5c")
!11 = !{!12, !14}
!12 = !DIDerivedType(tag: DW_TAG_member, name: "a", scope: !10, file: !5, baseType: !13, size: 32, align: 32, offset: 32, flags: DIFlagPublic)
!13 = !DIBasicType(name: "u32", size: 32, encoding: DW_ATE_unsigned)
!14 = !DIDerivedType(tag: DW_TAG_member, name: "b", scope: !10, file: !5, baseType: !15, size: 32, align: 32, offset: 64, flags: DIFlagPublic)
!15 = !DIBasicType(name: "i32", size: 32, encoding: DW_ATE_signed)
!16 = !{}
!17 = !DIDerivedType(tag: DW_TAG_member, name: "Second", scope: !7, file: !5, baseType: !18, size: 96, align: 32, extraData: i32 1)
!18 = !DICompositeType(tag: DW_TAG_structure_type, name: "Second", scope: !4, file: !5, size: 96, align: 32, flags: DIFlagPublic, elements: !19, templateParams: !16, identifier: "734c122892f88e54fc2cbc5b90961e52")
!19 = !{!20, !21}
!20 = !DIDerivedType(tag: DW_TAG_member, name: "__0", scope: !18, file: !5, baseType: !13, size: 32, align: 32, offset: 32, flags: DIFlagPublic)
!21 = !DIDerivedType(tag: DW_TAG_member, name: "__1", scope: !18, file: !5, baseType: !15, size: 32, align: 32, offset: 64, flags: DIFlagPublic)
!22 = !DIDerivedType(tag: DW_TAG_member, scope: !4, file: !5, baseType: !13, size: 32, align: 32, flags: DIFlagArtificial)
!23 = !{i32 8, !"PIC Level", i32 2}
!24 = !{i32 2, !"RtLibUseGOT", i32 1}
!25 = !{i32 7, !"Dwarf Version", i32 4}
!26 = !{i32 2, !"Debug Info Version", i32 3}
!27 = !{!"rustc version 1.88.0 (6b00bc388 2025-06-23) (Red Hat 1.88.0-1.el9)"}
!28 = distinct !DICompileUnit(language: DW_LANG_Rust, file: !29, producer: "clang LLVM (rustc version 1.88.0 (6b00bc388 2025-06-23) (Red Hat 1.88.0-1.el9))", isOptimized: false, runtimeVersion: 0, emissionKind: FullDebug, globals: !30, splitDebugInlining: false, nameTableKind: None)
!29 = !DIFile(filename: "test-debug-info.rs/@/test_debug_info.bfdfecb05037b21e-cgu.0", directory: "/home/ezingerman/tmp")
!30 = !{!0}

@vadorovsky
Copy link
Contributor Author

vadorovsky commented Sep 9, 2025

Oh, so in your case rustc didn't emit undef in the constant, interesting. In my case, the IR fragments defining constants look like:

vadorovsky@413a98a#diff-f411e8cbe94fed63916ca9eebac7652eddbb2db4770934eb94c26632216cbc58R42

The difference in our codes that I added one more constant for the other variant and used no_mangle attribute.

I'm AFK ATM. Tomorrow I will try to produce a similar IR without undef, or just use your IR in the test. What Rust version did you use? I used the latest nightly from a week ago.

(The issue with Rust emitting undef in any case will still will need to be fixed, eventually, but good to know it doesn't necessarily block my PR 🙂 )

@vadorovsky vadorovsky closed this Sep 9, 2025
@vadorovsky vadorovsky reopened this Sep 9, 2025
@eddyz87
Copy link
Contributor

eddyz87 commented Sep 9, 2025

Oh, so in your case rustc didn't emit undef in the constant, interesting. In my case, the IR fragments defining constants look like:

vadorovsky@413a98a#diff-f411e8cbe94fed63916ca9eebac7652eddbb2db4770934eb94c26632216cbc58R42

The difference in our codes that I added one more constant for the other variant and used no_mangle attribute.

It's a backend test. When I write such tests I start from something generated by frontend but cut away any unrelated stuff. In this particular case the only relevant thing is the structure of debug information generated for variant/variant_part. You can drop initialization from those globals and it would still be fine for the backend test.

I'm AFK ATM. Tomorrow I will try to produce a similar IR without undef, or just use your IR in the test. What Rust version did you use? I used the latest nightly from a week ago.

rustc version 1.88.0 (6b00bc388 2025-06-23) (Red Hat 1.88.0-1.el9)

@vadorovsky vadorovsky force-pushed the bpf-data-carrying-enum branch from 413a98a to a22acb8 Compare September 11, 2025 06:20
@vadorovsky
Copy link
Contributor Author

@eddyz87 After playing a bit with your and my code, I realized that Rust emits undef if the given struct/variant has more than two fields. Trimming down my struct to your layout (variant with 2 fields, variant with 1 field) fixed that. I still managed to trim the IR with llvm-extract instead of doing it manually, so it's easier to grasp for other people if we ever need to regenerate the IR.

All comments should be addressed now.

Copy link
Contributor

@eddyz87 eddyz87 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this change is fine, but just realized an unfortunate quirk regarding discriminator handling. Consider the following example:

pub enum Adt {
    First { a: u32, b: i32 },
    Second(u32, i32),
}

pub static X: Adt = Adt::First{a:0, b:0};

With corresponding IR:

!7 = !DICompositeType(tag: DW_TAG_variant_part, scope: !4, file: !5, size: 96, align: 32, elements: !8, ..., discriminator: !22)
  !8 = !{!9, !17}
  !9 = !DIDerivedType(tag: DW_TAG_member, name: "First", scope: !7, file: !5, baseType: !10, size: 96, align: 32, extraData: i32 0)
  !10 = !DICompositeType(tag: DW_TAG_structure_type, name: "First", ..., elements: !11, ...)
    !11 = !{!12, !14}
    !12 = !DIDerivedType(tag: DW_TAG_member, name: "a", scope: !10, file: !5, baseType: !13, size: 32, align: 32, offset: 32, flags: DIFlagPublic)
      ...

  !22 = !DIDerivedType(tag: DW_TAG_member, scope: !4, file: !5, baseType: !13, size: 32, align: 32, flags: DIFlagArtificial)

And corresponding DWARF:

0x0000003d:     DW_TAG_structure_type
                  DW_AT_name    ("Adt")
                  DW_AT_byte_size       (0x0c)
                  DW_AT_accessibility   (DW_ACCESS_public)
                  DW_AT_alignment       (4)

0x00000045:       DW_TAG_variant_part
                    DW_AT_discr (0x0000004a)

0x0000004a:         DW_TAG_member
                      DW_AT_type        (0x000000b2 "u32")
                      DW_AT_alignment   (4)
                      DW_AT_data_member_location        (0x00)
                      DW_AT_artificial  (true)

0x00000051:         DW_TAG_variant
                      DW_AT_discr_value (0x00)

0x00000053:           DW_TAG_member
                        DW_AT_name      ("First")
                        DW_AT_type      (0x0000006e "test_debug_info::Adt::First")
                        DW_AT_alignment (4)
                        DW_AT_data_member_location      (0x00)

0x0000005e:           NULL

0x0000005f:         DW_TAG_variant
                      DW_AT_discr_value (0x01)

0x00000061:           DW_TAG_member
                        DW_AT_name      ("Second")
                        DW_AT_type      (0x0000008f "test_debug_info::Adt::Second")
                        DW_AT_alignment (4)
                        DW_AT_data_member_location      (0x00)

0x0000006c:           NULL

0x0000006d:         NULL

0x0000006e:       DW_TAG_structure_type
                    DW_AT_name  ("First")
                    DW_AT_byte_size     (0x0c)
                    DW_AT_accessibility (DW_ACCESS_public)
                    DW_AT_alignment     (4)

0x00000076:         DW_TAG_member
                      DW_AT_name        ("a")
                      DW_AT_type        (0x000000b2 "u32")
                      DW_AT_alignment   (4)
                      DW_AT_data_member_location        (0x04)
                      DW_AT_accessibility       (DW_ACCESS_public)

0x00000082:         DW_TAG_member
                      DW_AT_name        ("b")
                      DW_AT_type        (0x000000b9 "i32")
                      DW_AT_alignment   (4)
                      DW_AT_data_member_location        (0x08)
                      DW_AT_accessibility       (DW_ACCESS_public)

0x0000008e:         NULL

Note how offsets for a and b are shifted by 4 bytes and variant part itself has three members, beside First and Second there is also an anonymous member 0x0000004a, representing a descriminator at offset 0.

Meaning that in BTF union representing the variant part has to have three members, not two. Which is a bit inconvenient, as it is not a part of "elements", but instead a separate "discriminator" reference.

Wdyt?

@vadorovsky
Copy link
Contributor Author

vadorovsky commented Sep 12, 2025

Good catch!

Meaning that in BTF union representing the variant part has to have three members, not two.

Given that:

  • The discriminant takes the first 4 bytes.
  • The other variants have the 4 byte offset
  • Discriminant and variants have different memory location, only variants share the same location.

Wouldn't it be more correct to represent the variant part as a struct with two members - discriminant and union (which then contains the variants as elements)?

Another option would be extending BTF to actually represent the variant part in a similar way to either LLVM DI or DWARF, showing the discriminant and variants explicitly.

@eddyz87
Copy link
Contributor

eddyz87 commented Sep 12, 2025

Wouldn't it be more correct to represent the variant part as a struct with two members - discriminant and union (which then contains the variants as elements)?

This makes sense, but will require adjusting member offsets, compared to what LLVM describes in DI. I'd avoid such complication at the moment, but you can give it a try if you want. Note that it will have to be done in a generic way, meaning that we can't hard code that discriminant is always 4 bytes and at offset 0, this info would have to be extracted from DI.

Also note, that similar reconstruction will heed to happen in pahole, eventually, when BTF generated for rust kernel code would become important. Pahole generates BTF from DWARF.

Another option would be extending BTF to actually represent the variant part in a similar way to either LLVM DI or DWARF, showing the discriminant and variants explicitly.

Given that there are alternative options: a union with additional member for discriminator, or a struct with discriminator and a union, I don't think kernel upstream would be happy to extend BTF.

@yonghong-song
Copy link
Contributor

For BPF, I think we can do @eddyz87 suggested below:
Given that there are alternative options: a union with additional member for discriminator, or a struct with discriminator and a union, I don't think kernel upstream would be happy to extend BTF.
This will make the BTF format consistent with the existing practice.

For non-BTF, we probably cannot do much in llvm and pahole needs to do above conversion.

Is it possible for rust frontend to generate easier debuginfo which can be easily mapped to BTF?

@vadorovsky
Copy link
Contributor Author

a union with additional member for discriminator

This will make the BTF format consistent with the existing practice.

Alright, I will go forward with the union and offsets then. Hopefully I can get it working by tomorrow.

For non-BTF, we probably cannot do much in llvm and pahole needs to do above conversion.

Yes, and I'm willing to try implementing that myself in pahole, once this PR is merged, unless there is someone else already planning such work.

@vadorovsky vadorovsky force-pushed the bpf-data-carrying-enum branch from a22acb8 to aed0232 Compare September 28, 2025 07:18
@vadorovsky
Copy link
Contributor Author

@eddyz87 @yonghong-song I got it working and went with the additional member solution.

@vadorovsky vadorovsky force-pushed the bpf-data-carrying-enum branch from aed0232 to be94a5f Compare September 28, 2025 07:21
struct BTF::BTFMember Discriminator;
const auto *DDTy = STy->getDiscriminator();

InitialOffset += DDTy->getOffsetInBits() + DDTy->getSizeInBits();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is no chance for getOffsetInBits() being anything else than 0 here, but I'm still including it to make the implementation generic instead of just assuming things. But let me know if you have other thoughts.

@vadorovsky vadorovsky requested a review from eddyz87 September 28, 2025 07:26

Discriminator.NameOff = BDebug.addString(DDTy->getName());
Discriminator.Offset = DDTy->getOffsetInBits();
const auto *BaseTy = tryRemoveAtomicType(DDTy->getBaseType());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think we need tryRemoveAtomicType(). Did you find a use case that tryRemoveAtomicType() could be necessary?

Copy link
Contributor Author

@vadorovsky vadorovsky Oct 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't. You're right, I removed it now.

BTFMember.NameOff = BDebug.addString(DCTy->getName());
BTFMember.Offset = InitialOffset + DCTy->getOffsetInBits();
const auto *DTy = cast<DIType>(DCTy);
BTFMember.Type = BDebug.getTypeId(DTy);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DTy is not needed. Directly using BDebug.getTypeId(DCTy).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Variant part, represented by `DW_TAG_variant_part` is a structure with a
discriminant and different variants, from which only one can be active
and valid at the same time. The discriminant is the main difference
between variant parts and unions represented by `DW_TAG_union` type.

Variant parts are used by Rust enums, which look like:

```rust
pub enum MyEnum {
    First { a: u32, b: i32 },
    Second(u32),
}
```

This type's debug info is the following `DICompositeType` with
`DW_TAG_structure_type` tag:

```llvm
!4 = !DICompositeType(tag: DW_TAG_structure_type, name: "MyEnum",
     scope: !2, file: !5, size: 96, align: 32, flags: DIFlagPublic,
     elements: !6, templateParams: !16,
     identifier: "faba668fd9f71e9b7cf3b9ac5e8b93cb")
```

With one element being also a `DICompositeType`, but with
`DW_TAG_variant_part` tag:

```llvm
!6 = !{!7}
!7 = !DICompositeType(tag: DW_TAG_variant_part, scope: !4, file: !5,
     size: 96, align: 32, elements: !8, templateParams: !16,
     identifier: "e4aee046fc86d111657622fdcb8c42f7", discriminator: !21)
```

Which has a discriminator:

```llvm
!21 = !DIDerivedType(tag: DW_TAG_member, scope: !4, file: !5,
      baseType: !13, size: 32, align: 32, flags: DIFlagArtificial)
```

Which then holds different variants as `DIDerivedType` elements with
`DW_TAG_member` tag:

```llvm
!8 = !{!9, !17}
!9 = !DIDerivedType(tag: DW_TAG_member, name: "First", scope: !7,
     file: !5, baseType: !10, size: 96, align: 32, extraData: i32 0)
!10 = !DICompositeType(tag: DW_TAG_structure_type, name: "First",
      scope: !4, file: !5, size: 96, align: 32, flags: DIFlagPublic,
      elements: !11, templateParams: !16,
      identifier: "cc7748c842e275452db4205b190c8ff7")
!11 = !{!12, !14}
!12 = !DIDerivedType(tag: DW_TAG_member, name: "a", scope: !10,
      file: !5, baseType: !13, size: 32, align: 32, offset: 32,
      flags: DIFlagPublic)
!13 = !DIBasicType(name: "u32", size: 32, encoding: DW_ATE_unsigned)
!14 = !DIDerivedType(tag: DW_TAG_member, name: "b", scope: !10,
      file: !5, baseType: !15, size: 32, align: 32, offset: 64,
      flags: DIFlagPublic)
!15 = !DIBasicType(name: "i32", size: 32, encoding: DW_ATE_signed)
!16 = !{}
!17 = !DIDerivedType(tag: DW_TAG_member, name: "Second", scope: !7,
      file: !5, baseType: !18, size: 96, align: 32, extraData: i32 1)
!18 = !DICompositeType(tag: DW_TAG_structure_type, name: "Second",
      scope: !4, file: !5, size: 96, align: 32, flags: DIFlagPublic,
      elements: !19, templateParams: !16,
      identifier: "a2094b1381f3082d504fbd0903aa7c06")
!19 = !{!20}
!20 = !DIDerivedType(tag: DW_TAG_member, name: "__0", scope: !18,
      file: !5, baseType: !13, size: 32, align: 32, offset: 32,
      flags: DIFlagPublic)
```

BPF backend was assuming that all the elements of any `DICompositeType`
have tag `DW_TAG_member` and are instances of `DIDerivedType`. However,
the single element of the outer composite type `!4` has tag
`DW_TAG_variant_part` and is an instance of `DICompositeType`. The
unconditional call of `cast<DIDerivedType>` on all elements was causing
an assertion failure when any Rust code with enums was compiled to the
BPF target.

Fix that by:

* Handling `DW_TAG_variant_part` in `visitStructType`.
* Replacing unconditional call of `cast<DIDerivedType>` over
  `DICompositeType` elements with a `switch` statement, handling
  both `DW_TAG_member` and `DW_TAG_variant_part` and casting the element
  to an appropriate type (`DIDerivedType` or `DICompositeType`).

To keep BTF simple and make BTF relocations correct, represent the
discriminator as the first element and apply an offset to all elements.

Fixes: llvm#155778
@vadorovsky vadorovsky force-pushed the bpf-data-carrying-enum branch from be94a5f to c3d8eb4 Compare October 1, 2025 09:16
; CHECK-BTF-NEXT: [2] UNION '(anon)' size=12 vlen=3
; CHECK-BTF-NEXT: '(anon)' type_id=4 bits_offset=0
; CHECK-BTF-NEXT: 'First' type_id=3 bits_offset=32
; CHECK-BTF-NEXT: 'Second' type_id=6 bits_offset=32
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we kindly agree that we do not like have different bit_offset for union members.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you agree then with representing it as a struct with a discriminator and an union? Like:

[2] STRUCT '(anon)' size=12 vlen=2
        '(anon)' type_id=4 bits_offset=0 // discriminator
        '(anon)' type_id=3 bits_offset=32 // union with variants
[3] UNION '(anon)' size=8 vlen=2
        'First' type_id=4 bits_offset=0
        'Second' type_id=7 bits_offset=0

That's what I proposed initially #155783 (comment) and I'm quite confident that the code won't be too complicated.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. This should work. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BPF] Rust data-carrying enums cause invalid cast to DIDerivedType
4 participants