Skip to content

Conversation

@royitaqi
Copy link
Contributor

@royitaqi royitaqi commented Nov 6, 2025

The root cause of #162954 is that the two typedef DIEs are merged into one. This is because their synthetic names are the same (see below). The (same) synthetic names are then used as the hash key of the type pool in the (output) artificial type unit (code), making the two input DIEs to both be paired with the same output DIE.

CompileUnit::assignTypeNamesRec(): current child = 0x00000044
SyntheticTypeNameBuilder::addDIETypeName():
    added parent name. SyntheticName =
    added type prefix. SyntheticName ={H}
    added type name. SyntheticName ={H}BarInt   <-- Same synthetic name
    assigned to type descriptor. TypeEntryPtr = 0x0x13300a018   <-- It's the same type entry object

CompileUnit::assignTypeNamesRec(): current child = 0x0000004c
SyntheticTypeNameBuilder::addDIETypeName():
    added parent name. SyntheticName =
    added type prefix. SyntheticName ={H}
    added type name. SyntheticName ={H}BarInt   <-- Same synthetic name
    assigned to type descriptor. TypeEntryPtr = 0x0x13300a018   <-- It's the same type entry object

A solution would be to differentiate the two input DIEs' synthetic names, so that they generate different output DIEs, so that they end up being two DIEs the linked DWARF.

This patch implements said solution by adding decl file and line into the synthetic names. This is done only for typedef DIEs to minimize the change. Note that, in order for the bug to be fixed, we depend on #166673 to be fixed, too, so that the decl lines will be different for these two typedefs (currently they are both 3, so the synthetic name is still the same even with decl file/line added). Note that the usage of addDieNameFromDeclFileAndDeclLine() is hopefully not evil - see line 339 use it as a last resort if no synthetic name is generated already.

A different implementation (implemented in #166767) can be to call addReferencedODRDies() (just 10 lines below my change) to add the referenced types into the synthetic name (hopefully recursively). The advantage of this solution is that 1) the resultant synthetic name is more "correct" (see caveat of the first approach below), and, 2) it doesn't depend on #166673 to be fixed.

The two implementations have different caveats. The first won't work if there are two typedefs of the same name on the same source line. Not typical, but happens. The second won't work if any of the referenced types resolve to the same name for some reason (similar to how the two typedefs resolved to the same name in this case).

Copy link
Member

@Michael137 Michael137 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Conceptually it makes sense to not merge the DIEs since they aren't the same. Also typedefs are the only construct that i can think of atm that would allow two distinct entities with the same name. I guess technically it's a redeclaration?

I dont think we should look at line numbers though. For one, a dwarf producer may omit them. And as you say, two typedefs may be declared on the same line

Could we include the canonical referred-to type in the synthetic name? I.e unwrap the typefed type?

@royitaqi
Copy link
Contributor Author

royitaqi commented Nov 6, 2025

Could we include the canonical referred-to type in the synthetic name? I.e unwrap the typefed type?

Thanks for the advice. IIUC, that's the "different implementation" mentioned in the description of this patch.

I have just implemented it in #166767. Would love your review there.

royitaqi added a commit that referenced this pull request Nov 7, 2025
…ing referred-to types into synthetic name (#166767)

**TL;DR:** See #166675 for description of the problem, the root cause,
and one solution. This patch is the "different implementation" descried
there.

This patch tries to fix the problem by recursively including the
referred-to types into the synthetic name. This way, the synthetic name
of the typedef DIE is canonicalized. See example debug prints below:

```
SyntheticTypeNameBuilder::addDIETypeName() is called for DIE at offset 0x0000004c
    SyntheticName = {H}BarInt{F}Foo<int>:()                               <- Two different names
    Assigned to type descriptor. TypeEntryPtr = 0x0000004c0x0x150020a38   <- Hence different type entries
SyntheticTypeNameBuilder::addDIETypeName() is called for DIE at offset 0x00000044
    SyntheticName = {H}BarInt{H}BarInt{F}Foo<int>:()                      <- Two different names
    Assigned to type descriptor. TypeEntryPtr = 0x000000440x0x150020a60   <- Hence different type entries
```

The advantages of this approach over
#166675 are:
1. The resulting synthetic name is more "correct" than using decl file
and line (which _can_ still collide).
1. This doesn't depend on
#166673 to be fixed.

A **hypothetical** caveat is that it would work if any of the referenced
types resolve to the same name for some reason (similar to how the two
typedefs resolved to the same name before this patch).


# Tests

```
cd ~/public_llvm/build
bin/llvm-lit -a \
    ../llvm-project/llvm/test/tools/dsymutil/typedefs-with-same-name.test \
    ../llvm-project/llvm/test/tools/dsymutil/X86/DWARFLinkerParallel/odr-fwd-declaration.test
```
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Nov 7, 2025
…E by including referred-to types into synthetic name (#166767)

**TL;DR:** See #166675 for description of the problem, the root cause,
and one solution. This patch is the "different implementation" descried
there.

This patch tries to fix the problem by recursively including the
referred-to types into the synthetic name. This way, the synthetic name
of the typedef DIE is canonicalized. See example debug prints below:

```
SyntheticTypeNameBuilder::addDIETypeName() is called for DIE at offset 0x0000004c
    SyntheticName = {H}BarInt{F}Foo<int>:()                               <- Two different names
    Assigned to type descriptor. TypeEntryPtr = 0x0000004c0x0x150020a38   <- Hence different type entries
SyntheticTypeNameBuilder::addDIETypeName() is called for DIE at offset 0x00000044
    SyntheticName = {H}BarInt{H}BarInt{F}Foo<int>:()                      <- Two different names
    Assigned to type descriptor. TypeEntryPtr = 0x000000440x0x150020a60   <- Hence different type entries
```

The advantages of this approach over
llvm/llvm-project#166675 are:
1. The resulting synthetic name is more "correct" than using decl file
and line (which _can_ still collide).
1. This doesn't depend on
llvm/llvm-project#166673 to be fixed.

A **hypothetical** caveat is that it would work if any of the referenced
types resolve to the same name for some reason (similar to how the two
typedefs resolved to the same name before this patch).

# Tests

```
cd ~/public_llvm/build
bin/llvm-lit -a \
    ../llvm-project/llvm/test/tools/dsymutil/typedefs-with-same-name.test \
    ../llvm-project/llvm/test/tools/dsymutil/X86/DWARFLinkerParallel/odr-fwd-declaration.test
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants