Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ThinLTO][DebugInfo] Emit full type definitions when importing anonymous types. #78461

Merged
merged 1 commit into from
Jan 18, 2024

Conversation

rickyz
Copy link
Member

@rickyz rickyz commented Jan 17, 2024

This fixes some cases of missing debuginfo caused by an interaction between:

f0d6655,
which drops the identifier from a DICompositeType in the module containing its
vtable.

and

a61f5e3,
which causes ThinLTO to import composite types as declarations when they have
an identifier.

If a virtual class's DICompositeType has no identifier due to the first change,
and contains a nested anonymous type which does have an identifier, then the
second change can cause ThinLTO to output the classes's DICompositeType as a
type definition that links to a non-defining declaration for the nested type.
Since the nested anonyous type does not have a name, debuggers are unable to
find the definition for the declaration.

Repro case:

cat > a.h <<EOF
class A {
 public:
  A();
  virtual ~A();

 private:
  union {
    int val;
  };
};
EOF

cat > a.cc <<EOF
#include "a.h"

A::A() { asm(""); }

A::~A() {}
EOF

cat > main.cc <<EOF
#include "a.h"

int main(int argc, char **argv) {
  A a;
  return 0;
}
EOF

clang++ -O2 -g -flto=thin -mllvm -force-import-all main.cc a.cc
gdb ./a.out -batch -ex 'pt /rmt A'

The gdb command outputs:

type = class A {
  private:
    union {
        <incomplete type>
    };
}

and dwarfdump -i a.out shows a DW_TAG_class_type for A with an incomplete union
type (note that there is also a duplicate entry with the full union type that
comes after).

< 1><0x0000001e>    DW_TAG_class_type
                      DW_AT_containing_type       <0x0000001e>
                      DW_AT_calling_convention    DW_CC_pass_by_reference
                      DW_AT_name                  (indexed string: 0x00000007)A
                      DW_AT_byte_size             0x00000010
                      DW_AT_decl_file             0x00000001 /path/to/./a.h
                      DW_AT_decl_line             0x00000001
...
< 2><0x0000002f>      DW_TAG_member
                        DW_AT_type                  <0x00000037>
                        DW_AT_decl_file             0x00000001 /path/to/./a.h
                        DW_AT_decl_line             0x00000007
                        DW_AT_data_member_location  8
< 2><0x00000037>      DW_TAG_union_type
                        DW_AT_export_symbols        yes(1)
                        DW_AT_calling_convention    DW_CC_pass_by_value
                        DW_AT_declaration           yes(1)

This change works around this by making ThinLTO always import full definitions
for anonymous types.

…ous types.

This fixes some cases of missing debuginfo caused by an interaction between:

llvm@f0d6655,
which drops the identifier from a DICompositeType in the module containing its
vtable.

and

llvm@a61f5e3,
which causes ThinLTO to import composite types as declarations when they have
an identifier.

If a virtual class's DICompositeType has no identifier due to the first change,
and contains a nested anonymous type which does have an identifier, then the
second change can cause ThinLTO to output the classes's DICompositeType as a
type definition that links to a non-defining declaration for the nested type.
Since the nested anonyous type does not have a name, debuggers are unable to
find the definition for the declaration.

Repro case:

cat > a.h <<EOF
class A {
 public:
  A();
  virtual ~A();

 private:
  union {
    int val;
  };
};
EOF

cat > a.cc <<EOF

A::A() { asm(""); }

A::~A() {}
EOF

cat > main.cc <<EOF

int main(int argc, char **argv) {
  A a;
  return 0;
}
EOF

clang++ -O2 -g -flto=thin -mllvm -force-import-all main.cc a.cc
gdb ./a.out -batch -ex 'pt /rmt A'
The gdb command outputs:

type = class A {
  private:
    union {
        <incomplete type>
    };
}

and dwarfdump -i a.out shows a DW_TAG_class_type for A with an incomplete union
type (note that there is also a duplicate entry with the full union type that
comes after).

< 1><0x0000001e>    DW_TAG_class_type
                      DW_AT_containing_type       <0x0000001e>
                      DW_AT_calling_convention    DW_CC_pass_by_reference
                      DW_AT_name                  (indexed string: 0x00000007)A
                      DW_AT_byte_size             0x00000010
                      DW_AT_decl_file             0x00000001 /path/to/./a.h
                      DW_AT_decl_line             0x00000001
...
< 2><0x0000002f>      DW_TAG_member
                        DW_AT_type                  <0x00000037>
                        DW_AT_decl_file             0x00000001 /path/to/./a.h
                        DW_AT_decl_line             0x00000007
                        DW_AT_data_member_location  8
< 2><0x00000037>      DW_TAG_union_type
                        DW_AT_export_symbols        yes(1)
                        DW_AT_calling_convention    DW_CC_pass_by_value
                        DW_AT_declaration           yes(1)

This change works around this by making ThinLTO always import full definitions
for anonymous types.
@llvmbot llvmbot added the LTO Link time optimization (regular/full LTO or ThinLTO) label Jan 17, 2024
@llvmbot
Copy link
Collaborator

llvmbot commented Jan 17, 2024

@llvm/pr-subscribers-lto

Author: Ricky Zhou (rickyz)

Changes

This fixes some cases of missing debuginfo caused by an interaction between:

f0d6655,
which drops the identifier from a DICompositeType in the module containing its
vtable.

and

a61f5e3,
which causes ThinLTO to import composite types as declarations when they have
an identifier.

If a virtual class's DICompositeType has no identifier due to the first change,
and contains a nested anonymous type which does have an identifier, then the
second change can cause ThinLTO to output the classes's DICompositeType as a
type definition that links to a non-defining declaration for the nested type.
Since the nested anonyous type does not have a name, debuggers are unable to
find the definition for the declaration.

Repro case:

cat > a.h <<EOF
class A {
public:
A();
virtual ~A();

private:
union {
int val;
};
};
EOF

cat > a.cc <<EOF

A::A() { asm(""); }

A::~A() {}
EOF

cat > main.cc <<EOF

int main(int argc, char **argv) {
A a;
return 0;
}
EOF

clang++ -O2 -g -flto=thin -mllvm -force-import-all main.cc a.cc
gdb ./a.out -batch -ex 'pt /rmt A'
The gdb command outputs:

type = class A {
private:
union {
<incomplete type>
};
}

and dwarfdump -i a.out shows a DW_TAG_class_type for A with an incomplete union
type (note that there is also a duplicate entry with the full union type that
comes after).

< 1><0x0000001e> DW_TAG_class_type
DW_AT_containing_type <0x0000001e>
DW_AT_calling_convention DW_CC_pass_by_reference
DW_AT_name (indexed string: 0x00000007)A
DW_AT_byte_size 0x00000010
DW_AT_decl_file 0x00000001 /path/to/./a.h
DW_AT_decl_line 0x00000001
...
< 2><0x0000002f> DW_TAG_member
DW_AT_type <0x00000037>
DW_AT_decl_file 0x00000001 /path/to/./a.h
DW_AT_decl_line 0x00000007
DW_AT_data_member_location 8
< 2><0x00000037> DW_TAG_union_type
DW_AT_export_symbols yes(1)
DW_AT_calling_convention DW_CC_pass_by_value
DW_AT_declaration yes(1)

This change works around this by making ThinLTO always import full definitions
for anonymous types.


Full diff: https://github.com/llvm/llvm-project/pull/78461.diff

2 Files Affected:

  • (modified) llvm/lib/Bitcode/Reader/MetadataLoader.cpp (+21-16)
  • (modified) llvm/test/ThinLTO/X86/debuginfo-compositetype-import.ll (+9-6)
diff --git a/llvm/lib/Bitcode/Reader/MetadataLoader.cpp b/llvm/lib/Bitcode/Reader/MetadataLoader.cpp
index 910e97489dbbe0e..770eb83af17f9b0 100644
--- a/llvm/lib/Bitcode/Reader/MetadataLoader.cpp
+++ b/llvm/lib/Bitcode/Reader/MetadataLoader.cpp
@@ -1615,27 +1615,32 @@ Error MetadataLoader::MetadataLoaderImpl::parseOneMetadata(
     Metadata *Annotations = nullptr;
     auto *Identifier = getMDString(Record[15]);
     // If this module is being parsed so that it can be ThinLTO imported
-    // into another module, composite types only need to be imported
-    // as type declarations (unless full type definitions requested).
-    // Create type declarations up front to save memory. Also, buildODRType
-    // handles the case where this is type ODRed with a definition needed
-    // by the importing module, in which case the existing definition is
-    // used.
-    if (IsImporting && !ImportFullTypeDefinitions && Identifier &&
+    // into another module, composite types only need to be imported as
+    // type declarations (unless full type definitions are requested).
+    // Create type declarations up front to save memory. This is only
+    // done for types which have an Identifier, and are therefore
+    // subject to the ODR.
+    //
+    // buildODRType handles the case where this is type ODRed with a
+    // definition needed by the importing module, in which case the
+    // existing definition is used.
+    //
+    // We always import full definitions for anonymous composite types,
+    // as without a name, debuggers cannot easily resolve a declaration
+    // to its definition.
+    if (IsImporting && !ImportFullTypeDefinitions && Identifier && Name &&
         (Tag == dwarf::DW_TAG_enumeration_type ||
          Tag == dwarf::DW_TAG_class_type ||
          Tag == dwarf::DW_TAG_structure_type ||
          Tag == dwarf::DW_TAG_union_type)) {
       Flags = Flags | DINode::FlagFwdDecl;
-      if (Name) {
-        // This is a hack around preserving template parameters for simplified
-        // template names - it should probably be replaced with a
-        // DICompositeType flag specifying whether template parameters are
-        // required on declarations of this type.
-        StringRef NameStr = Name->getString();
-        if (!NameStr.contains('<') || NameStr.starts_with("_STN|"))
-          TemplateParams = getMDOrNull(Record[14]);
-      }
+      // This is a hack around preserving template parameters for simplified
+      // template names - it should probably be replaced with a
+      // DICompositeType flag specifying whether template parameters are
+      // required on declarations of this type.
+      StringRef NameStr = Name->getString();
+      if (!NameStr.contains('<') || NameStr.starts_with("_STN|"))
+        TemplateParams = getMDOrNull(Record[14]);
     } else {
       BaseType = getDITypeRefOrNull(Record[6]);
       OffsetInBits = Record[9];
diff --git a/llvm/test/ThinLTO/X86/debuginfo-compositetype-import.ll b/llvm/test/ThinLTO/X86/debuginfo-compositetype-import.ll
index 22e8ec4418eca3d..133930ce6c0377f 100644
--- a/llvm/test/ThinLTO/X86/debuginfo-compositetype-import.ll
+++ b/llvm/test/ThinLTO/X86/debuginfo-compositetype-import.ll
@@ -16,7 +16,8 @@
 ; CHECK: distinct !DICompositeType(tag: DW_TAG_enumeration_type, name: "enum", scope: !{{[0-9]+}}, file: !{{[0-9]+}}, line: 50, size: 32, flags: DIFlagFwdDecl, identifier: "enum")
 ; CHECK: distinct !DICompositeType(tag: DW_TAG_class_type, name: "class<>", scope: !{{[0-9]+}}, file: !{{[0-9]+}}, line: 728, size: 448, flags: DIFlagFwdDecl, identifier: "class")
 ; CHECK: distinct !DICompositeType(tag: DW_TAG_structure_type, name: "struct", scope: !{{[0-9]+}}, file: !{{[0-9]+}}, line: 309, size: 128, flags: DIFlagFwdDecl, templateParams: !{{[0-9]+}}, identifier: "struct_templ_simplified")
-; CHECK: distinct !DICompositeType(tag: DW_TAG_union_type, file: !{{[0-9]+}}, line: 115, size: 384, flags: DIFlagFwdDecl, identifier: "union")
+; CHECK: distinct !DICompositeType(tag: DW_TAG_union_type, name: "union", file: !{{[0-9]+}}, line: 115, size: 384, flags: DIFlagFwdDecl, identifier: "union")
+; CHECK: distinct !DICompositeType(tag: DW_TAG_union_type, file: !{{[0-9]+}}, line: 1, elements: !{{[0-9]+}}, identifier: "anon_union")
 ; CHECK: distinct !DICompositeType(tag: DW_TAG_structure_type, name: "struct<>", scope: !{{[0-9]+}}, file: !{{[0-9]+}}, line: 309, size: 128, flags: DIFlagFwdDecl, identifier: "struct_templ")
 ; CHECK: distinct !DICompositeType(tag: DW_TAG_structure_type, name: "_STN|struct|<>", scope: !{{[0-9]+}}, file: !{{[0-9]+}}, line: 309, size: 128, flags: DIFlagFwdDecl, templateParams: !{{[0-9]+}}, identifier: "struct_templ_simplified_mangled")
 
@@ -32,7 +33,8 @@
 ; FULL: distinct !DICompositeType(tag: DW_TAG_enumeration_type, name: "enum", scope: !{{[0-9]+}}, file: !{{[0-9]+}}, line: 50, size: 32, elements: !{{[0-9]+}}, identifier: "enum")
 ; FULL: distinct !DICompositeType(tag: DW_TAG_class_type, name: "class<>", scope: !{{[0-9]+}}, file: !{{[0-9]+}}, line: 728, size: 448, elements: !{{[0-9]+}}, identifier: "class")
 ; FULL: distinct !DICompositeType(tag: DW_TAG_structure_type, name: "struct", scope: !{{[0-9]+}}, file: !{{[0-9]+}}, line: 309, baseType: !{{[0-9]+}}, size: 128, offset: 64, elements: !{{[0-9]+}}, vtableHolder: !{{[0-9]+}}, templateParams: !{{[0-9]+}}, identifier: "struct_templ_simplified")
-; FULL: distinct !DICompositeType(tag: DW_TAG_union_type, file: !{{[0-9]+}}, line: 115, size: 384, elements: !{{[0-9]+}}, identifier: "union")
+; FULL: distinct !DICompositeType(tag: DW_TAG_union_type, name: "union", file: !{{[0-9]+}}, line: 115, size: 384, elements: !{{[0-9]+}}, identifier: "union")
+; FULL: distinct !DICompositeType(tag: DW_TAG_union_type, file: !{{[0-9]+}}, line: 1, elements: !{{[0-9]+}}, identifier: "anon_union")
 ; FULL: distinct !DICompositeType(tag: DW_TAG_structure_type, name: "struct<>", scope: !{{[0-9]+}}, file: !{{[0-9]+}}, line: 309, baseType: !{{[0-9]+}}, size: 128, offset: 64, elements: !{{[0-9]+}}, vtableHolder: !{{[0-9]+}}, templateParams: !{{[0-9]+}}, identifier: "struct_templ")
 ; FULL: distinct !DICompositeType(tag: DW_TAG_structure_type, name: "_STN|struct|<>", scope: !{{[0-9]+}}, file: !{{[0-9]+}}, line: 309, baseType: !{{[0-9]+}}, size: 128, offset: 64, elements: !{{[0-9]+}}, vtableHolder: !{{[0-9]+}}, templateParams: !{{[0-9]+}}, identifier: "struct_templ_simplified_mangled")
 
@@ -59,10 +61,11 @@ entry:
 !5 = !{}
 !6 = distinct !DISubprogram(name: "foo", scope: !1, file: !1, line: 1, type: !7, isLocal: false, isDefinition: true, scopeLine: 2, isOptimized: false, unit: !0, retainedNodes: !5)
 !7 = !DISubroutineType(types: !8)
-!8 = !{!9, !10, !11, !12, !13, !14}
+!8 = !{!9, !10, !11, !12, !13, !14, !15}
 !9 = !DICompositeType(tag: DW_TAG_enumeration_type, name: "enum", scope: !1, file: !1, line: 50, size: 32, elements: !5, identifier: "enum")
 !10 = !DICompositeType(tag: DW_TAG_class_type, name: "class<>", scope: !1, file: !1, line: 728, size: 448, elements: !5, identifier: "class")
 !11 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "struct", scope: !1, file: !1, line: 309, baseType: !10, size: 128, offset: 64, elements: !5, vtableHolder: !10, templateParams: !5, identifier: "struct_templ_simplified")
-!12 = distinct !DICompositeType(tag: DW_TAG_union_type, file: !1, line: 115, size: 384, elements: !5, identifier: "union")
-!13 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "struct<>", scope: !1, file: !1, line: 309, baseType: !10, size: 128, offset: 64, elements: !5, vtableHolder: !10, templateParams: !5, identifier: "struct_templ")
-!14 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "_STN|struct|<>", scope: !1, file: !1, line: 309, baseType: !10, size: 128, offset: 64, elements: !5, vtableHolder: !10, templateParams: !5, identifier: "struct_templ_simplified_mangled")
+!12 = distinct !DICompositeType(tag: DW_TAG_union_type, name: "union", file: !1, line: 115, size: 384, elements: !5, identifier: "union")
+!13 = distinct !DICompositeType(tag: DW_TAG_union_type, file: !1, line: 1, elements: !5, identifier: "anon_union")
+!14 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "struct<>", scope: !1, file: !1, line: 309, baseType: !10, size: 128, offset: 64, elements: !5, vtableHolder: !10, templateParams: !5, identifier: "struct_templ")
+!15 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "_STN|struct|<>", scope: !1, file: !1, line: 309, baseType: !10, size: 128, offset: 64, elements: !5, vtableHolder: !10, templateParams: !5, identifier: "struct_templ_simplified_mangled")

Copy link
Collaborator

@dwblaikie dwblaikie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, thanks!

@rickyz
Copy link
Member Author

rickyz commented Jan 18, 2024

Thanks for the email discussion (and for leaving all those breadcrumbs that were useful for tracking this down)! I'm not a committer, so can you merge this on my behalf?

@dwblaikie dwblaikie merged commit 1522333 into llvm:main Jan 18, 2024
4 of 5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
LTO Link time optimization (regular/full LTO or ThinLTO)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants