Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[LLVM][DWARF] Add support for .debug_names with split dwarf #73872

Merged
merged 1 commit into from
Dec 4, 2023

Conversation

ayermolo
Copy link
Contributor

Enables Type Units with DWARF5 accelerator tables for split dwarf. It is still
under discussion what is the best way to implement support for de-duplication in
DWP. This will be in follow up PR.

@llvmbot
Copy link
Collaborator

llvmbot commented Nov 30, 2023

@llvm/pr-subscribers-debuginfo

Author: Alexander Yermolovich (ayermolo)

Changes

Enables Type Units with DWARF5 accelerator tables for split dwarf. It is still
under discussion what is the best way to implement support for de-duplication in
DWP. This will be in follow up PR.


Full diff: https://github.com/llvm/llvm-project/pull/73872.diff

4 Files Affected:

  • (modified) llvm/include/llvm/CodeGen/AccelTable.h (+6-4)
  • (modified) llvm/lib/CodeGen/AsmPrinter/AccelTable.cpp (+28-10)
  • (modified) llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp (+9-8)
  • (modified) llvm/test/DebugInfo/X86/debug-names-types.ll (+94-6)
diff --git a/llvm/include/llvm/CodeGen/AccelTable.h b/llvm/include/llvm/CodeGen/AccelTable.h
index 6a69a01a8c786c4..0f35fd3514fae74 100644
--- a/llvm/include/llvm/CodeGen/AccelTable.h
+++ b/llvm/include/llvm/CodeGen/AccelTable.h
@@ -297,15 +297,15 @@ class DWARF5AccelTableData : public AccelTableData {
 };
 
 struct TypeUnitMetaInfo {
-  // Symbol for start of the TU section.
-  MCSymbol *Label;
+  // Symbol for start of the TU section or signature if this is SplitDwarf.
+  std::variant<MCSymbol *, uint64_t> LabelOrSignature;
   // Unique ID of Type Unit.
   unsigned UniqueID;
 };
 using TUVectorTy = SmallVector<TypeUnitMetaInfo, 1>;
 class DWARF5AccelTable : public AccelTable<DWARF5AccelTableData> {
   // Symbols to start of all the TU sections that were generated.
-  TUVectorTy TUSymbols;
+  TUVectorTy TUSymbolsOrHashes;
 
 public:
   struct UnitIndexAndEncoding {
@@ -313,9 +313,11 @@ class DWARF5AccelTable : public AccelTable<DWARF5AccelTableData> {
     DWARF5AccelTableData::AttributeEncoding Endoding;
   };
   /// Returns type units that were constructed.
-  const TUVectorTy &getTypeUnitsSymbols() { return TUSymbols; }
+  const TUVectorTy &getTypeUnitsSymbols() { return TUSymbolsOrHashes; }
   /// Add a type unit start symbol.
   void addTypeUnitSymbol(DwarfTypeUnit &U);
+  /// Add a type unit Signature.
+  void addTypeUnitSignature(DwarfTypeUnit &U);
   /// Convert DIE entries to explicit offset.
   /// Needs to be called after DIE offsets are computed.
   void convertDieToOffset() {
diff --git a/llvm/lib/CodeGen/AsmPrinter/AccelTable.cpp b/llvm/lib/CodeGen/AsmPrinter/AccelTable.cpp
index 60c54687c96662d..d6f487c18b0301e 100644
--- a/llvm/lib/CodeGen/AsmPrinter/AccelTable.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/AccelTable.cpp
@@ -201,9 +201,11 @@ class Dwarf5AccelTableWriter : public AccelTableWriter {
     char AugmentationString[8] = {'L', 'L', 'V', 'M', '0', '7', '0', '0'};
 
     Header(uint32_t CompUnitCount, uint32_t LocalTypeUnitCount,
-           uint32_t BucketCount, uint32_t NameCount)
+           uint32_t ForeignTypeUnitCount, uint32_t BucketCount,
+           uint32_t NameCount)
         : CompUnitCount(CompUnitCount), LocalTypeUnitCount(LocalTypeUnitCount),
-          BucketCount(BucketCount), NameCount(NameCount) {}
+          ForeignTypeUnitCount(ForeignTypeUnitCount), BucketCount(BucketCount),
+          NameCount(NameCount) {}
 
     void emit(Dwarf5AccelTableWriter &Ctx);
   };
@@ -220,6 +222,8 @@ class Dwarf5AccelTableWriter : public AccelTableWriter {
   MCSymbol *AbbrevStart = Asm->createTempSymbol("names_abbrev_start");
   MCSymbol *AbbrevEnd = Asm->createTempSymbol("names_abbrev_end");
   MCSymbol *EntryPool = Asm->createTempSymbol("names_entries");
+  // Indicates if this module is built with Split Dwarf enabled.
+  bool IsSplitDwarf = false;
 
   void populateAbbrevsMap();
 
@@ -238,7 +242,8 @@ class Dwarf5AccelTableWriter : public AccelTableWriter {
       ArrayRef<std::variant<MCSymbol *, uint64_t>> TypeUnits,
       llvm::function_ref<
           std::optional<DWARF5AccelTable::UnitIndexAndEncoding>(const DataT &)>
-          getIndexForEntry);
+          getIndexForEntry,
+      bool IsSplitDwarf);
 
   void emit();
 };
@@ -450,6 +455,8 @@ void Dwarf5AccelTableWriter<DataT>::emitTUList() const {
     Asm->OutStreamer->AddComment("Type unit " + Twine(TU.index()));
     if (std::holds_alternative<MCSymbol *>(TU.value()))
       Asm->emitDwarfSymbolReference(std::get<MCSymbol *>(TU.value()));
+    else if (IsSplitDwarf)
+      Asm->emitInt64(std::get<uint64_t>(TU.value()));
     else
       Asm->emitDwarfLengthOrOffset(std::get<uint64_t>(TU.value()));
   }
@@ -551,12 +558,15 @@ Dwarf5AccelTableWriter<DataT>::Dwarf5AccelTableWriter(
     ArrayRef<std::variant<MCSymbol *, uint64_t>> TypeUnits,
     llvm::function_ref<
         std::optional<DWARF5AccelTable::UnitIndexAndEncoding>(const DataT &)>
-        getIndexForEntry)
+        getIndexForEntry,
+    bool IsSplitDwarf)
     : AccelTableWriter(Asm, Contents, false),
-      Header(CompUnits.size(), TypeUnits.size(), Contents.getBucketCount(),
+      Header(CompUnits.size(), IsSplitDwarf ? 0 : TypeUnits.size(),
+             IsSplitDwarf ? TypeUnits.size() : 0, Contents.getBucketCount(),
              Contents.getUniqueNameCount()),
       CompUnits(CompUnits), TypeUnits(TypeUnits),
-      getIndexForEntry(std::move(getIndexForEntry)) {
+      getIndexForEntry(std::move(getIndexForEntry)),
+      IsSplitDwarf(IsSplitDwarf) {
   populateAbbrevsMap();
 }
 
@@ -608,7 +618,10 @@ void llvm::emitDWARF5AccelTable(
 
   for (const auto &TU : TUSymbols) {
     TUIndex[TU.UniqueID] = TUCount++;
-    TypeUnits.push_back(TU.Label);
+    if (DD.useSplitDwarf())
+      TypeUnits.push_back(std::get<uint64_t>(TU.LabelOrSignature));
+    else
+      TypeUnits.push_back(std::get<MCSymbol *>(TU.LabelOrSignature));
   }
 
   if (CompUnits.empty())
@@ -633,12 +646,17 @@ void llvm::emitDWARF5AccelTable(
           return {{CUIndex[Entry.getUnitID()],
                    {dwarf::DW_IDX_compile_unit, CUIndexForm}}};
         return std::nullopt;
-      })
+      },
+      DD.useSplitDwarf())
       .emit();
 }
 
 void DWARF5AccelTable::addTypeUnitSymbol(DwarfTypeUnit &U) {
-  TUSymbols.push_back({U.getLabelBegin(), U.getUniqueID()});
+  TUSymbolsOrHashes.push_back({U.getLabelBegin(), U.getUniqueID()});
+}
+
+void DWARF5AccelTable::addTypeUnitSignature(DwarfTypeUnit &U) {
+  TUSymbolsOrHashes.push_back({U.getTypeSignature(), U.getUniqueID()});
 }
 
 void llvm::emitDWARF5AccelTable(
@@ -650,7 +668,7 @@ void llvm::emitDWARF5AccelTable(
   std::vector<std::variant<MCSymbol *, uint64_t>> TypeUnits;
   Contents.finalize(Asm, "names");
   Dwarf5AccelTableWriter<DWARF5AccelTableData>(Asm, Contents, CUs, TypeUnits,
-                                               getIndexForEntry)
+                                               getIndexForEntry, false)
       .emit();
 }
 
diff --git a/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp b/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
index f311931a41aa5df..ab29020bf1d7bb0 100644
--- a/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
@@ -305,7 +305,6 @@ void Loc::MMI::addFrameIndexExpr(const DIExpression *Expr, int FI) {
 
 static AccelTableKind computeAccelTableKind(unsigned DwarfVersion,
                                             bool GenerateTypeUnits,
-                                            bool HasSplitDwarf,
                                             DebuggerKind Tuning,
                                             const Triple &TT) {
   // Honor an explicit request.
@@ -314,8 +313,7 @@ static AccelTableKind computeAccelTableKind(unsigned DwarfVersion,
 
   // Generating DWARF5 acceleration table.
   // Currently Split dwarf and non ELF format is not supported.
-  if (GenerateTypeUnits &&
-      (DwarfVersion < 5 || HasSplitDwarf || !TT.isOSBinFormatELF()))
+  if (GenerateTypeUnits && (DwarfVersion < 5 || !TT.isOSBinFormatELF()))
     return AccelTableKind::None;
 
   // Accelerator tables get emitted if targetting DWARF v5 or LLDB.  DWARF v5
@@ -403,9 +401,8 @@ DwarfDebug::DwarfDebug(AsmPrinter *A)
                        A->TM.getTargetTriple().isOSBinFormatWasm()) &&
                       GenerateDwarfTypeUnits;
 
-  TheAccelTableKind =
-      computeAccelTableKind(DwarfVersion, GenerateTypeUnits, HasSplitDwarf,
-                            DebuggerTuning, A->TM.getTargetTriple());
+  TheAccelTableKind = computeAccelTableKind(
+      DwarfVersion, GenerateTypeUnits, DebuggerTuning, A->TM.getTargetTriple());
 
   // Work around a GDB bug. GDB doesn't support the standard opcode;
   // SCE doesn't support GNU's; LLDB prefers the standard opcode, which
@@ -3532,8 +3529,12 @@ void DwarfDebug::addDwarfTypeUnitType(DwarfCompileUnit &CU,
       InfoHolder.computeSizeAndOffsetsForUnit(TU.first.get());
       InfoHolder.emitUnit(TU.first.get(), useSplitDwarf());
       if (getDwarfVersion() >= 5 &&
-          getAccelTableKind() == AccelTableKind::Dwarf)
-        AccelDebugNames.addTypeUnitSymbol(*TU.first);
+          getAccelTableKind() == AccelTableKind::Dwarf) {
+        if (useSplitDwarf())
+          AccelDebugNames.addTypeUnitSignature(*TU.first);
+        else
+          AccelDebugNames.addTypeUnitSymbol(*TU.first);
+      }
     }
     AccelTypeUnitsDebugNames.convertDieToOffset();
     AccelDebugNames.addTypeEntries(AccelTypeUnitsDebugNames);
diff --git a/llvm/test/DebugInfo/X86/debug-names-types.ll b/llvm/test/DebugInfo/X86/debug-names-types.ll
index fc23e604b386365..d32691305d1c101 100644
--- a/llvm/test/DebugInfo/X86/debug-names-types.ll
+++ b/llvm/test/DebugInfo/X86/debug-names-types.ll
@@ -1,15 +1,10 @@
 ; UNSUPPORTED: system-windows
-; This checks that .debug_names can be generated with monolithic -fdebug-type-sections, and does not generate when split-dwarf is enabled.
+; This checks that .debug_names can be generated with monolithic, and split-dwarf, when -fdebug-type-sections is enabled.
 ; Generated with: clang++ main.cpp   -g2 -gdwarf-5 -gpubnames -fdebug-types-section
 
 ; RUN: llc -mtriple=x86_64 -generate-type-units -dwarf-version=5 -filetype=obj %s -o %t
 ; RUN: llvm-dwarfdump -debug-info -debug-names %t | FileCheck %s
 
-; RUN: llc -mtriple=x86_64 -generate-type-units -dwarf-version=5 -filetype=obj -split-dwarf-file=%t.mainTypes.dwo --split-dwarf-output=%t.mainTypes.dwo %s -o %t
-; RUN: llvm-readelf --sections %t | FileCheck %s --check-prefixes=CHECK-SPLIT
-
-; CHECK-SPLIT-NOT: .debug_names
-
 ; CHECK:     .debug_info contents:
 ; CHECK:      DW_TAG_type_unit
 ; CHECK-NEXT:   DW_AT_language  (DW_LANG_C_plus_plus_14)
@@ -117,6 +112,99 @@
 ; CHECK-NEXT:   ]
 ; CHECK-NEXT: }
 
+; RUN: llc -mtriple=x86_64 -generate-type-units -dwarf-version=5 -filetype=obj -split-dwarf-file=%t.mainTypes.dwo --split-dwarf-output=%t.mainTypes.dwo %s -o %t
+; RUN: llvm-dwarfdump -debug-names %t | FileCheck %s --check-prefixes=CHECK-SPLIT
+
+; CHECK-SPLIT:          .debug_names contents
+; CHECK-SPLIT:          Foreign TU count: 1
+; CHECK-SPLIT-NEXT:     Bucket count: 4
+; CHECK-SPLIT-NEXT:     Name count: 4
+; CHECK-SPLIT-NEXT:     Abbreviations table size: 0x28
+; CHECK-SPLIT-NEXT:     Augmentation: 'LLVM0700'
+; CHECK-SPLIT-NEXT:   }
+; CHECK-SPLIT-NEXT:   Compilation Unit offsets [
+; CHECK-SPLIT-NEXT:     CU[0]: 0x00000000
+; CHECK-SPLIT-NEXT:   ]
+; CHECK-SPLIT-NEXT:   Foreign Type Unit signatures [
+; CHECK-SPLIT-NEXT:     ForeignTU[0]: 0x675d23e4f33235f2
+; CHECK-SPLIT-NEXT:   ]
+; CHECK-SPLIT-NEXT:   Abbreviations [
+; CHECK-SPLIT-NEXT:     Abbreviation [[ABBREV:0x[0-9a-f]*]] {
+; CHECK-SPLIT-NEXT:       Tag: DW_TAG_structure_type
+; CHECK-SPLIT-NEXT:       DW_IDX_die_offset: DW_FORM_ref4
+; CHECK-SPLIT-NEXT:     }
+; CHECK-SPLIT-NEXT:     Abbreviation [[ABBREV1:0x[0-9a-f]*]] {
+; CHECK-SPLIT-NEXT:       Tag: DW_TAG_structure_type
+; CHECK-SPLIT-NEXT:       DW_IDX_type_unit: DW_FORM_data1
+; CHECK-SPLIT-NEXT:       DW_IDX_die_offset: DW_FORM_ref4
+; CHECK-SPLIT-NEXT:     }
+; CHECK-SPLIT-NEXT:     Abbreviation [[ABBREV2:0x[0-9a-f]*]] {
+; CHECK-SPLIT-NEXT:       Tag: DW_TAG_base_type
+; CHECK-SPLIT-NEXT:       DW_IDX_die_offset: DW_FORM_ref4
+; CHECK-SPLIT-NEXT:     }
+; CHECK-SPLIT-NEXT:     Abbreviation [[ABBREV3:0x[0-9a-f]*]] {
+; CHECK-SPLIT-NEXT:       Tag: DW_TAG_subprogram
+; CHECK-SPLIT-NEXT:       DW_IDX_die_offset: DW_FORM_ref4
+; CHECK-SPLIT-NEXT:     }
+; CHECK-SPLIT-NEXT:     Abbreviation [[ABBREV4:0x[0-9a-f]*]] {
+; CHECK-SPLIT-NEXT:       Tag: DW_TAG_base_type
+; CHECK-SPLIT-NEXT:       DW_IDX_type_unit: DW_FORM_data1
+; CHECK-SPLIT-NEXT:       DW_IDX_die_offset: DW_FORM_ref4
+; CHECK-SPLIT-NEXT:     }
+; CHECK-SPLIT-NEXT:   ]
+; CHECK-SPLIT-NEXT:   Bucket 0 [
+; CHECK-SPLIT-NEXT:     Name 1 {
+; CHECK-SPLIT-NEXT:       Hash: 0xB888030
+; CHECK-SPLIT-NEXT:       String: {{.+}} "int"
+; CHECK-SPLIT-NEXT:       Entry @ {{.+}} {
+; CHECK-SPLIT-NEXT:         Abbrev: [[ABBREV2]]
+; CHECK-SPLIT-NEXT:         Tag: DW_TAG_base_type
+; CHECK-SPLIT-NEXT:         DW_IDX_die_offset: 0x00000035
+; CHECK-SPLIT-NEXT:       }
+; CHECK-SPLIT-NEXT:     }
+; CHECK-SPLIT-NEXT:   ]
+; CHECK-SPLIT-NEXT:   Bucket 1 [
+; CHECK-SPLIT-NEXT:     Name 2 {
+; CHECK-SPLIT-NEXT:       Hash: 0xB887389
+; CHECK-SPLIT-NEXT:       String: {{.+}} "Foo"
+; CHECK-SPLIT-NEXT:       Entry @ {{.+}} {
+; CHECK-SPLIT-NEXT:         Abbrev: [[ABBREV1]]
+; CHECK-SPLIT-NEXT:         Tag: DW_TAG_structure_type
+; CHECK-SPLIT-NEXT:         DW_IDX_type_unit: 0x00
+; CHECK-SPLIT-NEXT:         DW_IDX_die_offset: 0x0000001f
+; CHECK-SPLIT-NEXT:       }
+; CHECK-SPLIT-NEXT:       Entry @ 0xae {
+; CHECK-SPLIT-NEXT:         Abbrev: [[ABBREV]]
+; CHECK-SPLIT-NEXT:         Tag: DW_TAG_structure_type
+; CHECK-SPLIT-NEXT:         DW_IDX_die_offset: 0x00000039
+; CHECK-SPLIT-NEXT:       }
+; CHECK-SPLIT-NEXT:     }
+; CHECK-SPLIT-NEXT:   ]
+; CHECK-SPLIT-NEXT:   Bucket 2 [
+; CHECK-SPLIT-NEXT:     Name 3 {
+; CHECK-SPLIT-NEXT:       Hash: 0x7C9A7F6A
+; CHECK-SPLIT-NEXT:       String: {{.+}} "main"
+; CHECK-SPLIT-NEXT:       Entry @ {{.+}} {
+; CHECK-SPLIT-NEXT:         Abbrev: [[ABBREV3]]
+; CHECK-SPLIT-NEXT:         Tag: DW_TAG_subprogram
+; CHECK-SPLIT-NEXT:         DW_IDX_die_offset: 0x0000001a
+; CHECK-SPLIT-NEXT:       }
+; CHECK-SPLIT-NEXT:     }
+; CHECK-SPLIT-NEXT:   ]
+; CHECK-SPLIT-NEXT:   Bucket 3 [
+; CHECK-SPLIT-NEXT:     Name 4 {
+; CHECK-SPLIT-NEXT:       Hash: 0x7C952063
+; CHECK-SPLIT-NEXT:       String: {{.+}} "char"
+; CHECK-SPLIT-NEXT:       Entry @ {{.+}} {
+; CHECK-SPLIT-NEXT:         Abbrev: [[ABBREV4]]
+; CHECK-SPLIT-NEXT:         Tag: DW_TAG_base_type
+; CHECK-SPLIT-NEXT:         DW_IDX_type_unit: 0x00
+; CHECK-SPLIT-NEXT:         DW_IDX_die_offset: 0x00000034
+; CHECK-SPLIT-NEXT:       }
+; CHECK-SPLIT-NEXT:     }
+; CHECK-SPLIT-NEXT:   ]
+; CHECK-SPLIT-NEXT: }
+
 
 ; ModuleID = 'main.cpp'
 source_filename = "main.cpp"

@ayermolo ayermolo force-pushed the debugNamesSplitDwarf branch 2 times, most recently from 366f614 to d26811e Compare December 1, 2023 18:43
Enables Type Units with DWARF5 accelerator tables for split dwarf. It is still
under discussion what is the best way to implement support for de-duplication in
DWP. This will be in follow up PR.
@ayermolo
Copy link
Contributor Author

ayermolo commented Dec 4, 2023

@dwblaikie ping :)

Copy link
Collaborator

@dwblaikie dwblaikie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

The DWARF spec I think is a smidge vague on how the IDX_type_unit works with the local and foreign type units. (like if you had 5 local and 5 foreign type units (won't happen for us with object files, but would happen with the linked index) - presumably an IDX_type_unit value of 5 means the first foreign type unit? and 4 means the last local one?))

(also, total aside: be interesting to measure the size of .debug_names relative to .debug_gnu_pubnames and consider including a flag for producing .debug_names without the hash table in object files (if it's going to be merged into a unified index by the linker anyway, we don't need the hash table in the object file, probably))

@ayermolo
Copy link
Contributor Author

ayermolo commented Dec 4, 2023

Looks good to me.

The DWARF spec I think is a smidge vague on how the IDX_type_unit works with the local and foreign type units. (like if you had 5 local and 5 foreign type units (won't happen for us with object files, but would happen with the linked index) - presumably an IDX_type_unit value of 5 means the first foreign type unit? and 4 means the last local one?))

(also, total aside: be interesting to measure the size of .debug_names relative to .debug_gnu_pubnames and consider including a flag for producing .debug_names without the hash table in object files (if it's going to be merged into a unified index by the linker anyway, we don't need the hash table in the object file, probably))

IDX_type_unit

Yes.
On page 138, section 6.1.12
"The foreign TU list immediately follows the local TU list and they both use the same index, so that if there are N local TU entries, the index for the first foreign TU is N."

I am currently working on support for it in BOLT which have similar output as with "smart linking" since at that point binary is fully linked.
Sorry it's not clear to me why would hash table can be skpped in that case?

@ayermolo ayermolo merged commit e8f3ccd into llvm:main Dec 4, 2023
3 checks passed
@ayermolo ayermolo deleted the debugNamesSplitDwarf branch December 4, 2023 21:56
@dwblaikie
Copy link
Collaborator

Yes. On page 138, section 6.1.12 "The foreign TU list immediately follows the local TU list and they both use the same index, so that if there are N local TU entries, the index for the first foreign TU is N."

Ah, great! Thanks for the chapter-and-verse!

I am currently working on support for it in BOLT which have similar output as with "smart linking" since at that point binary is fully linked. Sorry it's not clear to me why would hash table can be skpped in that case?

Because the hash table is there for fast lookup - but if you're merging tables, you don't need fast lookup/you can walk the entries linearly when building a merged table. So far as I understand/can think of.

@cmtice - looking at .debug_names merging - is there some benefit to having the hash table on input? Or is that part of the .debug_names going to be ignored when merging? (I'd expect the latter, but perhaps I've missed something)

felipepiovezan pushed a commit to felipepiovezan/llvm-project that referenced this pull request Feb 2, 2024
Enables Type Units with DWARF5 accelerator tables for split dwarf. It is
still
under discussion what is the best way to implement support for
de-duplication in
DWP. This will be in follow up PR.

(cherry picked from commit e8f3ccd)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants