Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[LLVM][DebugInfo] Refactor some code for easier sharing. #82153

Merged
merged 4 commits into from
Feb 22, 2024

Conversation

cmtice
Copy link
Contributor

@cmtice cmtice commented Feb 18, 2024

Refactor the code that calculates the offsets for the various pieces of the DWARF .debug_names index section, to make it easier to share the code with other tools, such as LLD.

Refactor the code that calculates the offsets for the various
pieces of the DWARF .debug_names index section, to make it easier to
share the code with other tools, such as LLD.
@llvmbot
Copy link
Collaborator

llvmbot commented Feb 18, 2024

@llvm/pr-subscribers-debuginfo

Author: None (cmtice)

Changes

Refactor the code that calculates the offsets for the various pieces of the DWARF .debug_names index section, to make it easier to share the code with other tools, such as LLD.


Full diff: https://github.com/llvm/llvm-project/pull/82153.diff

2 Files Affected:

  • (modified) llvm/include/llvm/DebugInfo/DWARF/DWARFAcceleratorTable.h (+21-7)
  • (modified) llvm/lib/DebugInfo/DWARF/DWARFAcceleratorTable.cpp (+46-25)
diff --git a/llvm/include/llvm/DebugInfo/DWARF/DWARFAcceleratorTable.h b/llvm/include/llvm/DebugInfo/DWARF/DWARFAcceleratorTable.h
index a26c44bf7e9c29..0dd5ed55efe702 100644
--- a/llvm/include/llvm/DebugInfo/DWARF/DWARFAcceleratorTable.h
+++ b/llvm/include/llvm/DebugInfo/DWARF/DWARFAcceleratorTable.h
@@ -562,6 +562,17 @@ class DWARFDebugNames : public DWARFAcceleratorTable {
     uint64_t getEntryOffset() const { return EntryOffset; }
   };
 
+  // Offsets for the start of various important tables from the start of the
+  // section.
+  struct DWARFDebugNamesOffsets {
+    uint64_t CUsBase;
+    uint64_t BucketsBase;
+    uint64_t HashesBase;
+    uint64_t StringOffsetsBase;
+    uint64_t EntryOffsetsBase;
+    uint64_t EntriesBase;
+  };
+
   /// Represents a single accelerator table within the DWARF v5 .debug_names
   /// section.
   class NameIndex {
@@ -572,12 +583,7 @@ class DWARFDebugNames : public DWARFAcceleratorTable {
     // Base of the whole unit and of various important tables, as offsets from
     // the start of the section.
     uint64_t Base;
-    uint64_t CUsBase;
-    uint64_t BucketsBase;
-    uint64_t HashesBase;
-    uint64_t StringOffsetsBase;
-    uint64_t EntryOffsetsBase;
-    uint64_t EntriesBase;
+    DWARFDebugNamesOffsets Offsets;
 
     void dumpCUs(ScopedPrinter &W) const;
     void dumpLocalTUs(ScopedPrinter &W) const;
@@ -638,7 +644,7 @@ class DWARFDebugNames : public DWARFAcceleratorTable {
     /// Returns the Entry at the relative `Offset` from the start of the Entry
     /// pool.
     Expected<Entry> getEntryAtRelativeOffset(uint64_t Offset) const {
-      auto OffsetFromSection = Offset + this->EntriesBase;
+      auto OffsetFromSection = Offset + this->Offsets.EntriesBase;
       return getEntry(&OffsetFromSection);
     }
 
@@ -793,6 +799,14 @@ class DWARFDebugNames : public DWARFAcceleratorTable {
   const NameIndex *getCUNameIndex(uint64_t CUOffset);
 };
 
+// Calculates the starting offsets for various sections within the
+// debug_names section. This has been pulled out of DWARFDebugNames, to
+// allow easier sharing of the code with other tools, such as LLD.
+uint64_t FindDebugNamesOffsets(DWARFDebugNames::DWARFDebugNamesOffsets &Offsets,
+                               uint64_t HdrSize,
+                               const dwarf::DwarfFormat Format,
+                               const DWARFDebugNames::Header &Hdr);
+
 /// If `Name` is the name of a templated function that includes template
 /// parameters, returns a substring of `Name` containing no template
 /// parameters.
diff --git a/llvm/lib/DebugInfo/DWARF/DWARFAcceleratorTable.cpp b/llvm/lib/DebugInfo/DWARF/DWARFAcceleratorTable.cpp
index 78f819dd052aac..68adad517f9364 100644
--- a/llvm/lib/DebugInfo/DWARF/DWARFAcceleratorTable.cpp
+++ b/llvm/lib/DebugInfo/DWARF/DWARFAcceleratorTable.cpp
@@ -510,7 +510,7 @@ DWARFDebugNames::Abbrev DWARFDebugNames::AbbrevMapInfo::getTombstoneKey() {
 
 Expected<DWARFDebugNames::AttributeEncoding>
 DWARFDebugNames::NameIndex::extractAttributeEncoding(uint64_t *Offset) {
-  if (*Offset >= EntriesBase) {
+  if (*Offset >= Offsets.EntriesBase) {
     return createStringError(errc::illegal_byte_sequence,
                              "Incorrectly terminated abbreviation table.");
   }
@@ -536,7 +536,7 @@ DWARFDebugNames::NameIndex::extractAttributeEncodings(uint64_t *Offset) {
 
 Expected<DWARFDebugNames::Abbrev>
 DWARFDebugNames::NameIndex::extractAbbrev(uint64_t *Offset) {
-  if (*Offset >= EntriesBase) {
+  if (*Offset >= Offsets.EntriesBase) {
     return createStringError(errc::illegal_byte_sequence,
                              "Incorrectly terminated abbreviation table.");
   }
@@ -552,6 +552,37 @@ DWARFDebugNames::NameIndex::extractAbbrev(uint64_t *Offset) {
   return Abbrev(Code, dwarf::Tag(Tag), AbbrevOffset, std::move(*AttrEncOr));
 }
 
+uint64_t llvm::FindDebugNamesOffsets(
+    DWARFDebugNames::DWARFDebugNamesOffsets &Offsets,
+    uint64_t HdrSize,
+    dwarf::DwarfFormat Format,
+    const DWARFDebugNames::Header &Hdr) {
+  uint32_t DwarfSize = (Format == llvm::dwarf::DwarfFormat::DWARF32) ? 4 : 8;
+  uint64_t Offset = HdrSize;
+  Offsets.CUsBase = Offset;
+  Offset += Hdr.CompUnitCount * DwarfSize;
+  Offset += Hdr.LocalTypeUnitCount * DwarfSize;
+  Offset += Hdr.ForeignTypeUnitCount * 8;
+
+  Offsets.BucketsBase = Offset;
+  Offset += Hdr.BucketCount * 4;
+
+  Offsets.HashesBase = Offset;
+  if (Hdr.BucketCount > 0)
+    Offset += Hdr.NameCount * 4;
+
+  Offsets.StringOffsetsBase = Offset;
+  Offset += Hdr.NameCount * DwarfSize;
+
+  Offsets.EntryOffsetsBase = Offset;
+  Offset += Hdr.NameCount * DwarfSize;
+
+  Offset += Hdr.AbbrevTableSize;
+  Offsets.EntriesBase = Offset;
+
+  return Offset;
+}
+
 Error DWARFDebugNames::NameIndex::extract() {
   const DWARFDataExtractor &AS = Section.AccelSection;
   uint64_t Offset = Base;
@@ -559,25 +590,15 @@ Error DWARFDebugNames::NameIndex::extract() {
     return E;
 
   const unsigned SectionOffsetSize = dwarf::getDwarfOffsetByteSize(Hdr.Format);
-  CUsBase = Offset;
-  Offset += Hdr.CompUnitCount * SectionOffsetSize;
-  Offset += Hdr.LocalTypeUnitCount * SectionOffsetSize;
-  Offset += Hdr.ForeignTypeUnitCount * 8;
-  BucketsBase = Offset;
-  Offset += Hdr.BucketCount * 4;
-  HashesBase = Offset;
-  if (Hdr.BucketCount > 0)
-    Offset += Hdr.NameCount * 4;
-  StringOffsetsBase = Offset;
-  Offset += Hdr.NameCount * SectionOffsetSize;
-  EntryOffsetsBase = Offset;
-  Offset += Hdr.NameCount * SectionOffsetSize;
+  Offset = FindDebugNamesOffsets(Offsets, Offset, Hdr.Format, Hdr);
+
+  Offset = Offsets.EntryOffsetsBase + (Hdr.NameCount * SectionOffsetSize);
 
   if (!AS.isValidOffsetForDataOfSize(Offset, Hdr.AbbrevTableSize))
     return createStringError(errc::illegal_byte_sequence,
                              "Section too small: cannot read abbreviations.");
 
-  EntriesBase = Offset + Hdr.AbbrevTableSize;
+  Offsets.EntriesBase = Offset + Hdr.AbbrevTableSize;
 
   for (;;) {
     auto AbbrevOr = extractAbbrev(&Offset);
@@ -679,7 +700,7 @@ void DWARFDebugNames::Entry::dumpParentIdx(
     return;
   }
 
-  auto AbsoluteOffset = NameIdx->EntriesBase + FormValue.getRawUValue();
+  auto AbsoluteOffset = NameIdx->Offsets.EntriesBase + FormValue.getRawUValue();
   W.getOStream() << "Entry @ 0x" + Twine::utohexstr(AbsoluteOffset);
 }
 
@@ -708,14 +729,14 @@ std::error_code DWARFDebugNames::SentinelError::convertToErrorCode() const {
 uint64_t DWARFDebugNames::NameIndex::getCUOffset(uint32_t CU) const {
   assert(CU < Hdr.CompUnitCount);
   const unsigned SectionOffsetSize = dwarf::getDwarfOffsetByteSize(Hdr.Format);
-  uint64_t Offset = CUsBase + SectionOffsetSize * CU;
+  uint64_t Offset = Offsets.CUsBase + SectionOffsetSize * CU;
   return Section.AccelSection.getRelocatedValue(SectionOffsetSize, &Offset);
 }
 
 uint64_t DWARFDebugNames::NameIndex::getLocalTUOffset(uint32_t TU) const {
   assert(TU < Hdr.LocalTypeUnitCount);
   const unsigned SectionOffsetSize = dwarf::getDwarfOffsetByteSize(Hdr.Format);
-  uint64_t Offset = CUsBase + SectionOffsetSize * (Hdr.CompUnitCount + TU);
+  uint64_t Offset = Offsets.CUsBase + SectionOffsetSize * (Hdr.CompUnitCount + TU);
   return Section.AccelSection.getRelocatedValue(SectionOffsetSize, &Offset);
 }
 
@@ -723,7 +744,7 @@ uint64_t DWARFDebugNames::NameIndex::getForeignTUSignature(uint32_t TU) const {
   assert(TU < Hdr.ForeignTypeUnitCount);
   const unsigned SectionOffsetSize = dwarf::getDwarfOffsetByteSize(Hdr.Format);
   uint64_t Offset =
-      CUsBase +
+      Offsets.CUsBase +
       SectionOffsetSize * (Hdr.CompUnitCount + Hdr.LocalTypeUnitCount) + 8 * TU;
   return Section.AccelSection.getU64(&Offset);
 }
@@ -759,28 +780,28 @@ DWARFDebugNames::NameIndex::getNameTableEntry(uint32_t Index) const {
   assert(0 < Index && Index <= Hdr.NameCount);
   const unsigned SectionOffsetSize = dwarf::getDwarfOffsetByteSize(Hdr.Format);
   uint64_t StringOffsetOffset =
-      StringOffsetsBase + SectionOffsetSize * (Index - 1);
+      Offsets.StringOffsetsBase + SectionOffsetSize * (Index - 1);
   uint64_t EntryOffsetOffset =
-      EntryOffsetsBase + SectionOffsetSize * (Index - 1);
+      Offsets.EntryOffsetsBase + SectionOffsetSize * (Index - 1);
   const DWARFDataExtractor &AS = Section.AccelSection;
 
   uint64_t StringOffset =
       AS.getRelocatedValue(SectionOffsetSize, &StringOffsetOffset);
   uint64_t EntryOffset = AS.getUnsigned(&EntryOffsetOffset, SectionOffsetSize);
-  EntryOffset += EntriesBase;
+  EntryOffset += Offsets.EntriesBase;
   return {Section.StringSection, Index, StringOffset, EntryOffset};
 }
 
 uint32_t
 DWARFDebugNames::NameIndex::getBucketArrayEntry(uint32_t Bucket) const {
   assert(Bucket < Hdr.BucketCount);
-  uint64_t BucketOffset = BucketsBase + 4 * Bucket;
+  uint64_t BucketOffset = Offsets.BucketsBase + 4 * Bucket;
   return Section.AccelSection.getU32(&BucketOffset);
 }
 
 uint32_t DWARFDebugNames::NameIndex::getHashArrayEntry(uint32_t Index) const {
   assert(0 < Index && Index <= Hdr.NameCount);
-  uint64_t HashOffset = HashesBase + 4 * (Index - 1);
+  uint64_t HashOffset = Offsets.HashesBase + 4 * (Index - 1);
   return Section.AccelSection.getU32(&HashOffset);
 }
 

Copy link

github-actions bot commented Feb 18, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@@ -562,6 +562,17 @@ class DWARFDebugNames : public DWARFAcceleratorTable {
uint64_t getEntryOffset() const { return EntryOffset; }
};

// Offsets for the start of various important tables from the start of the
// section.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you use Doxygen cmments ///?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -793,6 +799,14 @@ class DWARFDebugNames : public DWARFAcceleratorTable {
const NameIndex *getCUNameIndex(uint64_t CUOffset);
};

// Calculates the starting offsets for various sections within the
// debug_names section. This has been pulled out of DWARFDebugNames, to
// allow easier sharing of the code with other tools, such as LLD.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

///

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been pulled out of DWARFDebugNames, to
// allow easier sharing of the code with other tools, such as LLD.

This is more of a review comment than documentation, I suggest removing this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Offset += Hdr.NameCount * SectionOffsetSize;
EntryOffsetsBase = Offset;
Offset += Hdr.NameCount * SectionOffsetSize;
Offset = FindDebugNamesOffsets(Offsets, Offset, Hdr.Format, Hdr);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is strange that we assign to Offset, but then overwrite it on the next line?

So what's the return value for? Perhaps if it's needed by the future callers but not right now - maybe it should be in the Offsets structure, and that Offsets structure should be returned by this function, rather than as an input parameter?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, something feels off here. Suggestion:

  1. rename Offset to EndOfHeaderOffset
  2. remove the return value from FindDebugNamesOffset
  3. declare Offset in the line below

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did something very similar to what felipe recommended.

Change 'FindDebugNamesOffsets' function: make the return type 'void' (return
nothing); update the name to start with lowercase letter.  Break 'offset'
variable in NameIndex::extract into two parts: hdrSize (to be passed to
Header::extract and findDebugNamesOffsets) and 'Offset', to be used for
offsets after call to findDebugNamesOffsets.

Change comments in header file to doxygen comments.
@cmtice
Copy link
Contributor Author

cmtice commented Feb 20, 2024

I think I have addressed all the review comments; if there's something else I need to do please let me know.

@cmtice
Copy link
Contributor Author

cmtice commented Feb 22, 2024

Ping? Is this OK to commit now?

Copy link
Contributor

@felipepiovezan felipepiovezan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@cmtice cmtice merged commit 43f1fa9 into llvm:main Feb 22, 2024
4 checks passed
Copy link
Collaborator

@dwblaikie dwblaikie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few minor things to address here in follow-up commits, please. Excuse the delay.

void llvm::findDebugNamesOffsets(
DWARFDebugNames::DWARFDebugNamesOffsets &Offsets, uint64_t HdrSize,
dwarf::DwarfFormat Format, const DWARFDebugNames::Header &Hdr) {
uint32_t DwarfSize = (Format == llvm::dwarf::DwarfFormat::DWARF64) ? 8 : 4;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should use dwarf::getDwarfOffsetByteSize

@@ -552,32 +552,50 @@ DWARFDebugNames::NameIndex::extractAbbrev(uint64_t *Offset) {
return Abbrev(Code, dwarf::Tag(Tag), AbbrevOffset, std::move(*AttrEncOr));
}

void llvm::findDebugNamesOffsets(
DWARFDebugNames::DWARFDebugNamesOffsets &Offsets, uint64_t HdrSize,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HdrSize isn't an accurate name for this parameter, I think @felipepiovezan mentioned EndOfHeaderOffset as a name in #82153 (comment) which seems accurate.

(the issue here is that while HdrSize and EndOfHeaderOffset would be identical for the first debug names table in a .debug_names section - they aren't identical for the second/other subsequent contributions - EndOfHeaderOffset is the offset in the .debug_names section after the header we care about here, which might be the Nth header, not necessarily the first)

@@ -552,32 +552,50 @@ DWARFDebugNames::NameIndex::extractAbbrev(uint64_t *Offset) {
return Abbrev(Code, dwarf::Tag(Tag), AbbrevOffset, std::move(*AttrEncOr));
}

void llvm::findDebugNamesOffsets(
DWARFDebugNames::DWARFDebugNamesOffsets &Offsets, uint64_t HdrSize,
dwarf::DwarfFormat Format, const DWARFDebugNames::Header &Hdr) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably remove the Format parameter, since it's accessible from the header?

@@ -552,32 +552,50 @@ DWARFDebugNames::NameIndex::extractAbbrev(uint64_t *Offset) {
return Abbrev(Code, dwarf::Tag(Tag), AbbrevOffset, std::move(*AttrEncOr));
}

void llvm::findDebugNamesOffsets(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably return the offsets from this function, rather than taking them as an out parameter? It's then clearer that they aren't in/out parameters, or something else more interesting.

MaskRay added a commit that referenced this pull request Apr 9, 2024
Address some post-review comments in #82153 and move the function inside
llvm::dwarf, used by certain free functions.

Pull Request: #88064
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants