Skip to content

Conversation

@clayborg
Copy link
Collaborator

This path is updating the reading capabilities of the LLVM DWARF parser for a llvm-dwp patch #167457 that will emit .dwp files where the compile units are DWARF32 and the .debug_str_offsets tables will be emitted as DWARF64 to allow .debug_str sections that exceed 4GB in size.

… DWARF units in .dwp files.

This path is updating the reading capabilities of the LLVM DWARF parser for a llvm-dwp patch llvm#167457 that will emit .dwp files where the compile units are DWARF32 and the .debug_str_offsets tables will be emitted as DWARF64 to allow .debug_str sections that exceed 4GB in size.
@llvmbot
Copy link
Member

llvmbot commented Nov 14, 2025

@llvm/pr-subscribers-debuginfo

Author: Greg Clayton (clayborg)

Changes

This path is updating the reading capabilities of the LLVM DWARF parser for a llvm-dwp patch #167457 that will emit .dwp files where the compile units are DWARF32 and the .debug_str_offsets tables will be emitted as DWARF64 to allow .debug_str sections that exceed 4GB in size.


Full diff: https://github.com/llvm/llvm-project/pull/167986.diff

2 Files Affected:

  • (modified) llvm/lib/DebugInfo/DWARF/DWARFUnit.cpp (+11-2)
  • (added) llvm/test/DebugInfo/dwarfdump-dwp-str-offsets-64.yaml (+88)
diff --git a/llvm/lib/DebugInfo/DWARF/DWARFUnit.cpp b/llvm/lib/DebugInfo/DWARF/DWARFUnit.cpp
index da0bf03e1ac57..b4256ae13914c 100644
--- a/llvm/lib/DebugInfo/DWARF/DWARFUnit.cpp
+++ b/llvm/lib/DebugInfo/DWARF/DWARFUnit.cpp
@@ -1187,9 +1187,18 @@ DWARFUnit::determineStringOffsetsTableContributionDWO(DWARFDataExtractor &DA) {
   if (getVersion() >= 5) {
     if (DA.getData().data() == nullptr)
       return std::nullopt;
-    Offset += Header.getFormat() == dwarf::DwarfFormat::DWARF32 ? 8 : 16;
+    // For .dwo files, the section contribution for the .debug_str_offsets
+    // points to the string offsets table header. Decode the format from this
+    // data as llvm-dwp has been modified to be able to emit a
+    // .debug_str_offsets table as DWARF64 even if the compile unit is DWARF32.
+    // This allows .dwp files to have string tables that exceed UINT32_MAX in
+    // size.
+    uint64_t Length = 0;
+    DwarfFormat Format = dwarf::DwarfFormat::DWARF32;
+    std::tie(Length, Format) = DA.getInitialLength(&Offset);
+    Offset += 4; // Skip the DWARF version uint16_t and the uint16_t padding.
     // Look for a valid contribution at the given offset.
-    auto DescOrError = parseDWARFStringOffsetsTableHeader(DA, Header.getFormat(), Offset);
+    auto DescOrError = parseDWARFStringOffsetsTableHeader(DA, Format, Offset);
     if (!DescOrError)
       return DescOrError.takeError();
     return *DescOrError;
diff --git a/llvm/test/DebugInfo/dwarfdump-dwp-str-offsets-64.yaml b/llvm/test/DebugInfo/dwarfdump-dwp-str-offsets-64.yaml
new file mode 100644
index 0000000000000..3820ca7184d62
--- /dev/null
+++ b/llvm/test/DebugInfo/dwarfdump-dwp-str-offsets-64.yaml
@@ -0,0 +1,88 @@
+# This YAML file will create a .dwp file that has a DWARF32 compile unit whose
+# .debug_str_offsets.dwo is in DWARF64 format. This test verifies that
+# llvm-dwarfdump can read the strings correctly and dump the
+# .debug_str_offsets.dwo info correctly. This paves the way for llvm-dwp to
+# promote some .debug_str_offsets tables for .dwo files to be DWARF64 and will
+# allow the .debug_str section to be larger than UINT32_MAX size in bytes
+# without losing data.
+
+# RUN: yaml2obj %s -o %t.dwp
+# RUN: llvm-dwarfdump --debug-str-offsets --debug-info %t.dwp | FileCheck %s
+
+# CHECK:      0x00000000: Compile Unit: length = 0x0000002a, format = DWARF32, version = 0x0005, unit_type = DW_UT_split_compile, abbr_offset = 0x0000, addr_size = 0x08, DWO_id = 0x1158980a3c2f811b (next unit at 0x0000002e)
+
+# CHECK:      0x00000014: DW_TAG_compile_unit
+# CHECK-NEXT:               DW_AT_producer    ("Apple clang version 17.0.0 (clang-1700.4.4.1)")
+# CHECK-NEXT:               DW_AT_language    (DW_LANG_C_plus_plus_14)
+# CHECK-NEXT:               DW_AT_name        ("main.minimal.cpp")
+# CHECK-NEXT:               DW_AT_dwo_name    ("main.minimal.dwo")
+
+# CHECK:      0x0000001a:   DW_TAG_subprogram
+# CHECK-NEXT:                 DW_AT_low_pc    (indexed (00000000) address = <unresolved>)
+# CHECK-NEXT:                 DW_AT_high_pc   (0x0000000f)
+# CHECK-NEXT:                 DW_AT_frame_base        (DW_OP_reg6 RBP)
+# CHECK-NEXT:                 DW_AT_name      ("main")
+# CHECK-NEXT:                 DW_AT_decl_file (0x00)
+# CHECK-NEXT:                 DW_AT_decl_line (1)
+# CHECK-NEXT:                 DW_AT_type      (0x00000029 "int")
+# CHECK-NEXT:                 DW_AT_external  (true)
+
+# CHECK:      0x00000029:   DW_TAG_base_type
+# CHECK-NEXT:                 DW_AT_name      ("int")
+# CHECK-NEXT:                 DW_AT_encoding  (DW_ATE_signed)
+# CHECK-NEXT:                 DW_AT_byte_size (0x04)
+
+# CHECK:      0x0000002d:   NULL
+
+# CHECK:      .debug_str_offsets.dwo contents:
+# CHECK-NEXT: 0x00000000: Contribution size = 44, Format = DWARF64, Version = 5
+# CHECK-NEXT: 0x00000010: 0000000000000000 "main"
+# CHECK-NEXT: 0x00000018: 0000000000000005 "int"
+# CHECK-NEXT: 0x00000020: 0000000000000009 "Apple clang version 17.0.0 (clang-1700.4.4.1)"
+# CHECK-NEXT: 0x00000028: 0000000000000037 "main.minimal.cpp"
+# CHECK-NEXT: 0x00000030: 0000000000000048 "main.minimal.dwo"
+
+--- !ELF
+FileHeader:
+  Class:           ELFCLASS64
+  Data:            ELFDATA2LSB
+  Type:            ET_REL
+  Machine:         EM_X86_64
+  SectionHeaderStringTable: .strtab
+Sections:
+  - Name:            .debug_abbrev.dwo
+    Type:            SHT_PROGBITS
+    Flags:           [ SHF_EXCLUDE ]
+    AddressAlign:    0x1
+    Content:         01110125251305032576250000022E00111B1206401803253A0B3B0B49133F19000003240003253E0B0B0B000000
+  - Name:            .debug_str.dwo
+    Type:            SHT_PROGBITS
+    Flags:           [ SHF_EXCLUDE, SHF_MERGE, SHF_STRINGS ]
+    AddressAlign:    0x1
+    EntSize:         0x1
+    Content:         6D61696E00696E74004170706C6520636C616E672076657273696F6E2031372E302E302028636C616E672D313730302E342E342E3129006D61696E2E6D696E696D616C2E637070006D61696E2E6D696E696D616C2E64776F00
+  - Name:            .debug_str_offsets.dwo
+    Type:            SHT_PROGBITS
+    Flags:           [ SHF_EXCLUDE ]
+    AddressAlign:    0x1
+    Content:         'FFFFFFFF2C000000000000000500000000000000000000000500000000000000090000000000000037000000000000004800000000000000'
+  - Name:            .debug_info.dwo
+    Type:            SHT_PROGBITS
+    Flags:           [ SHF_EXCLUDE ]
+    AddressAlign:    0x1
+    Content:         2A00000005000508000000001B812F3C0A98581101022100030402000F0000000156000001290000000301050400
+  - Name:            .debug_cu_index
+    Type:            SHT_PROGBITS
+    AddressAlign:    0x1
+    Content:         0500000003000000010000000200000000000000000000001B812F3C0A98581100000000010000000100000003000000060000000000000000000000000000002E0000002E0000001C000000
+  - Type:            SectionHeaderTable
+    Sections:
+      - Name:            .strtab
+      - Name:            .debug_abbrev.dwo
+      - Name:            .debug_str.dwo
+      - Name:            .debug_str_offsets.dwo
+      - Name:            .debug_info.dwo
+      - Name:            .debug_cu_index
+      - Name:            .symtab
+Symbols:         []
+...

Comment on lines 1190 to 1195
// For .dwo files, the section contribution for the .debug_str_offsets
// points to the string offsets table header. Decode the format from this
// data as llvm-dwp has been modified to be able to emit a
// .debug_str_offsets table as DWARF64 even if the compile unit is DWARF32.
// This allows .dwp files to have string tables that exceed UINT32_MAX in
// size.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd probably skip the reference to llvm-dwp - Probably mention/link to https://dwarfstd.org/issues/211101.1.html

Though it barely needs a comment - the consequence falls out pretty naturally from parsing the header.

So maybe just a small FYI:

// FYI: The .debug_str_offsets.dwo section may use DWARF64 even when the rest of the file uses DWARF32, so respect whichever encoding the header/length uses.

@clayborg clayborg merged commit 8b59622 into llvm:main Nov 14, 2025
10 checks passed
@clayborg clayborg deleted the dwp-read-str-offsets-dwarf64 branch November 14, 2025 19:22
@dyung
Copy link
Collaborator

dyung commented Nov 14, 2025

Hi @clayborg, the test you added in this change seems to be failing on MacOS, can you take a look?

https://lab.llvm.org/buildbot/#/builders/190/builds/30982

LLVM Buildbot[Builders](https://lab.llvm.org/buildbot/#/builders)[llvm-clang-aarch64-darwin](https://lab.llvm.org/buildbot/#/builders/190)[30982](https://lab.llvm.org/buildbot/#/builders/190/builds/30982)test-build-unified-tree-check-all[FAIL: LLVM::dwarfdump-dwp-str-offsets-64.yaml](https://lab.llvm.org/buildbot/#/builders/190/builds/30982/steps/6/logs/FAIL__LLVM__dwarfdump-dwp-str-offsets-64_yaml)
dyung
******************** TEST 'LLVM :: DebugInfo/dwarfdump-dwp-str-offsets-64.yaml' FAILED ********************
Exit Code: 1
Command Output (stdout):
--
# RUN: at line 9
/Volumes/ExternalSSD/buildbot-root/aarch64-darwin/build/bin/yaml2obj /Users/buildbot/buildbot-root2/aarch64-darwin/llvm-project/llvm/test/DebugInfo/dwarfdump-dwp-str-offsets-64.yaml -o /Volumes/ExternalSSD/buildbot-root/aarch64-darwin/build/test/DebugInfo/Output/dwarfdump-dwp-str-offsets-64.yaml.tmp.dwp
# executed command: /Volumes/ExternalSSD/buildbot-root/aarch64-darwin/build/bin/yaml2obj /Users/buildbot/buildbot-root2/aarch64-darwin/llvm-project/llvm/test/DebugInfo/dwarfdump-dwp-str-offsets-64.yaml -o /Volumes/ExternalSSD/buildbot-root/aarch64-darwin/build/test/DebugInfo/Output/dwarfdump-dwp-str-offsets-64.yaml.tmp.dwp
# note: command had no output on stdout or stderr
# RUN: at line 10
/Volumes/ExternalSSD/buildbot-root/aarch64-darwin/build/bin/llvm-dwarfdump --debug-str-offsets --debug-info /Volumes/ExternalSSD/buildbot-root/aarch64-darwin/build/test/DebugInfo/Output/dwarfdump-dwp-str-offsets-64.yaml.tmp.dwp | /Volumes/ExternalSSD/buildbot-root/aarch64-darwin/build/bin/FileCheck /Users/buildbot/buildbot-root2/aarch64-darwin/llvm-project/llvm/test/DebugInfo/dwarfdump-dwp-str-offsets-64.yaml
# executed command: /Volumes/ExternalSSD/buildbot-root/aarch64-darwin/build/bin/llvm-dwarfdump --debug-str-offsets --debug-info /Volumes/ExternalSSD/buildbot-root/aarch64-darwin/build/test/DebugInfo/Output/dwarfdump-dwp-str-offsets-64.yaml.tmp.dwp
# .---command stderr------------
# | /Volumes/ExternalSSD/buildbot-root/aarch64-darwin/build/test/DebugInfo/Output/dwarfdump-dwp-str-offsets-64.yaml.tmp.dwp: Error in creating MCRegInfo
# `-----------------------------
# executed command: /Volumes/ExternalSSD/buildbot-root/aarch64-darwin/build/bin/FileCheck /Users/buildbot/buildbot-root2/aarch64-darwin/llvm-project/llvm/test/DebugInfo/dwarfdump-dwp-str-offsets-64.yaml
# .---command stderr------------
# | /Users/buildbot/buildbot-root2/aarch64-darwin/llvm-project/llvm/test/DebugInfo/dwarfdump-dwp-str-offsets-64.yaml:23:15: error: CHECK-NEXT: expected string not found in input
# | # CHECK-NEXT: DW_AT_frame_base (DW_OP_reg6 RBP)
# |               ^
# | <stdin>:14:28: note: scanning from here
# |  DW_AT_high_pc (0x0000000f)
# |                            ^
# | <stdin>:15:2: note: possible intended match here
# |  DW_AT_frame_base (DW_OP_reg6)
# |  ^
# | 
# | Input file: <stdin>
# | Check file: /Users/buildbot/buildbot-root2/aarch64-darwin/llvm-project/llvm/test/DebugInfo/dwarfdump-dwp-str-offsets-64.yaml
# | 
# | -dump-input=help explains the following input dump.
# | 
# | Input was:
# | <<<<<<
# |            .
# |            .
# |            .
# |            9:  DW_AT_name ("main.minimal.cpp") 
# |           10:  DW_AT_dwo_name ("main.minimal.dwo") 
# |           11:  
# |           12: 0x0000001a: DW_TAG_subprogram 
# |           13:  DW_AT_low_pc (indexed (00000000) address = <unresolved>) 
# |           14:  DW_AT_high_pc (0x0000000f) 
# | next:23'0                                X error: no match found
# |           15:  DW_AT_frame_base (DW_OP_reg6) 
# | next:23'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# | next:23'1      ?                              possible intended match
# |           16:  DW_AT_name ("main") 
# | next:23'0     ~~~~~~~~~~~~~~~~~~~~~
# |           17:  DW_AT_decl_file (0x00) 
# | next:23'0     ~~~~~~~~~~~~~~~~~~~~~~~~
# |           18:  DW_AT_decl_line (1) 
# | next:23'0     ~~~~~~~~~~~~~~~~~~~~~
# |           19:  DW_AT_type (0x00000029 "int") 
# | next:23'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# |           20:  DW_AT_external (true) 
# | next:23'0     ~~~~~~~~~~~~~~~~~~~~~~~
# |            .
# |            .
# |            .
# | >>>>>>
# `-----------------------------
# error: command failed with exit status: 1
--
********************

@clayborg
Copy link
Collaborator Author

Hi @clayborg, the test you added in this change seems to be failing on MacOS, can you take a look?

https://lab.llvm.org/buildbot/#/builders/190/builds/30982

Easy fix, I will make a PR real quick.

clayborg added a commit to clayborg/llvm-project that referenced this pull request Nov 14, 2025
@clayborg
Copy link
Collaborator Author

Buildbot fix PR is here: #168124

clayborg added a commit that referenced this pull request Nov 14, 2025
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Nov 14, 2025
…e from targets. (#168124)

Fixes a buildbot issue stemming from
llvm/llvm-project#167986
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants