Skip to content

Conversation

dlav-sc
Copy link
Contributor

@dlav-sc dlav-sc commented Oct 18, 2025

Currently, -gsplit-dwarf and -mrelax are incompatible options in
Clang. The issue is that .dwo files should not contain any
relocations, as they are not processed by the linker. However, relaxable
code emits relocations in DWARF for debug ranges that reside in the
.dwo file when DWARF fission is enabled.

This patch makes DWARF fission compatible with RISC-V relaxations. It
uses the StartxEndx DWARF forms in .debug_rnglists.dwo, which
allow referencing addresses from .debug_addr instead of using absolute
addresses. This approach eliminates relocations from .dwo files.

@llvmbot llvmbot added clang Clang issues not falling into any other category backend:RISC-V clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang:frontend Language frontend issues, e.g. anything involving "Sema" llvm:codegen debuginfo llvm:mc Machine (object) code labels Oct 18, 2025
@llvmbot
Copy link
Member

llvmbot commented Oct 18, 2025

@llvm/pr-subscribers-clang
@llvm/pr-subscribers-llvm-mc
@llvm/pr-subscribers-debuginfo

@llvm/pr-subscribers-backend-risc-v

Author: None (dlav-sc)

Changes

Currently, -gsplit-dwarf and -mrelax are incompatible options in
Clang. The issue is that .dwo files should not contain any
relocations, as they are not processed by the linker. However, relaxable
code emits relocations in DWARF for debug ranges that reside in the
.dwo file when DWARF fission is enabled.

This patch makes DWARF fission compatible with RISC-V relaxations. It
uses the StartxEndx DWARF forms in .debug_rnglists.dwo, which
allow referencing addresses from .debug_addr instead of using absolute
addresses. This approach eliminates relocations from .dwo files.


Patch is 23.06 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/164128.diff

8 Files Affected:

  • (modified) clang/include/clang/Basic/DiagnosticDriverKinds.td (-3)
  • (modified) clang/lib/Driver/ToolChains/Arch/RISCV.cpp (+2-9)
  • (modified) clang/test/Driver/riscv-features.c (-7)
  • (modified) llvm/include/llvm/MC/MCSymbol.h (+2)
  • (modified) llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp (+5-3)
  • (modified) llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp (+112-35)
  • (modified) llvm/lib/MC/MCSymbol.cpp (+19-3)
  • (added) llvm/test/DebugInfo/RISCV/relax_dwo_ranges.ll (+155)
diff --git a/clang/include/clang/Basic/DiagnosticDriverKinds.td b/clang/include/clang/Basic/DiagnosticDriverKinds.td
index 0581bf353d936..fe29424d8f914 100644
--- a/clang/include/clang/Basic/DiagnosticDriverKinds.td
+++ b/clang/include/clang/Basic/DiagnosticDriverKinds.td
@@ -833,9 +833,6 @@ def warn_drv_sarif_format_unstable : Warning<
   "diagnostic formatting in SARIF mode is currently unstable">,
   InGroup<DiagGroup<"sarif-format-unstable">>;
 
-def err_drv_riscv_unsupported_with_linker_relaxation : Error<
-  "%0 is unsupported with RISC-V linker relaxation (-mrelax)">;
-
 def warn_drv_loongarch_conflicting_implied_val : Warning<
   "ignoring '%0' as it conflicts with that implied by '%1' (%2)">,
   InGroup<OptionIgnored>;
diff --git a/clang/lib/Driver/ToolChains/Arch/RISCV.cpp b/clang/lib/Driver/ToolChains/Arch/RISCV.cpp
index f2e79e71f93d4..549a8b712c602 100644
--- a/clang/lib/Driver/ToolChains/Arch/RISCV.cpp
+++ b/clang/lib/Driver/ToolChains/Arch/RISCV.cpp
@@ -130,17 +130,10 @@ void riscv::getRISCVTargetFeatures(const Driver &D, const llvm::Triple &Triple,
 #undef RESERVE_REG
 
   // -mrelax is default, unless -mno-relax is specified.
-  if (Args.hasFlag(options::OPT_mrelax, options::OPT_mno_relax, true)) {
+  if (Args.hasFlag(options::OPT_mrelax, options::OPT_mno_relax, true))
     Features.push_back("+relax");
-    // -gsplit-dwarf -mrelax requires DW_AT_high_pc/DW_AT_ranges/... indexing
-    // into .debug_addr, which is currently not implemented.
-    Arg *A;
-    if (getDebugFissionKind(D, Args, A) != DwarfFissionKind::None)
-      D.Diag(clang::diag::err_drv_riscv_unsupported_with_linker_relaxation)
-          << A->getAsString(Args);
-  } else {
+  else
     Features.push_back("-relax");
-  }
 
   // If -mstrict-align, -mno-strict-align, -mscalar-strict-align, or
   // -mno-scalar-strict-align is passed, use it. Otherwise, the
diff --git a/clang/test/Driver/riscv-features.c b/clang/test/Driver/riscv-features.c
index 1c8b52bd31997..97736ff81c799 100644
--- a/clang/test/Driver/riscv-features.c
+++ b/clang/test/Driver/riscv-features.c
@@ -68,13 +68,6 @@
 // DEFAULT-LINUX-SAME: "-target-feature" "+d"
 // DEFAULT-LINUX-SAME: "-target-feature" "+c"
 
-// RUN: not %clang -c --target=riscv64-linux-gnu -gsplit-dwarf %s 2>&1 | FileCheck %s --check-prefix=ERR-SPLIT-DWARF
-// RUN: not %clang -c --target=riscv64 -gsplit-dwarf=single %s 2>&1 | FileCheck %s --check-prefix=ERR-SPLIT-DWARF
-// RUN: %clang -### -c --target=riscv64 -mno-relax -g -gsplit-dwarf %s 2>&1 | FileCheck %s --check-prefix=SPLIT-DWARF
-
-// ERR-SPLIT-DWARF: error: -gsplit-dwarf{{.*}} is unsupported with RISC-V linker relaxation (-mrelax)
-// SPLIT-DWARF:     "-split-dwarf-file"
-
 // RUN: %clang -mabi=lp64d --target=riscv64-unknown-fuchsia -### %s -fsyntax-only 2>&1 | FileCheck %s -check-prefixes=FUCHSIA
 // FUCHSIA: "-target-feature" "+m"
 // FUCHSIA-SAME: "-target-feature" "+a"
diff --git a/llvm/include/llvm/MC/MCSymbol.h b/llvm/include/llvm/MC/MCSymbol.h
index e31d0374baf4a..eef248354b70f 100644
--- a/llvm/include/llvm/MC/MCSymbol.h
+++ b/llvm/include/llvm/MC/MCSymbol.h
@@ -383,6 +383,8 @@ inline raw_ostream &operator<<(raw_ostream &OS, const MCSymbol &Sym) {
   return OS;
 }
 
+bool isRangeRelaxable(const MCSymbol *Begin, const MCSymbol *End);
+
 } // end namespace llvm
 
 #endif // LLVM_MC_MCSYMBOL_H
diff --git a/llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp b/llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp
index 518121e200190..12ad5c18e9600 100644
--- a/llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp
@@ -493,10 +493,12 @@ void DwarfCompileUnit::attachLowHighPC(DIE &D, const MCSymbol *Begin,
   assert(End->isDefined() && "Invalid end label");
 
   addLabelAddress(D, dwarf::DW_AT_low_pc, Begin);
-  if (DD->getDwarfVersion() < 4)
-    addLabelAddress(D, dwarf::DW_AT_high_pc, End);
-  else
+  if (DD->getDwarfVersion() >= 4 &&
+      (!isDwoUnit() || !llvm::isRangeRelaxable(Begin, End))) {
     addLabelDelta(D, dwarf::DW_AT_high_pc, End, Begin);
+    return;
+  }
+  addLabelAddress(D, dwarf::DW_AT_high_pc, End);
 }
 
 // Add info for Wasm-global-based relocation.
diff --git a/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp b/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
index 433877f3a8b98..4ba35953d196c 100644
--- a/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
@@ -3275,34 +3275,70 @@ static MCSymbol *emitLoclistsTableHeader(AsmPrinter *Asm,
   return TableEnd;
 }
 
-template <typename Ranges, typename PayloadEmitter>
-static void emitRangeList(
-    DwarfDebug &DD, AsmPrinter *Asm, MCSymbol *Sym, const Ranges &R,
-    const DwarfCompileUnit &CU, unsigned BaseAddressx, unsigned OffsetPair,
-    unsigned StartxLength, unsigned EndOfList,
-    StringRef (*StringifyEnum)(unsigned),
-    bool ShouldUseBaseAddress,
-    PayloadEmitter EmitPayload) {
+namespace {
+
+struct DebugLocSpanList {
+  MCSymbol *Label;
+  const DwarfCompileUnit *CU;
+  llvm::ArrayRef<llvm::DebugLocStream::Entry> Ranges;
+};
+
+template <typename DWARFSpanList> struct DwarfRangeListTraits {};
+
+template <> struct DwarfRangeListTraits<DebugLocSpanList> {
+  static constexpr unsigned BaseAddressx = dwarf::DW_LLE_base_addressx;
+  static constexpr unsigned OffsetPair = dwarf::DW_LLE_offset_pair;
+  static constexpr unsigned StartxLength = dwarf::DW_LLE_startx_length;
+  static constexpr unsigned StartxEndx = dwarf::DW_LLE_startx_endx;
+  static constexpr unsigned EndOfList = dwarf::DW_LLE_end_of_list;
+
+  static StringRef StringifyRangeKind(unsigned Encoding) {
+    return llvm::dwarf::LocListEncodingString(Encoding);
+  }
+};
+
+template <> struct DwarfRangeListTraits<RangeSpanList> {
+  static constexpr unsigned BaseAddressx = dwarf::DW_RLE_base_addressx;
+  static constexpr unsigned OffsetPair = dwarf::DW_RLE_offset_pair;
+  static constexpr unsigned StartxLength = dwarf::DW_RLE_startx_length;
+  static constexpr unsigned StartxEndx = dwarf::DW_RLE_startx_endx;
+  static constexpr unsigned EndOfList = dwarf::DW_RLE_end_of_list;
+
+  static StringRef StringifyRangeKind(unsigned Encoding) {
+    return llvm::dwarf::RangeListEncodingString(Encoding);
+  }
+};
+
+} // namespace
+
+template <
+    typename Ranges, typename PayloadEmitter,
+    std::enable_if_t<DwarfRangeListTraits<Ranges>::BaseAddressx, bool> = true>
+static void emitRangeList(DwarfDebug &DD, AsmPrinter *Asm, const Ranges &R,
+                          bool ShouldUseBaseAddress,
+                          PayloadEmitter EmitPayload) {
 
   auto Size = Asm->MAI->getCodePointerSize();
   bool UseDwarf5 = DD.getDwarfVersion() >= 5;
 
   // Emit our symbol so we can find the beginning of the range.
-  Asm->OutStreamer->emitLabel(Sym);
+  Asm->OutStreamer->emitLabel(R.Label);
 
   // Gather all the ranges that apply to the same section so they can share
   // a base address entry.
-  SmallMapVector<const MCSection *, std::vector<decltype(&*R.begin())>, 16>
+  SmallMapVector<const MCSection *, std::vector<decltype(&*R.Ranges.begin())>,
+                 16>
       SectionRanges;
 
-  for (const auto &Range : R)
+  for (const auto &Range : R.Ranges)
     SectionRanges[&Range.Begin->getSection()].push_back(&Range);
 
-  const MCSymbol *CUBase = CU.getBaseAddress();
+  const MCSymbol *CUBase = R.CU->getBaseAddress();
   bool BaseIsSet = false;
   for (const auto &P : SectionRanges) {
     auto *Base = CUBase;
-    if ((Asm->TM.getTargetTriple().isNVPTX() && DD.tuneForGDB())) {
+    if ((Asm->TM.getTargetTriple().isNVPTX() && DD.tuneForGDB()) ||
+        (DD.useSplitDwarf() && P.first->isLinkerRelaxable())) {
       // PTX does not support subtracting labels from the code section in the
       // debug_loc section.  To work around this, the NVPTX backend needs the
       // compile unit to have no low_pc in order to have a zero base_address
@@ -3327,8 +3363,10 @@ static void emitRangeList(
         //  * or, there's more than one entry to share the base address
         Base = NewBase;
         BaseIsSet = true;
-        Asm->OutStreamer->AddComment(StringifyEnum(BaseAddressx));
-        Asm->emitInt8(BaseAddressx);
+        Asm->OutStreamer->AddComment(
+            DwarfRangeListTraits<Ranges>::StringifyRangeKind(
+                DwarfRangeListTraits<Ranges>::BaseAddressx));
+        Asm->emitInt8(DwarfRangeListTraits<Ranges>::BaseAddressx);
         Asm->OutStreamer->AddComment("  base address index");
         Asm->emitULEB128(DD.getAddressPool().getIndex(Base));
       }
@@ -3339,7 +3377,46 @@ static void emitRangeList(
       Asm->OutStreamer->emitIntValue(0, Size);
     }
 
-    for (const auto *RS : P.second) {
+    if (DD.useSplitDwarf() && UseDwarf5) {
+      // In .dwo files, we must ensure no relocations are present. For
+      // .debug_ranges.dwo. this means that if there is at least one
+      // relocation between the start and end of a range, we must
+      // represent the range boundaries using indirect addresses from
+      // the .debug_addr section.
+      //
+      // The DWARFv5 specification (section 2.17.3) does not require
+      // range entries to be ordered. Therefore, we emit such range
+      // entries here to allow using more optimal formats (e.g.
+      // StartxLength) for other ranges.
+
+      auto RelaxableRanges = llvm::make_filter_range(P.second, [](auto &&RS) {
+        return llvm::isRangeRelaxable(RS->Begin, RS->End);
+      });
+
+      for (auto &&RS : RelaxableRanges) {
+        const auto *Begin = RS->Begin;
+        const auto *End = RS->End;
+        Asm->OutStreamer->AddComment(
+            DwarfRangeListTraits<Ranges>::StringifyRangeKind(
+                DwarfRangeListTraits<Ranges>::StartxEndx));
+        Asm->emitInt8(DwarfRangeListTraits<Ranges>::StartxEndx);
+        Asm->OutStreamer->AddComment("  start index");
+        Asm->emitULEB128(DD.getAddressPool().getIndex(Begin));
+        Asm->OutStreamer->AddComment("  end index");
+        Asm->emitULEB128(DD.getAddressPool().getIndex(End));
+        EmitPayload(*RS);
+      }
+    }
+
+    auto NonRelaxableRanges = llvm::make_filter_range(
+        P.second,
+        [HasRelaxableRanges = DD.useSplitDwarf() && UseDwarf5](auto &&RS) {
+          if (!HasRelaxableRanges)
+            return true;
+          return !llvm::isRangeRelaxable(RS->Begin, RS->End);
+        });
+
+    for (const auto *RS : NonRelaxableRanges) {
       const MCSymbol *Begin = RS->Begin;
       const MCSymbol *End = RS->End;
       assert(Begin && "Range without a begin symbol?");
@@ -3347,8 +3424,10 @@ static void emitRangeList(
       if (Base) {
         if (UseDwarf5) {
           // Emit offset_pair when we have a base.
-          Asm->OutStreamer->AddComment(StringifyEnum(OffsetPair));
-          Asm->emitInt8(OffsetPair);
+          Asm->OutStreamer->AddComment(
+              DwarfRangeListTraits<Ranges>::StringifyRangeKind(
+                  DwarfRangeListTraits<Ranges>::OffsetPair));
+          Asm->emitInt8(DwarfRangeListTraits<Ranges>::OffsetPair);
           Asm->OutStreamer->AddComment("  starting offset");
           Asm->emitLabelDifferenceAsULEB128(Begin, Base);
           Asm->OutStreamer->AddComment("  ending offset");
@@ -3358,8 +3437,10 @@ static void emitRangeList(
           Asm->emitLabelDifference(End, Base, Size);
         }
       } else if (UseDwarf5) {
-        Asm->OutStreamer->AddComment(StringifyEnum(StartxLength));
-        Asm->emitInt8(StartxLength);
+        Asm->OutStreamer->AddComment(
+            DwarfRangeListTraits<Ranges>::StringifyRangeKind(
+                DwarfRangeListTraits<Ranges>::StartxLength));
+        Asm->emitInt8(DwarfRangeListTraits<Ranges>::StartxLength);
         Asm->OutStreamer->AddComment("  start index");
         Asm->emitULEB128(DD.getAddressPool().getIndex(Begin));
         Asm->OutStreamer->AddComment("  length");
@@ -3373,8 +3454,10 @@ static void emitRangeList(
   }
 
   if (UseDwarf5) {
-    Asm->OutStreamer->AddComment(StringifyEnum(EndOfList));
-    Asm->emitInt8(EndOfList);
+    Asm->OutStreamer->AddComment(
+        DwarfRangeListTraits<Ranges>::StringifyRangeKind(
+            DwarfRangeListTraits<Ranges>::EndOfList));
+    Asm->emitInt8(DwarfRangeListTraits<Ranges>::EndOfList);
   } else {
     // Terminate the list with two 0 values.
     Asm->OutStreamer->emitIntValue(0, Size);
@@ -3384,11 +3467,9 @@ static void emitRangeList(
 
 // Handles emission of both debug_loclist / debug_loclist.dwo
 static void emitLocList(DwarfDebug &DD, AsmPrinter *Asm, const DebugLocStream::List &List) {
-  emitRangeList(DD, Asm, List.Label, DD.getDebugLocs().getEntries(List),
-                *List.CU, dwarf::DW_LLE_base_addressx,
-                dwarf::DW_LLE_offset_pair, dwarf::DW_LLE_startx_length,
-                dwarf::DW_LLE_end_of_list, llvm::dwarf::LocListEncodingString,
-                /* ShouldUseBaseAddress */ true,
+  DebugLocSpanList Ranges = {List.Label, List.CU,
+                             DD.getDebugLocs().getEntries(List)};
+  emitRangeList(DD, Asm, Ranges, /* ShouldUseBaseAddress */ true,
                 [&](const DebugLocStream::Entry &E) {
                   DD.emitDebugLocEntryLocation(E, List.CU);
                 });
@@ -3413,10 +3494,9 @@ void DwarfDebug::emitDebugLocImpl(MCSection *Sec) {
 
 // Emit locations into the .debug_loc/.debug_loclists section.
 void DwarfDebug::emitDebugLoc() {
-  emitDebugLocImpl(
-      getDwarfVersion() >= 5
-          ? Asm->getObjFileLowering().getDwarfLoclistsSection()
-          : Asm->getObjFileLowering().getDwarfLocSection());
+  emitDebugLocImpl(getDwarfVersion() >= 5
+                       ? Asm->getObjFileLowering().getDwarfLoclistsSection()
+                       : Asm->getObjFileLowering().getDwarfLocSection());
 }
 
 // Emit locations into the .debug_loc.dwo/.debug_loclists.dwo section.
@@ -3611,10 +3691,7 @@ void DwarfDebug::emitDebugARanges() {
 /// Emit a single range list. We handle both DWARF v5 and earlier.
 static void emitRangeList(DwarfDebug &DD, AsmPrinter *Asm,
                           const RangeSpanList &List) {
-  emitRangeList(DD, Asm, List.Label, List.Ranges, *List.CU,
-                dwarf::DW_RLE_base_addressx, dwarf::DW_RLE_offset_pair,
-                dwarf::DW_RLE_startx_length, dwarf::DW_RLE_end_of_list,
-                llvm::dwarf::RangeListEncodingString,
+  emitRangeList(DD, Asm, List,
                 List.CU->getCUNode()->getRangesBaseAddress() ||
                     DD.getDwarfVersion() >= 5,
                 [](auto) {});
diff --git a/llvm/lib/MC/MCSymbol.cpp b/llvm/lib/MC/MCSymbol.cpp
index b86873824cb00..e95c714ed003c 100644
--- a/llvm/lib/MC/MCSymbol.cpp
+++ b/llvm/lib/MC/MCSymbol.cpp
@@ -83,8 +83,24 @@ void MCSymbol::print(raw_ostream &OS, const MCAsmInfo *MAI) const {
   OS << '"';
 }
 
-#if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
-LLVM_DUMP_METHOD void MCSymbol::dump() const {
-  dbgs() << *this;
+bool llvm::isRangeRelaxable(const MCSymbol *Begin, const MCSymbol *End) {
+  assert(Begin && "Range without a begin symbol?");
+  assert(End && "Range without an end symbol?");
+  llvm::SmallVector<const MCFragment *> RangeFragments{};
+  for (const auto *Fragment = Begin->getFragment();
+       Fragment != End->getFragment(); Fragment = Fragment->getNext()) {
+    assert(Fragment);
+    RangeFragments.push_back(Fragment);
+  }
+  assert(End->getFragment());
+  RangeFragments.push_back(End->getFragment());
+
+  bool IsRelaxableRange = llvm::any_of(RangeFragments, [](auto &&Fragment) {
+    return Fragment->isLinkerRelaxable();
+  });
+  return IsRelaxableRange;
 }
+
+#if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
+LLVM_DUMP_METHOD void MCSymbol::dump() const { dbgs() << *this; }
 #endif
diff --git a/llvm/test/DebugInfo/RISCV/relax_dwo_ranges.ll b/llvm/test/DebugInfo/RISCV/relax_dwo_ranges.ll
new file mode 100644
index 0000000000000..759177b2a376f
--- /dev/null
+++ b/llvm/test/DebugInfo/RISCV/relax_dwo_ranges.ll
@@ -0,0 +1,155 @@
+; RUN: llc -dwarf-version=5 -split-dwarf-file=foo.dwo -O0 %s -mtriple=riscv64-unknown-linux-gnu -filetype=obj -o %t
+; RUN: llvm-dwarfdump -v %t | FileCheck %s
+
+; In the RISC-V architecture, the .text section is subject to
+; relaxation, meaning the start address of each function can change
+; during the linking process. Therefore, the .debug_rnglists.dwo
+; section must obtain function's start addresses from the .debug_addr
+; section.
+
+; Generally, a function's body can be relaxed (for example, the
+; square() and main() functions in this test, which contain call
+; instructions). For such code ranges, the linker must place the
+; start and end addresses into the .debug_addr section and use
+; the DW_RLE_startx_endx entry form in the .debug_rnglists.dwo
+; section within the .dwo file.
+
+; However, some functions may not contain any relaxable instructions
+; (for example, the boo() function in this test). In these cases,
+; it is possible to use the more space-efficient DW_RLE_startx_length
+; range entry form.
+
+; From the code:
+
+; __attribute__((noinline)) int boo();
+
+; int square(int num) {
+;   int num1 = boo();
+;   return num1 * num;
+; }
+
+; __attribute__((noinline)) int boo() {
+;   return 8;
+; }
+
+; int main() {
+;   int a = 10;
+;   int squared = square(a);
+;   return squared;
+; }
+
+; compiled with
+
+; clang -g -S -gsplit-dwarf --target=riscv64 -march=rv64gc -O0 relax_dwo_ranges.cpp
+
+; Ensure that 'square()' function uses indexed start and end addresses
+; CHECK: .debug_info.dwo contents:
+; CHECK: DW_TAG_subprogram
+; CHECK-NEXT: DW_AT_low_pc [DW_FORM_addrx]    (indexed (00000000) address = 0x0000000000000000 ".text")
+; CHECK-NEXT: DW_AT_high_pc [DW_FORM_addrx]   (indexed (00000001) address = 0x0000000000000044 ".text")
+; CHECK: DW_AT_name {{.*}} "square") 
+; CHECK: DW_TAG_formal_parameter
+
+; Ensure there is no unnecessary addresses in .o file
+; CHECK: .debug_addr contents:
+; CHECK: Addrs: [
+; CHECK-NEXT: 0x0000000000000000
+; CHECK-NEXT: 0x0000000000000044
+; CHECK-NEXT: 0x0000000000000046
+; CHECK-NEXT: 0x000000000000006c
+; CHECK-NEXT: 0x00000000000000b0
+; CHECK-NEXT: ]
+
+; Ensure that 'boo()' and 'main()' use DW_RLE_startx_length and DW_RLE_startx_endx
+; entries respectively
+; CHECK: .debug_rnglists.dwo contents:
+; CHECK: ranges:
+; CHECK-NEXT: 0x00000014: [DW_RLE_startx_length]:  0x0000000000000002, 0x0000000000000024 => [0x0000000000000046, 0x000000000000006a)
+; CHECK-NEXT: 0x00000017: [DW_RLE_end_of_list  ]
+; CHECK-NEXT: 0x00000018: [DW_RLE_startx_endx  ]:  0x0000000000000003, 0x0000000000000004 => [0x000000000000006c, 0x00000000000000b0)
+; CHECK-NEXT: 0x0000001b: [DW_RLE_end_of_list  ]
+; CHECK-EMPTY:
+
+
+; Function Attrs: mustprogress noinline optnone
+define dso_local noundef signext i32 @_Z6squarei(i32 noundef signext %0) #0 !dbg !11 {
+  %2 = alloca i32, align 4
+  %3 = alloca i32, align 4
+  store i32 %0, ptr %2, align 4
+    #dbg_declare(ptr %2, !16, !DIExpression(), !17)
+    #dbg_declare(ptr %3, !18, !DIExpression(), !19)
+  %4 = call noundef signext i32 @_Z3boov(), !dbg !20
+  store i32 %4, ptr %3, align 4, !dbg !19
+  %5 = load i32, ptr %3, align 4, !dbg !21
+  %6 = load i32, ptr %2, align 4, !dbg !22
+  %7 = mul nsw i32 %5, %6, !dbg !23
+  ret i32 %7, !dbg !24
+}
+
+; Function Attrs: mustprogress noinline nounwind optnone
+define dso_local noundef signext i32 @_Z3boov() #1 !dbg !25 {
+  ret i32 8, !dbg !28
+}
+
+; Function Attrs: mustprogress noinline norecurse optnone
+define dso_local noundef signext i32 @main() #2 !dbg !29 {
+  %1 = alloca i32, align 4
+  %2 = alloca i32, align 4
+  %3 = alloca i32, align 4
+  store i32 0, ptr %1, align 4
+    #dbg_declare(ptr %2, !30, !DIExpression(), !31)
+  store i32 10, ptr %2, align 4, !dbg !31
+    #dbg_declare(ptr %3, !32, !DIExpression(), !33)
+  %4 = load i32, ptr %2, align 4, !dbg !34
+  %5 = call noundef signext i32 @_Z6squarei(i32 noundef signext %4), !dbg !35
+  store i32 %5, ptr %3, align 4, !dbg !33
+  %6 = load i32, ptr %3, align 4, !dbg !36
+  ret i32 %6, !dbg !37
+}
+
+attributes #0 = { mustprogress noinline optnone "frame-pointer"="all" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="generic-rv64" "target-features"="+64bit,+relax,+f,+d" }
+attributes #1 = { mustprogress noinline nounwind optnone "frame-pointer"="all" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="generic-rv64" "target-features"="+64bit,+relax,+f,+d" }
+attributes #2 = { must...
[truncated]

@dlav-sc
Copy link
Contributor Author

dlav-sc commented Oct 18, 2025

Out of curiosity, I decided to compare the linking time and final binary size with and without DWARF fission in riscv with -mrelax option. Since relaxations force us to use the least optimal DWARF form (StartxEndx), we expect the improvements of split-dwarf to be less than usual.

For measurements, I used clang as a test program and collected results for debug builds (CMAKE_BUILD_TYPE=Debug) on my machine with the following configurations:

With options:
LLVM_USE_SPLIT_DWARF=ON and CMAKE_C_FLAGS, CMAKE_CXX_FLAGS with -gsplit-dwarf
Size: 1.15 GB
Linking time: 12m11s

Without options:
Size: 1.82 GB
Linking time: 13m26s

The results show that linking with the -gsplit-dwarf option takes approximately 10% less time than linking without fission, and the size of the resulting binary with dwarf fission is around 40% smaller.

Honestly, I'm not sure how well this benchmark reflects real-world scenarios. I might have missed some significant flags when configuring the clang debug build and I only ran these builds on my local machine, so the results might differ on other platforms. However, I believe the results still reflect the general trend.

Currently, `-gsplit-dwarf` and `-mrelax` are incompatible options in
Clang. The issue is that `.dwo` files should not contain any
relocations, as they are not processed by the linker. However, relaxable
code emits relocations in DWARF for debug ranges that reside in the
`.dwo` file when DWARF fission is enabled.

This patch makes DWARF fission compatible with RISC-V relaxations. It
uses the `StartxEndx` DWARF forms in `.debug_rnglists.dwo`, which
allow referencing addresses from `.debug_addr` instead of using absolute
addresses. This approach eliminates relocations from `.dwo` files.
@dlav-sc dlav-sc force-pushed the dlav-sc/dwarf_split branch from cd1ffe0 to bb526f6 Compare October 18, 2025 22:08
@dlav-sc
Copy link
Contributor Author

dlav-sc commented Oct 18, 2025

@dwblaikie, @MaskRay if you have a minute, could you take a look, please.


template <
typename Ranges, typename PayloadEmitter,
std::enable_if_t<DwarfRangeListTraits<Ranges>::BaseAddressx, bool> = true>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably doesn't need the enable_if SFINAE? (emitRangeList isn't overloaded, right? So it can just fail to compile if someone passes in a range that doesn't have matching traits?)

& could you split this change out - do the trait-based change as NFC (but still send it for review) then do the relaxation stuff on top of that, would help isolate the changes?

bool llvm::isRangeRelaxable(const MCSymbol *Begin, const MCSymbol *End) {
assert(Begin && "Range without a begin symbol?");
assert(End && "Range without an end symbol?");
llvm::SmallVector<const MCFragment *> RangeFragments{};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not quite following - why is this vector needed? (rather than computing the IsRelaxableRange via a single iteration over the fragments in this loop below?

@@ -0,0 +1,155 @@
; RUN: llc -dwarf-version=5 -split-dwarf-file=foo.dwo -O0 %s -mtriple=riscv64-unknown-linux-gnu -filetype=obj -o %t
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you generalize this test with multiple RUN lines to cover the two (DWARFv4, DWARFv5) or three (DWARFv4, DWARFv5, Split DWARFv5) (or perhaps 4, if you wanted to test Split DWARFv4 too) that are affected by this change?

I think we have an existing test somewhere that checks there are no relocations in .dwo files - maybe that's an assert, rather than a test, so we can't actually ever produce/observe a .dwo file with relocations? (if we have an existing test somewhere that actually dumps relocations - might be worth reusing that kind of testing here too in the Split DWARF case to validate there are no relocations in the .dwo file)

bool BaseIsSet = false;
for (const auto &P : SectionRanges) {
auto *Base = CUBase;
if ((Asm->TM.getTargetTriple().isNVPTX() && DD.tuneForGDB())) {
if ((Asm->TM.getTargetTriple().isNVPTX() && DD.tuneForGDB()) ||
(DD.useSplitDwarf() && P.first->isLinkerRelaxable())) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be worth separating these two changes too - one for DWARFv4 and one for DWARFv5?

Comment on lines +3280 to +3284
struct DebugLocSpanList {
MCSymbol *Label;
const DwarfCompileUnit *CU;
llvm::ArrayRef<llvm::DebugLocStream::Entry> Ranges;
};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure that this structure's pulling its weight here - does it make much of a difference to stick with the original API, passing a few more parameters directly?

Comment on lines -836 to -838
def err_drv_riscv_unsupported_with_linker_relaxation : Error<
"%0 is unsupported with RISC-V linker relaxation (-mrelax)">;

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The clang change will need to go in in a separate commit - splitting independent patches. Add the functionality to LLVM first, then enable its use in clang in a separate follow-up patch.

Comment on lines +496 to +501
if (DD->getDwarfVersion() >= 4 &&
(!isDwoUnit() || !llvm::isRangeRelaxable(Begin, End))) {
addLabelDelta(D, dwarf::DW_AT_high_pc, End, Begin);
return;
}
addLabelAddress(D, dwarf::DW_AT_high_pc, End);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably be a separate patch too - you can do the uses of isRangeRelaxable in any order (& whichever's the first can have the isRangeRelaxable function, so it can be tested when it's added, etc), but should be separate with incremental test coverage being added, etc.

Though perhaps the Split DWARF test coverage (if we have an assert checking that relocations aren't used in .dwo files) can only be added at the end once all the functionality is in.

// -gsplit-dwarf -mrelax requires DW_AT_high_pc/DW_AT_ranges/... indexing
// into .debug_addr, which is currently not implemented.
Arg *A;
if (getDebugFissionKind(D, Args, A) != DwarfFissionKind::None)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh, how'd RISCV survive even in non-split DWARF where high_pc would've been a constant relative to the low_pc without this patch/work?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RISC-V has relocations for PC offsets in debug info. IIRC, R_RISCV_SET6 and R_RISCV_SUB6 would be placed as a pair on the relevant part of the debug info. There are variants of this for ulebs too, which came later.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, so this is only related to Split DWARF, good to understand.
Could you break it up into a few steps, though - the trait-based refactoring (if it's worth it), each piece of the DWARF (range list, loc list, attribute, whatever else), then the clang change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend:RISC-V clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category debuginfo llvm:codegen llvm:mc Machine (object) code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants