Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DWARF] Add option to add linkage_names to call_origin declaration refs #89640

Merged
merged 2 commits into from
Apr 23, 2024

Conversation

OCHyams
Copy link
Contributor

@OCHyams OCHyams commented Apr 22, 2024

If -mllvm -add-linkage-names-to-external-call-origins is true then add
DW_AT_linkage_name attributes to DW_TAG_subprogram DIEs referenced by
DW_AT_call_origin attributes that would otherwise be omitted.


A debugger may use DW_TAG_call_origin attributes to determine whether any
frames in a call stack are missing due to optimisations (e.g. tail calls).

For example, say a() calls b() tail-calls c(), and you stop in your debugger
in c():

The callstack looks like this:

#0 c() <- stopped in this frame
#1 a()

Looking "up" from c(), call site information can be found in a(). This includes
a DW_AT_call_origin referencing b()'s subprogram DIE, which means the call at
this call site was to b(), not c() where we are currently stopped. This
indicates b()'s frame has been lost due to optimisation (or is misleading due
to ICF).

This patch makes it easier for a debugger to check whether the referenced
DIE describes the target function or not, for example by comparing the referenced
function name to the current frame.


There's already an option to apply DW_AT_linkage_name in a targeted manner: -dwarf-linkage-names=Abstract, which limits adding DW_AT_linkage_names to abstract subprogram DIEs (this is default for SCE tuning).

The new flag shouldn't affect non-SCE-tuned behaviour whether it is enabled or not because the non-SCE-tuned behaviour is to always add linkage names to subprogram DIEs, AFAICT.

Enabling this feature alongside -dwarf-linkage-names=Abstract has minimal impact on size of CTMark RelWithDebInfo binaries. There's a geomean increase in size by < 0.1%.

If -mllvm -add-linkage-names-to-external-call-origins is true then add
DW_AT_linkage_name attributes to DW_TAG_subprogram DIEs referenced by
DW_AT_call_origin attributes that would otherwise be omitted.

A debugger may use DW_TAG_call_origin attributes to determine whether any
frames in a callstack are missing due to optimisations (e.g. tail calls).

For example, say a() calls b() tail-calls c(), and you stop in your
debugger in c():

The callstack looks like this:
    c()
    a()

Looking "up" from c(), call site information can be found in a(). This
includes a DW_AT_call_origin referencing b()'s subprogram DIE, which means
the call at this call site was to b(), not c() where we are currently stopped.
This indicates b()'s frame has been lost due to optimisation (tail call or
ICF).

This patch makes it easier for a debugger to check whether the referenced DIE
describes the target function or not, for example by comparing the referenced
function name to the current frame.
@llvmbot
Copy link
Collaborator

llvmbot commented Apr 22, 2024

@llvm/pr-subscribers-debuginfo

Author: Orlando Cazalet-Hyams (OCHyams)

Changes

If -mllvm -add-linkage-names-to-external-call-origins is true then add
DW_AT_linkage_name attributes to DW_TAG_subprogram DIEs referenced by
DW_AT_call_origin attributes that would otherwise be omitted.


A debugger may use DW_TAG_call_origin attributes to determine whether any
frames in a call stack are missing due to optimisations (e.g. tail calls).

For example, say a() calls b() tail-calls c(), and you stop in your debugger
in c():

The callstack looks like this:

#<!-- -->0 c() &lt;- stopped in this frame
#<!-- -->1 a()

Looking "up" from c(), call site information can be found in a(). This includes
a DW_AT_call_origin referencing b()'s subprogram DIE, which means the call at
this call site was to b(), not c() where we are currently stopped. This
indicates b()'s frame has been lost due to optimisation (or is misleading due
to ICF).

This patch makes it easier for a debugger to check whether the referenced
DIE describes the target function or not, for example by comparing the referenced
function name to the current frame.


There's already an option to apply DW_AT_linkage_name in a targeted manner: -dwarf-linkage-names=Abstract, which limits adding DW_AT_linkage_names to abstract subprogram DIEs (this is default for SCE tuning).

The new flag shouldn't change anything upstream whether it is enabled or not because the non-SCE-tuned behaviour upstream is to always add linkage names to subprogram DIEs, AFAICT.

Enabling this feature alongside -dwarf-linkage-names=Abstract has minimal impact on size of CTMark RelWithDebInfo binaries. There's a geomean increase in size by < 0.1%.


Full diff: https://github.com/llvm/llvm-project/pull/89640.diff

2 Files Affected:

  • (modified) llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp (+21)
  • (added) llvm/test/DebugInfo/X86/call-origin-linkage-names.ll (+96)
diff --git a/llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp b/llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp
index 14f2a363f9be61..49954bf10abd5b 100644
--- a/llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp
@@ -35,6 +35,7 @@
 #include "llvm/Target/TargetLoweringObjectFile.h"
 #include "llvm/Target/TargetMachine.h"
 #include "llvm/Target/TargetOptions.h"
+#include "llvm/Support/CommandLine.h"
 #include <iterator>
 #include <optional>
 #include <string>
@@ -42,6 +43,20 @@
 
 using namespace llvm;
 
+/// Query value using AddLinkageNamesToDeclCallOriginsForTuning.
+cl::opt<cl::boolOrDefault> AddLinkageNamesToDeclCallOrigins(
+    "add-linkage-names-to-declaration-call-origins", cl::Hidden,
+    cl::desc("Add DW_AT_linkage_name to function declaration DIEs "
+             "referenced by DW_AT_call_origin attributes. Enabled by default "
+             "for -gsce debugger tuning."));
+
+static bool AddLinkageNamesToDeclCallOriginsForTuning(const DwarfDebug *DD) {
+  bool EnabledByDefault = DD->tuneForSCE();
+  if (EnabledByDefault)
+    return AddLinkageNamesToDeclCallOrigins != cl::boolOrDefault::BOU_FALSE;
+  return AddLinkageNamesToDeclCallOrigins == cl::boolOrDefault::BOU_TRUE;
+}
+
 static dwarf::Tag GetCompileUnitType(UnitKind Kind, DwarfDebug *DW) {
 
   //  According to DWARF Debugging Information Format Version 5,
@@ -1260,6 +1275,12 @@ DIE &DwarfCompileUnit::constructCallSiteEntryDIE(DIE &ScopeDIE,
   } else {
     DIE *CalleeDIE = getOrCreateSubprogramDIE(CalleeSP);
     assert(CalleeDIE && "Could not create DIE for call site entry origin");
+    if (AddLinkageNamesToDeclCallOriginsForTuning(DD) &&
+        !CalleeSP->isDefinition() &&
+        !CalleeDIE->findAttribute(dwarf::DW_AT_linkage_name)) {
+      addLinkageName(*CalleeDIE, CalleeSP->getLinkageName());
+    }
+
     addDIEEntry(CallSiteDIE, getDwarf5OrGNUAttr(dwarf::DW_AT_call_origin),
                 *CalleeDIE);
   }
diff --git a/llvm/test/DebugInfo/X86/call-origin-linkage-names.ll b/llvm/test/DebugInfo/X86/call-origin-linkage-names.ll
new file mode 100644
index 00000000000000..c8ca3075c0ff7a
--- /dev/null
+++ b/llvm/test/DebugInfo/X86/call-origin-linkage-names.ll
@@ -0,0 +1,96 @@
+; RUN: llc %s --filetype=obj -o - -dwarf-linkage-names=Abstract -add-linkage-names-to-declaration-call-origins=false \
+; RUN: | llvm-dwarfdump - | FileCheck %s --check-prefixes=COMMON,DISABLE --implicit-check-not=DW_AT_linkage_name
+; RUN: llc %s --filetype=obj -o - -dwarf-linkage-names=Abstract -add-linkage-names-to-declaration-call-origins=true \
+; RUN: | llvm-dwarfdump - | FileCheck %s --check-prefixes=COMMON,ENABLE --implicit-check-not=DW_AT_linkage_name
+
+;; Check that -add-linkage-names-to-declaration-call-origins controls whether
+;; linkage names are added to declarations referenced by DW_AT_call_origin
+;; attributes.
+;;
+;; $ cat test.cpp
+;; void a();
+;; __attribute__((optnone))
+;; void b() {}
+;; void c();
+;; extern "C" {
+;;   void d();
+;; }
+;;
+;; void e() {
+;;   a(); //< Reference declaration DIE (add linkage name).
+;;   b(); //< Reference definition DIE (don't add linkage name).
+;;   c(); //< Reference definition DIE (don't add linkage name).
+;;   d(); //< Reference declaration DIE, but there's no linkage name.
+;; }
+;;
+;; __attribute__((optnone))
+;; void c() {}
+;; $ clang++ -emit-llvm -S -O1 -g
+
+; COMMON:       DW_TAG_call_site
+; ENABLE-NEXT:    DW_AT_call_origin    (0x[[a:[a-z0-9]+]] "_Z1av")
+; DISABLE-NEXT:   DW_AT_call_origin    (0x[[a:[a-z0-9]+]] "a")
+; COMMON:       DW_TAG_call_site
+; COMMON-NEXT:    DW_AT_call_origin     (0x[[b:[a-z0-9]+]] "b")
+; COMMON:       DW_TAG_call_site
+; COMMON-NEXT:    DW_AT_call_origin     (0x[[c:[a-z0-9]+]] "c")
+; COMMON:       DW_TAG_call_site
+; COMMON-NEXT:    DW_AT_call_origin     (0x[[d:[a-z0-9]+]] "d")
+
+; COMMON: 0x[[a]]: DW_TAG_subprogram
+; COMMON:   DW_AT_name  ("a")
+; ENABLE:   DW_AT_linkage_name  ("_Z1av")
+; COMMON: 0x[[b]]: DW_TAG_subprogram
+; COMMON:   DW_AT_name  ("b")
+; COMMON: 0x[[c]]: DW_TAG_subprogram
+; COMMON:   DW_AT_name  ("c")
+; COMMON: 0x[[d]]: DW_TAG_subprogram
+; COMMON:   DW_AT_name  ("d")
+
+target triple = "x86_64-unknown-linux-gnu"
+
+define dso_local void @_Z1ev() local_unnamed_addr !dbg !13 {
+entry:
+  tail call void @_Z1av(), !dbg !14
+  tail call void @_Z1bv(), !dbg !15
+  tail call void @_Z1cv(), !dbg !16
+  tail call void @d(), !dbg !17
+  ret void, !dbg !18
+}
+
+define dso_local void @_Z1bv() local_unnamed_addr !dbg !9 {
+entry:
+  ret void, !dbg !12
+}
+
+declare !dbg !19 void @_Z1av() local_unnamed_addr
+
+define dso_local void @_Z1cv() local_unnamed_addr !dbg !20 {
+entry:
+  ret void, !dbg !21
+}
+declare !dbg !22 void @d() local_unnamed_addr
+
+!llvm.dbg.cu = !{!0}
+!llvm.module.flags = !{!2, !3}
+!llvm.ident = !{!8}
+
+!0 = distinct !DICompileUnit(language: DW_LANG_C_plus_plus_14, file: !1, producer: "clang version 19.0.0git", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, splitDebugInlining: false, nameTableKind: None)
+!1 = !DIFile(filename: "test.cpp", directory: "/")
+!2 = !{i32 7, !"Dwarf Version", i32 5}
+!3 = !{i32 2, !"Debug Info Version", i32 3}
+!8 = !{!"clang version 19.0.0"}
+!9 = distinct !DISubprogram(name: "b", linkageName: "_Z1bv", scope: !1, file: !1, line: 3, type: !10, scopeLine: 3, flags: DIFlagPrototyped | DIFlagAllCallsDescribed, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !0)
+!10 = !DISubroutineType(types: !11)
+!11 = !{null}
+!12 = !DILocation(line: 3, column: 11, scope: !9)
+!13 = distinct !DISubprogram(name: "e", linkageName: "_Z1ev", scope: !1, file: !1, line: 9, type: !10, scopeLine: 9, flags: DIFlagPrototyped | DIFlagAllCallsDescribed, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !0)
+!14 = !DILocation(line: 10, column: 3, scope: !13)
+!15 = !DILocation(line: 11, column: 3, scope: !13)
+!16 = !DILocation(line: 12, column: 3, scope: !13)
+!17 = !DILocation(line: 13, column: 3, scope: !13)
+!18 = !DILocation(line: 14, column: 1, scope: !13)
+!19 = !DISubprogram(name: "a", linkageName: "_Z1av", scope: !1, file: !1, line: 1, type: !10, flags: DIFlagPrototyped, spFlags: DISPFlagOptimized)
+!20 = distinct !DISubprogram(name: "c", linkageName: "_Z1cv", scope: !1, file: !1, line: 17, type: !10, scopeLine: 17, flags: DIFlagPrototyped | DIFlagAllCallsDescribed, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !0)
+!21 = !DILocation(line: 17, column: 11, scope: !20)
+!22 = !DISubprogram(name: "d", scope: !1, file: !1, line: 6, type: !10, flags: DIFlagPrototyped, spFlags: DISPFlagOptimized)

Copy link

github-actions bot commented Apr 22, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

Copy link
Collaborator

@pogo59 pogo59 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for our debugger's needs, but I'll hold off approving to give others a chance to look at it.

@@ -35,13 +35,28 @@
#include "llvm/Target/TargetLoweringObjectFile.h"
#include "llvm/Target/TargetMachine.h"
#include "llvm/Target/TargetOptions.h"
#include "llvm/Support/CommandLine.h"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-format reminds me this is not in correct order.

; RUN: | llvm-dwarfdump - | FileCheck %s --check-prefixes=COMMON,DISABLE --implicit-check-not=DW_AT_linkage_name
; RUN: llc %s --filetype=obj -o - -dwarf-linkage-names=Abstract -add-linkage-names-to-declaration-call-origins=true \
; RUN: | llvm-dwarfdump - | FileCheck %s --check-prefixes=COMMON,ENABLE --implicit-check-not=DW_AT_linkage_name

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want more RUN lines showing default varies with tuning? I'm inclined to think it's not super necessary but curious what other reviewers think.

Comment on lines +46 to +52
/// Query value using AddLinkageNamesToDeclCallOriginsForTuning.
cl::opt<cl::boolOrDefault> AddLinkageNamesToDeclCallOrigins(
"add-linkage-names-to-declaration-call-origins", cl::Hidden,
cl::desc("Add DW_AT_linkage_name to function declaration DIEs "
"referenced by DW_AT_call_origin attributes. Enabled by default "
"for -gsce debugger tuning."));

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we /generally/ try not to use/add raw cl::opts, and instead pass things through the driver/frontend flag handling? (I might be wrong, if there's substantial precedent, let me know)

& not sure how this sits with @pogo59's general premise that debugger tuning shouldn't be reachable only through the tuning flags, but should be independently accessible (does independently accessible mean supported in the driver flags, or is -Xclang/cc1/mllvm flags sufficient to meet that requirement?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW -dwarf-linkage-names is an -mllvm flag too (happy to move the flag around as needed though - I don't have a strong opinion on it)

@dwblaikie
Copy link
Collaborator

The new flag shouldn't change anything upstream whether it is enabled or not because the non-SCE-tuned behaviour upstream is to always add linkage names to subprogram DIEs, AFAICT.

By "upstream" you mean "anyone not using -gsce"?

And any chance you've measured the impact of not always adding linkage names to subprogram DIEs for SCE lately? Is that optimization still pulling its weight for you, or would it be possible to converge on the non-sce behavior and remove some special cases here, rather than adding a new one?

Copy link
Contributor Author

@OCHyams OCHyams left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks both for the reviews.

By "upstream" you mean "anyone not using -gsce"?

Ah, yes indeed. I'll edit the summary so that it makes more sense.

And any chance you've measured the impact of not always adding linkage names to subprogram DIEs for SCE lately? Is that optimization still pulling its weight for you, or would it be possible to converge on the non-sce behavior and remove some special cases here, rather than adding a new one?

I have - across some internal codebases and configurations I measured an increase of total linked binary size including debug info of between ~4% (-O0 builds) to as much as ~19% (thinLTO -O3 builds) when using -dwarf-linkage-names=All instead of =Abstract.

Comment on lines +46 to +52
/// Query value using AddLinkageNamesToDeclCallOriginsForTuning.
cl::opt<cl::boolOrDefault> AddLinkageNamesToDeclCallOrigins(
"add-linkage-names-to-declaration-call-origins", cl::Hidden,
cl::desc("Add DW_AT_linkage_name to function declaration DIEs "
"referenced by DW_AT_call_origin attributes. Enabled by default "
"for -gsce debugger tuning."));

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW -dwarf-linkage-names is an -mllvm flag too (happy to move the flag around as needed though - I don't have a strong opinion on it)

Copy link
Collaborator

@dwblaikie dwblaikie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds ok, as good as how this stuff has been implemented historically at least

@pogo59
Copy link
Collaborator

pogo59 commented Apr 22, 2024

& not sure how this sits with @pogo59's general premise that debugger tuning shouldn't be reachable only through the tuning flags, but should be independently accessible (does independently accessible mean supported in the driver flags, or is -Xclang/cc1/mllvm flags sufficient to meet that requirement?)

What we don't want is

if (DD->tuneForLLDB()) {
  doTheLLDBThing();
}

Tuning sets the defaults for a bunch of behaviors, but each of the behaviors can be individually controlled outside of the tuning parameter. That is, we could decide to abandon the tuning idea entirely, but still get exactly the same behavior for each target, if we were willing to pass the extra umpteen flags. Whether these are driver, cc1, -mllvm, or whatever, doesn't matter.

@OCHyams OCHyams merged commit 0e44ffe into llvm:main Apr 23, 2024
3 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants