Skip to content

[llvm][DebugInfo] Encode DW_AT_object_pointer on method declarations with DW_FORM_implicit_const #124790

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

Michael137
Copy link
Member

@Michael137 Michael137 commented Jan 28, 2025

We started attaching DW_AT_object_pointers on method declarations in #122742. However, that caused the .debug_info section size to increase significantly (by around ~10% on some projects). This was mainly due to the large number of new DW_FORM_ref4 values. This patch tries to address that regression by changing the DW_FORM_ref4 to a DW_FORM_implicit_const for declarations. The value of DW_FORM_implicit_const will be the index of the object parameter in the list of formal parameters of the subprogram (i.e., if the first DW_TAG_formal_parameter is the object pointer, the DW_FORM_implicit_const would be 0). The DWARFv5 spec only mentions the use of the reference attribute class to for DW_AT_object_pointer. So using a DW_FORM_impilicit_const would be an extension to (and not something mandated/specified by) the standard. Though it'd make sense to extend the wording in the spec to allow for this optimization.

That way we don't pay for the 4 byte references on every attribute occurrence. In a local build of clang this barely affected the .debug_info section size (but did increase .debug_abbrev by up to 10%, which doesn't impact the total debug-info size much however).

We guarded this on LLDB tuning (since using DW_FORM_implicit_const for this purpose may surprise consumers) and DWARFv5 (since that's where DW_FORM_implicit_const was first standardized).

…with DW_FORM_implicit_const

We started attaching `DW_AT_object_pointer`s on method declarations in llvm#122742. However, that caused the `.debug_info` section size to increase significantly (by around ~10% on some projects). This was mainly due to the large number of new `DW_FORM_ref4` values. This patch tries to address that regression by changing the `DW_FORM_ref4` to a `DW_FORM_implicit_const` for declarations. That way we don't pay for the 4 byte references on every attribute occurrence. In a local build of clang this barely affected the `.debug_info` section size (but did increase `.debug_abbrev` by up to 10%, which doesn't affect the total debug-info size much however).

We guarded this on LLDB tuning (since using `DW_FORM_implicit_const` for this purpose may surprise consumers) and DWARFv5 (since that's where `DW_FORM_implicit_const` was first standardized).
@Michael137 Michael137 force-pushed the clang/dwarf-object-pointer-size-improvement branch from be1b1d6 to de16264 Compare January 28, 2025 17:22
@dwblaikie
Copy link
Collaborator

A little more detail in the description explaining that this implicit_const would be the index of the parameter that is the object pointer would be good - and that this is an extension/not the way the spec says to use this attribute.

Even though we're only using this as an lldb extension, might be worth getting some buy-in from Sony (@jmorse) and maybe @adrian-prantl or @JDevlieghere could take a look, just so we're all on the same page. (@labath - any thoughts on this?)

But all that pending the further investigation of the cost you mentioned here: #122742 (comment)

I'll see about sending a revert cl in the mean time.

(also, if this impact isn't because of bugs, but just the direct implementation of object_pointer on declarations - it makes me wonder about other ways to reduce the size of member function declarations. Like usually the file name is the same as the class's file name, so could we have an implicit_const value (-1?) to specify that the member function's file is the same as the class's file (could generalize to "-N" for N scopes back if that was useful) - could be used on local variables or other entities where their filename matches their scope's filename
Also formal parameters - have a type or an abstract origin, both of which are 4 byte refs - maybe a generalized sleb128 relative DIE offset would be helpful (bit difficult to implement in LLVM's output, since the LEB128 encoding would depend on the size of other DWARF encodings, etc, and we currently compute that all up-front, but if we switched to label-based output, we could let the assembler handle that complexity & have more legible/hand-modifiable DWARF assembly, since it wouldn't contain so many hard coded constants))

@Michael137
Copy link
Member Author

A little more detail in the description explaining that this implicit_const would be the index of the parameter that is the object pointer would be good - and that this is an extension/not the way the spec says to use this attribute.

Yup will do

Even though we're only using this as an lldb extension, might be worth getting some buy-in from Sony (@jmorse) and maybe @adrian-prantl or @JDevlieghere could take a look, just so we're all on the same page. (@labath - any thoughts on this?)

But all that pending the further investigation of the cost you mentioned here: #122742 (comment)

I'll see about sending a revert cl in the mean time.

Thanks!

(also, if this impact isn't because of bugs, but just the direct implementation of object_pointer on declarations - it makes me wonder about other ways to reduce the size of member function declarations. Like usually the file name is the same as the class's file name, so could we have an implicit_const value (-1?) to specify that the member function's file is the same as the class's file (could generalize to "-N" for N scopes back if that was useful) - could be used on local variables or other entities where their filename matches their scope's filename Also formal parameters - have a type or an abstract origin, both of which are 4 byte refs - maybe a generalized sleb128 relative DIE offset would be helpful (bit difficult to implement in LLVM's output, since the LEB128 encoding would depend on the size of other DWARF encodings, etc, and we currently compute that all up-front, but if we switched to label-based output, we could let the assembler handle that complexity & have more legible/hand-modifiable DWARF assembly, since it wouldn't contain so many hard coded constants))

Good point, definitely something we could look into. Also we could apply the new DW_AT_implicit_const form to DW_AT_object_pointers on definitions (though i suspect the size benefit not be as pronounced, but for consistency might be nice).

dwblaikie added a commit to dwblaikie/llvm-project that referenced this pull request Jan 28, 2025
…rations (llvm#122742)"

This introduces a substantial (5-10%) regression in .debug_info size, so
we're discussing alternatives in llvm#122742 and llvm#124790.

This reverts commit 7c72941.
@labath
Copy link
Collaborator

labath commented Jan 29, 2025

@labath - any thoughts on this?

Maybe?

  • I am wondering how other debuggers (gdb mainly) will react to the new attribute. It would be great if "tuning" for lldb didn't completely break other debuggers as well
  • speaking of tuning, I believe we currently don't "tune" Google binaries for lldb, so if this is conditioned on -glldb and we still emit the long attribute for gdb, then this will not change the size difference you were seeing. We probably should start tuning for lldb, but I don't know what will all of that cause (in particular it could cause a size increase due to some extra attributes emitted this way, but I don't think the difference should be big). I don't think this is Michael's problem, I'm just saying...

@Michael137
Copy link
Member Author

Michael137 commented Jan 29, 2025

@labath - any thoughts on this?

Maybe?

  • I am wondering how other debuggers (gdb mainly) will react to the new attribute. It would be great if "tuning" for lldb didn't completely break other debuggers as well

Good point, will play around with gdb to see if this breaks them when tuning for lldb

  • speaking of tuning, I believe we currently don't "tune" Google binaries for lldb, so if this is conditioned on -glldb and we still emit the long attribute for gdb, then this will not change the size difference you were seeing.

The current patch would not emit the attribute anymore for non-lldb tuned binaries (which we weren't doing anyway up until very recently). So the size regression would be mitigated there also

We probably should start tuning for lldb, but I don't know what will all of that cause (in particular it could cause a size increase due to some extra attributes emitted this way, but I don't think the difference should be big). I don't think this is Michael's problem, I'm just saying...

Yea would be an interesting experiment

@labath
Copy link
Collaborator

labath commented Jan 29, 2025

The current patch would not emit the attribute anymore for non-lldb tuned binaries (which we weren't doing anyway up until very recently). So the size regression would be mitigated there also

But, in that case, lldb (or at least explicit object arguments) would not work with gdb tuning, right? (I assume you're doing this because you need that in lldb). We could say that you need to use -glldb to get these (it wouldn't be the first feature to require it, though it may be the most visible, once people start using these), but it is a bit of an odd choice to make given that gcc (which tunes for gdb "by default") does emit them and maybe gdb also depends on them.

@Michael137
Copy link
Member Author

Michael137 commented Jan 29, 2025

The current patch would not emit the attribute anymore for non-lldb tuned binaries (which we weren't doing anyway up until very recently). So the size regression would be mitigated there also

But, in that case, lldb (or at least explicit object arguments) would not work with gdb tuning, right? (I assume you're doing this because you need that in lldb).

True, the explicit object argument support wouldn't work for gdb tuned binaries compiled with Clang.

We could say that you need to use -glldb to get these (it wouldn't be the first feature to require it, though it may be the most visible, once people start using these), but it is a bit of an odd choice to make given that gcc (which tunes for gdb "by default") does emit them and maybe gdb also depends on them.

Yea that's why I wasn't too concerned with emitting them the way GCC has been doing. I guess since they've been emitting the attribute on declarations for a while it never came up as a size issue.

So AFAICT we're left with following options:

  1. Use DW_FORM_implicit_const only with lldb tuning
  2. Use DW_FORM_implicit_const regardless of tuning
  3. Find a different way to encode the parameter that won't be potentially disruptive to other debuggers?

I might be too optimistic but given the DWARF spec is permissive, I'd hope consumers are able to deal with our choice of DW_FORM_* here. E.g., from a brief glance, LLDB should continue working just fine even if the object-pointer form is not a reference (of course we'd have to teach it about the new form to actually make use of it, but at least it doesn't do anything unexpected). Are people strongly opposed to (2)?

@Michael137 Michael137 marked this pull request as ready for review January 29, 2025 10:09
@llvmbot
Copy link
Member

llvmbot commented Jan 29, 2025

@llvm/pr-subscribers-debuginfo

Author: Michael Buch (Michael137)

Changes

We started attaching DW_AT_object_pointers on method declarations in #122742. However, that caused the .debug_info section size to increase significantly (by around ~10% on some projects). This was mainly due to the large number of new DW_FORM_ref4 values. This patch tries to address that regression by changing the DW_FORM_ref4 to a DW_FORM_implicit_const for declarations. The value of DW_FORM_implicit_const will be the index of the object parameter in the list of formal parameters of the subprogram (i.e., if the first DW_TAG_formal_parameter is the object pointer, the DW_FORM_implicit_const would be 0). The DWARFv5 spec only mentions the use of the reference attribute class to for DW_AT_object_pointer. So using a DW_FORM_impilicit_const would be an extension to (and not something mandated/specified by) the standard. Though it'd make sense to extend the wording in the spec to allow for this optimization.

That way we don't pay for the 4 byte references on every attribute occurrence. In a local build of clang this barely affected the .debug_info section size (but did increase .debug_abbrev by up to 10%, which doesn't impact the total debug-info size much however).

We guarded this on LLDB tuning (since using DW_FORM_implicit_const for this purpose may surprise consumers) and DWARFv5 (since that's where DW_FORM_implicit_const was first standardized).


Patch is 230.14 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/124790.diff

6 Files Affected:

  • (modified) llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp (+21-7)
  • (modified) llvm/lib/CodeGen/AsmPrinter/DwarfUnit.h (+4-2)
  • (modified) llvm/test/DebugInfo/NVPTX/debug-info.ll (+2548-2578)
  • (added) llvm/test/DebugInfo/X86/DW_AT_object_pointer-non-standard-index.ll (+79)
  • (modified) llvm/test/DebugInfo/X86/DW_AT_object_pointer.ll (+15-7)
  • (modified) llvm/test/tools/llvm-dwarfdump/X86/statistics.ll (+3-3)
diff --git a/llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp b/llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp
index d3450b8b0556fde..05bcaa9405e7dd7 100644
--- a/llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp
@@ -849,9 +849,10 @@ void DwarfUnit::constructTypeDIE(DIE &Buffer, const DIDerivedType *DTy) {
   }
 }
 
-DIE *DwarfUnit::constructSubprogramArguments(DIE &Buffer, DITypeRefArray Args) {
+std::optional<unsigned>
+DwarfUnit::constructSubprogramArguments(DIE &Buffer, DITypeRefArray Args) {
   // Args[0] is the return type.
-  DIE *ObjectPointer = nullptr;
+  std::optional<unsigned> ObjectPointerIndex;
   for (unsigned i = 1, N = Args.size(); i < N; ++i) {
     const DIType *Ty = Args[i];
     if (!Ty) {
@@ -863,13 +864,14 @@ DIE *DwarfUnit::constructSubprogramArguments(DIE &Buffer, DITypeRefArray Args) {
       if (Ty->isArtificial())
         addFlag(Arg, dwarf::DW_AT_artificial);
       if (Ty->isObjectPointer()) {
-        assert(!ObjectPointer && "Can't have more than one object pointer");
-        ObjectPointer = &Arg;
+        assert(!ObjectPointerIndex &&
+               "Can't have more than one object pointer");
+        ObjectPointerIndex = i;
       }
     }
   }
 
-  return ObjectPointer;
+  return ObjectPointerIndex;
 }
 
 void DwarfUnit::constructTypeDIE(DIE &Buffer, const DISubroutineType *CTy) {
@@ -1366,8 +1368,20 @@ void DwarfUnit::applySubprogramAttributes(const DISubprogram *SP, DIE &SPDie,
 
     // Add arguments. Do not add arguments for subprogram definition. They will
     // be handled while processing variables.
-    if (auto *ObjectPointer = constructSubprogramArguments(SPDie, Args))
-      addDIEEntry(SPDie, dwarf::DW_AT_object_pointer, *ObjectPointer);
+    //
+    // Encode the object pointer as an index instead of a DIE reference in order
+    // to minimize the affect on the .debug_info size.
+    if (std::optional<unsigned> ObjectPointerIndex =
+            constructSubprogramArguments(SPDie, Args)) {
+      if (getDwarfDebug().tuneForLLDB() &&
+          getDwarfDebug().getDwarfVersion() >= 5) {
+        // 0th index in Args is the return type, hence adjust by 1. In DWARF
+        // we want the first parameter to be at index 0.
+        assert(*ObjectPointerIndex > 0);
+        addSInt(SPDie, dwarf::DW_AT_object_pointer,
+                dwarf::DW_FORM_implicit_const, *ObjectPointerIndex - 1);
+      }
+    }
   }
 
   addThrownTypes(SPDie, SP->getThrownTypes());
diff --git a/llvm/lib/CodeGen/AsmPrinter/DwarfUnit.h b/llvm/lib/CodeGen/AsmPrinter/DwarfUnit.h
index 7a5295d826a483b..afbe90899421045 100644
--- a/llvm/lib/CodeGen/AsmPrinter/DwarfUnit.h
+++ b/llvm/lib/CodeGen/AsmPrinter/DwarfUnit.h
@@ -269,8 +269,10 @@ class DwarfUnit : public DIEUnit {
 
   /// Construct function argument DIEs.
   ///
-  /// \returns DIE of the object pointer if one exists. Nullptr otherwise.
-  DIE *constructSubprogramArguments(DIE &Buffer, DITypeRefArray Args);
+  /// \returns The index of the object parameter in \c Args if one exists.
+  /// Returns std::nullopt otherwise.
+  std::optional<unsigned> constructSubprogramArguments(DIE &Buffer,
+                                                       DITypeRefArray Args);
 
   /// Create a DIE with the given Tag, add the DIE to its parent, and
   /// call insertDIE if MD is not null.
diff --git a/llvm/test/DebugInfo/NVPTX/debug-info.ll b/llvm/test/DebugInfo/NVPTX/debug-info.ll
index 51fb692789e2265..62b30a1f15aff11 100644
--- a/llvm/test/DebugInfo/NVPTX/debug-info.ll
+++ b/llvm/test/DebugInfo/NVPTX/debug-info.ll
@@ -98,2584 +98,2554 @@ if.end:                                           ; preds = %if.then, %entry
 ; CHECK-DAG: .file {{[0-9]+}} "{{.*}}/usr/local/cuda/include{{/|\\\\}}vector_types.h"
 
 ; CHECK:	.section	.debug_loc
-; CHECK-NEXT:	{
-; CHECK-NEXT:$L__debug_loc0:
-; CHECK-NEXT:.b64 $L__tmp8
-; CHECK-NEXT:.b64 $L__tmp10
-; CHECK-NEXT:.b8 5                                   // Loc expr size
-; CHECK-NEXT:.b8 0
-; CHECK-NEXT:.b8 144                                 // DW_OP_regx
-; CHECK-NEXT:.b8 177                                 // 2450993
-; CHECK-NEXT:.b8 204                                 //
-; CHECK-NEXT:.b8 149                                 //
-; CHECK-NEXT:.b8 1                                   //
-; CHECK-NEXT:.b64 0
-; CHECK-NEXT:.b64 0
-; CHECK-NEXT:$L__debug_loc1:
-; CHECK-NEXT:.b64 $L__tmp5
-; CHECK-NEXT:.b64 $L__func_end0
-; CHECK-NEXT:.b8 5                                   // Loc expr size
-; CHECK-NEXT:.b8 0
-; CHECK-NEXT:.b8 144                                 // DW_OP_regx
-; CHECK-NEXT:.b8 177                                 // 2454065
-; CHECK-NEXT:.b8 228                                 //
-; CHECK-NEXT:.b8 149                                 //
-; CHECK-NEXT:.b8 1                                   //
-; CHECK-NEXT:.b64 0
-; CHECK-NEXT:.b64 0
-; CHECK-NEXT:	}
-; CHECK-NEXT:	.section	.debug_abbrev
-; CHECK-NEXT:	{
-; CHECK-NEXT:.b8 1                                   // Abbreviation Code
-; CHECK-NEXT:.b8 17                                  // DW_TAG_compile_unit
-; CHECK-NEXT:.b8 1                                   // DW_CHILDREN_yes
-; CHECK-NEXT:.b8 37                                  // DW_AT_producer
-; CHECK-NEXT:.b8 8                                   // DW_FORM_string
-; CHECK-NEXT:.b8 19                                  // DW_AT_language
-; CHECK-NEXT:.b8 5                                   // DW_FORM_data2
-; CHECK-NEXT:.b8 3                                   // DW_AT_name
-; CHECK-NEXT:.b8 8                                   // DW_FORM_string
-; CHECK-NEXT:.b8 16                                  // DW_AT_stmt_list
-; CHECK-NEXT:.b8 6                                   // DW_FORM_data4
-; CHECK-NEXT:.b8 27                                  // DW_AT_comp_dir
-; CHECK-NEXT:.b8 8                                   // DW_FORM_string
-; CHECK-NEXT:.b8 0                                   // EOM(1)
-; CHECK-NEXT:.b8 0                                   // EOM(2)
-; CHECK-NEXT:.b8 2                                   // Abbreviation Code
-; CHECK-NEXT:.b8 19                                  // DW_TAG_structure_type
-; CHECK-NEXT:.b8 1                                   // DW_CHILDREN_yes
-; CHECK-NEXT:.b8 3                                   // DW_AT_name
-; CHECK-NEXT:.b8 8                                   // DW_FORM_string
-; CHECK-NEXT:.b8 11                                  // DW_AT_byte_size
-; CHECK-NEXT:.b8 11                                  // DW_FORM_data1
-; CHECK-NEXT:.b8 58                                  // DW_AT_decl_file
-; CHECK-NEXT:.b8 11                                  // DW_FORM_data1
-; CHECK-NEXT:.b8 59                                  // DW_AT_decl_line
-; CHECK-NEXT:.b8 11                                  // DW_FORM_data1
-; CHECK-NEXT:.b8 0                                   // EOM(1)
-; CHECK-NEXT:.b8 0                                   // EOM(2)
-; CHECK-NEXT:.b8 3                                   // Abbreviation Code
-; CHECK-NEXT:.b8 46                                  // DW_TAG_subprogram
-; CHECK-NEXT:.b8 0                                   // DW_CHILDREN_no
-; CHECK-NEXT:.b8 135                                 // DW_AT_MIPS_linkage_name
-; CHECK-NEXT:.b8 64
-; CHECK-NEXT:.b8 8                                   // DW_FORM_string
-; CHECK-NEXT:.b8 3                                   // DW_AT_name
-; CHECK-NEXT:.b8 8                                   // DW_FORM_string
-; CHECK-NEXT:.b8 58                                  // DW_AT_decl_file
-; CHECK-NEXT:.b8 11                                  // DW_FORM_data1
-; CHECK-NEXT:.b8 59                                  // DW_AT_decl_line
-; CHECK-NEXT:.b8 11                                  // DW_FORM_data1
-; CHECK-NEXT:.b8 73                                  // DW_AT_type
-; CHECK-NEXT:.b8 19                                  // DW_FORM_ref4
-; CHECK-NEXT:.b8 60                                  // DW_AT_declaration
-; CHECK-NEXT:.b8 12                                  // DW_FORM_flag
-; CHECK-NEXT:.b8 63                                  // DW_AT_external
-; CHECK-NEXT:.b8 12                                  // DW_FORM_flag
-; CHECK-NEXT:.b8 0                                   // EOM(1)
-; CHECK-NEXT:.b8 0                                   // EOM(2)
-; CHECK-NEXT:.b8 4                                   // Abbreviation Code
-; CHECK-NEXT:.b8 46                                  // DW_TAG_subprogram
-; CHECK-NEXT:.b8 1                                   // DW_CHILDREN_yes
-; CHECK-NEXT:.b8 135                                 // DW_AT_MIPS_linkage_name
-; CHECK-NEXT:.b8 64
-; CHECK-NEXT:.b8 8                                   // DW_FORM_string
-; CHECK-NEXT:.b8 3                                   // DW_AT_name
-; CHECK-NEXT:.b8 8                                   // DW_FORM_string
-; CHECK-NEXT:.b8 58                                  // DW_AT_decl_file
-; CHECK-NEXT:.b8 11                                  // DW_FORM_data1
-; CHECK-NEXT:.b8 59                                  // DW_AT_decl_line
-; CHECK-NEXT:.b8 11                                  // DW_FORM_data1
-; CHECK-NEXT:.b8 73                                  // DW_AT_type
-; CHECK-NEXT:.b8 19                                  // DW_FORM_ref4
-; CHECK-NEXT:.b8 60                                  // DW_AT_declaration
-; CHECK-NEXT:.b8 12                                  // DW_FORM_flag
-; CHECK-NEXT:.b8 100                                 // DW_AT_object_pointer
-; CHECK-NEXT:.b8 19                                  // DW_FORM_ref4
-; CHECK-NEXT:.b8 63                                  // DW_AT_external
-; CHECK-NEXT:.b8 12                                  // DW_FORM_flag
-; CHECK-NEXT:.b8 0                                   // EOM(1)
-; CHECK-NEXT:.b8 0                                   // EOM(2)
-; CHECK-NEXT:.b8 5                                   // Abbreviation Code
-; CHECK-NEXT:.b8 5                                   // DW_TAG_formal_parameter
-; CHECK-NEXT:.b8 0                                   // DW_CHILDREN_no
-; CHECK-NEXT:.b8 73                                  // DW_AT_type
-; CHECK-NEXT:.b8 19                                  // DW_FORM_ref4
-; CHECK-NEXT:.b8 52                                  // DW_AT_artificial
-; CHECK-NEXT:.b8 12                                  // DW_FORM_flag
-; CHECK-NEXT:.b8 0                                   // EOM(1)
-; CHECK-NEXT:.b8 0                                   // EOM(2)
-; CHECK-NEXT:.b8 6                                   // Abbreviation Code
-; CHECK-NEXT:.b8 46                                  // DW_TAG_subprogram
-; CHECK-NEXT:.b8 1                                   // DW_CHILDREN_yes
-; CHECK-NEXT:.b8 3                                   // DW_AT_name
-; CHECK-NEXT:.b8 8                                   // DW_FORM_string
-; CHECK-NEXT:.b8 58                                  // DW_AT_decl_file
-; CHECK-NEXT:.b8 11                                  // DW_FORM_data1
-; CHECK-NEXT:.b8 59                                  // DW_AT_decl_line
-; CHECK-NEXT:.b8 11                                  // DW_FORM_data1
-; CHECK-NEXT:.b8 60                                  // DW_AT_declaration
-; CHECK-NEXT:.b8 12                                  // DW_FORM_flag
-; CHECK-NEXT:.b8 100                                 // DW_AT_object_pointer
-; CHECK-NEXT:.b8 19                                  // DW_FORM_ref4
-; CHECK-NEXT:.b8 63                                  // DW_AT_external
-; CHECK-NEXT:.b8 12                                  // DW_FORM_flag
-; CHECK-NEXT:.b8 50                                  // DW_AT_accessibility
-; CHECK-NEXT:.b8 11                                  // DW_FORM_data1
-; CHECK-NEXT:.b8 0                                   // EOM(1)
-; CHECK-NEXT:.b8 0                                   // EOM(2)
-; CHECK-NEXT:.b8 7                                   // Abbreviation Code
-; CHECK-NEXT:.b8 5                                   // DW_TAG_formal_parameter
-; CHECK-NEXT:.b8 0                                   // DW_CHILDREN_no
-; CHECK-NEXT:.b8 73                                  // DW_AT_type
-; CHECK-NEXT:.b8 19                                  // DW_FORM_ref4
-; CHECK-NEXT:.b8 0                                   // EOM(1)
-; CHECK-NEXT:.b8 0                                   // EOM(2)
-; CHECK-NEXT:.b8 8                                   // Abbreviation Code
-; CHECK-NEXT:.b8 46                                  // DW_TAG_subprogram
-; CHECK-NEXT:.b8 1                                   // DW_CHILDREN_yes
-; CHECK-NEXT:.b8 135                                 // DW_AT_MIPS_linkage_name
-; CHECK-NEXT:.b8 64
-; CHECK-NEXT:.b8 8                                   // DW_FORM_string
-; CHECK-NEXT:.b8 3                                   // DW_AT_name
-; CHECK-NEXT:.b8 8                                   // DW_FORM_string
-; CHECK-NEXT:.b8 58                                  // DW_AT_decl_file
-; CHECK-NEXT:.b8 11                                  // DW_FORM_data1
-; CHECK-NEXT:.b8 59                                  // DW_AT_decl_line
-; CHECK-NEXT:.b8 11                                  // DW_FORM_data1
-; CHECK-NEXT:.b8 60                                  // DW_AT_declaration
-; CHECK-NEXT:.b8 12                                  // DW_FORM_flag
-; CHECK-NEXT:.b8 100                                 // DW_AT_object_pointer
-; CHECK-NEXT:.b8 19                                  // DW_FORM_ref4
-; CHECK-NEXT:.b8 63                                  // DW_AT_external
-; CHECK-NEXT:.b8 12                                  // DW_FORM_flag
-; CHECK-NEXT:.b8 50                                  // DW_AT_accessibility
-; CHECK-NEXT:.b8 11                                  // DW_FORM_data1
-; CHECK-NEXT:.b8 0                                   // EOM(1)
-; CHECK-NEXT:.b8 0                                   // EOM(2)
-; CHECK-NEXT:.b8 9                                   // Abbreviation Code
-; CHECK-NEXT:.b8 46                                  // DW_TAG_subprogram
-; CHECK-NEXT:.b8 1                                   // DW_CHILDREN_yes
-; CHECK-NEXT:.b8 135                                 // DW_AT_MIPS_linkage_name
-; CHECK-NEXT:.b8 64
-; CHECK-NEXT:.b8 8                                   // DW_FORM_string
-; CHECK-NEXT:.b8 3                                   // DW_AT_name
-; CHECK-NEXT:.b8 8                                   // DW_FORM_string
-; CHECK-NEXT:.b8 58                                  // DW_AT_decl_file
-; CHECK-NEXT:.b8 11                                  // DW_FORM_data1
-; CHECK-NEXT:.b8 59                                  // DW_AT_decl_line
-; CHECK-NEXT:.b8 11                                  // DW_FORM_data1
-; CHECK-NEXT:.b8 73                                  // DW_AT_type
-; CHECK-NEXT:.b8 19                                  // DW_FORM_ref4
-; CHECK-NEXT:.b8 60                                  // DW_AT_declaration
-; CHECK-NEXT:.b8 12                                  // DW_FORM_flag
-; CHECK-NEXT:.b8 100                                 // DW_AT_object_pointer
-; CHECK-NEXT:.b8 19                                  // DW_FORM_ref4
-; CHECK-NEXT:.b8 63                                  // DW_AT_external
-; CHECK-NEXT:.b8 12                                  // DW_FORM_flag
-; CHECK-NEXT:.b8 50                                  // DW_AT_accessibility
-; CHECK-NEXT:.b8 11                                  // DW_FORM_data1
-; CHECK-NEXT:.b8 0                                   // EOM(1)
-; CHECK-NEXT:.b8 0                                   // EOM(2)
-; CHECK-NEXT:.b8 10                                  // Abbreviation Code
-; CHECK-NEXT:.b8 36                                  // DW_TAG_base_type
-; CHECK-NEXT:.b8 0                                   // DW_CHILDREN_no
-; CHECK-NEXT:.b8 3                                   // DW_AT_name
-; CHECK-NEXT:.b8 8                                   // DW_FORM_string
-; CHECK-NEXT:.b8 62                                  // DW_AT_encoding
-; CHECK-NEXT:.b8 11                                  // DW_FORM_data1
-; CHECK-NEXT:.b8 11                                  // DW_AT_byte_size
-; CHECK-NEXT:.b8 11                                  // DW_FORM_data1
-; CHECK-NEXT:.b8 0                                   // EOM(1)
-; CHECK-NEXT:.b8 0                                   // EOM(2)
-; CHECK-NEXT:.b8 11                                  // Abbreviation Code
-; CHECK-NEXT:.b8 13                                  // DW_TAG_member
-; CHECK-NEXT:.b8 0                                   // DW_CHILDREN_no
-; CHECK-NEXT:.b8 3                                   // DW_AT_name
-; CHECK-NEXT:.b8 8                                   // DW_FORM_string
-; CHECK-NEXT:.b8 73                                  // DW_AT_type
-; CHECK-NEXT:.b8 19                                  // DW_FORM_ref4
-; CHECK-NEXT:.b8 58                                  // DW_AT_decl_file
-; CHECK-NEXT:.b8 11                                  // DW_FORM_data1
-; CHECK-NEXT:.b8 59                                  // DW_AT_decl_line
-; CHECK-NEXT:.b8 11                                  // DW_FORM_data1
-; CHECK-NEXT:.b8 56                                  // DW_AT_data_member_location
-; CHECK-NEXT:.b8 10                                  // DW_FORM_block1
-; CHECK-NEXT:.b8 0                                   // EOM(1)
-; CHECK-NEXT:.b8 0                                   // EOM(2)
-; CHECK-NEXT:.b8 12                                  // Abbreviation Code
-; CHECK-NEXT:.b8 15                                  // DW_TAG_pointer_type
-; CHECK-NEXT:.b8 0                                   // DW_CHILDREN_no
-; CHECK-NEXT:.b8 73                                  // DW_AT_type
-; CHECK-NEXT:.b8 19                                  // DW_FORM_ref4
-; CHECK-NEXT:.b8 0                                   // EOM(1)
-; CHECK-NEXT:.b8 0                                   // EOM(2)
-; CHECK-NEXT:.b8 13                                  // Abbreviation Code
-; CHECK-NEXT:.b8 38                                  // DW_TAG_const_type
-; CHECK-NEXT:.b8 0                                   // DW_CHILDREN_no
-; CHECK-NEXT:.b8 73                                  // DW_AT_type
-; CHECK-NEXT:.b8 19                                  // DW_FORM_ref4
-; CHECK-NEXT:.b8 0                                   // EOM(1)
-; CHECK-NEXT:.b8 0                                   // EOM(2)
-; CHECK-NEXT:.b8 14                                  // Abbreviation Code
-; CHECK-NEXT:.b8 16                                  // DW_TAG_reference_type
-; CHECK-NEXT:.b8 0                                   // DW_CHILDREN_no
-; CHECK-NEXT:.b8 73                                  // DW_AT_type
-; CHECK-NEXT:.b8 19                                  // DW_FORM_ref4
-; CHECK-NEXT:.b8 0                                   // EOM(1)
-; CHECK-NEXT:.b8 0                                   // EOM(2)
-; CHECK-NEXT:.b8 15                                  // Abbreviation Code
-; CHECK-NEXT:.b8 46                                  // DW_TAG_subprogram
-; CHECK-NEXT:.b8 0                                   // DW_CHILDREN_no
-; CHECK-NEXT:.b8 71                                  // DW_AT_specification
-; CHECK-NEXT:.b8 19                                  // DW_FORM_ref4
-; CHECK-NEXT:.b8 32                                  // DW_AT_inline
-; CHECK-NEXT:.b8 11                                  // DW_FORM_data1
-; CHECK-NEXT:.b8 0                                   // EOM(1)
-; CHECK-NEXT:.b8 0                                   // EOM(2)
-; CHECK-NEXT:.b8 16                                  // Abbreviation Code
-; CHECK-NEXT:.b8 19                                  // DW_TAG_structure_type
-; CHECK-NEXT:.b8 1                                   // DW_CHILDREN_yes
-; CHECK-NEXT:.b8 3                                   // DW_AT_name
-; CHECK-NEXT:.b8 8                                   // DW_FORM_string
-; CHECK-NEXT:.b8 11                                  // DW_AT_byte_size
-; CHECK-NEXT:.b8 11                                  // DW_FORM_data1
-; CHECK-NEXT:.b8 58                                  // DW_AT_decl_file
-; CHECK-NEXT:.b8 11                                  // DW_FORM_data1
-; CHECK-NEXT:.b8 59                                  // DW_AT_decl_line
-; CHECK-NEXT:.b8 5                                   // DW_FORM_data2
-; CHECK-NEXT:.b8 0                                   // EOM(1)
-; CHECK-NEXT:.b8 0                                   // EOM(2)
-; CHECK-NEXT:.b8 17                                  // Abbreviat...
[truncated]

@llvmbot
Copy link
Member

llvmbot commented Jan 29, 2025

@llvm/pr-subscribers-backend-nvptx

Author: Michael Buch (Michael137)

Changes

We started attaching DW_AT_object_pointers on method declarations in #122742. However, that caused the .debug_info section size to increase significantly (by around ~10% on some projects). This was mainly due to the large number of new DW_FORM_ref4 values. This patch tries to address that regression by changing the DW_FORM_ref4 to a DW_FORM_implicit_const for declarations. The value of DW_FORM_implicit_const will be the index of the object parameter in the list of formal parameters of the subprogram (i.e., if the first DW_TAG_formal_parameter is the object pointer, the DW_FORM_implicit_const would be 0). The DWARFv5 spec only mentions the use of the reference attribute class to for DW_AT_object_pointer. So using a DW_FORM_impilicit_const would be an extension to (and not something mandated/specified by) the standard. Though it'd make sense to extend the wording in the spec to allow for this optimization.

That way we don't pay for the 4 byte references on every attribute occurrence. In a local build of clang this barely affected the .debug_info section size (but did increase .debug_abbrev by up to 10%, which doesn't impact the total debug-info size much however).

We guarded this on LLDB tuning (since using DW_FORM_implicit_const for this purpose may surprise consumers) and DWARFv5 (since that's where DW_FORM_implicit_const was first standardized).


Patch is 230.14 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/124790.diff

6 Files Affected:

  • (modified) llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp (+21-7)
  • (modified) llvm/lib/CodeGen/AsmPrinter/DwarfUnit.h (+4-2)
  • (modified) llvm/test/DebugInfo/NVPTX/debug-info.ll (+2548-2578)
  • (added) llvm/test/DebugInfo/X86/DW_AT_object_pointer-non-standard-index.ll (+79)
  • (modified) llvm/test/DebugInfo/X86/DW_AT_object_pointer.ll (+15-7)
  • (modified) llvm/test/tools/llvm-dwarfdump/X86/statistics.ll (+3-3)
diff --git a/llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp b/llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp
index d3450b8b0556fde..05bcaa9405e7dd7 100644
--- a/llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp
@@ -849,9 +849,10 @@ void DwarfUnit::constructTypeDIE(DIE &Buffer, const DIDerivedType *DTy) {
   }
 }
 
-DIE *DwarfUnit::constructSubprogramArguments(DIE &Buffer, DITypeRefArray Args) {
+std::optional<unsigned>
+DwarfUnit::constructSubprogramArguments(DIE &Buffer, DITypeRefArray Args) {
   // Args[0] is the return type.
-  DIE *ObjectPointer = nullptr;
+  std::optional<unsigned> ObjectPointerIndex;
   for (unsigned i = 1, N = Args.size(); i < N; ++i) {
     const DIType *Ty = Args[i];
     if (!Ty) {
@@ -863,13 +864,14 @@ DIE *DwarfUnit::constructSubprogramArguments(DIE &Buffer, DITypeRefArray Args) {
       if (Ty->isArtificial())
         addFlag(Arg, dwarf::DW_AT_artificial);
       if (Ty->isObjectPointer()) {
-        assert(!ObjectPointer && "Can't have more than one object pointer");
-        ObjectPointer = &Arg;
+        assert(!ObjectPointerIndex &&
+               "Can't have more than one object pointer");
+        ObjectPointerIndex = i;
       }
     }
   }
 
-  return ObjectPointer;
+  return ObjectPointerIndex;
 }
 
 void DwarfUnit::constructTypeDIE(DIE &Buffer, const DISubroutineType *CTy) {
@@ -1366,8 +1368,20 @@ void DwarfUnit::applySubprogramAttributes(const DISubprogram *SP, DIE &SPDie,
 
     // Add arguments. Do not add arguments for subprogram definition. They will
     // be handled while processing variables.
-    if (auto *ObjectPointer = constructSubprogramArguments(SPDie, Args))
-      addDIEEntry(SPDie, dwarf::DW_AT_object_pointer, *ObjectPointer);
+    //
+    // Encode the object pointer as an index instead of a DIE reference in order
+    // to minimize the affect on the .debug_info size.
+    if (std::optional<unsigned> ObjectPointerIndex =
+            constructSubprogramArguments(SPDie, Args)) {
+      if (getDwarfDebug().tuneForLLDB() &&
+          getDwarfDebug().getDwarfVersion() >= 5) {
+        // 0th index in Args is the return type, hence adjust by 1. In DWARF
+        // we want the first parameter to be at index 0.
+        assert(*ObjectPointerIndex > 0);
+        addSInt(SPDie, dwarf::DW_AT_object_pointer,
+                dwarf::DW_FORM_implicit_const, *ObjectPointerIndex - 1);
+      }
+    }
   }
 
   addThrownTypes(SPDie, SP->getThrownTypes());
diff --git a/llvm/lib/CodeGen/AsmPrinter/DwarfUnit.h b/llvm/lib/CodeGen/AsmPrinter/DwarfUnit.h
index 7a5295d826a483b..afbe90899421045 100644
--- a/llvm/lib/CodeGen/AsmPrinter/DwarfUnit.h
+++ b/llvm/lib/CodeGen/AsmPrinter/DwarfUnit.h
@@ -269,8 +269,10 @@ class DwarfUnit : public DIEUnit {
 
   /// Construct function argument DIEs.
   ///
-  /// \returns DIE of the object pointer if one exists. Nullptr otherwise.
-  DIE *constructSubprogramArguments(DIE &Buffer, DITypeRefArray Args);
+  /// \returns The index of the object parameter in \c Args if one exists.
+  /// Returns std::nullopt otherwise.
+  std::optional<unsigned> constructSubprogramArguments(DIE &Buffer,
+                                                       DITypeRefArray Args);
 
   /// Create a DIE with the given Tag, add the DIE to its parent, and
   /// call insertDIE if MD is not null.
diff --git a/llvm/test/DebugInfo/NVPTX/debug-info.ll b/llvm/test/DebugInfo/NVPTX/debug-info.ll
index 51fb692789e2265..62b30a1f15aff11 100644
--- a/llvm/test/DebugInfo/NVPTX/debug-info.ll
+++ b/llvm/test/DebugInfo/NVPTX/debug-info.ll
@@ -98,2584 +98,2554 @@ if.end:                                           ; preds = %if.then, %entry
 ; CHECK-DAG: .file {{[0-9]+}} "{{.*}}/usr/local/cuda/include{{/|\\\\}}vector_types.h"
 
 ; CHECK:	.section	.debug_loc
-; CHECK-NEXT:	{
-; CHECK-NEXT:$L__debug_loc0:
-; CHECK-NEXT:.b64 $L__tmp8
-; CHECK-NEXT:.b64 $L__tmp10
-; CHECK-NEXT:.b8 5                                   // Loc expr size
-; CHECK-NEXT:.b8 0
-; CHECK-NEXT:.b8 144                                 // DW_OP_regx
-; CHECK-NEXT:.b8 177                                 // 2450993
-; CHECK-NEXT:.b8 204                                 //
-; CHECK-NEXT:.b8 149                                 //
-; CHECK-NEXT:.b8 1                                   //
-; CHECK-NEXT:.b64 0
-; CHECK-NEXT:.b64 0
-; CHECK-NEXT:$L__debug_loc1:
-; CHECK-NEXT:.b64 $L__tmp5
-; CHECK-NEXT:.b64 $L__func_end0
-; CHECK-NEXT:.b8 5                                   // Loc expr size
-; CHECK-NEXT:.b8 0
-; CHECK-NEXT:.b8 144                                 // DW_OP_regx
-; CHECK-NEXT:.b8 177                                 // 2454065
-; CHECK-NEXT:.b8 228                                 //
-; CHECK-NEXT:.b8 149                                 //
-; CHECK-NEXT:.b8 1                                   //
-; CHECK-NEXT:.b64 0
-; CHECK-NEXT:.b64 0
-; CHECK-NEXT:	}
-; CHECK-NEXT:	.section	.debug_abbrev
-; CHECK-NEXT:	{
-; CHECK-NEXT:.b8 1                                   // Abbreviation Code
-; CHECK-NEXT:.b8 17                                  // DW_TAG_compile_unit
-; CHECK-NEXT:.b8 1                                   // DW_CHILDREN_yes
-; CHECK-NEXT:.b8 37                                  // DW_AT_producer
-; CHECK-NEXT:.b8 8                                   // DW_FORM_string
-; CHECK-NEXT:.b8 19                                  // DW_AT_language
-; CHECK-NEXT:.b8 5                                   // DW_FORM_data2
-; CHECK-NEXT:.b8 3                                   // DW_AT_name
-; CHECK-NEXT:.b8 8                                   // DW_FORM_string
-; CHECK-NEXT:.b8 16                                  // DW_AT_stmt_list
-; CHECK-NEXT:.b8 6                                   // DW_FORM_data4
-; CHECK-NEXT:.b8 27                                  // DW_AT_comp_dir
-; CHECK-NEXT:.b8 8                                   // DW_FORM_string
-; CHECK-NEXT:.b8 0                                   // EOM(1)
-; CHECK-NEXT:.b8 0                                   // EOM(2)
-; CHECK-NEXT:.b8 2                                   // Abbreviation Code
-; CHECK-NEXT:.b8 19                                  // DW_TAG_structure_type
-; CHECK-NEXT:.b8 1                                   // DW_CHILDREN_yes
-; CHECK-NEXT:.b8 3                                   // DW_AT_name
-; CHECK-NEXT:.b8 8                                   // DW_FORM_string
-; CHECK-NEXT:.b8 11                                  // DW_AT_byte_size
-; CHECK-NEXT:.b8 11                                  // DW_FORM_data1
-; CHECK-NEXT:.b8 58                                  // DW_AT_decl_file
-; CHECK-NEXT:.b8 11                                  // DW_FORM_data1
-; CHECK-NEXT:.b8 59                                  // DW_AT_decl_line
-; CHECK-NEXT:.b8 11                                  // DW_FORM_data1
-; CHECK-NEXT:.b8 0                                   // EOM(1)
-; CHECK-NEXT:.b8 0                                   // EOM(2)
-; CHECK-NEXT:.b8 3                                   // Abbreviation Code
-; CHECK-NEXT:.b8 46                                  // DW_TAG_subprogram
-; CHECK-NEXT:.b8 0                                   // DW_CHILDREN_no
-; CHECK-NEXT:.b8 135                                 // DW_AT_MIPS_linkage_name
-; CHECK-NEXT:.b8 64
-; CHECK-NEXT:.b8 8                                   // DW_FORM_string
-; CHECK-NEXT:.b8 3                                   // DW_AT_name
-; CHECK-NEXT:.b8 8                                   // DW_FORM_string
-; CHECK-NEXT:.b8 58                                  // DW_AT_decl_file
-; CHECK-NEXT:.b8 11                                  // DW_FORM_data1
-; CHECK-NEXT:.b8 59                                  // DW_AT_decl_line
-; CHECK-NEXT:.b8 11                                  // DW_FORM_data1
-; CHECK-NEXT:.b8 73                                  // DW_AT_type
-; CHECK-NEXT:.b8 19                                  // DW_FORM_ref4
-; CHECK-NEXT:.b8 60                                  // DW_AT_declaration
-; CHECK-NEXT:.b8 12                                  // DW_FORM_flag
-; CHECK-NEXT:.b8 63                                  // DW_AT_external
-; CHECK-NEXT:.b8 12                                  // DW_FORM_flag
-; CHECK-NEXT:.b8 0                                   // EOM(1)
-; CHECK-NEXT:.b8 0                                   // EOM(2)
-; CHECK-NEXT:.b8 4                                   // Abbreviation Code
-; CHECK-NEXT:.b8 46                                  // DW_TAG_subprogram
-; CHECK-NEXT:.b8 1                                   // DW_CHILDREN_yes
-; CHECK-NEXT:.b8 135                                 // DW_AT_MIPS_linkage_name
-; CHECK-NEXT:.b8 64
-; CHECK-NEXT:.b8 8                                   // DW_FORM_string
-; CHECK-NEXT:.b8 3                                   // DW_AT_name
-; CHECK-NEXT:.b8 8                                   // DW_FORM_string
-; CHECK-NEXT:.b8 58                                  // DW_AT_decl_file
-; CHECK-NEXT:.b8 11                                  // DW_FORM_data1
-; CHECK-NEXT:.b8 59                                  // DW_AT_decl_line
-; CHECK-NEXT:.b8 11                                  // DW_FORM_data1
-; CHECK-NEXT:.b8 73                                  // DW_AT_type
-; CHECK-NEXT:.b8 19                                  // DW_FORM_ref4
-; CHECK-NEXT:.b8 60                                  // DW_AT_declaration
-; CHECK-NEXT:.b8 12                                  // DW_FORM_flag
-; CHECK-NEXT:.b8 100                                 // DW_AT_object_pointer
-; CHECK-NEXT:.b8 19                                  // DW_FORM_ref4
-; CHECK-NEXT:.b8 63                                  // DW_AT_external
-; CHECK-NEXT:.b8 12                                  // DW_FORM_flag
-; CHECK-NEXT:.b8 0                                   // EOM(1)
-; CHECK-NEXT:.b8 0                                   // EOM(2)
-; CHECK-NEXT:.b8 5                                   // Abbreviation Code
-; CHECK-NEXT:.b8 5                                   // DW_TAG_formal_parameter
-; CHECK-NEXT:.b8 0                                   // DW_CHILDREN_no
-; CHECK-NEXT:.b8 73                                  // DW_AT_type
-; CHECK-NEXT:.b8 19                                  // DW_FORM_ref4
-; CHECK-NEXT:.b8 52                                  // DW_AT_artificial
-; CHECK-NEXT:.b8 12                                  // DW_FORM_flag
-; CHECK-NEXT:.b8 0                                   // EOM(1)
-; CHECK-NEXT:.b8 0                                   // EOM(2)
-; CHECK-NEXT:.b8 6                                   // Abbreviation Code
-; CHECK-NEXT:.b8 46                                  // DW_TAG_subprogram
-; CHECK-NEXT:.b8 1                                   // DW_CHILDREN_yes
-; CHECK-NEXT:.b8 3                                   // DW_AT_name
-; CHECK-NEXT:.b8 8                                   // DW_FORM_string
-; CHECK-NEXT:.b8 58                                  // DW_AT_decl_file
-; CHECK-NEXT:.b8 11                                  // DW_FORM_data1
-; CHECK-NEXT:.b8 59                                  // DW_AT_decl_line
-; CHECK-NEXT:.b8 11                                  // DW_FORM_data1
-; CHECK-NEXT:.b8 60                                  // DW_AT_declaration
-; CHECK-NEXT:.b8 12                                  // DW_FORM_flag
-; CHECK-NEXT:.b8 100                                 // DW_AT_object_pointer
-; CHECK-NEXT:.b8 19                                  // DW_FORM_ref4
-; CHECK-NEXT:.b8 63                                  // DW_AT_external
-; CHECK-NEXT:.b8 12                                  // DW_FORM_flag
-; CHECK-NEXT:.b8 50                                  // DW_AT_accessibility
-; CHECK-NEXT:.b8 11                                  // DW_FORM_data1
-; CHECK-NEXT:.b8 0                                   // EOM(1)
-; CHECK-NEXT:.b8 0                                   // EOM(2)
-; CHECK-NEXT:.b8 7                                   // Abbreviation Code
-; CHECK-NEXT:.b8 5                                   // DW_TAG_formal_parameter
-; CHECK-NEXT:.b8 0                                   // DW_CHILDREN_no
-; CHECK-NEXT:.b8 73                                  // DW_AT_type
-; CHECK-NEXT:.b8 19                                  // DW_FORM_ref4
-; CHECK-NEXT:.b8 0                                   // EOM(1)
-; CHECK-NEXT:.b8 0                                   // EOM(2)
-; CHECK-NEXT:.b8 8                                   // Abbreviation Code
-; CHECK-NEXT:.b8 46                                  // DW_TAG_subprogram
-; CHECK-NEXT:.b8 1                                   // DW_CHILDREN_yes
-; CHECK-NEXT:.b8 135                                 // DW_AT_MIPS_linkage_name
-; CHECK-NEXT:.b8 64
-; CHECK-NEXT:.b8 8                                   // DW_FORM_string
-; CHECK-NEXT:.b8 3                                   // DW_AT_name
-; CHECK-NEXT:.b8 8                                   // DW_FORM_string
-; CHECK-NEXT:.b8 58                                  // DW_AT_decl_file
-; CHECK-NEXT:.b8 11                                  // DW_FORM_data1
-; CHECK-NEXT:.b8 59                                  // DW_AT_decl_line
-; CHECK-NEXT:.b8 11                                  // DW_FORM_data1
-; CHECK-NEXT:.b8 60                                  // DW_AT_declaration
-; CHECK-NEXT:.b8 12                                  // DW_FORM_flag
-; CHECK-NEXT:.b8 100                                 // DW_AT_object_pointer
-; CHECK-NEXT:.b8 19                                  // DW_FORM_ref4
-; CHECK-NEXT:.b8 63                                  // DW_AT_external
-; CHECK-NEXT:.b8 12                                  // DW_FORM_flag
-; CHECK-NEXT:.b8 50                                  // DW_AT_accessibility
-; CHECK-NEXT:.b8 11                                  // DW_FORM_data1
-; CHECK-NEXT:.b8 0                                   // EOM(1)
-; CHECK-NEXT:.b8 0                                   // EOM(2)
-; CHECK-NEXT:.b8 9                                   // Abbreviation Code
-; CHECK-NEXT:.b8 46                                  // DW_TAG_subprogram
-; CHECK-NEXT:.b8 1                                   // DW_CHILDREN_yes
-; CHECK-NEXT:.b8 135                                 // DW_AT_MIPS_linkage_name
-; CHECK-NEXT:.b8 64
-; CHECK-NEXT:.b8 8                                   // DW_FORM_string
-; CHECK-NEXT:.b8 3                                   // DW_AT_name
-; CHECK-NEXT:.b8 8                                   // DW_FORM_string
-; CHECK-NEXT:.b8 58                                  // DW_AT_decl_file
-; CHECK-NEXT:.b8 11                                  // DW_FORM_data1
-; CHECK-NEXT:.b8 59                                  // DW_AT_decl_line
-; CHECK-NEXT:.b8 11                                  // DW_FORM_data1
-; CHECK-NEXT:.b8 73                                  // DW_AT_type
-; CHECK-NEXT:.b8 19                                  // DW_FORM_ref4
-; CHECK-NEXT:.b8 60                                  // DW_AT_declaration
-; CHECK-NEXT:.b8 12                                  // DW_FORM_flag
-; CHECK-NEXT:.b8 100                                 // DW_AT_object_pointer
-; CHECK-NEXT:.b8 19                                  // DW_FORM_ref4
-; CHECK-NEXT:.b8 63                                  // DW_AT_external
-; CHECK-NEXT:.b8 12                                  // DW_FORM_flag
-; CHECK-NEXT:.b8 50                                  // DW_AT_accessibility
-; CHECK-NEXT:.b8 11                                  // DW_FORM_data1
-; CHECK-NEXT:.b8 0                                   // EOM(1)
-; CHECK-NEXT:.b8 0                                   // EOM(2)
-; CHECK-NEXT:.b8 10                                  // Abbreviation Code
-; CHECK-NEXT:.b8 36                                  // DW_TAG_base_type
-; CHECK-NEXT:.b8 0                                   // DW_CHILDREN_no
-; CHECK-NEXT:.b8 3                                   // DW_AT_name
-; CHECK-NEXT:.b8 8                                   // DW_FORM_string
-; CHECK-NEXT:.b8 62                                  // DW_AT_encoding
-; CHECK-NEXT:.b8 11                                  // DW_FORM_data1
-; CHECK-NEXT:.b8 11                                  // DW_AT_byte_size
-; CHECK-NEXT:.b8 11                                  // DW_FORM_data1
-; CHECK-NEXT:.b8 0                                   // EOM(1)
-; CHECK-NEXT:.b8 0                                   // EOM(2)
-; CHECK-NEXT:.b8 11                                  // Abbreviation Code
-; CHECK-NEXT:.b8 13                                  // DW_TAG_member
-; CHECK-NEXT:.b8 0                                   // DW_CHILDREN_no
-; CHECK-NEXT:.b8 3                                   // DW_AT_name
-; CHECK-NEXT:.b8 8                                   // DW_FORM_string
-; CHECK-NEXT:.b8 73                                  // DW_AT_type
-; CHECK-NEXT:.b8 19                                  // DW_FORM_ref4
-; CHECK-NEXT:.b8 58                                  // DW_AT_decl_file
-; CHECK-NEXT:.b8 11                                  // DW_FORM_data1
-; CHECK-NEXT:.b8 59                                  // DW_AT_decl_line
-; CHECK-NEXT:.b8 11                                  // DW_FORM_data1
-; CHECK-NEXT:.b8 56                                  // DW_AT_data_member_location
-; CHECK-NEXT:.b8 10                                  // DW_FORM_block1
-; CHECK-NEXT:.b8 0                                   // EOM(1)
-; CHECK-NEXT:.b8 0                                   // EOM(2)
-; CHECK-NEXT:.b8 12                                  // Abbreviation Code
-; CHECK-NEXT:.b8 15                                  // DW_TAG_pointer_type
-; CHECK-NEXT:.b8 0                                   // DW_CHILDREN_no
-; CHECK-NEXT:.b8 73                                  // DW_AT_type
-; CHECK-NEXT:.b8 19                                  // DW_FORM_ref4
-; CHECK-NEXT:.b8 0                                   // EOM(1)
-; CHECK-NEXT:.b8 0                                   // EOM(2)
-; CHECK-NEXT:.b8 13                                  // Abbreviation Code
-; CHECK-NEXT:.b8 38                                  // DW_TAG_const_type
-; CHECK-NEXT:.b8 0                                   // DW_CHILDREN_no
-; CHECK-NEXT:.b8 73                                  // DW_AT_type
-; CHECK-NEXT:.b8 19                                  // DW_FORM_ref4
-; CHECK-NEXT:.b8 0                                   // EOM(1)
-; CHECK-NEXT:.b8 0                                   // EOM(2)
-; CHECK-NEXT:.b8 14                                  // Abbreviation Code
-; CHECK-NEXT:.b8 16                                  // DW_TAG_reference_type
-; CHECK-NEXT:.b8 0                                   // DW_CHILDREN_no
-; CHECK-NEXT:.b8 73                                  // DW_AT_type
-; CHECK-NEXT:.b8 19                                  // DW_FORM_ref4
-; CHECK-NEXT:.b8 0                                   // EOM(1)
-; CHECK-NEXT:.b8 0                                   // EOM(2)
-; CHECK-NEXT:.b8 15                                  // Abbreviation Code
-; CHECK-NEXT:.b8 46                                  // DW_TAG_subprogram
-; CHECK-NEXT:.b8 0                                   // DW_CHILDREN_no
-; CHECK-NEXT:.b8 71                                  // DW_AT_specification
-; CHECK-NEXT:.b8 19                                  // DW_FORM_ref4
-; CHECK-NEXT:.b8 32                                  // DW_AT_inline
-; CHECK-NEXT:.b8 11                                  // DW_FORM_data1
-; CHECK-NEXT:.b8 0                                   // EOM(1)
-; CHECK-NEXT:.b8 0                                   // EOM(2)
-; CHECK-NEXT:.b8 16                                  // Abbreviation Code
-; CHECK-NEXT:.b8 19                                  // DW_TAG_structure_type
-; CHECK-NEXT:.b8 1                                   // DW_CHILDREN_yes
-; CHECK-NEXT:.b8 3                                   // DW_AT_name
-; CHECK-NEXT:.b8 8                                   // DW_FORM_string
-; CHECK-NEXT:.b8 11                                  // DW_AT_byte_size
-; CHECK-NEXT:.b8 11                                  // DW_FORM_data1
-; CHECK-NEXT:.b8 58                                  // DW_AT_decl_file
-; CHECK-NEXT:.b8 11                                  // DW_FORM_data1
-; CHECK-NEXT:.b8 59                                  // DW_AT_decl_line
-; CHECK-NEXT:.b8 5                                   // DW_FORM_data2
-; CHECK-NEXT:.b8 0                                   // EOM(1)
-; CHECK-NEXT:.b8 0                                   // EOM(2)
-; CHECK-NEXT:.b8 17                                  // Abbreviat...
[truncated]

@labath
Copy link
Collaborator

labath commented Jan 29, 2025

I wouldn't say I am opposed to that (I don't consider myself a decision maker here), but emitting a (non-standard!) form that gdb doesn't understand doesn't sound that much better than not emitting anything. It might be different if we went to gdb and asked them to implement it, but I don't think anyone's planning to do that.

I guess my preferred solution would be to emit the standard form for gdb tuning, and use the non-standard thing for lldb (assuming it doesn't break gdb catastrophically), and also start using lldb tuning for Google. I don't know how hard the last part would be, but I can try to find out.

Anyway, I think I've said enough now. Let's see what others have to say to all of that. :)

@Michael137
Copy link
Member Author

I wouldn't say I am opposed to that (I don't consider myself a decision maker here), but emitting a (non-standard!) form that gdb doesn't understand doesn't sound that much better than not emitting anything. It might be different if we went to gdb and asked them to implement it, but I don't think anyone's planning to do that.

Yea fair point. I'll have a look at what happens with GDB and report back. Just to clarify though, the form itself is standard, just the use of it for DW_AT_object_pointer isn't standard

I guess my preferred solution would be to emit the standard form for gdb tuning, and use the non-standard thing for lldb (assuming it doesn't break gdb catastrophically), and also start using lldb tuning for Google. I don't know how hard the last part would be, but I can try to find out.

Ah that seems like a good middle-ground! And could be done in separate steps (i.e., not emit anything under gdb tuning until google moves adopts -glldb in order to mitigate the size regression).

@Michael137
Copy link
Member Author

Michael137 commented Jan 29, 2025

So just tried with gdb and it does seem problematic:

(gdb) b method
Dwarf Error: Cannot find DIE at 0x0 referenced from DIE at 0x61 [in module /home/michaelbuch/Git/patch-lldb.out]
(gdb) b method2
Dwarf Error: Cannot find DIE at 0x0 referenced from DIE at 0x71 [in module /home/michaelbuch/Git/patch-lldb.out]

Where the source is:

struct Foo {
  int x = 5;
  void method(this Foo const volatile && self) {}
  void method2(int) {}
};

int main() {
  Foo{}.method();
  Foo{}.method2(5);
}

So yea, we'll have to keep it behind a tuning for now

alexfh pushed a commit that referenced this pull request Jan 29, 2025
…rations (#122742)" (#124853)

This introduces a substantial (5-10%) regression in .debug_info size, so
we're discussing alternatives in #122742 and #124790.

This reverts commit 7c72941.
@pogo59
Copy link
Collaborator

pogo59 commented Jan 29, 2025

So just tried with gdb and it does seem problematic:

Yeah gdb is evaluating the value of the attribute then assumes it's a reference, which seems natural. It's not being picky about forms. Using form (or actually, class) to distinguish different meanings of an attribute is something that happens in DWARF, but the difference in meaning isn't specified for this case (yet). The committee might or might not bless this, might prefer to use a different attribute for specifying a parameter index. You'd have to send in a proposal and see what happens.

@jmorse
Copy link
Member

jmorse commented Jan 29, 2025

This is now on my radar and I've prodded our debugger folks; as far as I'm aware we don't make use of this information and feel it's fine behind a tuning flag.

I'm not familiar with how off-piste from the spec LLVM has gone in the past; if we use this representation (DW_FORM_implicit_const with variable index) are we shutting down any future design space that the standard might take, due to the need for backwards compatibility? Independent of that, it seems a pretty reasonable workaround.

Michael137 added a commit to Michael137/llvm-project that referenced this pull request Jan 30, 2025
…r functions

This is XFAILed for now until we find a good way to locate the
DW_AT_object_pointer of function declarations (a possible solution being
llvm#124790).
Michael137 added a commit that referenced this pull request Jan 30, 2025
…r functions (#125053)

This is XFAILed for now until we find a good way to locate the
DW_AT_object_pointer of function declarations (a possible solution being
#124790).

Made it a shell test because I couldn't find any SBAPIs that i could
query to find the CV-qualifiers/etc. of member functions.
github-actions bot pushed a commit to arm/arm-toolchain that referenced this pull request Jan 30, 2025
…bject member functions (#125053)

This is XFAILed for now until we find a good way to locate the
DW_AT_object_pointer of function declarations (a possible solution being
llvm/llvm-project#124790).

Made it a shell test because I couldn't find any SBAPIs that i could
query to find the CV-qualifiers/etc. of member functions.
@dwblaikie
Copy link
Collaborator

re: producing the GCC-thing when gdb-tuning. I'd be hesitant to endorse that until someone has a specific gdb behavior/use case for this - it's possible GCC is producing something GDB doesn't need/use (there is precedent for this - view numbering, so far as I understand, is not implemented in gdb, for instance).

So, yeah, I'd be OK with the non-standard attr+form combo when tuning for lldb and not changing non-lldb tuning behavior.

If we wanted to be more cautious, we could use an LLVM extension attribute for this - though I'd guess it'd produce the same-ish GDB behavior (it'd spit some warning about not knowing what the attribute is - but there'd be less risk that way, since GDB wouldn't be trying to interpret the index as an offset, which could collide, but is unlikely to - because it's always going to be 0)

@dwblaikie
Copy link
Collaborator

For sure I don't feel /super/ great about this - but it's a pretty big regression that both debuggers for some information that both GDB and LLDB have been getting away without for some time.

@Michael137
Copy link
Member Author

re: producing the GCC-thing when gdb-tuning. I'd be hesitant to endorse that until someone has a specific gdb behavior/use case for this - it's possible GCC is producing something GDB doesn't need/use (there is precedent for this - view numbering, so far as I understand, is not implemented in gdb, for instance).

So, yeah, I'd be OK with the non-standard attr+form combo when tuning for lldb and not changing non-lldb tuning behavior.

If we wanted to be more cautious, we could use an LLVM extension attribute for this - though I'd guess it'd produce the same-ish GDB behavior (it'd spit some warning about not knowing what the attribute is - but there'd be less risk that way, since GDB wouldn't be trying to interpret the index as an offset, which could collide, but is unlikely to - because it's always going to be 0)

Happy to go either way. Though it'd be nice not to have to maintain an LLVM extension which will be made obsolete by one of the ongoing DWARF proposals re. encoding DW_AT_object_pointer (be it using index or DIE relative offsets). We can also wait for the DWARF committee discussions to settle since there is no rush with this. It'll be useful for codebases adopting explicit object parameters, but LLDB's support is blocked on a couple of other things, not just this patch.

@dwblaikie
Copy link
Collaborator

re: producing the GCC-thing when gdb-tuning. I'd be hesitant to endorse that until someone has a specific gdb behavior/use case for this - it's possible GCC is producing something GDB doesn't need/use (there is precedent for this - view numbering, so far as I understand, is not implemented in gdb, for instance).
So, yeah, I'd be OK with the non-standard attr+form combo when tuning for lldb and not changing non-lldb tuning behavior.

If we wanted to be more cautious, we could use an LLVM extension attribute for this - though I'd guess it'd produce the same-ish GDB behavior (it'd spit some warning about not knowing what the attribute is - but there'd be less risk that way, since GDB wouldn't be trying to interpret the index as an offset, which could collide, but is unlikely to - because it's always going to be 0)

Happy to go either way. Though it'd be nice not to have to maintain an LLVM extension which will be made obsolete by one of the ongoing DWARF proposals re. encoding DW_AT_object_pointer (be it using index or DIE relative offsets). We can also wait for the DWARF committee discussions to settle since there is no rush with this. It'll be useful for codebases adopting explicit object parameters, but LLDB's support is blocked on a couple of other things, not just this patch.

nod If you're not in too much of a rush, would be nice to see how it shakes out.

It's a bit of a stretch ask, but if you happened to have time/interest to gather the data I sort of volunteered to gather, that'd be handy - basically hacking up llvm-dwarfdump to gather data on the number of fixed-size DIEs (DIEs without LEBs, etc - where the size is known by the abbrev alone), how many of those would become non-fixed size if we used a LEB for DIE-to-DIE references (any form of class "reference"), and how many bytes would be saved by using DIE-to-DIE references (looking at the offsets of the referring DIE and the referent DIE, taking the difference and figuring out how long the LEB encoding for that difference is)... hopefully that'd help inform the design decision.

I do worry that the indexed approach is a bit too niche - expedient, probably fairly benign (because it's niche, doesn't have a large area of effect, at least) - but really only comes up because LLVM hasn't emitted this particular attribute in this particular place before. Otherwise we wouldn't be looking at this particular place as being especially problematic/needing a unique fix.

@pogo59
Copy link
Collaborator

pogo59 commented Feb 5, 2025

Just to throw out another thought on the fixed/variable point:

If LLVM used the fixed-size addr index forms, I'd expect a slight decrease in .debug_info size, because each byte of a fixed-size form can encode a larger range than the same number of bytes of LEB128. Offhand I don't recall typical index ranges of .debug_addr, which you could use to figure out whether it would matter.

I could see using addr1 for the small indexes and addrx for anything over 255. It would not reduce encoded size much, but might increase the number of fixed-size DIEs, making scans that took advantage of that more efficient.

@dwblaikie
Copy link
Collaborator

Just to throw out another thought on the fixed/variable point:

If LLVM used the fixed-size addr index forms, I'd expect a slight decrease in .debug_info size, because each byte of a fixed-size form can encode a larger range than the same number of bytes of LEB128. Offhand I don't recall typical index ranges of .debug_addr, which you could use to figure out whether it would matter.

I could see using addr1 for the small indexes and addrx for anything over 255. It would not reduce encoded size much, but might increase the number of fixed-size DIEs, making scans that took advantage of that more efficient.

Yeah - I mostly use that for evidence that maybe the performance benefit of fixed-size DIEs is overstated, since we've never bothered to fix/change the addrx encoding to use the fixed size.

(FWIW, I was curious - roughly 1400 addresses per address pool/object file is what I found in a debug build of clang, so, that means they fit in 2 ULEB128 bytes, but 128, or ~10% of the indexes are twice as large as they should be - assuming each index is used a similar number of times - that's a 5% increase in index encoding than needed (what % of .debug_info is addrx, I don't know, so not sure what the size cost is in more general terms))

@Michael137
Copy link
Member Author

Just looping back. Looks like the DWARF committee has accepted this use of DW_FORM_implicit_const in https://dwarfstd.org/issues/250130.1.html for DWARFv6. Can we go ahead with this?

// to minimize the affect on the .debug_info size.
if (std::optional<unsigned> ObjectPointerIndex =
constructSubprogramArguments(SPDie, Args)) {
if (getDwarfDebug().tuneForLLDB() &&
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is spec conformant, I don't think we need to hide in LLDB tuning.

Copy link
Member Author

@Michael137 Michael137 Jun 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GDB is currently not equipped with handling this unfortunately. See thread starting with #124790 (comment)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But let me try playing around with GDB again to remind myself whether it's an issue or not

Copy link
Collaborator

@adrian-prantl adrian-prantl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably don't need the LLDB tuning conditional.

@Michael137
Copy link
Member Author

Michael137 commented Jun 13, 2025

We probably don't need the LLDB tuning conditional.

Tried with gdb and without the tuning we'd get:

Dwarf Error: Cannot find DIE at 0x0 referenced from DIE at 0x4d

when setting a breakpoint on a function with the new implicit object pointer encoding.

I'll wait for others to chime in before merging (will wait til monday)

@Michael137 Michael137 merged commit 711f6a8 into llvm:main Jun 16, 2025
10 of 11 checks passed
@Michael137 Michael137 deleted the clang/dwarf-object-pointer-size-improvement branch June 16, 2025 15:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants