Skip to content

Conversation

@dzhidzhoev
Copy link
Member

@dzhidzhoev dzhidzhoev commented Nov 12, 2025

This fixes the issue reported in #166855 (comment) that had been revealed after #166855 was merged.

CodeGenFunction::GenerateVarArgsThunk creates thunks for vararg functions by cloning and modifying them.

According to https://reviews.llvm.org/D39396, CodeGenFunction::GenerateVarArgsThunk may be called before metadata nodes are resolved. So, it tries to avoid remapping DISubprogram and all metadata nodes it references inside CloneFunction() by manually cloning DISubprogram.

If optimization level is not OptNone, DILocalVariables for a function are saved in DISubprogram's retainedNodes field. When CodeGenFunction::GenerateVarArgsThunk clones such DISubprogram without remapping, it produces a subprogram with incorrectly-scoped retained nodes. It triggers Verifier checks added in #166855.

To solve that, retained nodes list of a cloned DISubprogram is cleared.

…bprogram

This fixes an issue reported in
llvm#166855 (comment),
that had been revealed after llvm#166855 was merged.

`CodeGenFunction::GenerateVarArgsThunk` creates thunks for vararg
functions by cloning and modifying the function. It is different
from `CodeGenFunction::generateThunk`, which is used for Itanium ABI.

According to https://reviews.llvm.org/D39396,
`CodeGenFunction::GenerateVarArgsThunk` may be called before metadata
nodes are resolved. So, it tries to avoid remapping DISubprogram and all
metadata nodes it references inside `CloneFunction()` by manually
cloning DISubprogram.

When optimization level is not OptNone, all local variables are saved in
DISubprogram's retainedNodes field. When `CodeGenFunction::GenerateVarArgsThunk`
clones such DISubprogram without remapping, it produces a subprogram
with incorrectly-scoped retained nodes. It triggers Verifier checks
added in llvm#166855.

To solve that, retained nodes list of a cloned DISubprogram is cleared.
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:codegen IR generation bugs: mangling, exceptions, etc. labels Nov 12, 2025
@llvmbot
Copy link
Member

llvmbot commented Nov 12, 2025

@llvm/pr-subscribers-clang-codegen

Author: Vladislav Dzhidzhoev (dzhidzhoev)

Changes

This fixes the issue reported in #166855 (comment) that had been revealed after #166855 was merged.

CodeGenFunction::GenerateVarArgsThunk creates thunks for vararg functions by cloning and modifying a function. It is different from CodeGenFunction::generateThunk, which is used for Itanium ABI.

According to https://reviews.llvm.org/D39396, CodeGenFunction::GenerateVarArgsThunk may be called before metadata nodes are resolved. So, it tries to avoid remapping DISubprogram and all metadata nodes it references inside CloneFunction() by manually cloning DISubprogram.

When optimization level is not OptNone, DILocalVariables for a function are saved in DISubprogram's retainedNodes field. When CodeGenFunction::GenerateVarArgsThunk clones such DISubprogram without remapping, it produces a subprogram with incorrectly-scoped retained nodes. It triggers Verifier checks added in #166855.

To solve that, retained nodes list of a cloned DISubprogram is cleared.


Full diff: https://github.com/llvm/llvm-project/pull/167758.diff

3 Files Affected:

  • (modified) clang/lib/CodeGen/CGVTables.cpp (+3)
  • (modified) clang/test/CodeGenCXX/tmp-md-nodes1.cpp (+16)
  • (modified) clang/test/CodeGenCXX/tmp-md-nodes2.cpp (+16)
diff --git a/clang/lib/CodeGen/CGVTables.cpp b/clang/lib/CodeGen/CGVTables.cpp
index e14e883a55ac5..736d1e4d2f1f5 100644
--- a/clang/lib/CodeGen/CGVTables.cpp
+++ b/clang/lib/CodeGen/CGVTables.cpp
@@ -125,6 +125,9 @@ static void resolveTopLevelMetadata(llvm::Function *Fn,
   if (!DIS)
     return;
   auto *NewDIS = llvm::MDNode::replaceWithDistinct(DIS->clone());
+  // As DISubprogram remapping is avoided, clear retained nodes list of
+  // cloned DISubprogram from retained nodes local to original DISubprogram.
+  NewDIS->replaceRetainedNodes(llvm::MDTuple::get(Fn->getContext(), {}));
   VMap.MD()[DIS].reset(NewDIS);
 
   // Find all llvm.dbg.declare intrinsics and resolve the DILocalVariable nodes
diff --git a/clang/test/CodeGenCXX/tmp-md-nodes1.cpp b/clang/test/CodeGenCXX/tmp-md-nodes1.cpp
index 524b2c08c1ad5..f39dca3edaed1 100644
--- a/clang/test/CodeGenCXX/tmp-md-nodes1.cpp
+++ b/clang/test/CodeGenCXX/tmp-md-nodes1.cpp
@@ -2,6 +2,14 @@
 // RUN: %clang_cc1 -O0 -triple %itanium_abi_triple -debug-info-kind=limited -emit-llvm %s -o - | \
 // RUN: FileCheck %s
 
+// Trigger GenerateVarArgsThunk.
+// RUN: %clang_cc1 -O0 -triple riscv64-linux-gnu -debug-info-kind=limited -emit-llvm %s -o - | \
+// RUN: FileCheck %s
+
+// Check that retainedNodes are properly maintained at function cloning.
+// RUN: %clang_cc1 -O1 -triple riscv64-linux-gnu -debug-info-kind=limited -emit-llvm %s -o - | \
+// RUN: FileCheck %s --check-prefixes=CHECK,CHECK-DI
+
 // This test simply checks that the varargs thunk is created. The failing test
 // case asserts.
 
@@ -16,3 +24,11 @@ struct CharlieImpl : Charlie, Alpha {
 } delta;
 
 // CHECK: define {{.*}} void @_ZThn{{[48]}}_N11CharlieImpl5bravoEz(
+
+// CHECK-DI: distinct !DISubprogram({{.*}}, linkageName: "_ZN11CharlieImpl5bravoEz", {{.*}}, retainedNodes: [[RN1:![0-9]+]]
+// A non-empty retainedNodes list of original DISubprogram.
+// CHECK-DI: [[RN1]] = !{!{{.*}}}
+
+// CHECK-DI: distinct !DISubprogram({{.*}}, linkageName: "_ZN11CharlieImpl5bravoEz", {{.*}}, retainedNodes: [[EMPTY:![0-9]+]]
+// An empty retainedNodes list of cloned DISubprogram.
+// CHECK-DI: [[EMPTY]] = !{}
diff --git a/clang/test/CodeGenCXX/tmp-md-nodes2.cpp b/clang/test/CodeGenCXX/tmp-md-nodes2.cpp
index 8500cf3c42393..0c323ae4f58aa 100644
--- a/clang/test/CodeGenCXX/tmp-md-nodes2.cpp
+++ b/clang/test/CodeGenCXX/tmp-md-nodes2.cpp
@@ -2,6 +2,14 @@
 // RUN: %clang_cc1 -O0 -triple %itanium_abi_triple -debug-info-kind=limited -emit-llvm %s -o - | \
 // RUN: FileCheck %s
 
+// Trigger GenerateVarArgsThunk.
+// RUN: %clang_cc1 -O0 -triple riscv64-linux-gnu -debug-info-kind=limited -emit-llvm %s -o - | \
+// RUN: FileCheck %s
+
+// Check that retainedNodes are properly maintained at function cloning.
+// RUN: %clang_cc1 -O1 -triple riscv64-linux-gnu -debug-info-kind=limited -emit-llvm %s -o - | \
+// RUN: FileCheck %s --check-prefixes=CHECK,CHECK-DI
+
 // This test simply checks that the varargs thunk is created. The failing test
 // case asserts.
 
@@ -31,3 +39,11 @@ BOOL CBdVfsImpl::ReqCacheHint( CMsgAgent* p_ma, CACHE_HINT hint, ... ) {
 }
 
 // CHECK: define {{.*}} @_ZThn{{[48]}}_N10CBdVfsImpl12ReqCacheHintEP9CMsgAgentN3CFs10CACHE_HINTEz(
+
+// An empty retainedNodes list of cloned DISubprogram.
+// CHECK-DI: [[EMPTY:![0-9]+]] = !{}
+// CHECK-DI: distinct !DISubprogram({{.*}}, linkageName: "_ZN10CBdVfsImpl12ReqCacheHintEP9CMsgAgentN3CFs10CACHE_HINTEz", {{.*}}, retainedNodes: [[RN1:![0-9]+]]
+// A non-empty retainedNodes list of original DISubprogram.
+// CHECK-DI: [[RN1]] = !{!{{.*}}}
+
+// CHECK-DI: distinct !DISubprogram({{.*}}, linkageName: "_ZN10CBdVfsImpl12ReqCacheHintEP9CMsgAgentN3CFs10CACHE_HINTEz", {{.*}}, retainedNodes: [[EMPTY]]

@dzhidzhoev dzhidzhoev changed the title [clang][DebugInfo] Clear retained nodes list of a vararg trunk's DISubprogram [clang][DebugInfo] Clear retained nodes list of vararg trunk's DISubprogram Nov 12, 2025
@llvm llvm deleted a comment from llvmbot Nov 12, 2025
@kongy
Copy link
Collaborator

kongy commented Nov 13, 2025

Verified that this change fixes the breakage in #166855 (comment)

Copy link
Member

@jmorse jmorse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting -- what are the consequences of retainedNodes being cleared for these thunks? I see from the historic comments that it's not "just" a thunk, it's a full copy of the function in question.

It's interesting to see how past design decisions create obstacles now that the debug-info metadata is being refined to fix all the scoping issues. Should we consider the current design of GenerateVarArgsThunk to be technical debt in some way, that needs to be cleared up in the future? If so, the retained-nodes-clearing code should probably have a comment signalling this. If I understand correctly we're more or less forced into clearing retained nodes, rather than it being a deliberate implementation decision.

(IMO the code is fine, I just feel we should talk about the consequences of this direction a bit more).

@dzhidzhoev
Copy link
Member Author

dzhidzhoev commented Nov 13, 2025

Interesting -- what are the consequences of retainedNodes being cleared for these thunks? I see from the historic comments that it's not "just" a thunk, it's a full copy of the function in question.

What CodeGenFunction::GenerateVarArgsThunk and related code do currently, is:

  1. A function, a DISubprogram, and local variables are emitted for a C++ functions from a base class.
  2. A "thunk", which is a modified function clone, is created byCodeGenFunction::GenerateVarArgsThunk. It clones the original subprogram, and clones variables encountered in #dbg_value nodes of the original function. But the cloned subprogram still references local variables of the original subprogram retained in retainedNodes list.

Even if retained nodes list of a subprogram does not have local variables (it is the case with -O0), we emit local variables for which we have location information through #dbg_value nodes.
So, to my understanding, we don't lose anything, compared to the current implementation, because currently retainedNodes list of a cloned DISubprogram contains entities of the original subprogram (and, as a result, they are emitted inside DW_TAG_subprogram for the original subprogram).
Compared to an option with proper remapping of cloned DISubprogram, we may lose some information about optimized-out variables, labels and imported entities.
I can add a comment about that.

It's interesting to see how past design decisions create obstacles now that the debug-info metadata is being refined to fix all the scoping issues. Should we consider the current design of GenerateVarArgsThunk to be technical debt in some way, that needs to be cleared up in the future?

It is called in the context where DIBuilder::finalize is not called yet, so some nodes may indeed be unresolved when CloneFunction is called. I don't have an answer to the question whether we can emit thunks later somehow.

@dwblaikie
Copy link
Collaborator

Compared to an option with proper remapping of cloned DISubprogram, we may lose some information about optimized-out variables, labels and imported entities.

Right, this.

The retained list is populated for optimized builds because an optimized build might optimize away the instructions that reference the parameter variables - and if we lose those, then we may emit the wrong signature for the function & callers could get the ABI wrong in debugger expression evaluation.

Stripping the retained list is at least "better" than having things inside the function that have a scope that is some other function (we already have this constraint for debug locations - I'm surprised we didn't have it for variable locations?) - but still pretty broken/wrong (it doesn't produce invalid metadata (the current state does), but the transformation is invalid/lossy/unfortunate).

At least that's my understanding?

@dzhidzhoev
Copy link
Member Author

Compared to an option with proper remapping of cloned DISubprogram, we may lose some information about optimized-out variables, labels and imported entities.

Right, this.

The retained list is populated for optimized builds because an optimized build might optimize away the instructions that reference the parameter variables - and if we lose those, then we may emit the wrong signature for the function & callers could get the ABI wrong in debugger expression evaluation.

Stripping the retained list is at least "better" than having things inside the function that have a scope that is some other function (we already have this constraint for debug locations - I'm surprised we didn't have it for variable locations?) - but still pretty broken/wrong (it doesn't produce invalid metadata (the current state does), but the transformation is invalid/lossy/unfortunate).

At least that's my understanding?

Yes, I agree. Missed that part.
If we look at clang/test/CodeGenCXX/tmp-md-nodes2.cpp with or without this fix, DW_TAG_subprogram for the thunk has missing formal parameters (see 0x000000f9: DW_TAG_subprogram) https://godbolt.org/z/aPnMzMzWq.

@jmorse
Copy link
Member

jmorse commented Nov 14, 2025

Overall I feel OK about this patch as it stands, mostly because solving a problem in one place (the overall function-local-types project) is creating problems in a smaller / rarer place. I feel we should have plan for fixing GenerateVarArgsThunk though, to ensure we're not backing into a corner we can't get out of. From my reading of the previous tickets, particularly #33277, it sounds like this arrangement was implemented to avoid too much metadata cloning, and I wonder whether LLVM does that any better 8 years later. Or perhaps we can implement the originally-desired solution which was to defer when these functions are generated.

@wolfy1961 do you have any recollection of https://reviews.llvm.org/D39396 and whether the issue with cloning temporary debug-info might have changed in the meantime?

@dzhidzhoev
Copy link
Member Author

I feel we should have plan for fixing GenerateVarArgsThunk though, to ensure we're not backing into a corner we can't get out of.

I've opened an issue to track that #168082.

Or perhaps we can implement the originally-desired solution which was to defer when these functions are generated.

I vote in favor of this option, considering that #165032 wants to track types in retainedNodes, and clang relies on temporary nodes when creating composite types. I think we may have more problems if we try to resolve everything referenced by a DISubprogram eagerly. But this is the view from LLVM's side 🙂

@dzhidzhoev
Copy link
Member Author

dzhidzhoev commented Nov 14, 2025

The retained list is populated for optimized builds because an optimized build might optimize away the instructions that reference the parameter variables - and if we lose those, then we may emit the wrong signature for the function & callers could get the ABI wrong in debugger expression evaluation.

Interestingly, expr -i false -- (((CFs*)new CBdVfsImpl()))->ReqCacheHint(new CMsgAgent(), static_cast<CFs::CACHE_HINT>(0)) works well in lldb. And it jumps into the right thunk.
But frame variable -l doesn't show optimized-out parameters from within thunk, but shows them inside the base function.

@wolfy1961
Copy link
Collaborator

@wolfy1961 do you have any recollection of https://reviews.llvm.org/D39396 and whether the issue with cloning temporary debug-info might have changed in the meantime?

I'm not aware of any changes in this area but it's been a while since I did any work there, so my lack of knowledge may not mean much. Sorry I can't be of more help.

Copy link
Member

@jmorse jmorse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, although we should try and fix GenerateVarArgsThunk someday soon

@dzhidzhoev dzhidzhoev merged commit 82ba3f5 into llvm:main Nov 17, 2025
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

clang:codegen IR generation bugs: mangling, exceptions, etc. clang Clang issues not falling into any other category debuginfo

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants