Skip to content

Conversation

bjope
Copy link
Collaborator

@bjope bjope commented Sep 8, 2025

Inliner/IROutliner/CodeExtractor all uses the
updateLoopMetadataDebugLocations helper in order to modify debug
location related to loop metadata. However, the helper has only
been updating DILocation nodes found as operands to the first level
of the MD_loop metadata. There could however be more DILocations
as part of the various kinds of followup metadata. A typical example
would be llvm.loop metadata like this

!6 = distinct !{!6, !7, !8, !9, !10, !11}
!7 = !DILocation(line: 6, column: 3, scope: !3)
!8 = !DILocation(line: 7, column: 22, scope: !3)
!11 = !{!"llvm.loop.distribute.followup_all", !7, !8, ..., !14}
!14 = !{!"llvm.loop.vectorize.followup_all", !7, !8, ...}

Instead of just updating !7 and !8 in !6, this patch make sure that
we now recursively update the DILocations in !11 and !14 as well.

Fixes #141568

Adding test case showing that we do not update DILocations in
followup loop metadata when inlining.
…ocations

Inliner/IROutliner/CodeExtractor all uses the
updateLoopMetadataDebugLocations helper in order to modify debug
location related to loop metadata. However, the helper has only
been updating DILocation nodes found as operands to the first level
of the MD_loop metadata. There could however be more DILocations
as part of the various kinds of followup metadata. A typical example
would be llvm.loop metadata like this

  !6 = distinct !{!6, !7, !8, !9, !10, !11}
  !7 = !DILocation(line: 6, column: 3, scope: !3)
  !8 = !DILocation(line: 7, column: 22, scope: !3)
  !11 = !{!"llvm.loop.distribute.followup_all", !7, !8, ..., !14}
  !14 = !{!"llvm.loop.vectorize.followup_all", !7, !8, ...}

Instead of just updating !7 and !8 in !6, this patch make sure that
we now recursively update the DILocations in !11 and !14 as well.

Fixes llvm#141568
@llvmbot
Copy link
Member

llvmbot commented Sep 8, 2025

@llvm/pr-subscribers-llvm-transforms
@llvm/pr-subscribers-llvm-ir

@llvm/pr-subscribers-debuginfo

Author: Björn Pettersson (bjope)

Changes

Inliner/IROutliner/CodeExtractor all uses the
updateLoopMetadataDebugLocations helper in order to modify debug
location related to loop metadata. However, the helper has only
been updating DILocation nodes found as operands to the first level
of the MD_loop metadata. There could however be more DILocations
as part of the various kinds of followup metadata. A typical example
would be llvm.loop metadata like this

!6 = distinct !{!6, !7, !8, !9, !10, !11}
!7 = !DILocation(line: 6, column: 3, scope: !3)
!8 = !DILocation(line: 7, column: 22, scope: !3)
!11 = !{!"llvm.loop.distribute.followup_all", !7, !8, ..., !14}
!14 = !{!"llvm.loop.vectorize.followup_all", !7, !8, ...}

Instead of just updating !7 and !8 in !6, this patch make sure that
we now recursively update the DILocations in !11 and !14 as well.

Fixes #141568


Full diff: https://github.com/llvm/llvm-project/pull/157557.diff

2 Files Affected:

  • (modified) llvm/lib/IR/DebugInfo.cpp (+30-2)
  • (added) llvm/test/Transforms/Inline/dilocation-loop-metadata-update.ll (+83)
diff --git a/llvm/lib/IR/DebugInfo.cpp b/llvm/lib/IR/DebugInfo.cpp
index 166521a276643..2ce66d00b8ef9 100644
--- a/llvm/lib/IR/DebugInfo.cpp
+++ b/llvm/lib/IR/DebugInfo.cpp
@@ -375,6 +375,32 @@ bool DebugInfoFinder::addScope(DIScope *Scope) {
   return true;
 }
 
+// Recursively handle DILocations in followup metadata etc.
+static Metadata *updateLoopMetadataDebugLocationsRecursive(
+    Metadata *MetadataIn, function_ref<Metadata *(Metadata *)> Updater) {
+  const MDNode *M = dyn_cast_or_null<MDNode>(MetadataIn);
+  // The loop metadata options should start with a MDString.
+  if (!M || M->getNumOperands() < 1 || !isa<MDString>(M->getOperand(0)))
+    return nullptr;
+
+  bool Updated = false;
+  SmallVector<Metadata *, 4> MDs {M->getOperand(0)};
+  for (Metadata *MD : llvm::drop_begin(M->operands())) {
+    if (!MD) {
+      MDs.push_back(nullptr);
+    } else if (Metadata *NewMD = updateLoopMetadataDebugLocationsRecursive(
+                   MD, Updater)) {
+      MDs.push_back(NewMD);
+      Updated = true;
+    } else if (Metadata *NewMD = Updater(MD)) {
+      MDs.push_back(NewMD);
+      Updated |= NewMD != MD;
+    }
+  }
+
+  return Updated ? MDNode::get(M->getContext(), MDs) : nullptr;
+}
+
 static MDNode *updateLoopMetadataDebugLocationsImpl(
     MDNode *OrigLoopID, function_ref<Metadata *(Metadata *)> Updater) {
   assert(OrigLoopID && OrigLoopID->getNumOperands() > 0 &&
@@ -385,10 +411,12 @@ static MDNode *updateLoopMetadataDebugLocationsImpl(
   // Save space for the self-referential LoopID.
   SmallVector<Metadata *, 4> MDs = {nullptr};
 
-  for (unsigned i = 1; i < OrigLoopID->getNumOperands(); ++i) {
-    Metadata *MD = OrigLoopID->getOperand(i);
+  for (Metadata *MD : llvm::drop_begin(OrigLoopID->operands())) {
     if (!MD)
       MDs.push_back(nullptr);
+    else if (Metadata *NewMD = updateLoopMetadataDebugLocationsRecursive(
+                 MD, Updater))
+      MDs.push_back(NewMD);
     else if (Metadata *NewMD = Updater(MD))
       MDs.push_back(NewMD);
   }
diff --git a/llvm/test/Transforms/Inline/dilocation-loop-metadata-update.ll b/llvm/test/Transforms/Inline/dilocation-loop-metadata-update.ll
new file mode 100644
index 0000000000000..1bc132663331b
--- /dev/null
+++ b/llvm/test/Transforms/Inline/dilocation-loop-metadata-update.ll
@@ -0,0 +1,83 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+; RUN: opt < %s -passes=inline -S | FileCheck %s
+
+; When inlining we need to update DILocation recursively for the followup
+; metadata when updating llvm.loop metadata.
+
+define void @a() !dbg !3 {
+; CHECK-LABEL: define void @a(
+; CHECK-SAME: ) !dbg [[DBG3:![0-9]+]] {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    br label %[[FOR_BODY:.*]]
+; CHECK:       [[FOR_BODY]]:
+; CHECK-NEXT:    br label %[[FOR_BODY]], !llvm.loop [[LOOP6:![0-9]+]]
+;
+entry:
+  br label %for.body
+
+for.body:                                         ; preds = %for.body, %entry
+  br label %for.body, !llvm.loop !6
+}
+
+define void @f() !dbg !17 {
+; CHECK-LABEL: define void @f(
+; CHECK-SAME: ) !dbg [[DBG17:![0-9]+]] {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    br label %[[A_EXIT:.*]]
+; CHECK:       [[A_EXIT]]:
+; CHECK-NEXT:    br label %[[A_EXIT]], !llvm.loop [[LOOP18:![0-9]+]]
+; CHECK:       [[A_EXIT1:.*:]]
+; CHECK-NEXT:    ret void
+;
+entry:
+  call void @a(), !dbg !18
+  ret void
+}
+
+!llvm.dbg.cu = !{!0}
+!llvm.module.flags = !{!2}
+
+!0 = distinct !DICompileUnit(language: DW_LANG_C11, file: !1, producer: "clang", isOptimized: true, runtimeVersion: 0, emissionKind: NoDebug, splitDebugInlining: false, nameTableKind: None)
+!1 = !DIFile(filename: "foo.c", directory: "/")
+!2 = !{i32 2, !"Debug Info Version", i32 3}
+!3 = distinct !DISubprogram(name: "a", scope: !1, file: !1, line: 3, type: !4, scopeLine: 3, flags: DIFlagPrototyped, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !0)
+!4 = !DISubroutineType(types: !5)
+!5 = !{}
+!6 = distinct !{!6, !7, !8, !9, !10, !11}
+!7 = !DILocation(line: 6, column: 3, scope: !3)
+!8 = !DILocation(line: 7, column: 22, scope: !3)
+!9 = !{!"llvm.loop.mustprogress"}
+!10 = !{!"llvm.loop.distribute.enable", i1 true}
+!11 = !{!"llvm.loop.distribute.followup_all", !7, !8, !9, !12, !13, !14}
+!12 = !{!"llvm.loop.vectorize.width", i32 8}
+!13 = !{!"llvm.loop.vectorize.enable", i1 true}
+!14 = !{!"llvm.loop.vectorize.followup_all", !7, !8, !9, !15, !16}
+!15 = !{!"llvm.loop.isvectorized"}
+!16 = !{!"llvm.loop.unroll.count", i32 1}
+!17 = distinct !DISubprogram(name: "f", scope: !1, file: !1, line: 9, type: !4, scopeLine: 9, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !0)
+!18 = !DILocation(line: 9, column: 12, scope: !17)
+;.
+; CHECK: [[META0:![0-9]+]] = distinct !DICompileUnit(language: DW_LANG_C11, file: [[META1:![0-9]+]], producer: "clang", isOptimized: true, runtimeVersion: 0, emissionKind: NoDebug, splitDebugInlining: false, nameTableKind: None)
+; CHECK: [[META1]] = !DIFile(filename: "{{.*}}foo.c", directory: {{.*}})
+; CHECK: [[DBG3]] = distinct !DISubprogram(name: "a", scope: [[META1]], file: [[META1]], line: 3, type: [[META4:![0-9]+]], scopeLine: 3, flags: DIFlagPrototyped, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: [[META0]])
+; CHECK: [[META4]] = !DISubroutineType(types: [[META5:![0-9]+]])
+; CHECK: [[META5]] = !{}
+; CHECK: [[LOOP6]] = distinct !{[[LOOP6]], [[META7:![0-9]+]], [[META8:![0-9]+]], [[META9:![0-9]+]], [[META10:![0-9]+]], [[META11:![0-9]+]]}
+; CHECK: [[META7]] = !DILocation(line: 6, column: 3, scope: [[DBG3]])
+; CHECK: [[META8]] = !DILocation(line: 7, column: 22, scope: [[DBG3]])
+; CHECK: [[META9]] = !{!"llvm.loop.mustprogress"}
+; CHECK: [[META10]] = !{!"llvm.loop.distribute.enable", i1 true}
+; CHECK: [[META11]] = !{!"llvm.loop.distribute.followup_all", [[META7]], [[META8]], [[META9]], [[META12:![0-9]+]], [[META13:![0-9]+]], [[META14:![0-9]+]]}
+; CHECK: [[META12]] = !{!"llvm.loop.vectorize.width", i32 8}
+; CHECK: [[META13]] = !{!"llvm.loop.vectorize.enable", i1 true}
+; CHECK: [[META14]] = !{!"llvm.loop.vectorize.followup_all", [[META7]], [[META8]], [[META9]], [[META15:![0-9]+]], [[META16:![0-9]+]]}
+; CHECK: [[META15]] = !{!"llvm.loop.isvectorized"}
+; CHECK: [[META16]] = !{!"llvm.loop.unroll.count", i32 1}
+; CHECK: [[DBG17]] = distinct !DISubprogram(name: "f", scope: [[META1]], file: [[META1]], line: 9, type: [[META4]], scopeLine: 9, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: [[META0]])
+; CHECK: [[LOOP18]] = distinct !{[[LOOP18]], [[META19:![0-9]+]], [[META21:![0-9]+]], [[META9]], [[META10]], [[META22:![0-9]+]]}
+; CHECK: [[META19]] = !DILocation(line: 6, column: 3, scope: [[DBG3]], inlinedAt: [[META20:![0-9]+]])
+; CHECK: [[META20]] = distinct !DILocation(line: 9, column: 12, scope: [[DBG17]])
+; CHECK: [[META21]] = !DILocation(line: 7, column: 22, scope: [[DBG3]], inlinedAt: [[META20]])
+; CHECK: [[META22]] = !{!"llvm.loop.distribute.followup_all", [[META19]], [[META21]], [[META9]], [[META12]], [[META13]], [[META23:![0-9]+]]}
+; CHECK: [[META23]] = !{!"llvm.loop.vectorize.followup_all", [[META19]], [[META21]], [[META9]], [[META15]], [[META16]]}
+;.

Copy link

github-actions bot commented Sep 8, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@bjope bjope requested a review from Meinersbur September 10, 2025 13:59
@bjope bjope requested review from jmorse, SLTozer and nikic September 20, 2025 20:27
@bjope
Copy link
Collaborator Author

bjope commented Sep 23, 2025

Ping! Looking for someone who wants to review this :-)

Copy link
Contributor

@SLTozer SLTozer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One small question - is a "visited" list necessary to avoid updating the same metadata multiple times? It might not be an issue, or even something that already happens, but it does seem more likely that we could end up updating the same DILocation multiple times from your example.

Aside from that, and some inline questions, this LGTM.

// Recursively handle DILocations in followup metadata etc.
static Metadata *updateLoopMetadataDebugLocationsRecursive(
Metadata *MetadataIn, function_ref<Metadata *(Metadata *)> Updater) {
const MDNode *M = dyn_cast_or_null<MDNode>(MetadataIn);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor detail, but I think this should probably be MDTuple rather than MDNode - we're already implicitly assuming an MDTuple when we choose to replace it with MDNode::get below, so better to make that explicit.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in the latest update of this PR.

}
}

return Updated ? MDNode::get(M->getContext(), MDs) : nullptr;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it guaranteed that there are no distinct loop metadata options? I can't see a reason they would exist, but may be worth adding an assert to that effect anyway.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added an assert in the latest update of this PR.

}
}

return Updated ? MDNode::get(M->getContext(), MDs) : nullptr;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return Updated ? MDNode::get(M->getContext(), MDs) : nullptr;
return Updated ? MDNode::get(M->getContext(), MDs) : MetadataIn;

[suggestion] I would return the old MD if there is no change. The caller can still check whether it is the same, but is not required to.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I've changed it to return the input node if not making changes, and then I could collapse the if-else logic a bit.

updateLoopMetadataDebugLocationsRecursive(MD, Updater)) {
MDs.push_back(NewMD);
Updated = true;
} else if (Metadata *NewMD = Updater(MD)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, Updater returning nullptr means to drop the node?

Updated should be set to true in that case since it has changed.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice find. This was fixed in the latest update of the PR.

Let updateLoopMetadataDebugLocationsRecursive return the input
node unless making changes.
@bjope
Copy link
Collaborator Author

bjope commented Sep 24, 2025

One small question - is a "visited" list necessary to avoid updating the same metadata multiple times? It might not be an issue, or even something that already happens, but it does seem more likely that we could end up updating the same DILocation multiple times from your example.

We may end up finding same DILocation multiple times, but that could happen also without the recursion. And I think we are expected to use the Updater callback for each.

I guess there could be a problem if there are self-references/cycles in the followup tuples, such as:

!11 = !{!"llvm.loop.distribute.followup_all", !7, !8, !9, !12, !13, !11}

or

!11 = !{!"llvm.loop.distribute.followup_all", !7, !8, !9, !12, !13, !14}
!14 = !{!"llvm.loop.vectorize.followup_all", !7, !8, !9, !15, !16, !11}

I wonder if that is supported. And also not quite sure how to deal with it. We could perhaps detect it and use and assert instead of infinite recursion. Getting complicated if we want to track and fixup such cycles based on if there were any "Updates" or not.

@bjope bjope requested review from Meinersbur and SLTozer September 25, 2025 08:36
Copy link
Member

@Meinersbur Meinersbur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have never considered loop properties to be recursive. Recursive MDNodes cannot be uniqued1 and hence I would avoid creating them. Clang does not create them. I would not require handling those in this PR, it can be fixed when there is a use case for such metadata. But a TODO comment could be helpful for anyone encountering an endless recursion here. If you decide to add handling for such nodes, be sure to add a test case since otherwise it will bitrot.

And also not quite sure how to deal with it.

Add a visited DenseMap argument remembering the previous result of Updater. This of course assumes the Updater itself is not recursive. It's not worth maintaining the DenseMap just for an assert that gets removed in release build. If it occurs, a fatal error will occur anyway.

Footnotes

  1. This is why loop metadata has a recursive first element, it was meant to be a "LoopID" before MDNode supported distinct. Yet, it cannot be used as an identifier for a specific loop because various passes (e.g inline) clone entire loops without creating a new "LoopID" for each new loop. I'd rather understand it as a bag of properties.

Comment on lines 389 to 391
if (!MD)
MDs.push_back(nullptr);
else {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (!MD)
MDs.push_back(nullptr);
else {
if (!MD) {
MDs.push_back(nullptr);
continue;
}

[style] Use Early Exits and continue to Simplify Code

@bjope
Copy link
Collaborator Author

bjope commented Sep 25, 2025

I have never considered loop properties to be recursive. Recursive MDNodes cannot be uniqued1 and hence I would avoid creating them. Clang does not create them. I would not require handling those in this PR, it can be fixed when there is a use case for such metadata. But a TODO comment could be helpful for anyone encountering an endless recursion here. If you decide to add handling for such nodes, be sure to add a test case since otherwise it will bitrot.

Thanks for the help and the review!

I've added a TODO for now. The pre-commit checks seems to be struggling a bit, but I think that will resolve itself after some rebasing against main...

@bjope bjope enabled auto-merge (squash) September 25, 2025 12:45
@bjope bjope merged commit 0956feb into llvm:main Sep 25, 2025
9 checks passed
mahesh-attarde pushed a commit to mahesh-attarde/llvm-project that referenced this pull request Oct 3, 2025
…ocations (llvm#157557)

Inliner/IROutliner/CodeExtractor all uses the
updateLoopMetadataDebugLocations helper in order to modify debug
location related to loop metadata. However, the helper has only
been updating DILocation nodes found as operands to the first level
of the MD_loop metadata. There could however be more DILocations
as part of the various kinds of followup metadata. A typical example
would be llvm.loop metadata like this

  !6 = distinct !{!6, !7, !8, !9, !10, !11}
  !7 = !DILocation(line: 6, column: 3, scope: !3)
  !8 = !DILocation(line: 7, column: 22, scope: !3)
  !11 = !{!"llvm.loop.distribute.followup_all", !7, !8, ..., !14}
  !14 = !{!"llvm.loop.vectorize.followup_all", !7, !8, ...}

Instead of just updating !7 and !8 in !6, this patch make sure that
we now recursively update the DILocations in !11 and !14 as well.

Fixes llvm#141568
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

"!dbg attachment points at wrong subprogram for function" after InlinerPass
4 participants