Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DebugInfo][RemoveDIs] Handle a debug-info splicing corner case #73810

Closed
wants to merge 2 commits into from

Conversation

jmorse
Copy link
Member

@jmorse jmorse commented Nov 29, 2023

A large amount of complexity when it comes to shuffling DPValue objects around is pushed into BasicBlock::spliceDebugInfo, and it gets comprehensive testing there via the unit tests. It turns out that there's a corner case though: splicing instructions and debug-info to the end() iterator requires blocks of DPValues to be concatenated, but the DPValues don't behave normally as they're dangling at the end of a block. While this splicing-to-an-empty-block case is rare, and it's even rarer for it to contain debug-info, it does happen occasionally.

Fix this by wrapping spliceDebugInfo with an outer layer that removes any dangling DPValues in the destination block -- that way the main splicing function (renamed to spliceDebugInfoImpl) doesn't need to worry about that scenario. See the diagram in the added function for more info.

A large amount of complexity when it comes to shuffling DPValue objects
around is pushed into BasicBlock::spliceDebugInfo, and it gets
comprehensive testing there via the unit tests. It turns out that there's a
corner case though: splicing instructions and debug-info to the end()
iterator requires blocks of DPValues to be concatenated, but the DPValues
don't behave normally as they're dangling at the end of a block. While this
splicing-to-an-empty-block case is rare, and it's even rarer for it to
contain debug-info, it does happen occasionally.

Fix this by wrapping spliceDebugInfo with an outer layer that removes any
dangling DPValues in the destination block -- that way the main splicing
function (renamed to spliceDebugInfoImpl) doesn't need to worry about that
scenario. See the diagram in the added function for more info.
@llvmbot
Copy link
Collaborator

llvmbot commented Nov 29, 2023

@llvm/pr-subscribers-llvm-ir

@llvm/pr-subscribers-debuginfo

Author: Jeremy Morse (jmorse)

Changes

A large amount of complexity when it comes to shuffling DPValue objects around is pushed into BasicBlock::spliceDebugInfo, and it gets comprehensive testing there via the unit tests. It turns out that there's a corner case though: splicing instructions and debug-info to the end() iterator requires blocks of DPValues to be concatenated, but the DPValues don't behave normally as they're dangling at the end of a block. While this splicing-to-an-empty-block case is rare, and it's even rarer for it to contain debug-info, it does happen occasionally.

Fix this by wrapping spliceDebugInfo with an outer layer that removes any dangling DPValues in the destination block -- that way the main splicing function (renamed to spliceDebugInfoImpl) doesn't need to worry about that scenario. See the diagram in the added function for more info.


Full diff: https://github.com/llvm/llvm-project/pull/73810.diff

4 Files Affected:

  • (modified) llvm/include/llvm/IR/BasicBlock.h (+3)
  • (modified) llvm/include/llvm/IR/DebugProgramInstruction.h (+1)
  • (modified) llvm/lib/IR/BasicBlock.cpp (+84)
  • (modified) llvm/unittests/IR/BasicBlockDbgInfoTest.cpp (+144)
diff --git a/llvm/include/llvm/IR/BasicBlock.h b/llvm/include/llvm/IR/BasicBlock.h
index ec916acc25151c8..d40582e9c61691d 100644
--- a/llvm/include/llvm/IR/BasicBlock.h
+++ b/llvm/include/llvm/IR/BasicBlock.h
@@ -528,6 +528,9 @@ class BasicBlock final : public Value, // Basic blocks are data objects also
   void spliceDebugInfo(BasicBlock::iterator ToIt, BasicBlock *FromBB,
                        BasicBlock::iterator FromBeginIt,
                        BasicBlock::iterator FromEndIt);
+  void spliceDebugInfoImpl(BasicBlock::iterator ToIt, BasicBlock *FromBB,
+                           BasicBlock::iterator FromBeginIt,
+                           BasicBlock::iterator FromEndIt);
 
 public:
   /// Returns a pointer to the symbol table if one exists.
diff --git a/llvm/include/llvm/IR/DebugProgramInstruction.h b/llvm/include/llvm/IR/DebugProgramInstruction.h
index cfee2a87b75c7b5..2489a3cf7878ce1 100644
--- a/llvm/include/llvm/IR/DebugProgramInstruction.h
+++ b/llvm/include/llvm/IR/DebugProgramInstruction.h
@@ -286,6 +286,7 @@ class DPMarker {
   /// representation has been converted, and the ordering of DPValues is
   /// meaningful in the same was a dbg.values.
   simple_ilist<DPValue> StoredDPValues;
+  bool empty() const { return StoredDPValues.empty(); }
 
   const BasicBlock *getParent() const;
   BasicBlock *getParent();
diff --git a/llvm/lib/IR/BasicBlock.cpp b/llvm/lib/IR/BasicBlock.cpp
index 6c08ca1efc65288..41996b3b3b1c7b4 100644
--- a/llvm/lib/IR/BasicBlock.cpp
+++ b/llvm/lib/IR/BasicBlock.cpp
@@ -829,6 +829,90 @@ void BasicBlock::spliceDebugInfoEmptyBlock(BasicBlock::iterator Dest,
 void BasicBlock::spliceDebugInfo(BasicBlock::iterator Dest, BasicBlock *Src,
                                  BasicBlock::iterator First,
                                  BasicBlock::iterator Last) {
+  /* Do a quick normalisation before calling the real splice implementation. We
+     might be operating on a degenerate basic block that has no instructions
+     in it, a legitimate transient state. In that case, Dest will be end() and
+     any DPValues temporarily stored in the TrailingDPValues map in LLVMContext.
+     We might illustrate it thus:
+
+                         Dest
+                           |
+     this-block:    ~~~~~~~~
+      Src-block             ++++B---B---B---B:::C
+                                |               |
+                               First           Last
+
+     However: does the caller expect the "~" DPValues to end up before or after
+     the spliced segment? This is communciated in the "Head" bit of Dest, which
+     signals whether the caller called begin() or end() on this block.
+
+     If the head bit is set, then all is well, we leave DPValues trailing just
+     like how dbg.value instructions would trail after instructions spliced to
+     the beginning of this block.
+
+     If the head bit isn't set, then try to jam the "~" DPValues onto the front
+     of the First instruction, then splice like normal, which joins the "~"
+     DPValues with the "+" DPValues. However if the "+" DPValues are supposed to
+     be left behind in Src, then:
+      * detach the "+" DPValues,
+      * move the "~" DPValues onto First,
+      * splice like normal,
+      * replace the "+" DPValues onto the Last position.
+     Complicated, but gets the job done. */
+
+  // If we're inserting at end(), and not in front of dangling DPValues, then
+  // move the DPValues onto "First". They'll then be moved naturally in the
+  // splice process.
+  DPMarker *MoreDanglingDPValues = nullptr;
+  DPMarker *OurTrailingDPValues = getTrailingDPValues();
+  if (Dest == end() && !Dest.getHeadBit() && OurTrailingDPValues) {
+    // Are the "+" DPValues not supposed to move? If so, detach them
+    // temporarily.
+    if (!First.getHeadBit() && Src->getMarker(First) &&
+        !Src->getMarker(First)->empty()) {
+      MoreDanglingDPValues = Src->getMarker(First);
+      MoreDanglingDPValues->removeFromParent();
+    }
+
+    if (DPMarker *CurMarker = Src->getMarker(First);
+        CurMarker && !CurMarker->empty()) {
+      // Place them at the front, it would look like this:
+      //            Dest
+      //              |
+      // this-block:
+      // Src-block: ~~~~~~~~++++B---B---B---B:::C
+      //                        |               |
+      //                       First           Last
+      CurMarker->absorbDebugValues(*OurTrailingDPValues, true);
+      delete OurTrailingDPValues;
+    } else {
+      // No current marker, create one and absorb in. (FIXME: we can avoid an
+      // allocation in the future).
+      CurMarker = Src->createMarker(&*First);
+      CurMarker->absorbDebugValues(*OurTrailingDPValues, false);
+      delete OurTrailingDPValues;
+    }
+    deleteTrailingDPValues();
+    First.setHeadBit(true);
+  }
+
+  // Call the main debug-info-splicing implementation.
+  spliceDebugInfoImpl(Dest, Src, First, Last);
+
+  // Do we have some "+" DPValues hanging around that weren't supposed to move,
+  // and we detached to make things easier?
+  if (!MoreDanglingDPValues)
+    return;
+
+  // FIXME: we could avoid an allocation here sometimes.
+  DPMarker *LastMarker = Src->createMarker(Last);
+  LastMarker->absorbDebugValues(*MoreDanglingDPValues, true);
+  MoreDanglingDPValues->eraseFromParent();
+}
+
+void BasicBlock::spliceDebugInfoImpl(BasicBlock::iterator Dest, BasicBlock *Src,
+                                     BasicBlock::iterator First,
+                                     BasicBlock::iterator Last) {
   // Find out where to _place_ these dbg.values; if InsertAtHead is specified,
   // this will be at the start of Dest's debug value range, otherwise this is
   // just Dest's marker.
diff --git a/llvm/unittests/IR/BasicBlockDbgInfoTest.cpp b/llvm/unittests/IR/BasicBlockDbgInfoTest.cpp
index 481cd181d3848e7..ec8e2302a21730a 100644
--- a/llvm/unittests/IR/BasicBlockDbgInfoTest.cpp
+++ b/llvm/unittests/IR/BasicBlockDbgInfoTest.cpp
@@ -1110,5 +1110,149 @@ TEST(BasicBlockDbgInfoTest, DbgSpliceTrailing) {
   UseNewDbgInfoFormat = false;
 }
 
+// Similar to the above, what if we splice into an empty block with debug-info,
+// with debug-info at the start of the moving range, that we intend to be
+// transferred. The dbg.value of %a should remain at the start, but come ahead
+// of the i16 0 dbg.value.
+TEST(BasicBlockDbgInfoTest, DbgSpliceToEmpty1) {
+  LLVMContext C;
+  UseNewDbgInfoFormat = true;
+
+  std::unique_ptr<Module> M = parseIR(C, R"(
+    define i16 @f(i16 %a) !dbg !6 {
+    entry:
+      call void @llvm.dbg.value(metadata i16 %a, metadata !9, metadata !DIExpression()), !dbg !11
+      br label %exit
+
+    exit:
+      call void @llvm.dbg.value(metadata i16 0, metadata !9, metadata !DIExpression()), !dbg !11
+      %b = add i16 %a, 1, !dbg !11
+      call void @llvm.dbg.value(metadata i16 1, metadata !9, metadata !DIExpression()), !dbg !11
+      ret i16 0, !dbg !11
+    }
+    declare void @llvm.dbg.value(metadata, metadata, metadata) #0
+    attributes #0 = { nounwind readnone speculatable willreturn }
+
+    !llvm.dbg.cu = !{!0}
+    !llvm.module.flags = !{!5}
+
+    !0 = distinct !DICompileUnit(language: DW_LANG_C, file: !1, producer: "debugify", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, enums: !2)
+    !1 = !DIFile(filename: "t.ll", directory: "/")
+    !2 = !{}
+    !5 = !{i32 2, !"Debug Info Version", i32 3}
+    !6 = distinct !DISubprogram(name: "foo", linkageName: "foo", scope: null, file: !1, line: 1, type: !7, scopeLine: 1, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !0, retainedNodes: !8)
+    !7 = !DISubroutineType(types: !2)
+    !8 = !{!9}
+    !9 = !DILocalVariable(name: "1", scope: !6, file: !1, line: 1, type: !10)
+    !10 = !DIBasicType(name: "ty16", size: 16, encoding: DW_ATE_unsigned)
+    !11 = !DILocation(line: 1, column: 1, scope: !6)
+)");
+
+  Function &F = *M->getFunction("f");
+  BasicBlock &Entry = F.getEntryBlock();
+  BasicBlock &Exit = *Entry.getNextNode();
+  M->convertToNewDbgValues();
+
+  // Begin by forcing entry block to have dangling DPValue.
+  Entry.getTerminator()->eraseFromParent();
+  ASSERT_NE(Entry.getTrailingDPValues(), nullptr);
+  EXPECT_TRUE(Entry.empty());
+
+  // Now transfer the entire contents of the exit block into the entry. This
+  // includes both dbg.values.
+  Entry.splice(Entry.end(), &Exit, Exit.begin(), Exit.end());
+
+  // We should now have two dbg.values on the first instruction, and they
+  // should be in the correct order of %a, then 0.
+  Instruction *BInst = &*Entry.begin();
+  ASSERT_TRUE(BInst->hasDbgValues());
+  EXPECT_EQ(BInst->DbgMarker->StoredDPValues.size(), 2u);
+  SmallVector<DPValue *, 2> DPValues;
+  for (DPValue &DPV : BInst->getDbgValueRange())
+    DPValues.push_back(&DPV);
+
+  EXPECT_EQ(DPValues[0]->getVariableLocationOp(0), F.getArg(0));
+  Value *SecondDPVValue = DPValues[1]->getVariableLocationOp(0);
+  ASSERT_TRUE(isa<ConstantInt>(SecondDPVValue));
+  EXPECT_EQ(cast<ConstantInt>(SecondDPVValue)->getZExtValue(), 0ull);
+
+  // No trailing DPValues in the entry block now.
+  EXPECT_EQ(Entry.getTrailingDPValues(), nullptr);
+
+  UseNewDbgInfoFormat = false;
+}
+
+// Similar test again, but this time: splice the contents of exit into entry,
+// with the intention of leaving the first dbg.value (i16 0) behind.
+TEST(BasicBlockDbgInfoTest, DbgSpliceToEmpty2) {
+  LLVMContext C;
+  UseNewDbgInfoFormat = true;
+
+  std::unique_ptr<Module> M = parseIR(C, R"(
+    define i16 @f(i16 %a) !dbg !6 {
+    entry:
+      call void @llvm.dbg.value(metadata i16 %a, metadata !9, metadata !DIExpression()), !dbg !11
+      br label %exit
+
+    exit:
+      call void @llvm.dbg.value(metadata i16 0, metadata !9, metadata !DIExpression()), !dbg !11
+      %b = add i16 %a, 1, !dbg !11
+      call void @llvm.dbg.value(metadata i16 1, metadata !9, metadata !DIExpression()), !dbg !11
+      ret i16 0, !dbg !11
+    }
+    declare void @llvm.dbg.value(metadata, metadata, metadata) #0
+    attributes #0 = { nounwind readnone speculatable willreturn }
+
+    !llvm.dbg.cu = !{!0}
+    !llvm.module.flags = !{!5}
+
+    !0 = distinct !DICompileUnit(language: DW_LANG_C, file: !1, producer: "debugify", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, enums: !2)
+    !1 = !DIFile(filename: "t.ll", directory: "/")
+    !2 = !{}
+    !5 = !{i32 2, !"Debug Info Version", i32 3}
+    !6 = distinct !DISubprogram(name: "foo", linkageName: "foo", scope: null, file: !1, line: 1, type: !7, scopeLine: 1, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !0, retainedNodes: !8)
+    !7 = !DISubroutineType(types: !2)
+    !8 = !{!9}
+    !9 = !DILocalVariable(name: "1", scope: !6, file: !1, line: 1, type: !10)
+    !10 = !DIBasicType(name: "ty16", size: 16, encoding: DW_ATE_unsigned)
+    !11 = !DILocation(line: 1, column: 1, scope: !6)
+)");
+
+  Function &F = *M->getFunction("f");
+  BasicBlock &Entry = F.getEntryBlock();
+  BasicBlock &Exit = *Entry.getNextNode();
+  M->convertToNewDbgValues();
+
+  // Begin by forcing entry block to have dangling DPValue.
+  Entry.getTerminator()->eraseFromParent();
+  ASSERT_NE(Entry.getTrailingDPValues(), nullptr);
+  EXPECT_TRUE(Entry.empty());
+
+  // Now transfer into the entry block -- fetching the first instruction with
+  // begin and then calling getIterator clears the "head" bit, meaning that the
+  // range to move will not include any leading DPValues.
+  Entry.splice(Entry.end(), &Exit, Exit.begin()->getIterator(), Exit.end());
+
+  // We should now have one dbg.values on the first instruction, %a.
+  Instruction *BInst = &*Entry.begin();
+  ASSERT_TRUE(BInst->hasDbgValues());
+  EXPECT_EQ(BInst->DbgMarker->StoredDPValues.size(), 1u);
+  SmallVector<DPValue *, 2> DPValues;
+  for (DPValue &DPV : BInst->getDbgValueRange())
+    DPValues.push_back(&DPV);
+
+  EXPECT_EQ(DPValues[0]->getVariableLocationOp(0), F.getArg(0));
+  // No trailing DPValues in the entry block now.
+  EXPECT_EQ(Entry.getTrailingDPValues(), nullptr);
+
+  // We should have nothing left in the exit block...
+  EXPECT_TRUE(Exit.empty());
+  // ... except for some dangling DPValues.
+  EXPECT_NE(Exit.getTrailingDPValues(), nullptr);
+  EXPECT_FALSE(Exit.getTrailingDPValues()->empty());
+  Exit.deleteTrailingDPValues();
+
+  UseNewDbgInfoFormat = false;
+}
 } // End anonymous namespace.
 #endif // EXPERIMENTAL_DEBUGINFO_ITERATORS

Copy link
Contributor

@OCHyams OCHyams left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

The enjoyable diagrams helped make the process clear (again), thanks for those!

Src-block ++++B---B---B---B:::C
| |
First Last

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Src-block: (added :)

Comment on lines 871 to 872
if (!First.getHeadBit() && Src->getMarker(First) &&
!Src->getMarker(First)->empty()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (!First.getHeadBit() && Src->getMarker(First) &&
!Src->getMarker(First)->empty()) {
if (!First.getHeadBit() && Src->hadDbgValues()) {

Comment on lines 877 to 878
if (DPMarker *CurMarker = Src->getMarker(First);
CurMarker && !CurMarker->empty()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (DPMarker *CurMarker = Src->getMarker(First);
CurMarker && !CurMarker->empty()) {
if (Src->hasDbgValues()) {
DPMarker *CurMarker = Src->getMarker();

nit

// allocation in the future).
CurMarker = Src->createMarker(&*First);
CurMarker->absorbDebugValues(*OurTrailingDPValues, false);
delete OurTrailingDPValues;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason OurTrailingDPValues is deleted and MoreDanglingDPValues is cleaned up with eraseFromParent?

jmorse added a commit that referenced this pull request Dec 1, 2023
A large amount of complexity when it comes to shuffling DPValue objects
around is pushed into BasicBlock::spliceDebugInfo, and it gets
comprehensive testing there via the unit tests. It turns out that there's a
corner case though: splicing instructions and debug-info to the end()
iterator requires blocks of DPValues to be concatenated, but the DPValues
don't behave normally as they're dangling at the end of a block. While this
splicing-to-an-empty-block case is rare, and it's even rarer for it to
contain debug-info, it does happen occasionally.

Fix this by wrapping spliceDebugInfo with an outer layer that removes any
dangling DPValues in the destination block -- that way the main splicing
function (renamed to spliceDebugInfoImpl) doesn't need to worry about that
scenario. See the diagram in the added function for more info.
@jmorse
Copy link
Member Author

jmorse commented Dec 1, 2023

Followed feedback but merged in 37f2f48 to avoid mangling the merge conflict.

@jmorse jmorse closed this Dec 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants