Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RemoveDIs][DebugInfo] Verifier and printing fixes for DPLabel #83242

Merged
merged 3 commits into from
Mar 4, 2024

Conversation

SLTozer
Copy link
Contributor

@SLTozer SLTozer commented Feb 28, 2024

DPLabel, the RemoveDI version of llvm.dbg.label, was landed recently at the same time the RemoveDIs IR printing and verifier patches were landing. The patches were updated to not miscompile, but did not have full-featured and correct support for DPLabel built in; this patch makes the remaining set of changes to enable DPLabel support.

@SLTozer SLTozer self-assigned this Feb 28, 2024
@llvmbot
Copy link
Collaborator

llvmbot commented Feb 28, 2024

@llvm/pr-subscribers-llvm-ir

@llvm/pr-subscribers-debuginfo

Author: Stephen Tozer (SLTozer)

Changes

DPLabel, the RemoveDI version of llvm.dbg.label, was landed recently at the same time the RemoveDIs IR printing and verifier patches were landing. The patches were updated to not miscompile, but did not have full-featured and correct support for DPLabel built in; this patch makes the remaining set of changes to enable DPLabel support.


Full diff: https://github.com/llvm/llvm-project/pull/83242.diff

5 Files Affected:

  • (modified) llvm/include/llvm/IR/DebugProgramInstruction.h (+5-7)
  • (modified) llvm/lib/IR/AsmWriter.cpp (+2)
  • (modified) llvm/lib/IR/DebugProgramInstruction.cpp (+11-1)
  • (modified) llvm/lib/IR/Verifier.cpp (+49-13)
  • (modified) llvm/test/DebugInfo/print-non-instruction-debug-info.ll (+2-2)
diff --git a/llvm/include/llvm/IR/DebugProgramInstruction.h b/llvm/include/llvm/IR/DebugProgramInstruction.h
index 2dd546ce709c76..3b35c7bd45631a 100644
--- a/llvm/include/llvm/IR/DebugProgramInstruction.h
+++ b/llvm/include/llvm/IR/DebugProgramInstruction.h
@@ -174,13 +174,10 @@ inline raw_ostream &operator<<(raw_ostream &OS, const DbgRecord &R) {
 /// llvm.dbg.label intrinsic.
 /// FIXME: Rename DbgLabelRecord when DPValue is renamed to DbgVariableRecord.
 class DPLabel : public DbgRecord {
-  DILabel *Label;
+  TrackingMDNodeRef Label;
 
 public:
-  DPLabel(DILabel *Label, DebugLoc DL)
-      : DbgRecord(LabelKind, DL), Label(Label) {
-    assert(Label && "Unexpected nullptr");
-  }
+  DPLabel(DILabel *Label, DebugLoc DL);
 
   DPLabel *clone() const;
   void print(raw_ostream &O, bool IsForDebug = false) const;
@@ -188,8 +185,9 @@ class DPLabel : public DbgRecord {
   DbgLabelInst *createDebugIntrinsic(Module *M,
                                      Instruction *InsertBefore) const;
 
-  void setLabel(DILabel *NewLabel) { Label = NewLabel; }
-  DILabel *getLabel() const { return Label; }
+  void setLabel(DILabel *NewLabel);
+  DILabel *getLabel() const;
+  MDNode *getRawLabel() const { return Label; };
 
   /// Support type inquiry through isa, cast, and dyn_cast.
   static bool classof(const DbgRecord *E) {
diff --git a/llvm/lib/IR/AsmWriter.cpp b/llvm/lib/IR/AsmWriter.cpp
index 4e1e48b4ad4a35..9c209c7fb4d46e 100644
--- a/llvm/lib/IR/AsmWriter.cpp
+++ b/llvm/lib/IR/AsmWriter.cpp
@@ -4660,6 +4660,8 @@ void AssemblyWriter::printDPLabel(const DPLabel &Label) {
   auto WriterCtx = getContext();
   Out << "#dbg_label(";
   WriteAsOperandInternal(Out, Label.getLabel(), WriterCtx, true);
+  Out << ", ";
+  WriteAsOperandInternal(Out, Label.getDebugLoc(), WriterCtx, true);
   Out << ")";
 }
 
diff --git a/llvm/lib/IR/DebugProgramInstruction.cpp b/llvm/lib/IR/DebugProgramInstruction.cpp
index 3a8b94a87bbcf0..987f1c8f636e6a 100644
--- a/llvm/lib/IR/DebugProgramInstruction.cpp
+++ b/llvm/lib/IR/DebugProgramInstruction.cpp
@@ -123,6 +123,14 @@ DbgRecord::createDebugIntrinsic(Module *M, Instruction *InsertBefore) const {
   llvm_unreachable("unsupported DbgRecord kind");
 }
 
+DPLabel::DPLabel(DILabel *Label, DebugLoc DL)
+    : DbgRecord(LabelKind, DL), Label(Label) {
+  assert(Label && "Unexpected nullptr");
+}
+
+void DPLabel::setLabel(DILabel *NewLabel) { Label.reset(NewLabel); }
+DILabel *DPLabel::getLabel() const { return cast<DILabel>(Label); }
+
 DPValue *DPValue::createDPValue(Value *Location, DILocalVariable *DV,
                                 DIExpression *Expr, const DILocation *DI) {
   return new DPValue(ValueAsMetadata::get(Location), DV, Expr, DI,
@@ -337,7 +345,9 @@ DbgRecord *DbgRecord::clone() const {
 
 DPValue *DPValue::clone() const { return new DPValue(*this); }
 
-DPLabel *DPLabel::clone() const { return new DPLabel(Label, getDebugLoc()); }
+DPLabel *DPLabel::clone() const {
+  return new DPLabel(getLabel(), getDebugLoc());
+}
 
 DbgVariableIntrinsic *
 DPValue::createDebugIntrinsic(Module *M, Instruction *InsertBefore) const {
diff --git a/llvm/lib/IR/Verifier.cpp b/llvm/lib/IR/Verifier.cpp
index 3741e5deaa4cd1..2a93e47fb9c7df 100644
--- a/llvm/lib/IR/Verifier.cpp
+++ b/llvm/lib/IR/Verifier.cpp
@@ -547,6 +547,7 @@ class Verifier : public InstVisitor<Verifier>, VerifierSupport {
 
   void visitTemplateParams(const MDNode &N, const Metadata &RawParams);
 
+  void visit(DPLabel &DPL);
   void visit(DPValue &DPV);
   // InstVisitor overrides...
   using InstVisitor<Verifier>::visit;
@@ -683,10 +684,17 @@ void Verifier::visitDbgRecords(Instruction &I) {
           &I);
   CheckDI(!isa<PHINode>(&I) || !I.hasDbgValues(),
           "PHI Node must not have any attached DbgRecords", &I);
-  for (DPValue &DPV : DPValue::filter(I.getDbgValueRange())) {
-    CheckDI(DPV.getMarker() == I.DbgMarker, "DbgRecord had invalid DbgMarker",
-            &I, &DPV);
-    visit(DPV);
+  for (DbgRecord &DR : I.getDbgValueRange()) {
+    CheckDI(DR.getMarker() == I.DbgMarker, "DbgRecord had invalid DbgMarker",
+            &I, &DR);
+    if (MDNode *N = DR.getDebugLoc().getAsMDNode()) {
+      CheckDI(isa<DILocation>(N), "invalid #dbg record location", &DR, N);
+      visitDILocation(*cast<DILocation>(N));
+    }
+    if (auto *DPV = dyn_cast<DPValue>(&DR))
+      visit(*DPV);
+    else if (auto *DPL = dyn_cast<DPLabel>(&DR))
+      visit(*DPL);
   }
 }
 
@@ -6187,6 +6195,34 @@ static DISubprogram *getSubprogram(Metadata *LocalScope) {
   return nullptr;
 }
 
+void Verifier::visit(DPLabel &DPL) {
+  CheckDI(isa<DILabel>(DPL.getRawLabel()),
+          "invalid #dbg_label intrinsic variable", &DPL, DPL.getRawLabel());
+
+  // Ignore broken !dbg attachments; they're checked elsewhere.
+  if (MDNode *N = DPL.getDebugLoc().getAsMDNode())
+    if (!isa<DILocation>(N))
+      return;
+
+  BasicBlock *BB = DPL.getParent();
+  Function *F = BB ? BB->getParent() : nullptr;
+
+  // The scopes for variables and !dbg attachments must agree.
+  DILabel *Label = DPL.getLabel();
+  DILocation *Loc = DPL.getDebugLoc();
+  Check(Loc, "#dbg_label record requires a !dbg attachment", &DPL, BB, F);
+
+  DISubprogram *LabelSP = getSubprogram(Label->getRawScope());
+  DISubprogram *LocSP = getSubprogram(Loc->getRawScope());
+  if (!LabelSP || !LocSP)
+    return;
+
+  CheckDI(LabelSP == LocSP,
+          "mismatched subprogram between #dbg_label label and !dbg attachment",
+          &DPL, BB, F, Label, Label->getScope()->getSubprogram(), Loc,
+          Loc->getScope()->getSubprogram());
+}
+
 void Verifier::visit(DPValue &DPV) {
   CheckDI(DPV.getType() == DPValue::LocationType::Value ||
               DPV.getType() == DPValue::LocationType::Declare ||
@@ -6223,16 +6259,20 @@ void Verifier::visit(DPValue &DPV) {
               "inst not in same function as #dbg_assign", I, &DPV);
   }
 
-  if (MDNode *N = DPV.getDebugLoc().getAsMDNode()) {
-    CheckDI(isa<DILocation>(N), "invalid #dbg record location", &DPV, N);
-    visitDILocation(*cast<DILocation>(N));
-  }
+  // This check is redundant with one in visitLocalVariable().
+  DILocalVariable *Var = DPV.getVariable();
+  CheckDI(isType(Var->getRawType()), "invalid type ref", Var,
+          Var->getRawType());
+
+  // Ignore broken !dbg attachments; they're checked elsewhere.
+  if (MDNode *N = DPV.getDebugLoc().getAsMDNode())
+    if (!isa<DILocation>(N))
+      return;
 
   BasicBlock *BB = DPV.getParent();
   Function *F = BB ? BB->getParent() : nullptr;
 
   // The scopes for variables and !dbg attachments must agree.
-  DILocalVariable *Var = DPV.getVariable();
   DILocation *Loc = DPV.getDebugLoc();
   CheckDI(Loc, "missing #dbg record DILocation", &DPV, BB, F);
 
@@ -6245,10 +6285,6 @@ void Verifier::visit(DPValue &DPV) {
           "mismatched subprogram between #dbg record variable and DILocation",
           &DPV, BB, F, Var, Var->getScope()->getSubprogram(), Loc,
           Loc->getScope()->getSubprogram());
-
-  // This check is redundant with one in visitLocalVariable().
-  CheckDI(isType(Var->getRawType()), "invalid type ref", Var,
-          Var->getRawType());
 }
 
 void Verifier::visitVPIntrinsic(VPIntrinsic &VPI) {
diff --git a/llvm/test/DebugInfo/print-non-instruction-debug-info.ll b/llvm/test/DebugInfo/print-non-instruction-debug-info.ll
index f8271df146fe96..2e765619fcb896 100644
--- a/llvm/test/DebugInfo/print-non-instruction-debug-info.ll
+++ b/llvm/test/DebugInfo/print-non-instruction-debug-info.ll
@@ -21,8 +21,8 @@
 ; CHECK-NEXT: {{^}}  %[[VAL_ADD:[0-9a-zA-Z]+]] = add i32 %[[VAL_A]], 5
 ; OLDDBG-NEXT: call void @llvm.dbg.value(metadata !DIArgList(i32 %[[VAL_A]], i32 %[[VAL_ADD]]), metadata ![[VAR_A]], metadata !DIExpression(DW_OP_LLVM_arg, 0, DW_OP_LLVM_arg, 1, DW_OP_plus)), !dbg ![[LOC_3:[0-9]+]]
 ; NEWDBG-NEXT: {{^}}    #dbg_value(!DIArgList(i32 %[[VAL_A]], i32 %[[VAL_ADD]]), ![[VAR_A]], !DIExpression(DW_OP_LLVM_arg, 0, DW_OP_LLVM_arg, 1, DW_OP_plus), ![[LOC_3:[0-9]+]])
-; OLDDBG-NEXT: call void @llvm.dbg.label(metadata ![[LABEL_ID:[0-9]+]])
-; NEWDBG-NEXT: {{^}}    #dbg_label(![[LABEL_ID:[0-9]+]])
+; OLDDBG-NEXT: call void @llvm.dbg.label(metadata ![[LABEL_ID:[0-9]+]]), !dbg ![[LOC_3]]
+; NEWDBG-NEXT: {{^}}    #dbg_label(![[LABEL_ID:[0-9]+]], ![[LOC_3]])
 ; CHECK-NEXT: {{^}}  store i32 %[[VAL_ADD]]{{.+}}, !DIAssignID ![[ASSIGNID:[0-9]+]]
 ; OLDDBG-NEXT: call void @llvm.dbg.assign(metadata i32 %[[VAL_ADD]], metadata ![[VAR_B]], metadata !DIExpression(), metadata ![[ASSIGNID]], metadata ptr %[[VAL_B]], metadata !DIExpression()), !dbg ![[LOC_4:[0-9]+]]
 ; NEWDBG-NEXT: {{^}}    #dbg_assign(i32 %[[VAL_ADD]], ![[VAR_B]], !DIExpression(), ![[ASSIGNID]], ptr %[[VAL_B]], !DIExpression(), ![[LOC_4:[0-9]+]])

Copy link
Member

@jmorse jmorse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, for full coverage you might want to add test coverage for the verifier errors too. I can't remember whether we've already done that for other RemoveDI records.

// The scopes for variables and !dbg attachments must agree.
DILabel *Label = DPL.getLabel();
DILocation *Loc = DPL.getDebugLoc();
Check(Loc, "#dbg_label record requires a !dbg attachment", &DPL, BB, F);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason why this is Check and not CheckDI?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simply because the same check for llvm.dbg.label is also a Check - no reason we can't change that here, of course!

@SLTozer
Copy link
Contributor Author

SLTozer commented Feb 28, 2024

LGTM, for full coverage you might want to add test coverage for the verifier errors too. I can't remember whether we've already done that for other RemoveDI records.

We've not; with that said, right now all the verifier tests are performed by llvm-as, and it would be good to keep that in place - in particular because it would make all the testing automatic, i.e. no large addition of new files that are almost identical to the old files, just an immediate coverage switchover. Doing so requires llvm-as to support conversion to the new format however, with or without a flag - would that be suitable as a separate patch, or would it be better to do the test cloning for now (replacing llvm-as with opt -verify)?

@jmorse
Copy link
Member

jmorse commented Feb 28, 2024

Ooooooo. I'm going to say adding a try-experimental... flag to llvm-as is better as it's more fitting into what we've done in the past, and it's more grepable. It's not a particular strong opinion though.

@SLTozer
Copy link
Contributor Author

SLTozer commented Feb 28, 2024

Added various fixes to the verifier to make the parsing more permissive; I've copied the verifier tests, and they can be run with llvm-as, but only when parsing is enabled (the verifier first runs before we convert back at the end of parsing, so to verify the new format we have to read a file in the new format).

Copy link
Contributor

@OCHyams OCHyams left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - some questions and nits inline that are non-blocking


/// Construct from an \a MDNode.
///
/// Note: if \c N does not have the template type, a verifier check will
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: should N in the comment be Param?

/// A typed tracking MDNode reference that does not require a definition for its
/// parameter type. Necessary to avoid including DebugInfoMetadata.h, which has
/// a significant impact on compile times if included in this file.
template <typename T> class DbgRecordParamRef {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this tactic used anywhere else? If so, would it be worth putting in its own header / the TypedTrackingMDNodeRef header?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This tactic is used for DebugLoc; that case specializes further for DILocations though, such that we can't trivially share a definition - at the very least we wouldn't deduplicate very much code. In principle I think this could be available in a broader header, //but// imo it should be done at the point we need it rather than preemptively.

CheckDI(!BB.getTrailingDPValues(), "Basic Block has trailing DbgRecords!",
&BB);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clang-format overreaching here?

Copy link
Contributor Author

@SLTozer SLTozer Mar 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, a hanging change from an intermediate revision - will remove!

@OCHyams
Copy link
Contributor

OCHyams commented Mar 4, 2024

Actually before landing, could you please summarise the testing strategy? It's not clear from the conversation above.

@SLTozer
Copy link
Contributor Author

SLTozer commented Mar 4, 2024

Actually before landing, could you please summarise the testing strategy? It's not clear from the conversation above.

We want to be able to test this patch with llvm-as, because that's how we prefer to test verifier changes. However, llvm-as won't currently verify any of the test cases in RemoveDI mode because it doesn't support RemoveDIs, but if the test is written in the RemoveDI format then the file will be verifier in RemoveDI mode after parsing; therefore, the testing for this patch is dependent on parser support, which comes afterwards. The two are more-or-less landing alongside each other, so I think it's reasonable to push the tests to a subsequent patch.

@SLTozer SLTozer merged commit b3c2c5a into llvm:main Mar 4, 2024
3 of 4 checks passed
@eaeltsin
Copy link
Contributor

Heads-up - we got a crash during LTO that reduced to this commit.

The error message is:

fragment covers entire variable
#dbg_value(i64 %37, !12949, !DIExpression(DW_OP_LLVM_fragment, 0, 64), !13072)
!12949 = !DILocalVariable(name: "foo", scope: !12950, file: !7650, line: 744, type: !553)
fragment is larger than or outside of variable
#dbg_value(i64 %43, !12949, !DIExpression(DW_OP_LLVM_fragment, 64, 64), !13072)
!12949 = !DILocalVariable(name: "foo", scope: !12950, file: !7650, line: 744, type: !553)
fatal error: error in backend: Broken module found, compilation aborted!

Stack trace fragment:

...
 #8 0x0000562ac005fc72 llvm::detail::PassModel<llvm::Module, llvm::VerifierPass, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&)
 #9 0x0000562ac4a5aa55 llvm::PassManager<llvm::Module, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&)
#10 0x0000562ac0209b48 llvm::lto::opt(llvm::lto::Config const&, llvm::TargetMachine*, unsigned int, llvm::Module&, bool, llvm::ModuleSummaryIndex*, llvm::ModuleSummaryIndex const*, std::__u::vector<unsigned char, std::__u::allocator<unsigned char>> const&)
#11 0x0000562ac020bfa5 llvm::lto::thinBackend(
...

We are working on a reduced reproducer, but I'm not sure how much time we need.

In the meantime, does the information above give you any hints? Any ideas how to fix this forward?

We definitely need this commit, because it fixes non-determinism. If we decide to revert, we need another fix for non-determinism instead.

@SLTozer
Copy link
Contributor Author

SLTozer commented Mar 11, 2024

I can't say with certainty from just that information, but this error could appear if there is an ODR violation. While you work on the reduced reproducer (which can be time consuming with LTO builds in my experience), can you verify whether, for the type of the variable that this error appears for, there are multiple non-identical class definitions with that type's name?

@eaeltsin
Copy link
Contributor

I can't say with certainty from just that information, but this error could appear if there is an ODR violation. While you work on the reduced reproducer (which can be time consuming with LTO builds in my experience), can you verify whether, for the type of the variable that this error appears for, there are multiple non-identical class definitions with that type's name?

Thanks for the hint, this was indeed an ODR violation!

Is it possible to report this as an error instead of crashing?

@pogo59
Copy link
Collaborator

pogo59 commented Mar 11, 2024

Is it possible to report this as an error instead of crashing?

I think so, but it would have to happen during the module loading stage of LTO. Currently the loader assumes that if it sees a second record with the same name, it's a duplicate and can be thrown away. ODR detection would instead need to do a deeper comparison to check whether it was a false duplicate.

IIRC @adrian-prantl pushed back on this the first time we ran into it, due to the increased expense of the checking for duplicates (which are likely to be common in LTO). I don't think we ever tried to measure the cost though. If it's high, we could possibly put it under an off-by-default option.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants