Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DebugInfo] Make DIArgList inherit from Metadata and always unique #72147

Merged
merged 1 commit into from
Nov 17, 2023

Conversation

SLTozer
Copy link
Contributor

@SLTozer SLTozer commented Nov 13, 2023

This patch changes the DIArgList class's inheritance from MDNode to Metadata, ReplaceableMetadataImpl, and ensures that it is always unique, i.e. a distinct DIArgList should never be produced.

Currently a distinct DIArgList will never be emitted to bitcode or IR, but they could appear in-memory if a DIArgList had one of its arguments RAUWd such that the contents of the DIArgList were identical to an already-existing DIArgList. Adding support for reuniquing DIArgLists in these cases caused some significant performance regressions (~3%); these are resolved by changing DIArgList to no longer inherit from MDNode.

Conceptually, having DIArgList inherit from MDNode does not make a lot of sense - part of the initial reason for the relationship was to make use of MDNode's RAUW support, but in the end the number of exceptions and special cases for DIArgLists makes it probably not worth the shared logic. Outside of the RAUW logic, DIArgLists don't make use of any features of MDNodes, so it makes sense to simply disentangle them.

This should not result in any changes to IR or bitcode; the parsing and printing format for DIArgList is unchanged, and the order in which it appears should also be identical. As a minor note, this patch also fixes a gap in the verifier, where the ValueAsMetadata operands to a DIArgList would not be visited.

@llvmbot
Copy link
Collaborator

llvmbot commented Nov 13, 2023

@llvm/pr-subscribers-llvm-ir

@llvm/pr-subscribers-debuginfo

Author: Stephen Tozer (SLTozer)

Changes

This patch changes the DIArgList class's inheritance from MDNode to Metadata, ReplaceableMetadataImpl, and ensures that it is always unique, i.e. a distinct DIArgList should never be produced.

Currently a distinct DIArgList will never be emitted to bitcode or IR, but they could appear in-memory if a DIArgList had one of its arguments RAUWd such that the contents of the DIArgList were identical to an already-existing DIArgList. Adding support for reuniquing DIArgLists in these cases caused some significant performance regressions (~3%); these are resolved by changing DIArgList to no longer inherit from MDNode.

Conceptually, having DIArgList inherit from MDNode does not make a lot of sense - part of the initial reason for the relationship was to make use of MDNode's RAUW support, but in the end the number of exceptions and special cases for DIArgLists makes it probably not worth the shared logic. Outside of the RAUW logic, DIArgLists don't make use of any features of MDNodes, so it makes sense to simply disentangle them.

This should not result in any changes to IR or bitcode; the parsing and printing format for DIArgList is unchanged, and the order in which it appears should also be identical. As a minor note, this patch also fixes a gap in the verifier, where the ValueAsMetadata operands to a DIArgList would not be visited.


Patch is 22.41 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/72147.diff

13 Files Affected:

  • (modified) llvm/include/llvm/AsmParser/LLParser.h (+1-2)
  • (modified) llvm/include/llvm/IR/DebugInfoMetadata.h (+10-21)
  • (modified) llvm/include/llvm/IR/Metadata.def (+1-1)
  • (modified) llvm/include/llvm/IR/Metadata.h (+1-4)
  • (modified) llvm/lib/AsmParser/LLParser.cpp (+9-9)
  • (modified) llvm/lib/Bitcode/Writer/BitcodeWriter.cpp (+7-5)
  • (modified) llvm/lib/IR/AsmWriter.cpp (+4-7)
  • (modified) llvm/lib/IR/DebugInfoMetadata.cpp (+27-35)
  • (modified) llvm/lib/IR/LLVMContextImpl.cpp (+5-8)
  • (modified) llvm/lib/IR/LLVMContextImpl.h (+35-3)
  • (modified) llvm/lib/IR/Metadata.cpp (+5-13)
  • (modified) llvm/lib/IR/TypeFinder.cpp (+5-8)
  • (modified) llvm/lib/IR/Verifier.cpp (+9-7)
diff --git a/llvm/include/llvm/AsmParser/LLParser.h b/llvm/include/llvm/AsmParser/LLParser.h
index eca908a24aac7b2..810f3668d05d449 100644
--- a/llvm/include/llvm/AsmParser/LLParser.h
+++ b/llvm/include/llvm/AsmParser/LLParser.h
@@ -548,6 +548,7 @@ namespace llvm {
     bool parseMetadataAsValue(Value *&V, PerFunctionState &PFS);
     bool parseValueAsMetadata(Metadata *&MD, const Twine &TypeMsg,
                               PerFunctionState *PFS);
+    bool parseDIArgList(Metadata *&MD, PerFunctionState *PFS);
     bool parseMetadata(Metadata *&MD, PerFunctionState *PFS);
     bool parseMDTuple(MDNode *&MD, bool IsDistinct = false);
     bool parseMDNode(MDNode *&N);
@@ -569,8 +570,6 @@ namespace llvm {
 #define HANDLE_SPECIALIZED_MDNODE_LEAF(CLASS)                                  \
   bool parse##CLASS(MDNode *&Result, bool IsDistinct);
 #include "llvm/IR/Metadata.def"
-    bool parseDIArgList(MDNode *&Result, bool IsDistinct,
-                        PerFunctionState *PFS);
 
     // Function Parsing.
     struct ArgInfo {
diff --git a/llvm/include/llvm/IR/DebugInfoMetadata.h b/llvm/include/llvm/IR/DebugInfoMetadata.h
index 1fe054316b75c82..3a3630a4f5846ad 100644
--- a/llvm/include/llvm/IR/DebugInfoMetadata.h
+++ b/llvm/include/llvm/IR/DebugInfoMetadata.h
@@ -3728,51 +3728,40 @@ class DIMacroFile : public DIMacroNode {
 
 /// List of ValueAsMetadata, to be used as an argument to a dbg.value
 /// intrinsic.
-class DIArgList : public MDNode {
+class DIArgList : public Metadata, ReplaceableMetadataImpl {
+  friend class ReplaceableMetadataImpl;
   friend class LLVMContextImpl;
-  friend class MDNode;
   using iterator = SmallVectorImpl<ValueAsMetadata *>::iterator;
 
   SmallVector<ValueAsMetadata *, 4> Args;
 
-  DIArgList(LLVMContext &C, StorageType Storage,
-            ArrayRef<ValueAsMetadata *> Args)
-      : MDNode(C, DIArgListKind, Storage, std::nullopt),
+  DIArgList(LLVMContext &Context, ArrayRef<ValueAsMetadata *> Args)
+      : Metadata(DIArgListKind, Uniqued), ReplaceableMetadataImpl(Context),
         Args(Args.begin(), Args.end()) {
     track();
   }
   ~DIArgList() { untrack(); }
 
-  static DIArgList *getImpl(LLVMContext &Context,
-                            ArrayRef<ValueAsMetadata *> Args,
-                            StorageType Storage, bool ShouldCreate = true);
-
-  TempDIArgList cloneImpl() const {
-    return getTemporary(getContext(), getArgs());
-  }
-
   void track();
   void untrack();
-  void dropAllReferences();
+  void dropAllReferences(bool Untrack);
 
 public:
-  DEFINE_MDNODE_GET(DIArgList, (ArrayRef<ValueAsMetadata *> Args), (Args))
-
-  TempDIArgList clone() const { return cloneImpl(); }
+  static DIArgList *get(LLVMContext &Context, ArrayRef<ValueAsMetadata *> Args);
 
   ArrayRef<ValueAsMetadata *> getArgs() const { return Args; }
 
   iterator args_begin() { return Args.begin(); }
   iterator args_end() { return Args.end(); }
 
-  ReplaceableMetadataImpl *getReplaceableUses() {
-    return Context.getReplaceableUses();
-  }
-
   static bool classof(const Metadata *MD) {
     return MD->getMetadataID() == DIArgListKind;
   }
 
+  SmallVector<DPValue *> getAllDPValueUsers() {
+    return ReplaceableMetadataImpl::getAllDPValueUsers();
+  }
+
   void handleChangedOperand(void *Ref, Metadata *New);
 };
 
diff --git a/llvm/include/llvm/IR/Metadata.def b/llvm/include/llvm/IR/Metadata.def
index 36c34c1d2347c08..a3cfb9ad6e3e785 100644
--- a/llvm/include/llvm/IR/Metadata.def
+++ b/llvm/include/llvm/IR/Metadata.def
@@ -77,6 +77,7 @@ HANDLE_METADATA_BRANCH(ValueAsMetadata)
 HANDLE_METADATA_LEAF(ConstantAsMetadata)
 HANDLE_METADATA_LEAF(LocalAsMetadata)
 HANDLE_METADATA_LEAF(DistinctMDOperandPlaceholder)
+HANDLE_METADATA_LEAF(DIArgList)
 HANDLE_MDNODE_BRANCH(MDNode)
 HANDLE_MDNODE_LEAF_UNIQUABLE(MDTuple)
 HANDLE_SPECIALIZED_MDNODE_LEAF_UNIQUABLE(DILocation)
@@ -115,7 +116,6 @@ HANDLE_SPECIALIZED_MDNODE_BRANCH(DIMacroNode)
 HANDLE_SPECIALIZED_MDNODE_LEAF_UNIQUABLE(DIMacro)
 HANDLE_SPECIALIZED_MDNODE_LEAF_UNIQUABLE(DIMacroFile)
 HANDLE_SPECIALIZED_MDNODE_LEAF_UNIQUABLE(DICommonBlock)
-HANDLE_SPECIALIZED_MDNODE_LEAF_UNIQUABLE(DIArgList)
 HANDLE_SPECIALIZED_MDNODE_LEAF_UNIQUABLE(DIStringType)
 HANDLE_SPECIALIZED_MDNODE_LEAF_UNIQUABLE(DIGenericSubrange)
 
diff --git a/llvm/include/llvm/IR/Metadata.h b/llvm/include/llvm/IR/Metadata.h
index a245dabe086f045..4498423c4c460d9 100644
--- a/llvm/include/llvm/IR/Metadata.h
+++ b/llvm/include/llvm/IR/Metadata.h
@@ -1037,7 +1037,6 @@ struct TempMDNodeDeleter {
 class MDNode : public Metadata {
   friend class ReplaceableMetadataImpl;
   friend class LLVMContextImpl;
-  friend class DIArgList;
 
   /// The header that is coallocated with an MDNode along with its "small"
   /// operands. It is located immediately before the main body of the node.
@@ -1220,9 +1219,7 @@ class MDNode : public Metadata {
   bool isDistinct() const { return Storage == Distinct; }
   bool isTemporary() const { return Storage == Temporary; }
 
-  bool isReplaceable() const {
-    return isTemporary() || getMetadataID() == DIArgListKind;
-  }
+  bool isReplaceable() const { return isTemporary(); }
 
   /// RAUW a temporary.
   ///
diff --git a/llvm/lib/AsmParser/LLParser.cpp b/llvm/lib/AsmParser/LLParser.cpp
index 9940bfb15d1979e..a188644ed058c10 100644
--- a/llvm/lib/AsmParser/LLParser.cpp
+++ b/llvm/lib/AsmParser/LLParser.cpp
@@ -5490,13 +5490,9 @@ bool LLParser::parseDIExpression(MDNode *&Result, bool IsDistinct) {
   return false;
 }
 
-bool LLParser::parseDIArgList(MDNode *&Result, bool IsDistinct) {
-  return parseDIArgList(Result, IsDistinct, nullptr);
-}
 /// ParseDIArgList:
 ///   ::= !DIArgList(i32 7, i64 %0)
-bool LLParser::parseDIArgList(MDNode *&Result, bool IsDistinct,
-                              PerFunctionState *PFS) {
+bool LLParser::parseDIArgList(Metadata *&MD, PerFunctionState *PFS) {
   assert(PFS && "Expected valid function state");
   assert(Lex.getKind() == lltok::MetadataVar && "Expected metadata type name");
   Lex.Lex();
@@ -5516,7 +5512,7 @@ bool LLParser::parseDIArgList(MDNode *&Result, bool IsDistinct,
   if (parseToken(lltok::rparen, "expected ')' here"))
     return true;
 
-  Result = GET_OR_DISTINCT(DIArgList, (Context, Args));
+  MD = DIArgList::get(Context, Args);
   return false;
 }
 
@@ -5630,13 +5626,17 @@ bool LLParser::parseValueAsMetadata(Metadata *&MD, const Twine &TypeMsg,
 ///  ::= !DILocation(...)
 bool LLParser::parseMetadata(Metadata *&MD, PerFunctionState *PFS) {
   if (Lex.getKind() == lltok::MetadataVar) {
-    MDNode *N;
     // DIArgLists are a special case, as they are a list of ValueAsMetadata and
     // so parsing this requires a Function State.
     if (Lex.getStrVal() == "DIArgList") {
-      if (parseDIArgList(N, false, PFS))
+      Metadata *AL;
+      if (parseDIArgList(AL, PFS))
         return true;
-    } else if (parseSpecializedMDNode(N)) {
+      MD = AL;
+      return false;
+    }
+    MDNode *N;
+    if (parseSpecializedMDNode(N)) {
       return true;
     }
     MD = N;
diff --git a/llvm/lib/Bitcode/Writer/BitcodeWriter.cpp b/llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
index d16b5c7781c2413..9c21cc69179e555 100644
--- a/llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
+++ b/llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
@@ -336,8 +336,7 @@ class ModuleBitcodeWriter : public ModuleBitcodeWriterBase {
                     unsigned Abbrev);
   void writeDIMacroFile(const DIMacroFile *N, SmallVectorImpl<uint64_t> &Record,
                         unsigned Abbrev);
-  void writeDIArgList(const DIArgList *N, SmallVectorImpl<uint64_t> &Record,
-                      unsigned Abbrev);
+  void writeDIArgList(const DIArgList *N, SmallVectorImpl<uint64_t> &Record);
   void writeDIModule(const DIModule *N, SmallVectorImpl<uint64_t> &Record,
                      unsigned Abbrev);
   void writeDIAssignID(const DIAssignID *N, SmallVectorImpl<uint64_t> &Record,
@@ -1975,13 +1974,12 @@ void ModuleBitcodeWriter::writeDIMacroFile(const DIMacroFile *N,
 }
 
 void ModuleBitcodeWriter::writeDIArgList(const DIArgList *N,
-                                         SmallVectorImpl<uint64_t> &Record,
-                                         unsigned Abbrev) {
+                                         SmallVectorImpl<uint64_t> &Record) {
   Record.reserve(N->getArgs().size());
   for (ValueAsMetadata *MD : N->getArgs())
     Record.push_back(VE.getMetadataID(MD));
 
-  Stream.EmitRecord(bitc::METADATA_ARG_LIST, Record, Abbrev);
+  Stream.EmitRecord(bitc::METADATA_ARG_LIST, Record);
   Record.clear();
 }
 
@@ -2264,6 +2262,10 @@ void ModuleBitcodeWriter::writeMetadataRecords(
 #include "llvm/IR/Metadata.def"
       }
     }
+    if (auto *AL = dyn_cast<DIArgList>(MD)) {
+      writeDIArgList(AL, Record);
+      continue;
+    }
     writeValueAsMetadata(cast<ValueAsMetadata>(MD), Record);
   }
 }
diff --git a/llvm/lib/IR/AsmWriter.cpp b/llvm/lib/IR/AsmWriter.cpp
index 6d66b34423949fb..a7d667d79880c1a 100644
--- a/llvm/lib/IR/AsmWriter.cpp
+++ b/llvm/lib/IR/AsmWriter.cpp
@@ -1265,9 +1265,8 @@ void SlotTracker::CreateFunctionSlot(const Value *V) {
 void SlotTracker::CreateMetadataSlot(const MDNode *N) {
   assert(N && "Can't insert a null Value into SlotTracker!");
 
-  // Don't make slots for DIExpressions or DIArgLists. We just print them inline
-  // everywhere.
-  if (isa<DIExpression>(N) || isa<DIArgList>(N))
+  // Don't make slots for DIExpressions. We just print them inline everywhere.
+  if (isa<DIExpression>(N))
     return;
 
   unsigned DestSlot = mdnNext;
@@ -3516,8 +3515,6 @@ void AssemblyWriter::printNamedMDNode(const NamedMDNode *NMD) {
     // Write DIExpressions inline.
     // FIXME: Ban DIExpressions in NamedMDNodes, they will serve no purpose.
     MDNode *Op = NMD->getOperand(i);
-    assert(!isa<DIArgList>(Op) &&
-           "DIArgLists should not appear in NamedMDNodes");
     if (auto *Expr = dyn_cast<DIExpression>(Op)) {
       writeDIExpression(Out, Expr, AsmWriterContext::getEmpty());
       continue;
@@ -4918,7 +4915,7 @@ static void printMetadataImplRec(raw_ostream &ROS, const Metadata &MD,
   WriteAsOperandInternal(OS, &MD, WriterCtx, /* FromValue */ true);
 
   auto *N = dyn_cast<MDNode>(&MD);
-  if (!N || isa<DIExpression>(MD) || isa<DIArgList>(MD))
+  if (!N || isa<DIExpression>(MD))
     return;
 
   OS << " = ";
@@ -4986,7 +4983,7 @@ static void printMetadataImpl(raw_ostream &ROS, const Metadata &MD,
   WriteAsOperandInternal(OS, &MD, *WriterCtx, /* FromValue */ true);
 
   auto *N = dyn_cast<MDNode>(&MD);
-  if (OnlyAsOperand || !N || isa<DIExpression>(MD) || isa<DIArgList>(MD))
+  if (OnlyAsOperand || !N || isa<DIExpression>(MD))
     return;
 
   OS << " = ";
diff --git a/llvm/lib/IR/DebugInfoMetadata.cpp b/llvm/lib/IR/DebugInfoMetadata.cpp
index 0e14ec90b51f1fb..caa55a10d29e878 100644
--- a/llvm/lib/IR/DebugInfoMetadata.cpp
+++ b/llvm/lib/IR/DebugInfoMetadata.cpp
@@ -2106,11 +2106,14 @@ DIMacroFile *DIMacroFile::getImpl(LLVMContext &Context, unsigned MIType,
   DEFINE_GETIMPL_STORE(DIMacroFile, (MIType, Line), Ops);
 }
 
-DIArgList *DIArgList::getImpl(LLVMContext &Context,
-                              ArrayRef<ValueAsMetadata *> Args,
-                              StorageType Storage, bool ShouldCreate) {
-  DEFINE_GETIMPL_LOOKUP(DIArgList, (Args));
-  DEFINE_GETIMPL_STORE_NO_OPS(DIArgList, (Args));
+DIArgList *DIArgList::get(LLVMContext &Context,
+                          ArrayRef<ValueAsMetadata *> Args) {
+  auto ExistingIt = Context.pImpl->DIArgLists.find_as(DIArgListKeyInfo(Args));
+  if (ExistingIt != Context.pImpl->DIArgLists.end())
+    return *ExistingIt;
+  DIArgList *NewArgList = new DIArgList(Context, Args);
+  Context.pImpl->DIArgLists.insert(NewArgList);
+  return NewArgList;
 }
 
 void DIArgList::handleChangedOperand(void *Ref, Metadata *New) {
@@ -2118,12 +2121,9 @@ void DIArgList::handleChangedOperand(void *Ref, Metadata *New) {
   assert((!New || isa<ValueAsMetadata>(New)) &&
          "DIArgList must be passed a ValueAsMetadata");
   untrack();
-  bool Uniq = isUniqued();
-  if (Uniq) {
-    // We need to update the uniqueness once the Args are updated since they
-    // form the key to the DIArgLists store.
-    eraseFromStore();
-  }
+  // We need to update the set storage once the Args are updated since they
+  // form the key to the DIArgLists store.
+  getContext().pImpl->DIArgLists.erase(this);
   ValueAsMetadata *NewVM = cast_or_null<ValueAsMetadata>(New);
   for (ValueAsMetadata *&VM : Args) {
     if (&VM == OldVMPtr) {
@@ -2133,28 +2133,19 @@ void DIArgList::handleChangedOperand(void *Ref, Metadata *New) {
         VM = ValueAsMetadata::get(PoisonValue::get(VM->getValue()->getType()));
     }
   }
-  if (Uniq) {
-    // In the RemoveDIs project (eliminating debug-info-intrinsics), DIArgLists
-    // can be referred to by DebugValueUser objects, which necessitates them
-    // being unique and replaceable metadata. This causes a slight
-    // performance regression that's to be avoided during the early stages of
-    // the RemoveDIs prototype, see D154080.
-#ifdef EXPERIMENTAL_DEBUGINFO_ITERATORS
-    MDNode *UniqueArgList = uniquify();
-    if (UniqueArgList != this) {
-      replaceAllUsesWith(UniqueArgList);
-      // Clear this here so we don't try to untrack in the destructor.
-      Args.clear();
-      delete this;
-      return;
-    }
-#else
-    // Otherwise, don't fully unique, become distinct instead. See D108968,
-    // there's a latent bug that presents here as nondeterminism otherwise.
-    if (uniquify() != this)
-      storeDistinctInContext();
-#endif
+  // We've changed the contents of this DIArgList, and the set storage may
+  // already contain a DIArgList with our new set of args; if it does, then we
+  // must RAUW this with the existing DIArgList, otherwise we simply insert this
+  // back into the set storage.
+  DIArgList *ExistingArgList = getUniqued(getContext().pImpl->DIArgLists, this);
+  if (ExistingArgList) {
+    replaceAllUsesWith(ExistingArgList);
+    // Clear this here so we don't try to untrack in the destructor.
+    Args.clear();
+    delete this;
+    return;
   }
+  getContext().pImpl->DIArgLists.insert(this);
   track();
 }
 void DIArgList::track() {
@@ -2167,8 +2158,9 @@ void DIArgList::untrack() {
     if (VAM)
       MetadataTracking::untrack(&VAM, *VAM);
 }
-void DIArgList::dropAllReferences() {
-  untrack();
+void DIArgList::dropAllReferences(bool Untrack) {
+  if (Untrack)
+    untrack();
   Args.clear();
-  MDNode::dropAllReferences();
+  ReplaceableMetadataImpl::resolveAllUses(/* ResolveUsers */ false);
 }
diff --git a/llvm/lib/IR/LLVMContextImpl.cpp b/llvm/lib/IR/LLVMContextImpl.cpp
index 406850b7de24816..993deafe6627010 100644
--- a/llvm/lib/IR/LLVMContextImpl.cpp
+++ b/llvm/lib/IR/LLVMContextImpl.cpp
@@ -68,15 +68,8 @@ LLVMContextImpl::~LLVMContextImpl() {
 
   // Drop references for MDNodes.  Do this before Values get deleted to avoid
   // unnecessary RAUW when nodes are still unresolved.
-  for (auto *I : DistinctMDNodes) {
-    // We may have DIArgList that were uniqued, and as it has a custom
-    // implementation of dropAllReferences, it needs to be explicitly invoked.
-    if (auto *AL = dyn_cast<DIArgList>(I)) {
-      AL->dropAllReferences();
-      continue;
-    }
+  for (auto *I : DistinctMDNodes)
     I->dropAllReferences();
-  }
 #define HANDLE_MDNODE_LEAF_UNIQUABLE(CLASS)                                    \
   for (auto *I : CLASS##s)                                                     \
     I->dropAllReferences();
@@ -87,6 +80,10 @@ LLVMContextImpl::~LLVMContextImpl() {
     Pair.second->dropUsers();
   for (auto &Pair : MetadataAsValues)
     Pair.second->dropUse();
+  // Do not untrack ValueAsMetadata references for DIArgLists, as they have
+  // already been more efficiently untracked above.
+  for (DIArgList *AL : DIArgLists)
+    AL->dropAllReferences(/* Untrack */ false);
 
   // Destroy MDNodes.
   for (MDNode *I : DistinctMDNodes)
diff --git a/llvm/lib/IR/LLVMContextImpl.h b/llvm/lib/IR/LLVMContextImpl.h
index ebc444fcb6896e9..b55107beba556c8 100644
--- a/llvm/lib/IR/LLVMContextImpl.h
+++ b/llvm/lib/IR/LLVMContextImpl.h
@@ -1308,11 +1308,13 @@ template <> struct MDNodeKeyImpl<DIMacroFile> {
   }
 };
 
-template <> struct MDNodeKeyImpl<DIArgList> {
+// DIArgLists are not MDNodes, but we still want to unique them in a DenseSet
+// based on a hash of their arguments.
+struct DIArgListKeyInfo {
   ArrayRef<ValueAsMetadata *> Args;
 
-  MDNodeKeyImpl(ArrayRef<ValueAsMetadata *> Args) : Args(Args) {}
-  MDNodeKeyImpl(const DIArgList *N) : Args(N->getArgs()) {}
+  DIArgListKeyInfo(ArrayRef<ValueAsMetadata *> Args) : Args(Args) {}
+  DIArgListKeyInfo(const DIArgList *N) : Args(N->getArgs()) {}
 
   bool isKeyOf(const DIArgList *RHS) const { return Args == RHS->getArgs(); }
 
@@ -1321,6 +1323,35 @@ template <> struct MDNodeKeyImpl<DIArgList> {
   }
 };
 
+/// DenseMapInfo for DIArgList.
+struct DIArgListInfo {
+  using KeyTy = DIArgListKeyInfo;
+
+  static inline DIArgList *getEmptyKey() {
+    return DenseMapInfo<DIArgList *>::getEmptyKey();
+  }
+
+  static inline DIArgList *getTombstoneKey() {
+    return DenseMapInfo<DIArgList *>::getTombstoneKey();
+  }
+
+  static unsigned getHashValue(const KeyTy &Key) { return Key.getHashValue(); }
+
+  static unsigned getHashValue(const DIArgList *N) {
+    return KeyTy(N).getHashValue();
+  }
+
+  static bool isEqual(const KeyTy &LHS, const DIArgList *RHS) {
+    if (RHS == getEmptyKey() || RHS == getTombstoneKey())
+      return false;
+    return LHS.isKeyOf(RHS);
+  }
+
+  static bool isEqual(const DIArgList *LHS, const DIArgList *RHS) {
+    return LHS == RHS;
+  }
+};
+
 /// DenseMapInfo for MDNode subclasses.
 template <class NodeTy> struct MDNodeInfo {
   using KeyTy = MDNodeKeyImpl<NodeTy>;
@@ -1471,6 +1502,7 @@ class LLVMContextImpl {
   StringMap<MDString, BumpPtrAllocator> MDStringCache;
   DenseMap<Value *, ValueAsMetadata *> ValuesAsMetadata;
   DenseMap<Metadata *, MetadataAsValue *> MetadataAsValues;
+  DenseSet<DIArgList *, DIArgListInfo> DIArgLists;
 
 #define HANDLE_MDNODE_LEAF_UNIQUABLE(CLASS)                                    \
   DenseSet<CLASS *, CLASS##Info> CLASS##s;
diff --git a/llvm/lib/IR/Metadata.cpp b/llvm/lib/IR/Metadata.cpp
index e040387fceb7436..415e256c817b82f 100644
--- a/llvm/lib/IR/Metadata.cpp
+++ b/llvm/lib/IR/Metadata.cpp
@@ -413,33 +413,25 @@ void ReplaceableMetadataImpl::resolveAllUses(bool ResolveUsers) {
 // commentry in DIArgList::handleChangedOperand for details. Hidden behind
 // conditional compilation to avoid a compile time regression.
 ReplaceableMetadataImpl *ReplaceableMetadataImpl::getOrCreate(Metadata &MD) {
-#ifdef EXPERIMENTAL_DEBUGINFO_ITERATORS
-  if (auto *ArgList = dyn_cast<DIArgList>(&MD))
-    return ArgList->Context.getOrCreateReplaceableUses();
-#endif
   if (auto *N = dyn_cast<MDNode>(&MD))
     return N->isResolved() ? nullptr : N->Context.getOrCreateReplaceableUses();
+  if (auto ArgList = dyn_cast<DIArgList>(&MD))
+    return ArgList;
   return dyn_cast<ValueAsMetadata>(&MD);
 }
 
 ReplaceableMetadataImpl *ReplaceableMetadataImpl::getIfExists(Metadata &MD) {
-#ifdef EXPERIMENTAL_DEBUGINFO_ITERATORS
-  if (auto *ArgList = dyn_cast<DIArgList>(&MD))
-    return ArgList->Context.getReplaceableUses();
-#endif
   if (auto *N = dyn_cast<MDNode>(&MD))
     return N->isResolved() ? nullptr : N->Context.getReplaceableUses();
+  if (auto ArgList = dyn_cast<DIArgList>(&MD))
+    return ArgList;
   return dyn_cast<ValueAsMetadata>(&MD);
 }
 
 bool ReplaceableMetadataImpl::isReplaceable(const Metadata &MD) {
-#ifdef EXPERIMENTAL_DEBUGINFO_ITERATORS
-  if (isa<DIArgList>(&MD))
-    return true;
-#endif
   if (auto *N = dyn_cast<MDNode>(&MD))
     return !N->isResolved();
-  return isa<ValueAsMetadata>(&MD);
+  return isa<ValueAsMetadata>(&MD) || isa<DIArgList>(&MD);
 }
 
 static DISubprogram *getLocalFunctionMetadata(Value *V) {
diff --git a/llvm/lib/IR/TypeFinder.cpp b/llvm/lib/IR/TypeFinder.cpp
index 904af7e737ccfca..003155a4af4877b 100644
--- a/llvm/lib/IR/TypeFinder.cpp
+++ b/llvm/lib/IR/TypeFinder.cpp
@@ -136,6 +136,11 @@ void TypeFinder::incorporateValue(const Value *V) {
       return incorporateMDNode(N);
     if (const auto *MDV = dyn_cast<ValueAsMetadata>(M->getMetadata()))
       return incorporateValue(MDV->getValue());
+    if (const auto *AL = dyn_cast<DIArgList>(M->getMetadata())) {
+      for (auto *Arg : AL->getArgs())
+        incorporateValue(Arg->getVa...
[truncated]

@SLTozer
Copy link
Contributor Author

SLTozer commented Nov 15, 2023

@dwblaikie Any chance you meant to comment that on this review? #70701

@dwblaikie
Copy link
Collaborator

@dwblaikie Any chance you meant to comment that on this review? #70701

Ah, yep - sorry for the noise. I'll delete my misplaced comment.

@OCHyams
Copy link
Contributor

OCHyams commented Nov 16, 2023

LGTM, sounds like a significant win. I would feel more comfortable if another pair of eyes could look over the changes too though.

@jmorse
Copy link
Member

jmorse commented Nov 16, 2023

Digging into this now, thanks for making this adjustment. Two high level concerns:

  • It looks like some files might end up being different according to the compile time tracker [0], in that one or two of the bitcode files report file-size differences. It's not clear to me whether that's actually a spurious difference though, as the numeric sizes aren't different.
  • IIRC the reason why https://reviews.llvm.org/D108968 was filed was because of a nondeterminism problem when uniquifying DIArgLists that only appears deep in LTO builds. We should be careful about re-enabling DIArgList uniquification in light of that, I see it's being re-implemented here (will look), is that likely to solve the reported nondeterminism issue?

[0] http://llvm-compile-time-tracker.com/compare.php?from=9e618e5ed4c9f2c24e2c308e7f28e505e788f5fa&to=9552698bada4e3809331d57c0a78627161767ae6&stat=size-file&details=on

Copy link
Member

@jmorse jmorse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes LGTM with some understanding-confirmations

Record.reserve(N->getArgs().size());
for (ValueAsMetadata *MD : N->getArgs())
Record.push_back(VE.getMetadataID(MD));

Stream.EmitRecord(bitc::METADATA_ARG_LIST, Record, Abbrev);
Stream.EmitRecord(bitc::METADATA_ARG_LIST, Record);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For my understanding, what are the implications of not having an abbrev here -- will we end up with a potentially suboptimal encoding or something?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DIArgLists have no abbreviations, so this is a no-op I believe.

Comment on lines +1268 to +1269
// Don't make slots for DIExpressions. We just print them inline everywhere.
if (isa<DIExpression>(N))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To confirm my understanding: the mechanism of print DIArgLists isn't changing at all, but as they're no longer MDNodes, they don't get considered for a metadata-node-number slot so don't need to appear here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, this function only gets called for MDNodes, so DIArgList doesn't need to care anymore.

Comment on lines +418 to +419
if (auto ArgList = dyn_cast<DIArgList>(&MD))
return ArgList;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This all sounds good given that there's no performance regression

@SLTozer
Copy link
Contributor Author

SLTozer commented Nov 16, 2023

* It looks like some files might end up being different according to the compile time tracker [0], in that one or two of the bitcode files report file-size differences. It's not clear to me whether that's actually a spurious difference though, as the numeric sizes aren't different.

I'll try and reproduce this before merging to make sure nothing bad is going on - I have suspicions that this is an innocent case of us being able to merge debug history ranges that previously were pinged as different due to distinct DIArgLists, but there's a (+0.00%) size increase for 7zip, so I'll need to figure out whether anything weird is happening there.

* IIRC the reason why https://reviews.llvm.org/D108968 was filed was because of a nondeterminism problem when uniquifying DIArgLists that only appears deep in LTO builds. We should be careful about re-enabling DIArgList uniquification in light of that, I see it's being re-implemented here (will look), is that likely to solve the reported nondeterminism issue?

The non-determinism problem solved in that patch was essentially that we weren't properly reuniquing DIArgLists, so we ended up in situations where equal DIArgLists had different pointers. This change should properly reunique DIArgLists, and should actually close a potential case where equal DIArgLists can have different pointers (i.e. if one is made distinct).

@SLTozer
Copy link
Contributor Author

SLTozer commented Nov 17, 2023

Update, doesn't look like anything serious - I can do some serious digging to verify whether the additional dbg.value produced (which is a rare case) is correct or not, but it is not horribly wrong (it just extends the range of an existing dbg.value), and if there is anything incorrect about its value or placement it is almost certainly exposed by this patch, rather than caused by it - does that seem good enough to merge this?

@jmorse
Copy link
Member

jmorse commented Nov 17, 2023

Urgh -- an extra dbg.value from an underlying metadata change shows something is majorly wrong. Although I'd hazard a guess that it isn't this patch, it's something else being exercised. (I'll dig into it).

@SLTozer
Copy link
Contributor Author

SLTozer commented Nov 17, 2023

Urgh -- an extra dbg.value from an underlying metadata change shows something is majorly wrong. Although I'd hazard a guess that it isn't this patch, it's something else being exercised. (I'll dig into it).

There is a practical reason to expect a debug info change - prior to this patch, we have the possibility that a DIArgList will be made distinct, and so an equivalent DIArgList that has been properly uniqued will exist at a different address. Since we generally assume DIArgLists are uniqued (as with ValueAsMetadata), all the checks for equality are tests for pointer equality. This change ensures that DIArgLists are actually always uniqued, so value equality will always mean pointer equality, so we may see situations where we correctly identify two locations as being equal when previously we wouldn't. There are probably passes where this results in a debug info change - albeit very rarely, since the distinct case seems to have been quite rare to begin with.

@jmorse
Copy link
Member

jmorse commented Nov 17, 2023

I was a bit skeptical about this ^, after some back and forth with @SLTozer though we reckon it's probably isIdenticalToWhenDefined returning something different now when merging blocks in SimplifyCFG, which would result in dbg.values not-being-dropped where they would have been before. With that in mind, LGTM again.

@SLTozer SLTozer merged commit e77af7e into llvm:main Nov 17, 2023
2 of 3 checks passed
@SLTozer
Copy link
Contributor Author

SLTozer commented Nov 17, 2023

MemSanitizer buildbot failure, will revert until the problem's fixed.

SLTozer added a commit that referenced this pull request Nov 17, 2023
SLTozer added a commit that referenced this pull request Nov 17, 2023
sr-tream pushed a commit to sr-tream/llvm-project that referenced this pull request Nov 20, 2023
…lvm#72147)

This patch changes the `DIArgList` class's inheritance from `MDNode` to
`Metadata, ReplaceableMetadataImpl`, and ensures that it is always
unique, i.e. a distinct DIArgList should never be produced.

This should not result in any changes to IR or bitcode parsing and
printing, as the format for DIArgList is unchanged, and the order in which it
appears should also be identical. As a minor note, this patch also fixes
a gap in the verifier, where the ValueAsMetadata operands to a DIArgList
would not be visited.
sr-tream pushed a commit to sr-tream/llvm-project that referenced this pull request Nov 20, 2023
zahiraam pushed a commit to zahiraam/llvm-project that referenced this pull request Nov 20, 2023
…lvm#72147)

This patch changes the `DIArgList` class's inheritance from `MDNode` to
`Metadata, ReplaceableMetadataImpl`, and ensures that it is always
unique, i.e. a distinct DIArgList should never be produced.

This should not result in any changes to IR or bitcode parsing and
printing, as the format for DIArgList is unchanged, and the order in which it
appears should also be identical. As a minor note, this patch also fixes
a gap in the verifier, where the ValueAsMetadata operands to a DIArgList
would not be visited.
zahiraam pushed a commit to zahiraam/llvm-project that referenced this pull request Nov 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants