Skip to content

Conversation

vitalybuka
Copy link
Collaborator

@vitalybuka vitalybuka commented Sep 6, 2025

In #157045 I didn't realize that IRMover object is
one for entire LTO, but IRLinker is created per
module. We need one cache for combined module.

Also update StringRef key to just a pointer, which
should be enough for this task.

I timed "IRLinker::linkNamedMDNodes" code on one
of our internal binaries, not very large, but
above average.

Before #146020: 0.4859s
After #146020: 624.4686s
After #157045: 322.3493s
After this patch: 0.5574s

Created using spr 1.3.6
@llvmbot llvmbot added the LTO Link time optimization (regular/full LTO or ThinLTO) label Sep 6, 2025
@vitalybuka vitalybuka requested a review from arsenm September 6, 2025 01:52
@llvmbot
Copy link
Member

llvmbot commented Sep 6, 2025

@llvm/pr-subscribers-lto

Author: Vitaly Buka (vitalybuka)

Changes

In #157045 I didn't realize that IRMover object is
one for entire LTO, but IRLinker is created per
module. We need one chache for combined module.

I timed "IRLinker::linkNamedMDNodes" code on one
of our internal binaries, not very large, but
above average.

Before #146020: 0.4859s
After #146020: 624.4686s
After #157045: 322.3493s
After this patch: 0.5574s


Full diff: https://github.com/llvm/llvm-project/pull/157218.diff

2 Files Affected:

  • (modified) llvm/include/llvm/Linker/IRMover.h (+6)
  • (modified) llvm/lib/Linker/IRMover.cpp (+9-7)
diff --git a/llvm/include/llvm/Linker/IRMover.h b/llvm/include/llvm/Linker/IRMover.h
index 39929311a3227..1dac5605f4961 100644
--- a/llvm/include/llvm/Linker/IRMover.h
+++ b/llvm/include/llvm/Linker/IRMover.h
@@ -10,6 +10,7 @@
 #define LLVM_LINKER_IRMOVER_H
 
 #include "llvm/ADT/ArrayRef.h"
+#include "llvm/ADT/DenseMap.h"
 #include "llvm/ADT/DenseSet.h"
 #include "llvm/ADT/FunctionExtras.h"
 #include "llvm/Support/Compiler.h"
@@ -19,6 +20,8 @@ namespace llvm {
 class Error;
 class GlobalValue;
 class Metadata;
+class MDNode;
+class NamedMDNode;
 class Module;
 class StructType;
 class TrackingMDRef;
@@ -67,6 +70,8 @@ class IRMover {
   using LazyCallback =
       llvm::unique_function<void(GlobalValue &GV, ValueAdder Add)>;
 
+  typedef DenseMap<const NamedMDNode *, DenseSet<const MDNode *>> NamedMDNodesT;
+
   /// Move in the provide values in \p ValuesToLink from \p Src.
   ///
   /// - \p AddLazyFor is a call back that the IRMover will call when a global
@@ -86,6 +91,7 @@ class IRMover {
   Module &Composite;
   IdentifiedStructTypeSet IdentifiedStructTypes;
   MDMapT SharedMDs; ///< A Metadata map to use for all calls to \a move().
+  NamedMDNodesT NamedMDNodes; ///< Cache for IRMover::linkNamedMDNodes().
 };
 
 } // End llvm namespace
diff --git a/llvm/lib/Linker/IRMover.cpp b/llvm/lib/Linker/IRMover.cpp
index f53fe8f84fe3c..489e03d5fdd60 100644
--- a/llvm/lib/Linker/IRMover.cpp
+++ b/llvm/lib/Linker/IRMover.cpp
@@ -293,7 +293,7 @@ class IRLinker {
   std::unique_ptr<Module> SrcM;
 
   // Lookup table to optimize IRMover::linkNamedMDNodes().
-  DenseMap<StringRef, DenseSet<MDNode *>> NamedMDNodes;
+  IRMover::NamedMDNodesT &NamedMDNodes;
 
   /// See IRMover::move().
   IRMover::LazyCallback AddLazyFor;
@@ -440,10 +440,12 @@ class IRLinker {
   IRLinker(Module &DstM, MDMapT &SharedMDs,
            IRMover::IdentifiedStructTypeSet &Set, std::unique_ptr<Module> SrcM,
            ArrayRef<GlobalValue *> ValuesToLink,
-           IRMover::LazyCallback AddLazyFor, bool IsPerformingImport)
-      : DstM(DstM), SrcM(std::move(SrcM)), AddLazyFor(std::move(AddLazyFor)),
-        TypeMap(Set), GValMaterializer(*this), LValMaterializer(*this),
-        SharedMDs(SharedMDs), IsPerformingImport(IsPerformingImport),
+           IRMover::LazyCallback AddLazyFor, bool IsPerformingImport,
+           IRMover::NamedMDNodesT &NamedMDNodes)
+      : DstM(DstM), SrcM(std::move(SrcM)), NamedMDNodes(NamedMDNodes),
+        AddLazyFor(std::move(AddLazyFor)), TypeMap(Set),
+        GValMaterializer(*this), LValMaterializer(*this), SharedMDs(SharedMDs),
+        IsPerformingImport(IsPerformingImport),
         Mapper(ValueMap, RF_ReuseAndMutateDistinctMDs | RF_IgnoreMissingLocals,
                &TypeMap, &GValMaterializer),
         IndirectSymbolMCID(Mapper.registerAlternateMappingContext(
@@ -1138,7 +1140,7 @@ void IRLinker::linkNamedMDNodes() {
 
     NamedMDNode *DestNMD = DstM.getOrInsertNamedMetadata(NMD.getName());
 
-    auto &Inserted = NamedMDNodes[DestNMD->getName()];
+    auto &Inserted = NamedMDNodes[DestNMD];
     if (Inserted.empty()) {
       // Must be the first module, copy everything from DestNMD.
       Inserted.insert(DestNMD->operands().begin(), DestNMD->operands().end());
@@ -1683,6 +1685,6 @@ Error IRMover::move(std::unique_ptr<Module> Src,
                     LazyCallback AddLazyFor, bool IsPerformingImport) {
   IRLinker TheIRLinker(Composite, SharedMDs, IdentifiedStructTypes,
                        std::move(Src), ValuesToLink, std::move(AddLazyFor),
-                       IsPerformingImport);
+                       IsPerformingImport, NamedMDNodes);
   return TheIRLinker.run();
 }

@vitalybuka vitalybuka requested a review from wenju-he September 6, 2025 01:52
@vitalybuka vitalybuka enabled auto-merge (squash) September 6, 2025 01:52
@vitalybuka vitalybuka requested a review from Copilot September 6, 2025 01:53
Copilot

This comment was marked as outdated.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@vitalybuka vitalybuka requested a review from Copilot September 6, 2025 01:56
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes a performance regression in the IRMover class by moving the NamedMDNodes cache from the IRLinker level to the IRMover level. The issue was that IRLinker objects are created per module during LTO, but the IRMover object persists for the entire LTO process, causing the cache to be rebuilt repeatedly.

Key Changes

  • Moved NamedMDNodes cache from IRLinker to IRMover class
  • Updated function signature to pass the cache by reference
  • Changed cache key from StringRef to const NamedMDNode*

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
llvm/include/llvm/Linker/IRMover.h Added NamedMDNodesT type definition and cache member to IRMover class
llvm/lib/Linker/IRMover.cpp Moved NamedMDNodes cache from IRLinker to IRMover and updated references

@vitalybuka vitalybuka requested a review from pcc September 6, 2025 01:58
@arsenm
Copy link
Contributor

arsenm commented Sep 6, 2025

After this patch: 0.5574s

Maybe this should be using small sets? I wouldn't expect there to be many duplicates to resolve

@vitalybuka vitalybuka merged commit 3386273 into main Sep 6, 2025
9 checks passed
@vitalybuka vitalybuka deleted the users/vitalybuka/spr/irmover-proper-fix-of-performance-regression-of-146020 branch September 6, 2025 02:24
vitalybuka added a commit that referenced this pull request Sep 8, 2025
Follow up to #157218.
But I can't measure a difference on
my example.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
LTO Link time optimization (regular/full LTO or ThinLTO)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants