Skip to content

Conversation

aaupov
Copy link
Contributor

@aaupov aaupov commented Jul 25, 2025

While registering profiled functions, only handle each address once.
Speeds up DataAggregator::preprocessProfile.

Test Plan:
For intermediate size pre-aggregated profile (10MB), reduces parsing
time from ~0.41s down to ~0.16s.

Created using spr 1.3.4
@llvmbot
Copy link
Member

llvmbot commented Jul 25, 2025

@llvm/pr-subscribers-bolt

Author: Amir Ayupov (aaupov)

Changes

While registering profiled functions, only handle each address once.
Speeds up DataAggregator::preprocessProfile.

Test Plan:
For intermediate size pre-aggregated profile (10MB), reduces parsing
time from ~0.41s down to ~0.16s.


Full diff: https://github.com/llvm/llvm-project/pull/150623.diff

2 Files Affected:

  • (modified) bolt/include/bolt/Profile/DataAggregator.h (+3)
  • (modified) bolt/lib/Profile/DataAggregator.cpp (+21-18)
diff --git a/bolt/include/bolt/Profile/DataAggregator.h b/bolt/include/bolt/Profile/DataAggregator.h
index db0f6903185b7..cb1b87f8d0d65 100644
--- a/bolt/include/bolt/Profile/DataAggregator.h
+++ b/bolt/include/bolt/Profile/DataAggregator.h
@@ -502,6 +502,9 @@ class DataAggregator : public DataReader {
   /// entries).
   void imputeFallThroughs();
 
+  /// Register profiled functions for lite mode.
+  void registerProfiledFunctions();
+
   /// Debugging dump methods
   void dump() const;
   void dump(const PerfBranchSample &Sample) const;
diff --git a/bolt/lib/Profile/DataAggregator.cpp b/bolt/lib/Profile/DataAggregator.cpp
index 3604fdd3a94b4..c13fa6dbe582b 100644
--- a/bolt/lib/Profile/DataAggregator.cpp
+++ b/bolt/lib/Profile/DataAggregator.cpp
@@ -581,6 +581,26 @@ void DataAggregator::imputeFallThroughs() {
     outs() << "BOLT-INFO: imputed " << InferredTraces << " traces\n";
 }
 
+void DataAggregator::registerProfiledFunctions() {
+  DenseSet<uint64_t> Addrs;
+  for (const auto &Trace : llvm::make_first_range(Traces)) {
+    if (Trace.Branch != Trace::FT_ONLY &&
+        Trace.Branch != Trace::FT_EXTERNAL_ORIGIN)
+      Addrs.insert(Trace.Branch);
+    Addrs.insert(Trace.From);
+  }
+
+  for (const auto [PC, _] : BasicSamples)
+    Addrs.insert(PC);
+
+  for (const PerfMemSample &MemSample : MemSamples)
+    Addrs.insert(MemSample.PC);
+
+  for (const uint64_t Addr : Addrs)
+    if (BinaryFunction *Func = getBinaryFunctionContainingAddress(Addr))
+      Func->setHasProfileAvailable();
+}
+
 Error DataAggregator::preprocessProfile(BinaryContext &BC) {
   this->BC = &BC;
 
@@ -603,6 +623,7 @@ Error DataAggregator::preprocessProfile(BinaryContext &BC) {
       exit(0);
   }
 
+  registerProfiledFunctions();
   return Error::success();
 }
 
@@ -1347,10 +1368,6 @@ std::error_code DataAggregator::parseAggregatedLBREntry() {
   }
 
   const uint64_t FromOffset = Addr[0]->Offset;
-  BinaryFunction *FromFunc = getBinaryFunctionContainingAddress(FromOffset);
-  if (FromFunc)
-    FromFunc->setHasProfileAvailable();
-
   int64_t Count = Counters[0];
   int64_t Mispreds = Counters[1];
 
@@ -1361,11 +1378,6 @@ std::error_code DataAggregator::parseAggregatedLBREntry() {
     return std::error_code();
   }
 
-  const uint64_t ToOffset = Addr[1]->Offset;
-  BinaryFunction *ToFunc = getBinaryFunctionContainingAddress(ToOffset);
-  if (ToFunc)
-    ToFunc->setHasProfileAvailable();
-
   /// For fall-through types, adjust locations to match Trace container.
   if (Type == FT || Type == FT_EXTERNAL_ORIGIN || Type == FT_EXTERNAL_RETURN) {
     Addr[2] = Location(Addr[1]->Offset); // Trace To
@@ -1613,9 +1625,6 @@ std::error_code DataAggregator::parseBranchEvents() {
   Traces.reserve(TraceMap.size());
   for (const auto &[Trace, Info] : TraceMap) {
     Traces.emplace_back(Trace, Info);
-    for (const uint64_t Addr : {Trace.Branch, Trace.From})
-      if (BinaryFunction *BF = getBinaryFunctionContainingAddress(Addr))
-        BF->setHasProfileAvailable();
   }
   clear(TraceMap);
 
@@ -1676,9 +1685,6 @@ std::error_code DataAggregator::parseBasicEvents() {
       continue;
     ++NumTotalSamples;
 
-    if (BinaryFunction *BF = getBinaryFunctionContainingAddress(Sample->PC))
-      BF->setHasProfileAvailable();
-
     ++BasicSamples[Sample->PC];
     EventNames.insert(Sample->EventName);
   }
@@ -1716,9 +1722,6 @@ std::error_code DataAggregator::parseMemEvents() {
     if (std::error_code EC = Sample.getError())
       return EC;
 
-    if (BinaryFunction *BF = getBinaryFunctionContainingAddress(Sample->PC))
-      BF->setHasProfileAvailable();
-
     MemSamples.emplace_back(std::move(Sample.get()));
   }
 

@aaupov aaupov closed this Jul 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants