Skip to content

Conversation

@kaadam
Copy link
Contributor

@kaadam kaadam commented Oct 16, 2025

Extend perf2bolt functionality by adding a new option to read perf-script output in textual format which created by Linux Perf with using 'script' command.

This option helps to add a large Spe test into the 'bolt-tests' repository to cover Arm Spe aggregation.

Why does the test need to have a textual format Spe profile?

  • To collect an Arm Spe profile by Linux Perf, it needs to have an arm developer device which has Spe support.
  • To decode Spe data, it also needs to have the proper version of Linux Perf.
    • The minimum required version of Linux Perf is v6.15.

To bypass these technical difficulties, that's easier to provide a pre-generated textual profile format.

How should generate this type of profile?

  1. Gather profile by using Linux Perf:
    $ perf record -e 'arm_spe_0/branch_filter=1/u' -- BINARY

  2. Generate a textual format profile by using Linux Perf's script command:
    $ perf script --show-mmap-events -F pid,brstack --itrace=bl -i perf.data

Extend perf2bolt functionality by adding a new option to
read perf-script output in textual format which created by
Linux perf using 'script' command.

This option helps to add a large Spe test into the 'bolt-tests'
repository to cover Arm Spe aggregation.

Why does the test need to have a textual format Spe profile?
- To collect an Arm Spe profile by Linux Perf, it needs to have
  an arm developer device which has Spe support.
- To decode Spe data, it also needs to have the proper version of
  Linux Perf.
  The minimum required version of Linux Perf is v6.15.

Bypassing these technical difficulties, that easier to prove
a pre-generated textual profile format.

How should generate this type of profile?

 1) Gather profile by using Linux Perf with '-b' or '--spe' option.

 2) Generate a textual format profile by using Linux Perf's script command:
    $ perf script --show-mmap-events -F pid,brstack --itrace=bl  -i perf.data
@kaadam kaadam marked this pull request as ready for review October 21, 2025 09:55
@llvmbot llvmbot added the BOLT label Oct 21, 2025
@llvmbot
Copy link
Member

llvmbot commented Oct 21, 2025

@llvm/pr-subscribers-bolt

Author: Ádám Kallai (kaadam)

Changes

Extend perf2bolt functionality by adding a new option to read perf-script output in textual format which created by Linux Perf with using 'script' command.

This option helps to add a large Spe test into the 'bolt-tests' repository to cover Arm Spe aggregation.

Why does the test need to have a textual format Spe profile?

  • To collect an Arm Spe profile by Linux Perf, it needs to have an arm developer device which has Spe support.
  • To decode Spe data, it also needs to have the proper version of Linux Perf.
    • The minimum required version of Linux Perf is v6.15.

To bypass these technical difficulties, that's easier to provide a pre-generated textual profile format.

How should generate this type of profile?

  1. Gather profile by using Linux Perf:
    $ perf record -e 'arm_spe_0/branch_filter=1/u' -- BINARY

  2. Generate a textual format profile by using Linux Perf's script command:
    $ perf script --show-mmap-events -F pid,brstack --itrace=bl -i perf.data


Full diff: https://github.com/llvm/llvm-project/pull/163785.diff

2 Files Affected:

  • (modified) bolt/include/bolt/Profile/DataAggregator.h (+26)
  • (modified) bolt/lib/Profile/DataAggregator.cpp (+86-3)
diff --git a/bolt/include/bolt/Profile/DataAggregator.h b/bolt/include/bolt/Profile/DataAggregator.h
index cb1b87f8d0d65..de88a8bb8ad1e 100644
--- a/bolt/include/bolt/Profile/DataAggregator.h
+++ b/bolt/include/bolt/Profile/DataAggregator.h
@@ -440,6 +440,32 @@ class DataAggregator : public DataReader {
   /// B 4b196f 4b19e0 2 0
   void parsePreAggregated();
 
+  /// Detect whether the parsed line is an mmap event or not.
+  bool isMMapEvent(StringRef Line);
+
+  /// Coordinate reading and parsing a hybrid perf-script trace created by
+  /// the following Linux perf script command:
+  /// 'perf script --show-mmap-events -F pid,brstack --itrace=bl -i perf.data'
+  ///
+  /// Note:
+  /// The original perf.data should be profiled with '-b' or 'Arm Spe'.
+  ///
+  /// How the output of this command looks like:
+  /// {<name> .* <sec>.<usec>: }PERF_RECORD_MMAP2 <pid>/<tid>: .* <file_name>
+  /// {<name> .* <sec>.<usec>: }PERF_RECORD_MMAP2 <pid>/<tid>: .* <file_name>
+  ///  PID  {FROM/TO/P/-/-/1/COND/-}+
+  ///  PID  {FROM/TO/P/-/-/1/COND/-}+
+  ///
+  /// The hybrid profile means it contains mmap events along with branch events.
+  /// An mmap event might appear among the branch events, therefore
+  /// Bolt will read this hybrid profile, selects the mmap events, the other
+  /// events treat as branch event.
+  /// Then it prepares the ParsingBuf based on the classification and
+  /// call the proper functions like parseMMapEvents() or parseBranchEvents().
+  ///
+  /// This option is only for testing purposes.
+  void parsePerfScriptEvents();
+
   /// Parse the full output of pre-aggregated LBR samples generated by
   /// an external tool.
   std::error_code parsePreAggregatedLBRSamples();
diff --git a/bolt/lib/Profile/DataAggregator.cpp b/bolt/lib/Profile/DataAggregator.cpp
index c13fa6dbe582b..8a2119480d49b 100644
--- a/bolt/lib/Profile/DataAggregator.cpp
+++ b/bolt/lib/Profile/DataAggregator.cpp
@@ -115,6 +115,12 @@ cl::opt<std::string>
                             "perf-script output in a textual format"),
                    cl::ReallyHidden, cl::init(""), cl::cat(AggregatorCategory));
 
+cl::opt<bool>
+    ReadPerfScript("perfscript",
+                   cl::desc("skip perf and read perf-script trace created by "
+                            "Linux perf tool with script command"),
+                   cl::ReallyHidden, cl::cat(AggregatorCategory));
+
 static cl::opt<bool>
 TimeAggregator("time-aggr",
   cl::desc("time BOLT aggregator"),
@@ -184,7 +190,8 @@ void DataAggregator::start() {
 
   // Don't launch perf for pre-aggregated files or when perf input is specified
   // by the user.
-  if (opts::ReadPreAggregated || !opts::ReadPerfEvents.empty())
+  if (opts::ReadPreAggregated || opts::ReadPerfScript ||
+      !opts::ReadPerfEvents.empty())
     return;
 
   findPerfExecutable();
@@ -226,7 +233,7 @@ void DataAggregator::start() {
 }
 
 void DataAggregator::abort() {
-  if (opts::ReadPreAggregated)
+  if (opts::ReadPreAggregated || opts::ReadPerfScript)
     return;
 
   std::string Error;
@@ -326,7 +333,7 @@ void DataAggregator::processFileBuildID(StringRef FileBuildID) {
 }
 
 bool DataAggregator::checkPerfDataMagic(StringRef FileName) {
-  if (opts::ReadPreAggregated)
+  if (opts::ReadPreAggregated || opts::ReadPerfScript)
     return true;
 
   Expected<sys::fs::file_t> FD = sys::fs::openNativeFileForRead(FileName);
@@ -372,6 +379,80 @@ void DataAggregator::parsePreAggregated() {
   }
 }
 
+bool DataAggregator::isMMapEvent(StringRef Line) {
+  // Short cut to avoid string find is possible.
+  if (Line.empty() || Line.size() < 50)
+    return false;
+
+  // Check that PERF_RECORD_MMAP2 or PERF_RECORD_MMAP appear in the line.
+  return Line.contains("PERF_RECORD_MMAP");
+}
+
+void DataAggregator::parsePerfScriptEvents() {
+  outs() << "PERF2BOLT: parsing a hybrid perf-script events...\n";
+  NamedRegionTimer T("parsePerfScriptEvents", "Parsing perf-script events",
+                     TimerGroupName, TimerGroupDesc, opts::TimeAggregator);
+
+  ErrorOr<std::unique_ptr<MemoryBuffer>> MB =
+      MemoryBuffer::getFileOrSTDIN(Filename);
+  if (std::error_code EC = MB.getError()) {
+    errs() << "PERF2BOLT-ERROR: cannot open " << Filename << ": "
+           << EC.message() << "\n";
+    exit(1);
+  }
+
+  FileBuf = std::move(*MB);
+  ParsingBuf = FileBuf->getBuffer();
+  Col = 0;
+  Line = 1;
+  std::string MMapEvents = "";
+  std::string BranchEvents = "";
+
+  if (!hasData())
+    return;
+
+  while (hasData()) {
+
+    size_t LineEnd = ParsingBuf.find_first_of("\n");
+    if (LineEnd == StringRef::npos) {
+      reportError("expected rest of line");
+      errs() << "Found: " << ParsingBuf << "\n";
+    }
+    StringRef Event = ParsingBuf.substr(0, LineEnd);
+
+    if (isMMapEvent(Event)) {
+      MMapEvents += Event.str();
+      MMapEvents += "\n";
+    } else {
+      BranchEvents += Event.str();
+      BranchEvents += '\n';
+    }
+
+    ParsingBuf = ParsingBuf.drop_front(LineEnd + 1);
+    Col = 0;
+    Line += 1;
+  }
+
+  // Set ParsingBuf for MMapEvents
+  ParsingBuf = StringRef(MMapEvents);
+  Col = 0;
+  Line = 1;
+  if (!ParsingBuf.empty() && parseMMapEvents()) {
+    errs() << "PERF2BOLT: failed to parse mmap events from the perf-script "
+              "file.\n";
+    exit(1);
+  }
+
+  // Set ParsingBuf for BranchEvents
+  ParsingBuf = StringRef(BranchEvents);
+  Col = 0;
+  Line = 1;
+  if (!ParsingBuf.empty() && parseBranchEvents()) {
+    errs() << "PERF2BOLT: failed to parse samples from perf-script file.\n";
+    exit(1);
+  }
+}
+
 void DataAggregator::filterBinaryMMapInfo() {
   if (opts::FilterPID) {
     auto MMapInfoIter = BinaryMMapInfo.find(opts::FilterPID);
@@ -606,6 +687,8 @@ Error DataAggregator::preprocessProfile(BinaryContext &BC) {
 
   if (opts::ReadPreAggregated) {
     parsePreAggregated();
+  } else if (opts::ReadPerfScript) {
+    parsePerfScriptEvents();
   } else {
     parsePerfData(BC);
   }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants