Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BasicBlockSections] Introduce the path cloning profile format to BasicBlockSectionsProfileReader. #67214

Merged
merged 3 commits into from
Oct 12, 2023

Conversation

rlavaee
Copy link
Contributor

@rlavaee rlavaee commented Sep 23, 2023

Following up on prior RFC (https://lists.llvm.org/pipermail/llvm-dev/2020-September/145357.html) we can now improve above our highly-optimized basic-block-sections binary (e.g., 2% for clang) by applying path cloning. Cloning can improve performance by reducing taken branches.

This patch prepares the profile format for applying cloning actions.

The basic block cloning profile format extends the basic block sections profile in two ways.

  1. Specifies the cloning paths with a 'p' specifier. For example, p 1 4 5 specifies that blocks with BB ids 4 and 5 must be cloned along the edge 1 --> 4.
  2. For each cloned block, it will appear in the cluster info as <bb_id>.<clone_id> where clone_id is the id associated with this clone.

For example, the following profile specifies one cloned block (2) and determines its cluster position as well.

f foo
p 1 2
c 0 1 2.1 3 2 5

This patch keeps backward-compatibility (retains the behavior for old profile formats). This feature is only introduced for profile version >= 1.

@llvmbot
Copy link
Collaborator

llvmbot commented Sep 23, 2023

@llvm/pr-subscribers-backend-x86

Changes

…icBlockSectionsProfileReader.

Following up on prior RFC (https://lists.llvm.org/pipermail/llvm-dev/2020-September/145357.html) we can now improve above our highly-optimized basic-block-sections binary (e.g., 2% for clang) by applying path cloning. Cloning can improve performance by reducing taken branches.

This patch prepares the profile format for applying cloning actions.

The basic block cloning profile format extends the basic block sections profile in two ways.

  1. Specifies the cloning paths with a 'p' specifier. For example, p 1 4 5 specifies that blocks with BB ids 4 and 5 must be cloned along the edge 1 --> 4.
  2. For each cloned block, it will appear in the cluster info as &lt;bb_id&gt;.&lt;clone_id&gt; where clone_id is the id associated with this clone.

For example, the following profile specifies one cloned block (2) and determines its cluster position as well.

f foo
p 1 2
c 0 1 2.1 3 2 5

This patch keeps backward-compatibility (retains the behavior for old profile formats). This feature is only introduced for profile version >= 1.

Differential Revision: https://reviews.llvm.org/D158442


Patch is 22.75 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/67214.diff

4 Files Affected:

  • (modified) llvm/include/llvm/CodeGen/BasicBlockSectionsProfileReader.h (+50-11)
  • (modified) llvm/lib/CodeGen/BasicBlockSections.cpp (+27-43)
  • (modified) llvm/lib/CodeGen/BasicBlockSectionsProfileReader.cpp (+83-31)
  • (modified) llvm/test/CodeGen/X86/basic-block-sections-clusters-error.ll (+28-7)
diff --git a/llvm/include/llvm/CodeGen/BasicBlockSectionsProfileReader.h b/llvm/include/llvm/CodeGen/BasicBlockSectionsProfileReader.h
index ad26eee642ead4f..f04eab8172253d9 100644
--- a/llvm/include/llvm/CodeGen/BasicBlockSectionsProfileReader.h
+++ b/llvm/include/llvm/CodeGen/BasicBlockSectionsProfileReader.h
@@ -28,17 +28,53 @@
 
 namespace llvm {
 
-// The cluster information for a machine basic block.
-struct BBClusterInfo {
-  // Unique ID for this basic block.
+// This structure represents a unique ID for every block specified in the
+// profile.
+struct ProfileBBID {
   unsigned BBID;
+  unsigned CloneID;
+};
+
+// Provides DenseMapInfo for ProfileBBID.
+template <> struct DenseMapInfo<ProfileBBID> {
+  static inline ProfileBBID getEmptyKey() {
+    unsigned EmptyKey = DenseMapInfo<unsigned>::getEmptyKey();
+    return ProfileBBID{EmptyKey, EmptyKey};
+  }
+  static inline ProfileBBID getTombstoneKey() {
+    unsigned TombstoneKey = DenseMapInfo<unsigned>::getTombstoneKey();
+    return ProfileBBID{TombstoneKey, TombstoneKey};
+  }
+  static unsigned getHashValue(const ProfileBBID &Val) {
+    std::pair<unsigned, unsigned> PairVal =
+        std::make_pair(Val.BBID, Val.CloneID);
+    return DenseMapInfo<std::pair<unsigned, unsigned>>::getHashValue(PairVal);
+  }
+  static bool isEqual(const ProfileBBID &LHS, const ProfileBBID &RHS) {
+    return DenseMapInfo<unsigned>::isEqual(LHS.BBID, RHS.BBID) &&
+           DenseMapInfo<unsigned>::isEqual(LHS.CloneID, RHS.CloneID);
+  }
+};
+
+// This struct represents the cluster information for a machine basic block,
+// which is specifed by a unique ID.
+template <typename BBIDType> struct BBProfile {
+  // Basic block ID.
+  BBIDType BasicBlockID;
   // Cluster ID this basic block belongs to.
   unsigned ClusterID;
   // Position of basic block within the cluster.
   unsigned PositionInCluster;
 };
 
-using ProgramBBClusterInfoMapTy = StringMap<SmallVector<BBClusterInfo>>;
+// This represents the profile for one function.
+struct RawFunctionProfile {
+  // BB Cluster information specified by `ProfileBBID`s (before cloning).
+  SmallVector<BBProfile<ProfileBBID>> RawBBProfiles;
+  // Paths to clone. A path a -> b -> c -> d implies cloning b, c, and d along
+  // the edge a -> b.
+  SmallVector<SmallVector<unsigned>> ClonePaths;
+};
 
 class BasicBlockSectionsProfileReader : public ImmutablePass {
 public:
@@ -70,8 +106,8 @@ class BasicBlockSectionsProfileReader : public ImmutablePass {
   // function. If the first element is true and the second element is empty, it
   // means unique basic block sections are desired for all basic blocks of the
   // function.
-  std::pair<bool, SmallVector<BBClusterInfo>>
-  getBBClusterInfoForFunction(StringRef FuncName) const;
+  std::pair<bool, RawFunctionProfile>
+  getRawProfileForFunction(StringRef FuncName) const;
 
   // Initializes the FunctionNameToDIFilename map for the current module and
   // then reads the profile for matching functions.
@@ -91,6 +127,9 @@ class BasicBlockSectionsProfileReader : public ImmutablePass {
         inconvertibleErrorCode());
   }
 
+  // Parses a `ProfileBBID` from `S`.
+  Expected<ProfileBBID> parseProfileBBID(StringRef S) const;
+
   // Reads the basic block sections profile for functions in this module.
   Error ReadProfile();
 
@@ -113,11 +152,11 @@ class BasicBlockSectionsProfileReader : public ImmutablePass {
 
   // This encapsulates the BB cluster information for the whole program.
   //
-  // For every function name, it contains the cluster information for (all or
-  // some of) its basic blocks. The cluster information for every basic block
-  // includes its cluster ID along with the position of the basic block in that
-  // cluster.
-  ProgramBBClusterInfoMapTy ProgramBBClusterInfo;
+  // For every function name, it contains the cloning and cluster information
+  // for (all or some of) its basic blocks. The cluster information for every
+  // basic block includes its cluster ID along with the position of the basic
+  // block in that cluster.
+  StringMap<RawFunctionProfile> RawProgramProfile;
 
   // Some functions have alias names. We use this map to find the main alias
   // name for which we have mapping in ProgramBBClusterInfo.
diff --git a/llvm/lib/CodeGen/BasicBlockSections.cpp b/llvm/lib/CodeGen/BasicBlockSections.cpp
index de7c17082fa4bb9..5e61a96c696aff0 100644
--- a/llvm/lib/CodeGen/BasicBlockSections.cpp
+++ b/llvm/lib/CodeGen/BasicBlockSections.cpp
@@ -168,31 +168,6 @@ updateBranches(MachineFunction &MF,
   }
 }
 
-// This function provides the BBCluster information associated with a function.
-// Returns true if a valid association exists and false otherwise.
-bool getBBClusterInfoForFunction(
-    const MachineFunction &MF,
-    BasicBlockSectionsProfileReader *BBSectionsProfileReader,
-    DenseMap<unsigned, BBClusterInfo> &V) {
-
-  // Find the assoicated cluster information.
-  std::pair<bool, SmallVector<BBClusterInfo, 4>> P =
-      BBSectionsProfileReader->getBBClusterInfoForFunction(MF.getName());
-  if (!P.first)
-    return false;
-
-  if (P.second.empty()) {
-    // This indicates that sections are desired for all basic blocks of this
-    // function. We clear the BBClusterInfo vector to denote this.
-    V.clear();
-    return true;
-  }
-
-  for (const BBClusterInfo &BBCI : P.second)
-    V[BBCI.BBID] = BBCI;
-  return true;
-}
-
 // This function sorts basic blocks according to the cluster's information.
 // All explicitly specified clusters of basic blocks will be ordered
 // accordingly. All non-specified BBs go into a separate "Cold" section.
@@ -200,12 +175,12 @@ bool getBBClusterInfoForFunction(
 // clusters, they are moved into a single "Exception" section. Eventually,
 // clusters are ordered in increasing order of their IDs, with the "Exception"
 // and "Cold" succeeding all other clusters.
-// FuncBBClusterInfo represent the cluster information for basic blocks. It
+// BBProfilesByBBID represents the cluster information for basic blocks. It
 // maps from BBID of basic blocks to their cluster information. If this is
 // empty, it means unique sections for all basic blocks in the function.
-static void
-assignSections(MachineFunction &MF,
-               const DenseMap<unsigned, BBClusterInfo> &FuncBBClusterInfo) {
+static void assignSections(
+    MachineFunction &MF,
+    const DenseMap<unsigned, BBProfile<unsigned>> &BBProfilesByBBID) {
   assert(MF.hasBBSections() && "BB Sections is not set for function.");
   // This variable stores the section ID of the cluster containing eh_pads (if
   // all eh_pads are one cluster). If more than one cluster contain eh_pads, we
@@ -216,17 +191,17 @@ assignSections(MachineFunction &MF,
     // With the 'all' option, every basic block is placed in a unique section.
     // With the 'list' option, every basic block is placed in a section
     // associated with its cluster, unless we want individual unique sections
-    // for every basic block in this function (if FuncBBClusterInfo is empty).
+    // for every basic block in this function (if BBProfilesByBBID is empty).
     if (MF.getTarget().getBBSectionsType() == llvm::BasicBlockSection::All ||
-        FuncBBClusterInfo.empty()) {
+        BBProfilesByBBID.empty()) {
       // If unique sections are desired for all basic blocks of the function, we
       // set every basic block's section ID equal to its original position in
       // the layout (which is equal to its number). This ensures that basic
       // blocks are ordered canonically.
       MBB.setSectionID(MBB.getNumber());
     } else {
-      auto I = FuncBBClusterInfo.find(*MBB.getBBID());
-      if (I != FuncBBClusterInfo.end()) {
+      auto I = BBProfilesByBBID.find(*MBB.getBBID());
+      if (I != BBProfilesByBBID.end()) {
         MBB.setSectionID(I->second.ClusterID);
       } else {
         // BB goes into the special cold section if it is not specified in the
@@ -333,16 +308,25 @@ bool BasicBlockSections::runOnMachineFunction(MachineFunction &MF) {
     return true;
   }
 
-  BBSectionsProfileReader = &getAnalysis<BasicBlockSectionsProfileReader>();
+  DenseMap<unsigned, BBProfile<unsigned>> BBProfilesByBBID;
+  if (BBSectionsType == BasicBlockSection::List) {
+    auto [HasProfile, RawProfile] =
+        getAnalysis<BasicBlockSectionsProfileReader>().getRawProfileForFunction(
+            MF.getName());
+    if (!HasProfile)
+      return true;
+    // TODO: Apply the path cloning profile.
+    for (const BBProfile<ProfileBBID> &BBP : RawProfile.RawBBProfiles) {
+      assert(!BBP.BasicBlockID.CloneID && "Path cloning is not supported yet.");
+      BBProfilesByBBID.try_emplace(BBP.BasicBlockID.BBID,
+                                   BBProfile<unsigned>{BBP.BasicBlockID.BBID,
+                                                       BBP.ClusterID,
+                                                       BBP.PositionInCluster});
+    }
+  }
 
-  // Map from BBID of blocks to their cluster information.
-  DenseMap<unsigned, BBClusterInfo> FuncBBClusterInfo;
-  if (BBSectionsType == BasicBlockSection::List &&
-      !getBBClusterInfoForFunction(MF, BBSectionsProfileReader,
-                                   FuncBBClusterInfo))
-    return true;
   MF.setBBSectionsType(BBSectionsType);
-  assignSections(MF, FuncBBClusterInfo);
+  assignSections(MF, BBProfilesByBBID);
 
   // We make sure that the cluster including the entry basic block precedes all
   // other clusters.
@@ -376,8 +360,8 @@ bool BasicBlockSections::runOnMachineFunction(MachineFunction &MF) {
     // If the two basic block are in the same section, the order is decided by
     // their position within the section.
     if (XSectionID.Type == MBBSectionID::SectionType::Default)
-      return FuncBBClusterInfo.lookup(*X.getBBID()).PositionInCluster <
-             FuncBBClusterInfo.lookup(*Y.getBBID()).PositionInCluster;
+      return BBProfilesByBBID.lookup(*X.getBBID()).PositionInCluster <
+             BBProfilesByBBID.lookup(*Y.getBBID()).PositionInCluster;
     return X.getNumber() < Y.getNumber();
   };
 
diff --git a/llvm/lib/CodeGen/BasicBlockSectionsProfileReader.cpp b/llvm/lib/CodeGen/BasicBlockSectionsProfileReader.cpp
index ef5f1251f5324e4..a30b503c728e34c 100644
--- a/llvm/lib/CodeGen/BasicBlockSectionsProfileReader.cpp
+++ b/llvm/lib/CodeGen/BasicBlockSectionsProfileReader.cpp
@@ -13,6 +13,7 @@
 //===----------------------------------------------------------------------===//
 
 #include "llvm/CodeGen/BasicBlockSectionsProfileReader.h"
+#include "llvm/ADT/DenseSet.h"
 #include "llvm/ADT/SmallSet.h"
 #include "llvm/ADT/SmallString.h"
 #include "llvm/ADT/SmallVector.h"
@@ -34,17 +35,36 @@ INITIALIZE_PASS(BasicBlockSectionsProfileReader, "bbsections-profile-reader",
                 "Reads and parses a basic block sections profile.", false,
                 false)
 
+Expected<ProfileBBID>
+BasicBlockSectionsProfileReader::parseProfileBBID(StringRef S) const {
+  SmallVector<StringRef, 2> Parts;
+  S.split(Parts, '.');
+  if (Parts.size() > 2)
+    return createProfileParseError(Twine("unable to parse basic block id: '") +
+                                   S + "'");
+  unsigned long long BBID;
+  if (getAsUnsignedInteger(Parts[0], 10, BBID))
+    return createProfileParseError(
+        Twine("unable to parse BB id: '" + Parts[0]) +
+        "': unsigned integer expected");
+  unsigned long long CloneID = 0;
+  if (Parts.size() > 1 && getAsUnsignedInteger(Parts[1], 10, CloneID))
+    return createProfileParseError(Twine("unable to parse clone id: '") +
+                                   Parts[1] + "': unsigned integer expected");
+  return ProfileBBID{static_cast<unsigned>(BBID),
+                     static_cast<unsigned>(CloneID)};
+}
+
 bool BasicBlockSectionsProfileReader::isFunctionHot(StringRef FuncName) const {
-  return getBBClusterInfoForFunction(FuncName).first;
+  return getRawProfileForFunction(FuncName).first;
 }
 
-std::pair<bool, SmallVector<BBClusterInfo>>
-BasicBlockSectionsProfileReader::getBBClusterInfoForFunction(
+std::pair<bool, RawFunctionProfile>
+BasicBlockSectionsProfileReader::getRawProfileForFunction(
     StringRef FuncName) const {
-  auto R = ProgramBBClusterInfo.find(getAliasName(FuncName));
-  return R != ProgramBBClusterInfo.end()
-             ? std::pair(true, R->second)
-             : std::pair(false, SmallVector<BBClusterInfo>{});
+  auto R = RawProgramProfile.find(getAliasName(FuncName));
+  return R != RawProgramProfile.end() ? std::pair(true, R->second)
+                                      : std::pair(false, RawFunctionProfile());
 }
 
 // Reads the version 1 basic block sections profile. Profile for each function
@@ -61,8 +81,24 @@ BasicBlockSectionsProfileReader::getBBClusterInfoForFunction(
 // aliases. Basic block clusters are specified by 'c' and specify the cluster of
 // basic blocks, and the internal order in which they must be placed in the same
 // section.
+// This profile can also specify cloning paths which instruct the compiler to
+// clone basic blocks along a path. The cloned blocks are then specified in the
+// cluster information.
+// The following profile lists two cloning paths (starting with 'p') for function
+// bar and places the total 11 blocks within two clusters. Each cloned block is
+// identified by its original block id, along with its clone id. A block cloned
+// multiple times (2 in this example) appears with distinct clone ids (2.1
+// and 2.2).
+// ---------------------------
+// 
+// f main
+// f bar
+// p 1 2 3
+// p 4 2 5
+// c 2 3 5 6 7
+// c 1 2.1 3.1 4 2.2 5.1
 Error BasicBlockSectionsProfileReader::ReadV1Profile() {
-  auto FI = ProgramBBClusterInfo.end();
+  auto FI = RawProgramProfile.end();
 
   // Current cluster ID corresponding to this function.
   unsigned CurrentCluster = 0;
@@ -71,7 +107,7 @@ Error BasicBlockSectionsProfileReader::ReadV1Profile() {
 
   // Temporary set to ensure every basic block ID appears once in the clusters
   // of a function.
-  SmallSet<unsigned, 4> FuncBBIDs;
+  DenseSet<ProfileBBID> FuncBBIDs;
 
   // Debug-info-based module filename for the current function. Empty string
   // means no filename.
@@ -106,7 +142,7 @@ Error BasicBlockSectionsProfileReader::ReadV1Profile() {
       if (!FunctionFound) {
         // Skip the following profile by setting the profile iterator (FI) to
         // the past-the-end element.
-        FI = ProgramBBClusterInfo.end();
+        FI = RawProgramProfile.end();
         DIFilename = "";
         continue;
       }
@@ -115,7 +151,7 @@ Error BasicBlockSectionsProfileReader::ReadV1Profile() {
 
       // Prepare for parsing clusters of this function name.
       // Start a new cluster map for this function name.
-      auto R = ProgramBBClusterInfo.try_emplace(Values.front());
+      auto R = RawProgramProfile.try_emplace(Values.front());
       // Report error when multiple profiles have been specified for the same
       // function.
       if (!R.second)
@@ -132,27 +168,44 @@ Error BasicBlockSectionsProfileReader::ReadV1Profile() {
     case 'c': // Basic block cluster specifier.
       // Skip the profile when we the profile iterator (FI) refers to the
       // past-the-end element.
-      if (FI == ProgramBBClusterInfo.end())
+      if (FI == RawProgramProfile.end())
         break;
       // Reset current cluster position.
       CurrentPosition = 0;
-      for (auto BBIDStr : Values) {
-        unsigned long long BBID;
-        if (getAsUnsignedInteger(BBIDStr, 10, BBID))
-          return createProfileParseError(Twine("unsigned integer expected: '") +
-                                         BBIDStr + "'");
-        if (!FuncBBIDs.insert(BBID).second)
+      for (auto BasicBlockIDStr : Values) {
+        auto BasicBlockID = parseProfileBBID(BasicBlockIDStr);
+        if (!BasicBlockID)
+          return BasicBlockID.takeError();
+        if (!FuncBBIDs.insert(*BasicBlockID).second)
           return createProfileParseError(
-              Twine("duplicate basic block id found '") + BBIDStr + "'");
-        if (BBID == 0 && CurrentPosition)
+              Twine("duplicate basic block id found '") + BasicBlockIDStr +
+              "'");
+
+        if (!BasicBlockID->BBID && CurrentPosition)
           return createProfileParseError(
-              "entry BB (0) does not begin a cluster");
+              "entry BB (0) does not begin a cluster.");
 
-        FI->second.emplace_back(
-            BBClusterInfo{((unsigned)BBID), CurrentCluster, CurrentPosition++});
+        FI->second.RawBBProfiles.emplace_back(BBProfile<ProfileBBID>{
+            *std::move(BasicBlockID), CurrentCluster, CurrentPosition++});
       }
       CurrentCluster++;
       continue;
+    case 'p': { // Basic block cloning path specifier.
+      SmallSet<unsigned, 5> BBsInPath;
+      FI->second.ClonePaths.push_back({});
+      for (size_t I = 0; I < Values.size(); ++I) {
+        auto BBIDStr = Values[I];
+        unsigned long long BBID = 0;
+        if (getAsUnsignedInteger(BBIDStr, 10, BBID))
+          return createProfileParseError(Twine("unsigned integer expected: '") +
+                                         BBIDStr + "'");
+        if (I != 0 && !BBsInPath.insert(BBID).second)
+          return createProfileParseError(
+              Twine("duplicate cloned block in path: '") + BBIDStr + "'");
+        FI->second.ClonePaths.back().push_back(BBID);
+      }
+      continue;
+    }
     default:
       return createProfileParseError(Twine("invalid specifier: '") +
                                      Twine(Specifier) + "'");
@@ -162,8 +215,7 @@ Error BasicBlockSectionsProfileReader::ReadV1Profile() {
 }
 
 Error BasicBlockSectionsProfileReader::ReadV0Profile() {
-  auto FI = ProgramBBClusterInfo.end();
-
+  auto FI = RawProgramProfile.end();
   // Current cluster ID corresponding to this function.
   unsigned CurrentCluster = 0;
   // Current position in the current cluster.
@@ -184,7 +236,7 @@ Error BasicBlockSectionsProfileReader::ReadV0Profile() {
     if (S.consume_front("!")) {
       // Skip the profile when we the profile iterator (FI) refers to the
       // past-the-end element.
-      if (FI == ProgramBBClusterInfo.end())
+      if (FI == RawProgramProfile.end())
         continue;
       SmallVector<StringRef, 4> BBIDs;
       S.split(BBIDs, ' ');
@@ -202,8 +254,8 @@ Error BasicBlockSectionsProfileReader::ReadV0Profile() {
           return createProfileParseError(
               "entry BB (0) does not begin a cluster");
 
-        FI->second.emplace_back(
-            BBClusterInfo{((unsigned)BBID), CurrentCluster, CurrentPosition++});
+        FI->second.RawBBProfiles.emplace_back(
+            BBProfile<ProfileBBID>({{static_cast<unsigned>(BBID), 0}, CurrentCluster, CurrentPosition++}));
       }
       CurrentCluster++;
     } else {
@@ -237,7 +289,7 @@ Error BasicBlockSectionsProfileReader::ReadV0Profile() {
       if (!FunctionFound) {
         // Skip the following profile by setting the profile iterator (FI) to
         // the past-the-end element.
-        FI = ProgramBBClusterInfo.end();
+        FI = RawProgramProfile.end();
         continue;
       }
       for (size_t i = 1; i < Aliases.size(); ++i)
@@ -245,7 +297,7 @@ Error BasicBlockSectionsProfileReader::ReadV0Profile() {
 
       // Prepare for parsing clusters of this function name.
       // Start a new cluster map for this function name.
-      auto R = ProgramBBClusterInfo.try_emplace(Aliases.front());
+      auto R = RawProgramProfile.try_emplace(Aliases.front());
       // Report error when multiple profiles have been specified for the same
       // function.
       if (!R.second)
@@ -261,7 +313,7 @@ Error BasicBlockSectionsProfileReader::ReadV0Profile() {
 
 // Basic Block Sections can be enabled for a subset of machine basic blocks.
 // This is done by passing a file containing names of functions for which basic
-// block sections are desired.  Additionally, machine basic block ids of the
+// block sections are desired. Additionally, machine basic block ids of the
 // functions can also be specified for a finer granularity. Moreover, a cluster
 // of basic blocks could be assigned to the same section.
 // Optionally, a debug-info filename can be specified for each function to allow
diff --git a/llvm/test/CodeGen/X86/basic-block-sections-clusters-error.ll b/llvm/test/CodeGen/X86/basic-block-sections-clusters-error.ll
index 5577601c02cfd79..597d8f6707ec...
[truncated]

@github-actions
Copy link

github-actions bot commented Sep 23, 2023

✅ With the latest revision this PR passed the C/C++ code formatter.

@rlavaee rlavaee requested review from a team as code owners September 28, 2023 21:38
@rlavaee rlavaee removed request for a team September 28, 2023 21:43
@rlavaee rlavaee changed the title [BasicBlockSections] Introduce the path cloning profile format to Bas… [BasicBlockSections] Introduce the path cloning profile format to BasicBlockSectionsProfileReader. Sep 28, 2023
@rlavaee rlavaee force-pushed the cloning_profile branch 2 times, most recently from c059d00 to 508159e Compare October 4, 2023 21:26
@rlavaee
Copy link
Contributor Author

rlavaee commented Oct 9, 2023

ping @tmsri

Copy link
Member

@tmsri tmsri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The one major comment I have is that I am not convinced about the way you represent cloning decisions. I think a block cloning decision can be represented by exactly specifying the edge along which you want to clone. This keeps the specification really simple. I would like your opinion on this one. Further, whatever spec you go with, you must give more clear instructions on what constitutes a valid cloning directive and what isn't. I am inclined to choose a spec that is simple.

@rlavaee
Copy link
Contributor Author

rlavaee commented Oct 11, 2023

The one major comment I have is that I am not convinced about the way you represent cloning decisions. I think a block cloning decision can be represented by exactly specifying the edge along which you want to clone. This keeps the specification really simple. I would like your opinion on this one. Further, whatever spec you go with, you must give more clear instructions on what constitutes a valid cloning directive and what isn't. I am inclined to choose a spec that is simple.

Expanded the function comment a bit more. The next patch (applying the cloning decisions) will implement the validation logic for path cloning.

@rlavaee rlavaee closed this Oct 11, 2023
@rlavaee rlavaee reopened this Oct 11, 2023
@rlavaee
Copy link
Contributor Author

rlavaee commented Oct 11, 2023

Thanks for the review.

// f main
// f bar
// p 1 3 4 # cloning path 1
// p 4 2 # cloning path 2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I think I confused 'c' which is clusters from 'p' which is cloning paths. I am convinced that 'p' is quite optimal and captures cloning decisions correctly. I will take back my original comment on this one.

…icBlockSectionsProfileReader.

Following up on prior RFC (https://lists.llvm.org/pipermail/llvm-dev/2020-September/145357.html) we can now improve above our highly-optimized basic-block-sections binary (e.g., 2% for clang) by applying path cloning. Cloning can improve performance by reducing taken branches.

This patch prepares the profile format for applying cloning actions.

The basic block cloning profile format extends the basic block sections profile in two ways.

  1. Specifies the cloning paths with a 'p' specifier. For example, `p 1 4 5` specifies that blocks with BB ids 4 and 5 must be cloned along the edge 1 --> 4.
  2. For each cloned block, it will appear in the cluster info as `<bb_id>.<clone_id>` where `clone_id` is the id associated with this clone.

For example, the following profile specifies one cloned block (2) and determines its cluster position as well.
```
f foo
p 1 2
c 0 1 2.1 3 2 5
```

This patch keeps backward-compatibility (retains the behavior for old profile formats). This feature is only introduced for profile version >= 1.
Changing variable names and comments.
@rlavaee rlavaee merged commit 28b9126 into llvm:main Oct 12, 2023
2 of 3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants