-
Notifications
You must be signed in to change notification settings - Fork 15.2k
Adding Matching and Inference Functionality to Propeller #160706
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Thank you for submitting a Pull Request (PR) to the LLVM Project! This PR will be automatically labeled and the relevant teams will be notified. If you wish to, you can add reviewers by using the "Reviewers" section on this page. If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers. If you have further questions, they may be answered by the LLVM GitHub User Guide. You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums. |
@llvm/pr-subscribers-llvm-mc @llvm/pr-subscribers-pgo Author: None (wdx727) ChangesWe have optimized the implementation of introducing the "matching and inference" technique into Propeller. In this new implementation, we have made every effort to avoid introducing new compilation parameters while ensuring compatibility with Propeller's current usage. Instead of creating a new profile format, we reused the existing one employed by Propeller. This new implementation is fully compatible with Propeller's current usage patterns and reduces the amount of code changes. For detailed information, please refer to the following RFC: https://discourse.llvm.org/t/rfc-adding-matching-and-inference-functionality-to-propeller/86238. Patch is 59.29 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/160706.diff 24 Files Affected:
diff --git a/llvm/include/llvm/CodeGen/BasicBlockMatchingAndInference.h b/llvm/include/llvm/CodeGen/BasicBlockMatchingAndInference.h
new file mode 100644
index 0000000000000..66209d7685ecc
--- /dev/null
+++ b/llvm/include/llvm/CodeGen/BasicBlockMatchingAndInference.h
@@ -0,0 +1,50 @@
+#ifndef LLVM_CODEGEN_BASIC_BLOCK_AND_INFERENCE_H
+#define LLVM_CODEGEN_BASIC_BLOCK_AND_INFERENCE_H
+
+#include "llvm/CodeGen/BasicBlockSectionsProfileReader.h"
+#include "llvm/CodeGen/MachineFunctionPass.h"
+#include "llvm/Transforms/Utils/SampleProfileInference.h"
+
+namespace llvm {
+
+class BasicBlockMatchingAndInference : public MachineFunctionPass {
+private:
+ using Edge = std::pair<const MachineBasicBlock *, const MachineBasicBlock *>;
+ using BlockWeightMap = DenseMap<const MachineBasicBlock *, uint64_t>;
+ using EdgeWeightMap = DenseMap<Edge, uint64_t>;
+ using BlockEdgeMap = DenseMap<const MachineBasicBlock *,
+ SmallVector<const MachineBasicBlock *, 8>>;
+
+ struct WeightInfo {
+ // Weight of basic blocks.
+ BlockWeightMap BlockWeights;
+ // Weight of edges.
+ EdgeWeightMap EdgeWeights;
+ };
+
+public:
+ static char ID;
+ BasicBlockMatchingAndInference();
+
+ StringRef getPassName() const override {
+ return "Basic Block Matching and Inference";
+ }
+
+ void getAnalysisUsage(AnalysisUsage &AU) const override;
+
+ bool runOnMachineFunction(MachineFunction &F) override;
+
+ std::optional<WeightInfo> getWeightInfo(StringRef FuncName) const;
+
+private:
+ StringMap<WeightInfo> ProgramWeightInfo;
+
+ WeightInfo initWeightInfoByMatching(MachineFunction &MF);
+
+ void generateWeightInfoByInference(MachineFunction &MF,
+ WeightInfo &MatchWeight);
+};
+
+} // end namespace llvm
+
+#endif // LLVM_CODEGEN_BASIC_BLOCK_AND_INFERENCE_H
diff --git a/llvm/include/llvm/CodeGen/BasicBlockSectionsProfileReader.h b/llvm/include/llvm/CodeGen/BasicBlockSectionsProfileReader.h
index 08e6a0e3ef629..a27b921fb1205 100644
--- a/llvm/include/llvm/CodeGen/BasicBlockSectionsProfileReader.h
+++ b/llvm/include/llvm/CodeGen/BasicBlockSectionsProfileReader.h
@@ -31,6 +31,22 @@
namespace llvm {
+using Edge = std::pair<uint64_t, uint64_t>;
+using BlockWeightMap = DenseMap<uint64_t, uint64_t>;
+using EdgeWeightMap = DenseMap<Edge, uint64_t>;
+using BlockHashMap = DenseMap<uint64_t, uint64_t>;
+
+// This represents the weights of basic blocks and edges, and the hashed of
+// basic blocks for one function.
+struct WeightAndHashInfo {
+ // Weight of basic blocks.
+ BlockWeightMap BlockWeights;
+ // Weight of edges.
+ EdgeWeightMap EdgeWeights;
+ // Hashes of basic blocks.
+ BlockHashMap BlockHashes;
+};
+
// This struct represents the cluster information for a machine basic block,
// which is specifed by a unique ID (`MachineBasicBlock::BBID`).
struct BBClusterInfo {
@@ -98,6 +114,10 @@ class BasicBlockSectionsProfileReader {
SmallVector<SmallVector<unsigned>>
getClonePathsForFunction(StringRef FuncName) const;
+ // Returns the weight and hash info for the given function.
+ std::pair<bool, WeightAndHashInfo>
+ getWeightAndHashInfoForFunction(StringRef FuncName) const;
+
private:
StringRef getAliasName(StringRef FuncName) const {
auto R = FuncAliasMap.find(FuncName);
@@ -118,6 +138,16 @@ class BasicBlockSectionsProfileReader {
// positive integer.
Expected<UniqueBBID> parseUniqueBBID(StringRef S) const;
+ // Parses the weight of basic block and edgs.
+ Error parseWight(StringRef S, BlockWeightMap &BlockWeights,
+ EdgeWeightMap &EdgeWeights);
+
+ // Parses the hash of basic block.
+ Error parseBBHash(StringRef S, BlockHashMap &BlockHashes);
+
+ // Parse a pair in the form of "xxx:xxx"
+ Expected<std::pair<uint64_t, uint64_t>> parsePairItem(StringRef S) const;
+
// Reads the basic block sections profile for functions in this module.
Error ReadProfile();
@@ -146,6 +176,10 @@ class BasicBlockSectionsProfileReader {
// block in that cluster.
StringMap<FunctionPathAndClusterInfo> ProgramPathAndClusterInfo;
+ // This contains the weights of basic blocks and edges, and the hashes of
+ // basic blocks of the whole program.
+ StringMap<WeightAndHashInfo> ProgramWeightAndHashInfo;
+
// Some functions have alias names. We use this map to find the main alias
// name which appears in ProgramPathAndClusterInfo as a key.
StringMap<StringRef> FuncAliasMap;
@@ -204,6 +238,9 @@ class BasicBlockSectionsProfileReaderWrapperPass : public ImmutablePass {
SmallVector<SmallVector<unsigned>>
getClonePathsForFunction(StringRef FuncName) const;
+ std::pair<bool, WeightAndHashInfo>
+ getWeightAndHashInfoForFunction(StringRef FuncName) const;
+
// Initializes the FunctionNameToDIFilename map for the current module and
// then reads the profile for the matching functions.
bool doInitialization(Module &M) override;
diff --git a/llvm/include/llvm/CodeGen/MachineBlockHashInfo.h b/llvm/include/llvm/CodeGen/MachineBlockHashInfo.h
new file mode 100644
index 0000000000000..5de1b567e0309
--- /dev/null
+++ b/llvm/include/llvm/CodeGen/MachineBlockHashInfo.h
@@ -0,0 +1,106 @@
+#ifndef LLVM_CODEGEN_MACHINEBLOCKHASHINFO_H
+#define LLVM_CODEGEN_MACHINEBLOCKHASHINFO_H
+
+#include "llvm/CodeGen/MachineFunctionPass.h"
+
+namespace llvm {
+
+/// An object wrapping several components of a basic block hash. The combined
+/// (blended) hash is represented and stored as one uint64_t, while individual
+/// components are of smaller size (e.g., uint16_t or uint8_t).
+struct BlendedBlockHash {
+private:
+ static uint64_t combineHashes(uint16_t Hash1, uint16_t Hash2, uint16_t Hash3,
+ uint16_t Hash4) {
+ uint64_t Hash = 0;
+
+ Hash |= uint64_t(Hash4);
+ Hash <<= 16;
+
+ Hash |= uint64_t(Hash3);
+ Hash <<= 16;
+
+ Hash |= uint64_t(Hash2);
+ Hash <<= 16;
+
+ Hash |= uint64_t(Hash1);
+
+ return Hash;
+ }
+
+ static void parseHashes(uint64_t Hash, uint16_t &Hash1, uint16_t &Hash2,
+ uint16_t &Hash3, uint16_t &Hash4) {
+ Hash1 = Hash & 0xffff;
+ Hash >>= 16;
+
+ Hash2 = Hash & 0xffff;
+ Hash >>= 16;
+
+ Hash3 = Hash & 0xffff;
+ Hash >>= 16;
+
+ Hash4 = Hash & 0xffff;
+ Hash >>= 16;
+ }
+
+public:
+ explicit BlendedBlockHash() {}
+
+ explicit BlendedBlockHash(uint64_t CombinedHash) {
+ parseHashes(CombinedHash, Offset, OpcodeHash, InstrHash, NeighborHash);
+ }
+
+ /// Combine the blended hash into uint64_t.
+ uint64_t combine() const {
+ return combineHashes(Offset, OpcodeHash, InstrHash, NeighborHash);
+ }
+
+ /// Compute a distance between two given blended hashes. The smaller the
+ /// distance, the more similar two blocks are. For identical basic blocks,
+ /// the distance is zero.
+ uint64_t distance(const BlendedBlockHash &BBH) const {
+ assert(OpcodeHash == BBH.OpcodeHash &&
+ "incorrect blended hash distance computation");
+ uint64_t Dist = 0;
+ // Account for NeighborHash
+ Dist += NeighborHash == BBH.NeighborHash ? 0 : 1;
+ Dist <<= 16;
+ // Account for InstrHash
+ Dist += InstrHash == BBH.InstrHash ? 0 : 1;
+ Dist <<= 16;
+ // Account for Offset
+ Dist += (Offset >= BBH.Offset ? Offset - BBH.Offset : BBH.Offset - Offset);
+ return Dist;
+ }
+
+ /// The offset of the basic block from the function start.
+ uint16_t Offset{0};
+ /// (Loose) Hash of the basic block instructions, excluding operands.
+ uint16_t OpcodeHash{0};
+ /// (Strong) Hash of the basic block instructions, including opcodes and
+ /// operands.
+ uint16_t InstrHash{0};
+ /// Hash of the (loose) basic block together with (loose) hashes of its
+ /// successors and predecessors.
+ uint16_t NeighborHash{0};
+};
+
+class MachineBlockHashInfo : public MachineFunctionPass {
+ DenseMap<unsigned, uint64_t> MBBHashInfo;
+
+public:
+ static char ID;
+ MachineBlockHashInfo();
+
+ StringRef getPassName() const override { return "Basic Block Hash Compute"; }
+
+ void getAnalysisUsage(AnalysisUsage &AU) const override;
+
+ bool runOnMachineFunction(MachineFunction &F) override;
+
+ uint64_t getMBBHash(const MachineBasicBlock &MBB);
+};
+
+} // end namespace llvm
+
+#endif // LLVM_CODEGEN_MACHINEBLOCKHASHINFO_H
diff --git a/llvm/include/llvm/CodeGen/Passes.h b/llvm/include/llvm/CodeGen/Passes.h
index d214ab9306c2f..063dd43e80638 100644
--- a/llvm/include/llvm/CodeGen/Passes.h
+++ b/llvm/include/llvm/CodeGen/Passes.h
@@ -67,6 +67,13 @@ namespace llvm {
MachineFunctionPass *createBasicBlockPathCloningPass();
+ /// createBasicBlockMatchingAndInferencePass - This pass enables matching
+ /// and inference when using propeller.
+ MachineFunctionPass *createBasicBlockMatchingAndInferencePass();
+
+ /// createMachineBlockHashInfoPass - This pass computes basic block hashes.
+ MachineFunctionPass *createMachineBlockHashInfoPass();
+
/// createMachineFunctionSplitterPass - This pass splits machine functions
/// using profile information.
MachineFunctionPass *createMachineFunctionSplitterPass();
diff --git a/llvm/include/llvm/InitializePasses.h b/llvm/include/llvm/InitializePasses.h
index 1ce36a95317b4..3172b135426f6 100644
--- a/llvm/include/llvm/InitializePasses.h
+++ b/llvm/include/llvm/InitializePasses.h
@@ -53,6 +53,7 @@ void initializeAlwaysInlinerLegacyPassPass(PassRegistry &);
void initializeAssignmentTrackingAnalysisPass(PassRegistry &);
void initializeAssumptionCacheTrackerPass(PassRegistry &);
void initializeAtomicExpandLegacyPass(PassRegistry &);
+void initializeBasicBlockMatchingAndInferencePass(PassRegistry &);
void initializeBasicBlockPathCloningPass(PassRegistry &);
void initializeBasicBlockSectionsProfileReaderWrapperPassPass(PassRegistry &);
void initializeBasicBlockSectionsPass(PassRegistry &);
@@ -185,6 +186,7 @@ void initializeMIRCanonicalizerPass(PassRegistry &);
void initializeMIRNamerPass(PassRegistry &);
void initializeMIRPrintingPassPass(PassRegistry &);
void initializeMachineBlockFrequencyInfoWrapperPassPass(PassRegistry &);
+void initializeMachineBlockHashInfoPass(PassRegistry&);
void initializeMachineBlockPlacementLegacyPass(PassRegistry &);
void initializeMachineBlockPlacementStatsLegacyPass(PassRegistry &);
void initializeMachineBranchProbabilityInfoWrapperPassPass(PassRegistry &);
diff --git a/llvm/include/llvm/Object/ELFTypes.h b/llvm/include/llvm/Object/ELFTypes.h
index 87e4dbe448091..bbf07d87bb318 100644
--- a/llvm/include/llvm/Object/ELFTypes.h
+++ b/llvm/include/llvm/Object/ELFTypes.h
@@ -831,6 +831,7 @@ struct BBAddrMap {
bool BrProb : 1;
bool MultiBBRange : 1;
bool OmitBBEntries : 1;
+ bool BBHash : 1;
bool hasPGOAnalysis() const { return FuncEntryCount || BBFreq || BrProb; }
@@ -842,7 +843,8 @@ struct BBAddrMap {
(static_cast<uint8_t>(BBFreq) << 1) |
(static_cast<uint8_t>(BrProb) << 2) |
(static_cast<uint8_t>(MultiBBRange) << 3) |
- (static_cast<uint8_t>(OmitBBEntries) << 4);
+ (static_cast<uint8_t>(OmitBBEntries) << 4) |
+ (static_cast<uint8_t>(BBHash) << 5);
}
// Decodes from minimum bit width representation and validates no
@@ -851,7 +853,7 @@ struct BBAddrMap {
Features Feat{
static_cast<bool>(Val & (1 << 0)), static_cast<bool>(Val & (1 << 1)),
static_cast<bool>(Val & (1 << 2)), static_cast<bool>(Val & (1 << 3)),
- static_cast<bool>(Val & (1 << 4))};
+ static_cast<bool>(Val & (1 << 4)), static_cast<bool>(Val & (1 << 5))};
if (Feat.encode() != Val)
return createStringError(
std::error_code(), "invalid encoding for BBAddrMap::Features: 0x%x",
@@ -861,9 +863,9 @@ struct BBAddrMap {
bool operator==(const Features &Other) const {
return std::tie(FuncEntryCount, BBFreq, BrProb, MultiBBRange,
- OmitBBEntries) ==
+ OmitBBEntries, BBHash) ==
std::tie(Other.FuncEntryCount, Other.BBFreq, Other.BrProb,
- Other.MultiBBRange, Other.OmitBBEntries);
+ Other.MultiBBRange, Other.OmitBBEntries, Other.BBHash);
}
};
@@ -914,9 +916,10 @@ struct BBAddrMap {
uint32_t Size = 0; // Size of the basic block.
Metadata MD = {false, false, false, false,
false}; // Metdata for this basic block.
+ uint64_t Hash = 0; // Hash for this basic block.
- BBEntry(uint32_t ID, uint32_t Offset, uint32_t Size, Metadata MD)
- : ID(ID), Offset(Offset), Size(Size), MD(MD){};
+ BBEntry(uint32_t ID, uint32_t Offset, uint32_t Size, Metadata MD, uint64_t Hash = 0)
+ : ID(ID), Offset(Offset), Size(Size), MD(MD), Hash(Hash){};
bool operator==(const BBEntry &Other) const {
return ID == Other.ID && Offset == Other.Offset && Size == Other.Size &&
diff --git a/llvm/include/llvm/ObjectYAML/ELFYAML.h b/llvm/include/llvm/ObjectYAML/ELFYAML.h
index dfdfa055d65fa..9427042db4303 100644
--- a/llvm/include/llvm/ObjectYAML/ELFYAML.h
+++ b/llvm/include/llvm/ObjectYAML/ELFYAML.h
@@ -162,6 +162,7 @@ struct BBAddrMapEntry {
llvm::yaml::Hex64 AddressOffset;
llvm::yaml::Hex64 Size;
llvm::yaml::Hex64 Metadata;
+ llvm::yaml::Hex64 Hash;
};
uint8_t Version;
llvm::yaml::Hex8 Feature;
diff --git a/llvm/include/llvm/Transforms/Utils/SampleProfileInference.h b/llvm/include/llvm/Transforms/Utils/SampleProfileInference.h
index 7231e45fe8eb7..2b4db171bfdfb 100644
--- a/llvm/include/llvm/Transforms/Utils/SampleProfileInference.h
+++ b/llvm/include/llvm/Transforms/Utils/SampleProfileInference.h
@@ -130,6 +130,11 @@ template <typename FT> class SampleProfileInference {
SampleProfileInference(FunctionT &F, BlockEdgeMap &Successors,
BlockWeightMap &SampleBlockWeights)
: F(F), Successors(Successors), SampleBlockWeights(SampleBlockWeights) {}
+ SampleProfileInference(FunctionT &F, BlockEdgeMap &Successors,
+ BlockWeightMap &SampleBlockWeights,
+ EdgeWeightMap &SampleEdgeWeights)
+ : F(F), Successors(Successors), SampleBlockWeights(SampleBlockWeights),
+ SampleEdgeWeights(SampleEdgeWeights) {}
/// Apply the profile inference algorithm for a given function
void apply(BlockWeightMap &BlockWeights, EdgeWeightMap &EdgeWeights);
@@ -157,6 +162,9 @@ template <typename FT> class SampleProfileInference {
/// Map basic blocks to their sampled weights.
BlockWeightMap &SampleBlockWeights;
+
+ /// Map edges to their sampled weights.
+ EdgeWeightMap SampleEdgeWeights;
};
template <typename BT>
@@ -266,6 +274,14 @@ FlowFunction SampleProfileInference<BT>::createFlowFunction(
FlowJump Jump;
Jump.Source = BlockIndex[BB];
Jump.Target = BlockIndex[Succ];
+ auto It = SampleEdgeWeights.find(std::make_pair(BB, Succ));
+ if (It != SampleEdgeWeights.end()) {
+ Jump.HasUnknownWeight = false;
+ Jump.Weight = It->second;
+ } else {
+ Jump.HasUnknownWeight = true;
+ Jump.Weight = 0;
+ }
Func.Jumps.push_back(Jump);
}
}
diff --git a/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp b/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
index bdcd54a135da9..41c084a4e4e49 100644
--- a/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
@@ -40,6 +40,7 @@
#include "llvm/CodeGen/GCMetadataPrinter.h"
#include "llvm/CodeGen/LazyMachineBlockFrequencyInfo.h"
#include "llvm/CodeGen/MachineBasicBlock.h"
+#include "llvm/CodeGen/MachineBlockHashInfo.h"
#include "llvm/CodeGen/MachineBranchProbabilityInfo.h"
#include "llvm/CodeGen/MachineConstantPool.h"
#include "llvm/CodeGen/MachineDominators.h"
@@ -180,6 +181,8 @@ static cl::opt<bool> PrintLatency(
cl::desc("Print instruction latencies as verbose asm comments"), cl::Hidden,
cl::init(false));
+extern cl::opt<bool> EmitBBHash;
+
STATISTIC(EmittedInsts, "Number of machine instrs printed");
char AsmPrinter::ID = 0;
@@ -454,6 +457,8 @@ void AsmPrinter::getAnalysisUsage(AnalysisUsage &AU) const {
AU.addRequired<GCModuleInfo>();
AU.addRequired<LazyMachineBlockFrequencyInfoPass>();
AU.addRequired<MachineBranchProbabilityInfoWrapperPass>();
+ if (EmitBBHash)
+ AU.addRequired<MachineBlockHashInfo>();
}
bool AsmPrinter::doInitialization(Module &M) {
@@ -1419,7 +1424,8 @@ getBBAddrMapFeature(const MachineFunction &MF, int NumMBBSectionRanges) {
}
return {FuncEntryCountEnabled, BBFreqEnabled, BrProbEnabled,
MF.hasBBSections() && NumMBBSectionRanges > 1,
- static_cast<bool>(BBAddrMapSkipEmitBBEntries)};
+ static_cast<bool>(BBAddrMapSkipEmitBBEntries),
+ static_cast<bool>(EmitBBHash)};
}
void AsmPrinter::emitBBAddrMapSection(const MachineFunction &MF) {
@@ -1477,6 +1483,8 @@ void AsmPrinter::emitBBAddrMapSection(const MachineFunction &MF) {
PrevMBBEndSymbol = MBBSymbol;
}
+ auto MBHI = Features.BBHash ? &getAnalysis<MachineBlockHashInfo>() : nullptr;
+
if (!Features.OmitBBEntries) {
// TODO: Remove this check when version 1 is deprecated.
if (BBAddrMapVersion > 1) {
@@ -1496,6 +1504,10 @@ void AsmPrinter::emitBBAddrMapSection(const MachineFunction &MF) {
emitLabelDifferenceAsULEB128(MBB.getEndSymbol(), MBBSymbol);
// Emit the Metadata.
OutStreamer->emitULEB128IntValue(getBBAddrMapMetadata(MBB));
+ // Emit the Hash.
+ if (MBHI) {
+ OutStreamer->emitULEB128IntValue(MBHI->getMBBHash(MBB));
+ }
}
PrevMBBEndSymbol = MBB.getEndSymbol();
diff --git a/llvm/lib/CodeGen/BasicBlockMatchingAndInference.cpp b/llvm/lib/CodeGen/BasicBlockMatchingAndInference.cpp
new file mode 100644
index 0000000000000..e2776162043ff
--- /dev/null
+++ b/llvm/lib/CodeGen/BasicBlockMatchingAndInference.cpp
@@ -0,0 +1,168 @@
+#include "llvm/CodeGen/BasicBlockMatchingAndInference.h"
+#include "llvm/CodeGen/BasicBlockSectionsProfileReader.h"
+#include "llvm/CodeGen/MachineBlockHashInfo.h"
+#include "llvm/CodeGen/Passes.h"
+#include "llvm/InitializePasses.h"
+#include <llvm/Support/CommandLine.h>
+
+using namespace llvm;
+
+static cl::opt<float>
+ PropellerInferThreshold("propeller-infer-threshold",
+ cl::desc("Threshold for infer stale profile"),
+ cl::init(0.6), cl::Optional);
+
+/// The object is used to identify and match basic blocks given their hashes.
+class StaleMatcher {
+public:
+ /// Initialize stale matcher.
+ void init(const std::vector<MachineBasicBlock *> &Blocks,
+ const std::vector<BlendedBlockHash> &Hashes) {
+ assert(Blocks.size() == Hashes.size() &&
+ "incorrect matcher initialization");
+ for (size_t I = 0; I < Blocks.size(); I++) {
+ MachineBasicBlock *Block = Blocks[I];
+ uint16_t OpHash = Hashes[I].OpcodeHash;
+ OpHashToBlocks[OpHash].push_back(std::make_pair(Hashes[I], Block));
+ }
+ }
+
+ /// Find the most similar block for a given hash.
+ MachineBasicBlock *matchBlock(BlendedBlockHash BlendedHash) const {
+ auto BlockIt = OpHashToBlocks.find(BlendedHash.OpcodeHash);
+ if (BlockIt == OpHashToBlocks.end()) {
+ return nullptr;
+ }
+ MachineBasicBlock *BestBlock = nullptr;
+ uint64_t BestDist = std::numeric_limits<uint64_t>::max();
+ for (auto It : BlockIt->second) {
+ MachineBasicBlock *Block = It.second;
+ BlendedBlockHash Hash = It.first;
+ uint64_t Dist = Hash.distance(BlendedHash);
+ if (BestBlock == nullptr || Dist < BestDist) {
+ BestDist = Dist;
+ BestBlock = Block;
+ }
+ }
+ return BestBlock;
+ }
+
+private:
+ using HashBlockPairType = std::pair<BlendedBlockHash, MachineBasicBlock *>;
+ std::unordered_map<uint16_t, std::vector<HashBlockPairType>> OpHashToBlocks;
+};
+
+INITIALIZE_PASS_BEGIN(BasicBlockMatchingAndInference,
+ "machine-block-match-infer",
+ "Machine Block Matching and Inference Analysis", true,
+ true)
+INITIALIZE_PASS_DEPENDENCY(MachineBlockHashInfo)
+INITIALIZE_PASS_DEPENDENCY(BasicBlockSectionsProfileReaderWrapperPass)
+INITIALIZE_PASS_END(BasicBlockMatchingAndInference, "machine-block-match-infer",
+ "Machine Block Matching and Inference Analysis", true, true)
+
+char BasicBlockMatch...
[truncated]
|
@llvm/pr-subscribers-llvm-binary-utilities Author: None (wdx727) ChangesWe have optimized the implementation of introducing the "matching and inference" technique into Propeller. In this new implementation, we have made every effort to avoid introducing new compilation parameters while ensuring compatibility with Propeller's current usage. Instead of creating a new profile format, we reused the existing one employed by Propeller. This new implementation is fully compatible with Propeller's current usage patterns and reduces the amount of code changes. For detailed information, please refer to the following RFC: https://discourse.llvm.org/t/rfc-adding-matching-and-inference-functionality-to-propeller/86238. Patch is 59.29 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/160706.diff 24 Files Affected:
diff --git a/llvm/include/llvm/CodeGen/BasicBlockMatchingAndInference.h b/llvm/include/llvm/CodeGen/BasicBlockMatchingAndInference.h
new file mode 100644
index 0000000000000..66209d7685ecc
--- /dev/null
+++ b/llvm/include/llvm/CodeGen/BasicBlockMatchingAndInference.h
@@ -0,0 +1,50 @@
+#ifndef LLVM_CODEGEN_BASIC_BLOCK_AND_INFERENCE_H
+#define LLVM_CODEGEN_BASIC_BLOCK_AND_INFERENCE_H
+
+#include "llvm/CodeGen/BasicBlockSectionsProfileReader.h"
+#include "llvm/CodeGen/MachineFunctionPass.h"
+#include "llvm/Transforms/Utils/SampleProfileInference.h"
+
+namespace llvm {
+
+class BasicBlockMatchingAndInference : public MachineFunctionPass {
+private:
+ using Edge = std::pair<const MachineBasicBlock *, const MachineBasicBlock *>;
+ using BlockWeightMap = DenseMap<const MachineBasicBlock *, uint64_t>;
+ using EdgeWeightMap = DenseMap<Edge, uint64_t>;
+ using BlockEdgeMap = DenseMap<const MachineBasicBlock *,
+ SmallVector<const MachineBasicBlock *, 8>>;
+
+ struct WeightInfo {
+ // Weight of basic blocks.
+ BlockWeightMap BlockWeights;
+ // Weight of edges.
+ EdgeWeightMap EdgeWeights;
+ };
+
+public:
+ static char ID;
+ BasicBlockMatchingAndInference();
+
+ StringRef getPassName() const override {
+ return "Basic Block Matching and Inference";
+ }
+
+ void getAnalysisUsage(AnalysisUsage &AU) const override;
+
+ bool runOnMachineFunction(MachineFunction &F) override;
+
+ std::optional<WeightInfo> getWeightInfo(StringRef FuncName) const;
+
+private:
+ StringMap<WeightInfo> ProgramWeightInfo;
+
+ WeightInfo initWeightInfoByMatching(MachineFunction &MF);
+
+ void generateWeightInfoByInference(MachineFunction &MF,
+ WeightInfo &MatchWeight);
+};
+
+} // end namespace llvm
+
+#endif // LLVM_CODEGEN_BASIC_BLOCK_AND_INFERENCE_H
diff --git a/llvm/include/llvm/CodeGen/BasicBlockSectionsProfileReader.h b/llvm/include/llvm/CodeGen/BasicBlockSectionsProfileReader.h
index 08e6a0e3ef629..a27b921fb1205 100644
--- a/llvm/include/llvm/CodeGen/BasicBlockSectionsProfileReader.h
+++ b/llvm/include/llvm/CodeGen/BasicBlockSectionsProfileReader.h
@@ -31,6 +31,22 @@
namespace llvm {
+using Edge = std::pair<uint64_t, uint64_t>;
+using BlockWeightMap = DenseMap<uint64_t, uint64_t>;
+using EdgeWeightMap = DenseMap<Edge, uint64_t>;
+using BlockHashMap = DenseMap<uint64_t, uint64_t>;
+
+// This represents the weights of basic blocks and edges, and the hashed of
+// basic blocks for one function.
+struct WeightAndHashInfo {
+ // Weight of basic blocks.
+ BlockWeightMap BlockWeights;
+ // Weight of edges.
+ EdgeWeightMap EdgeWeights;
+ // Hashes of basic blocks.
+ BlockHashMap BlockHashes;
+};
+
// This struct represents the cluster information for a machine basic block,
// which is specifed by a unique ID (`MachineBasicBlock::BBID`).
struct BBClusterInfo {
@@ -98,6 +114,10 @@ class BasicBlockSectionsProfileReader {
SmallVector<SmallVector<unsigned>>
getClonePathsForFunction(StringRef FuncName) const;
+ // Returns the weight and hash info for the given function.
+ std::pair<bool, WeightAndHashInfo>
+ getWeightAndHashInfoForFunction(StringRef FuncName) const;
+
private:
StringRef getAliasName(StringRef FuncName) const {
auto R = FuncAliasMap.find(FuncName);
@@ -118,6 +138,16 @@ class BasicBlockSectionsProfileReader {
// positive integer.
Expected<UniqueBBID> parseUniqueBBID(StringRef S) const;
+ // Parses the weight of basic block and edgs.
+ Error parseWight(StringRef S, BlockWeightMap &BlockWeights,
+ EdgeWeightMap &EdgeWeights);
+
+ // Parses the hash of basic block.
+ Error parseBBHash(StringRef S, BlockHashMap &BlockHashes);
+
+ // Parse a pair in the form of "xxx:xxx"
+ Expected<std::pair<uint64_t, uint64_t>> parsePairItem(StringRef S) const;
+
// Reads the basic block sections profile for functions in this module.
Error ReadProfile();
@@ -146,6 +176,10 @@ class BasicBlockSectionsProfileReader {
// block in that cluster.
StringMap<FunctionPathAndClusterInfo> ProgramPathAndClusterInfo;
+ // This contains the weights of basic blocks and edges, and the hashes of
+ // basic blocks of the whole program.
+ StringMap<WeightAndHashInfo> ProgramWeightAndHashInfo;
+
// Some functions have alias names. We use this map to find the main alias
// name which appears in ProgramPathAndClusterInfo as a key.
StringMap<StringRef> FuncAliasMap;
@@ -204,6 +238,9 @@ class BasicBlockSectionsProfileReaderWrapperPass : public ImmutablePass {
SmallVector<SmallVector<unsigned>>
getClonePathsForFunction(StringRef FuncName) const;
+ std::pair<bool, WeightAndHashInfo>
+ getWeightAndHashInfoForFunction(StringRef FuncName) const;
+
// Initializes the FunctionNameToDIFilename map for the current module and
// then reads the profile for the matching functions.
bool doInitialization(Module &M) override;
diff --git a/llvm/include/llvm/CodeGen/MachineBlockHashInfo.h b/llvm/include/llvm/CodeGen/MachineBlockHashInfo.h
new file mode 100644
index 0000000000000..5de1b567e0309
--- /dev/null
+++ b/llvm/include/llvm/CodeGen/MachineBlockHashInfo.h
@@ -0,0 +1,106 @@
+#ifndef LLVM_CODEGEN_MACHINEBLOCKHASHINFO_H
+#define LLVM_CODEGEN_MACHINEBLOCKHASHINFO_H
+
+#include "llvm/CodeGen/MachineFunctionPass.h"
+
+namespace llvm {
+
+/// An object wrapping several components of a basic block hash. The combined
+/// (blended) hash is represented and stored as one uint64_t, while individual
+/// components are of smaller size (e.g., uint16_t or uint8_t).
+struct BlendedBlockHash {
+private:
+ static uint64_t combineHashes(uint16_t Hash1, uint16_t Hash2, uint16_t Hash3,
+ uint16_t Hash4) {
+ uint64_t Hash = 0;
+
+ Hash |= uint64_t(Hash4);
+ Hash <<= 16;
+
+ Hash |= uint64_t(Hash3);
+ Hash <<= 16;
+
+ Hash |= uint64_t(Hash2);
+ Hash <<= 16;
+
+ Hash |= uint64_t(Hash1);
+
+ return Hash;
+ }
+
+ static void parseHashes(uint64_t Hash, uint16_t &Hash1, uint16_t &Hash2,
+ uint16_t &Hash3, uint16_t &Hash4) {
+ Hash1 = Hash & 0xffff;
+ Hash >>= 16;
+
+ Hash2 = Hash & 0xffff;
+ Hash >>= 16;
+
+ Hash3 = Hash & 0xffff;
+ Hash >>= 16;
+
+ Hash4 = Hash & 0xffff;
+ Hash >>= 16;
+ }
+
+public:
+ explicit BlendedBlockHash() {}
+
+ explicit BlendedBlockHash(uint64_t CombinedHash) {
+ parseHashes(CombinedHash, Offset, OpcodeHash, InstrHash, NeighborHash);
+ }
+
+ /// Combine the blended hash into uint64_t.
+ uint64_t combine() const {
+ return combineHashes(Offset, OpcodeHash, InstrHash, NeighborHash);
+ }
+
+ /// Compute a distance between two given blended hashes. The smaller the
+ /// distance, the more similar two blocks are. For identical basic blocks,
+ /// the distance is zero.
+ uint64_t distance(const BlendedBlockHash &BBH) const {
+ assert(OpcodeHash == BBH.OpcodeHash &&
+ "incorrect blended hash distance computation");
+ uint64_t Dist = 0;
+ // Account for NeighborHash
+ Dist += NeighborHash == BBH.NeighborHash ? 0 : 1;
+ Dist <<= 16;
+ // Account for InstrHash
+ Dist += InstrHash == BBH.InstrHash ? 0 : 1;
+ Dist <<= 16;
+ // Account for Offset
+ Dist += (Offset >= BBH.Offset ? Offset - BBH.Offset : BBH.Offset - Offset);
+ return Dist;
+ }
+
+ /// The offset of the basic block from the function start.
+ uint16_t Offset{0};
+ /// (Loose) Hash of the basic block instructions, excluding operands.
+ uint16_t OpcodeHash{0};
+ /// (Strong) Hash of the basic block instructions, including opcodes and
+ /// operands.
+ uint16_t InstrHash{0};
+ /// Hash of the (loose) basic block together with (loose) hashes of its
+ /// successors and predecessors.
+ uint16_t NeighborHash{0};
+};
+
+class MachineBlockHashInfo : public MachineFunctionPass {
+ DenseMap<unsigned, uint64_t> MBBHashInfo;
+
+public:
+ static char ID;
+ MachineBlockHashInfo();
+
+ StringRef getPassName() const override { return "Basic Block Hash Compute"; }
+
+ void getAnalysisUsage(AnalysisUsage &AU) const override;
+
+ bool runOnMachineFunction(MachineFunction &F) override;
+
+ uint64_t getMBBHash(const MachineBasicBlock &MBB);
+};
+
+} // end namespace llvm
+
+#endif // LLVM_CODEGEN_MACHINEBLOCKHASHINFO_H
diff --git a/llvm/include/llvm/CodeGen/Passes.h b/llvm/include/llvm/CodeGen/Passes.h
index d214ab9306c2f..063dd43e80638 100644
--- a/llvm/include/llvm/CodeGen/Passes.h
+++ b/llvm/include/llvm/CodeGen/Passes.h
@@ -67,6 +67,13 @@ namespace llvm {
MachineFunctionPass *createBasicBlockPathCloningPass();
+ /// createBasicBlockMatchingAndInferencePass - This pass enables matching
+ /// and inference when using propeller.
+ MachineFunctionPass *createBasicBlockMatchingAndInferencePass();
+
+ /// createMachineBlockHashInfoPass - This pass computes basic block hashes.
+ MachineFunctionPass *createMachineBlockHashInfoPass();
+
/// createMachineFunctionSplitterPass - This pass splits machine functions
/// using profile information.
MachineFunctionPass *createMachineFunctionSplitterPass();
diff --git a/llvm/include/llvm/InitializePasses.h b/llvm/include/llvm/InitializePasses.h
index 1ce36a95317b4..3172b135426f6 100644
--- a/llvm/include/llvm/InitializePasses.h
+++ b/llvm/include/llvm/InitializePasses.h
@@ -53,6 +53,7 @@ void initializeAlwaysInlinerLegacyPassPass(PassRegistry &);
void initializeAssignmentTrackingAnalysisPass(PassRegistry &);
void initializeAssumptionCacheTrackerPass(PassRegistry &);
void initializeAtomicExpandLegacyPass(PassRegistry &);
+void initializeBasicBlockMatchingAndInferencePass(PassRegistry &);
void initializeBasicBlockPathCloningPass(PassRegistry &);
void initializeBasicBlockSectionsProfileReaderWrapperPassPass(PassRegistry &);
void initializeBasicBlockSectionsPass(PassRegistry &);
@@ -185,6 +186,7 @@ void initializeMIRCanonicalizerPass(PassRegistry &);
void initializeMIRNamerPass(PassRegistry &);
void initializeMIRPrintingPassPass(PassRegistry &);
void initializeMachineBlockFrequencyInfoWrapperPassPass(PassRegistry &);
+void initializeMachineBlockHashInfoPass(PassRegistry&);
void initializeMachineBlockPlacementLegacyPass(PassRegistry &);
void initializeMachineBlockPlacementStatsLegacyPass(PassRegistry &);
void initializeMachineBranchProbabilityInfoWrapperPassPass(PassRegistry &);
diff --git a/llvm/include/llvm/Object/ELFTypes.h b/llvm/include/llvm/Object/ELFTypes.h
index 87e4dbe448091..bbf07d87bb318 100644
--- a/llvm/include/llvm/Object/ELFTypes.h
+++ b/llvm/include/llvm/Object/ELFTypes.h
@@ -831,6 +831,7 @@ struct BBAddrMap {
bool BrProb : 1;
bool MultiBBRange : 1;
bool OmitBBEntries : 1;
+ bool BBHash : 1;
bool hasPGOAnalysis() const { return FuncEntryCount || BBFreq || BrProb; }
@@ -842,7 +843,8 @@ struct BBAddrMap {
(static_cast<uint8_t>(BBFreq) << 1) |
(static_cast<uint8_t>(BrProb) << 2) |
(static_cast<uint8_t>(MultiBBRange) << 3) |
- (static_cast<uint8_t>(OmitBBEntries) << 4);
+ (static_cast<uint8_t>(OmitBBEntries) << 4) |
+ (static_cast<uint8_t>(BBHash) << 5);
}
// Decodes from minimum bit width representation and validates no
@@ -851,7 +853,7 @@ struct BBAddrMap {
Features Feat{
static_cast<bool>(Val & (1 << 0)), static_cast<bool>(Val & (1 << 1)),
static_cast<bool>(Val & (1 << 2)), static_cast<bool>(Val & (1 << 3)),
- static_cast<bool>(Val & (1 << 4))};
+ static_cast<bool>(Val & (1 << 4)), static_cast<bool>(Val & (1 << 5))};
if (Feat.encode() != Val)
return createStringError(
std::error_code(), "invalid encoding for BBAddrMap::Features: 0x%x",
@@ -861,9 +863,9 @@ struct BBAddrMap {
bool operator==(const Features &Other) const {
return std::tie(FuncEntryCount, BBFreq, BrProb, MultiBBRange,
- OmitBBEntries) ==
+ OmitBBEntries, BBHash) ==
std::tie(Other.FuncEntryCount, Other.BBFreq, Other.BrProb,
- Other.MultiBBRange, Other.OmitBBEntries);
+ Other.MultiBBRange, Other.OmitBBEntries, Other.BBHash);
}
};
@@ -914,9 +916,10 @@ struct BBAddrMap {
uint32_t Size = 0; // Size of the basic block.
Metadata MD = {false, false, false, false,
false}; // Metdata for this basic block.
+ uint64_t Hash = 0; // Hash for this basic block.
- BBEntry(uint32_t ID, uint32_t Offset, uint32_t Size, Metadata MD)
- : ID(ID), Offset(Offset), Size(Size), MD(MD){};
+ BBEntry(uint32_t ID, uint32_t Offset, uint32_t Size, Metadata MD, uint64_t Hash = 0)
+ : ID(ID), Offset(Offset), Size(Size), MD(MD), Hash(Hash){};
bool operator==(const BBEntry &Other) const {
return ID == Other.ID && Offset == Other.Offset && Size == Other.Size &&
diff --git a/llvm/include/llvm/ObjectYAML/ELFYAML.h b/llvm/include/llvm/ObjectYAML/ELFYAML.h
index dfdfa055d65fa..9427042db4303 100644
--- a/llvm/include/llvm/ObjectYAML/ELFYAML.h
+++ b/llvm/include/llvm/ObjectYAML/ELFYAML.h
@@ -162,6 +162,7 @@ struct BBAddrMapEntry {
llvm::yaml::Hex64 AddressOffset;
llvm::yaml::Hex64 Size;
llvm::yaml::Hex64 Metadata;
+ llvm::yaml::Hex64 Hash;
};
uint8_t Version;
llvm::yaml::Hex8 Feature;
diff --git a/llvm/include/llvm/Transforms/Utils/SampleProfileInference.h b/llvm/include/llvm/Transforms/Utils/SampleProfileInference.h
index 7231e45fe8eb7..2b4db171bfdfb 100644
--- a/llvm/include/llvm/Transforms/Utils/SampleProfileInference.h
+++ b/llvm/include/llvm/Transforms/Utils/SampleProfileInference.h
@@ -130,6 +130,11 @@ template <typename FT> class SampleProfileInference {
SampleProfileInference(FunctionT &F, BlockEdgeMap &Successors,
BlockWeightMap &SampleBlockWeights)
: F(F), Successors(Successors), SampleBlockWeights(SampleBlockWeights) {}
+ SampleProfileInference(FunctionT &F, BlockEdgeMap &Successors,
+ BlockWeightMap &SampleBlockWeights,
+ EdgeWeightMap &SampleEdgeWeights)
+ : F(F), Successors(Successors), SampleBlockWeights(SampleBlockWeights),
+ SampleEdgeWeights(SampleEdgeWeights) {}
/// Apply the profile inference algorithm for a given function
void apply(BlockWeightMap &BlockWeights, EdgeWeightMap &EdgeWeights);
@@ -157,6 +162,9 @@ template <typename FT> class SampleProfileInference {
/// Map basic blocks to their sampled weights.
BlockWeightMap &SampleBlockWeights;
+
+ /// Map edges to their sampled weights.
+ EdgeWeightMap SampleEdgeWeights;
};
template <typename BT>
@@ -266,6 +274,14 @@ FlowFunction SampleProfileInference<BT>::createFlowFunction(
FlowJump Jump;
Jump.Source = BlockIndex[BB];
Jump.Target = BlockIndex[Succ];
+ auto It = SampleEdgeWeights.find(std::make_pair(BB, Succ));
+ if (It != SampleEdgeWeights.end()) {
+ Jump.HasUnknownWeight = false;
+ Jump.Weight = It->second;
+ } else {
+ Jump.HasUnknownWeight = true;
+ Jump.Weight = 0;
+ }
Func.Jumps.push_back(Jump);
}
}
diff --git a/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp b/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
index bdcd54a135da9..41c084a4e4e49 100644
--- a/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
@@ -40,6 +40,7 @@
#include "llvm/CodeGen/GCMetadataPrinter.h"
#include "llvm/CodeGen/LazyMachineBlockFrequencyInfo.h"
#include "llvm/CodeGen/MachineBasicBlock.h"
+#include "llvm/CodeGen/MachineBlockHashInfo.h"
#include "llvm/CodeGen/MachineBranchProbabilityInfo.h"
#include "llvm/CodeGen/MachineConstantPool.h"
#include "llvm/CodeGen/MachineDominators.h"
@@ -180,6 +181,8 @@ static cl::opt<bool> PrintLatency(
cl::desc("Print instruction latencies as verbose asm comments"), cl::Hidden,
cl::init(false));
+extern cl::opt<bool> EmitBBHash;
+
STATISTIC(EmittedInsts, "Number of machine instrs printed");
char AsmPrinter::ID = 0;
@@ -454,6 +457,8 @@ void AsmPrinter::getAnalysisUsage(AnalysisUsage &AU) const {
AU.addRequired<GCModuleInfo>();
AU.addRequired<LazyMachineBlockFrequencyInfoPass>();
AU.addRequired<MachineBranchProbabilityInfoWrapperPass>();
+ if (EmitBBHash)
+ AU.addRequired<MachineBlockHashInfo>();
}
bool AsmPrinter::doInitialization(Module &M) {
@@ -1419,7 +1424,8 @@ getBBAddrMapFeature(const MachineFunction &MF, int NumMBBSectionRanges) {
}
return {FuncEntryCountEnabled, BBFreqEnabled, BrProbEnabled,
MF.hasBBSections() && NumMBBSectionRanges > 1,
- static_cast<bool>(BBAddrMapSkipEmitBBEntries)};
+ static_cast<bool>(BBAddrMapSkipEmitBBEntries),
+ static_cast<bool>(EmitBBHash)};
}
void AsmPrinter::emitBBAddrMapSection(const MachineFunction &MF) {
@@ -1477,6 +1483,8 @@ void AsmPrinter::emitBBAddrMapSection(const MachineFunction &MF) {
PrevMBBEndSymbol = MBBSymbol;
}
+ auto MBHI = Features.BBHash ? &getAnalysis<MachineBlockHashInfo>() : nullptr;
+
if (!Features.OmitBBEntries) {
// TODO: Remove this check when version 1 is deprecated.
if (BBAddrMapVersion > 1) {
@@ -1496,6 +1504,10 @@ void AsmPrinter::emitBBAddrMapSection(const MachineFunction &MF) {
emitLabelDifferenceAsULEB128(MBB.getEndSymbol(), MBBSymbol);
// Emit the Metadata.
OutStreamer->emitULEB128IntValue(getBBAddrMapMetadata(MBB));
+ // Emit the Hash.
+ if (MBHI) {
+ OutStreamer->emitULEB128IntValue(MBHI->getMBBHash(MBB));
+ }
}
PrevMBBEndSymbol = MBB.getEndSymbol();
diff --git a/llvm/lib/CodeGen/BasicBlockMatchingAndInference.cpp b/llvm/lib/CodeGen/BasicBlockMatchingAndInference.cpp
new file mode 100644
index 0000000000000..e2776162043ff
--- /dev/null
+++ b/llvm/lib/CodeGen/BasicBlockMatchingAndInference.cpp
@@ -0,0 +1,168 @@
+#include "llvm/CodeGen/BasicBlockMatchingAndInference.h"
+#include "llvm/CodeGen/BasicBlockSectionsProfileReader.h"
+#include "llvm/CodeGen/MachineBlockHashInfo.h"
+#include "llvm/CodeGen/Passes.h"
+#include "llvm/InitializePasses.h"
+#include <llvm/Support/CommandLine.h>
+
+using namespace llvm;
+
+static cl::opt<float>
+ PropellerInferThreshold("propeller-infer-threshold",
+ cl::desc("Threshold for infer stale profile"),
+ cl::init(0.6), cl::Optional);
+
+/// The object is used to identify and match basic blocks given their hashes.
+class StaleMatcher {
+public:
+ /// Initialize stale matcher.
+ void init(const std::vector<MachineBasicBlock *> &Blocks,
+ const std::vector<BlendedBlockHash> &Hashes) {
+ assert(Blocks.size() == Hashes.size() &&
+ "incorrect matcher initialization");
+ for (size_t I = 0; I < Blocks.size(); I++) {
+ MachineBasicBlock *Block = Blocks[I];
+ uint16_t OpHash = Hashes[I].OpcodeHash;
+ OpHashToBlocks[OpHash].push_back(std::make_pair(Hashes[I], Block));
+ }
+ }
+
+ /// Find the most similar block for a given hash.
+ MachineBasicBlock *matchBlock(BlendedBlockHash BlendedHash) const {
+ auto BlockIt = OpHashToBlocks.find(BlendedHash.OpcodeHash);
+ if (BlockIt == OpHashToBlocks.end()) {
+ return nullptr;
+ }
+ MachineBasicBlock *BestBlock = nullptr;
+ uint64_t BestDist = std::numeric_limits<uint64_t>::max();
+ for (auto It : BlockIt->second) {
+ MachineBasicBlock *Block = It.second;
+ BlendedBlockHash Hash = It.first;
+ uint64_t Dist = Hash.distance(BlendedHash);
+ if (BestBlock == nullptr || Dist < BestDist) {
+ BestDist = Dist;
+ BestBlock = Block;
+ }
+ }
+ return BestBlock;
+ }
+
+private:
+ using HashBlockPairType = std::pair<BlendedBlockHash, MachineBasicBlock *>;
+ std::unordered_map<uint16_t, std::vector<HashBlockPairType>> OpHashToBlocks;
+};
+
+INITIALIZE_PASS_BEGIN(BasicBlockMatchingAndInference,
+ "machine-block-match-infer",
+ "Machine Block Matching and Inference Analysis", true,
+ true)
+INITIALIZE_PASS_DEPENDENCY(MachineBlockHashInfo)
+INITIALIZE_PASS_DEPENDENCY(BasicBlockSectionsProfileReaderWrapperPass)
+INITIALIZE_PASS_END(BasicBlockMatchingAndInference, "machine-block-match-infer",
+ "Machine Block Matching and Inference Analysis", true, true)
+
+char BasicBlockMatch...
[truncated]
|
8a46dab
to
27839f3
Compare
This needs to be split into at least two PRs, with the first one enabling the decoding (changes in ELF.cpp, etc.), at the second one enabling the encoding (changes in the Codgen). Please also add a new version number for this (where the feature is only supported for this version and up) like the other features: https://github.com/llvm/llvm-project/blob/main/llvm/lib/Object/ELF.cpp#L850-854. |
27839f3
to
2c908bd
Compare
Thanks. We have split the original PR into two separate ones. The current PR only contains modifications related to ELF and basic block hash calculation. Additionally, we have updated the version number of SHT_LLVM_BB_ADDR_MAP. |
This is still pending. All the code related to emitting the BB hash as part of codegen must be split into a separate PR. |
2c908bd
to
9a7b663
Compare
Done. Currently, this PR only relates to ELF modifications. The calculation of the basic block hash will be submitted in the next PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please also add obj2yaml and yaml2obj tests (like tools/yaml2obj/ELF/bb-addr-map.yaml and ./tools/obj2yaml/ELF/bb-addr-map.yaml). These should ideally be done in a separate PR and they will test the ELFEmitter.cpp and obj2yaml/elf2yaml.cpp changes. If you want to include them here, it would be fine too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is also a codegen change and must be defered.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is also a codegen change and must be defered.
How can we increment the version of SHT_LLVM_BB_ADDR_MAP without modifying MCContext.h? Should I refrain from changing the version of SHT_LLVM_BB_ADDR_MAP for the time being?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct. We don't want to change the emitted BBAddrMap just yet. The reason for this complexity is that the Propeller tooling is released independently. So if we change the version right away, there is a chance that the compiler gets released before the Propeller tooling. I understand this is somewhat inconvenient. Once we move Propeller to the LLVM, this will be resolved. Nonetheless, splitting the change into smaller parts helps the code review purpose and this PR definitely would benefit from it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Get it. Done.
llvm/lib/ObjectYAML/ELFEmitter.cpp
Outdated
@@ -1526,6 +1526,9 @@ void ELFState<ELFT>::writeSectionContent( | |||
} | |||
SHeader.sh_size += CBA.writeULEB128(BBE.Size); | |||
SHeader.sh_size += CBA.writeULEB128(BBE.Metadata); | |||
if (FeatureOrErr->BBHash && BBE.Hash) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't seem right.
We should write the hash if the feature is enabled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't seem right. We should write the hash if the feature is enabled.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, this is not what I thought because BBE.Hash is std::optional. Since this is YAML, you should use ||
instead. So we would emit the hash if either the feature is enabled (even if BBE.Hash is zero) or BBE.Hash has value (even if feature is disabled). In the latter case, we don't need to enable the feature value. Please also use BBE.Hash.has_value()
to disambiguate against value comparison.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Get it. Done.
9a7b663
to
ef6c87d
Compare
Done. |
f954e83
to
f79dbb9
Compare
llvm/include/llvm/Object/ELFTypes.h
Outdated
|
||
BBEntry(uint32_t ID, uint32_t Offset, uint32_t Size, Metadata MD, | ||
SmallVector<uint32_t, 1> CallsiteEndOffsets) | ||
SmallVector<uint32_t, 1> CallsiteEndOffsets, uint64_t Hash = 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need the default value here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
llvm/lib/ObjectYAML/ELFEmitter.cpp
Outdated
if (FeatureOrErr->BBHash || BBE.Hash.has_value()) { | ||
auto Hash = BBE.Hash.has_value() ? | ||
BBE.Hash.value() : llvm::yaml::Hex64(0);; | ||
SHeader.sh_size += CBA.writeULEB128(Hash); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple points.
- ULEB128 gives you size savings for smaller values (which could fit within a smaller number of bytes), but your random hashes can have their most significant bits set. So there wouldn't be any size savings (and you may even add a single extra byte because of the ULEB encoding).
- This gets me into thinking whether we actually need 64bit values for the hash. Since we're storing a hash for every basic block, we might be able to use a smaller number of bytes if our inference algorithm is smart-enough. Even with a single byte, collision chance is 1/(2^8) which could be acceptable. We need to remember that this is about performance and a best-effort solution is fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be noted that using ULEB encoding does not help reduce data size. In fact, a 64-bit hash value consists of 4 hashes at different levels, with each hash value being 16 bits. This structure allows us to perform matching at different levels during the matching process, thereby improving matching accuracy.
If a smaller number of bits is used to represent the hash, the matching accuracy will decrease. Nevertheless, this approach may still be feasible. To determine the extent to which the reduced matching accuracy ultimately affects the inference results, further experiments need to be conducted for evaluation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If your hashes are always 64-bit, then you don't really need ULEB.
I'd like us to consider the size implications a little bit, since we're planning to include the section in our production binaries. 8 extra bytes per basic block could be a huge overhead (could almost double the total section size).
Do you need the 4 hashes to be separate or can you combine them to form a single hash? If we can combine them, then it's possible to define a more flexible encoding. Every function stores the number of hashing bytes once. Then we read the hash values for the specified number of bytes. Different functions can have varying number of hash bytes. So for larger functions we can utilize more bytes (even more than 8). WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have encoded the hashes in their original 64-bit form instead of using ULEB encoding.
Since employing different hash widths for different functions would introduce additional complexity, we plan to maintain the current 64-bit hash format for now. This will allow the matching and inference features to become available in propeller first. Further research and experimentation will be conducted before we proceed with hash bit compression. Would you be comfortable with this approach?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good.
c27d1f2
to
88b80ba
Compare
llvm/tools/obj2yaml/elf2yaml.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need to do this. Hash should be defined as optional and set if(FeatureOrErr->BBHash)
. Then we always push_back {ID, Offset, Size, Metadata, std::move(CallsiteEndOffsets), Hash}
,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is YAML code so the check here should be checking whether BBE.Hash.has_value()
instead of the feature.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The BBE is not YAML.
llvm/lib/ObjectYAML/ELFEmitter.cpp
Outdated
@@ -1526,6 +1526,12 @@ void ELFState<ELFT>::writeSectionContent( | |||
} | |||
SHeader.sh_size += CBA.writeULEB128(BBE.Size); | |||
SHeader.sh_size += CBA.writeULEB128(BBE.Metadata); | |||
if (FeatureOrErr->BBHash || BBE.Hash.has_value()) { | |||
auto Hash = BBE.Hash.has_value() ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spell out the type explicitly here: uint64_t
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
BTW, if you have the Codegen PR, we can start reviewing that one too. We just need to have about a 1 week delay between pushing the PRs upstream. |
…ic block hash to the SHT_LLVM_BB_ADDR_MAP section.
88b80ba
to
5b537ad
Compare
The PR related to CodeGen is ready. #162963 |
We have optimized the implementation of introducing the "matching and inference" technique into Propeller. In this new implementation, we have made every effort to avoid introducing new compilation parameters while ensuring compatibility with Propeller's current usage. Instead of creating a new profile format, we reused the existing one employed by Propeller. This new implementation is fully compatible with Propeller's current usage patterns and reduces the amount of code changes. For detailed information, please refer to the following RFC: https://discourse.llvm.org/t/rfc-adding-matching-and-inference-functionality-to-propeller/86238.
We plan to submit the relevant changes in several pull requests (PRs). The current one is the first PR, which adds the basic block hash to the SHT_LLVM_BB_ADDR_MAP section.