Skip to content

Commit

Permalink
[BOLT] stale profile matching [part 1 out of 2]
Browse files Browse the repository at this point in the history
BOLT often has to deal with profiles collected on binaries built from several
revisions behind release. As a result, a certain percentage of functions is
considered stale and not optimized. This diff adds an ability to match profile
to functions that are not 100% binary identical, which increases the
optimization coverage and boosts the performance of applications.

The algorithm consists of two phases: matching and inference:
- At the matching phase, we try to "guess" as many block and jump counts from
  the stale profile as possible. To this end, the content of each basic block
  is hashed and stored in the (yaml) profile. When BOLT optimizes a binary,
  it computes block hashes and identifies the corresponding entries in the
  stale profile. It yields a partial profile for every CFG in the binary.
- At the inference phase, we employ a network flow-based algorithm (profi) to
  reconstruct "realistic" block and jump counts from the partial profile
  generated at the first stage. In practice, we don't always produce proper
  profile data but the majority (e.g., >90%) of CFGs get the correct counts.

This is a first part of the change; the next stacked diff extends the block hashing
and provides perf evaluation numbers.

Reviewed By: maksfb

Differential Revision: https://reviews.llvm.org/D144500
  • Loading branch information
spupyrev committed Jun 6, 2023
1 parent 3cb6ead commit 4426827
Show file tree
Hide file tree
Showing 8 changed files with 681 additions and 0 deletions.
10 changes: 10 additions & 0 deletions bolt/include/bolt/Core/BinaryFunction.h
Original file line number Diff line number Diff line change
Expand Up @@ -384,6 +384,10 @@ class BinaryFunction {
/// Indicates the type of profile the function is using.
uint16_t ProfileFlags{PF_NONE};

/// True if the function's input profile data has been inaccurate but has
/// been adjusted by the profile inference algorithm.
bool HasInferredProfile{false};

/// For functions with mismatched profile we store all call profile
/// information at a function level (as opposed to tying it to
/// specific call sites).
Expand Down Expand Up @@ -1566,6 +1570,12 @@ class BinaryFunction {
/// Return flags describing a profile for this function.
uint16_t getProfileFlags() const { return ProfileFlags; }

/// Return true if the function's input profile data has been inaccurate but
/// has been corrected by the profile inference algorithm.
bool hasInferredProfile() const { return HasInferredProfile; }

void setHasInferredProfile(bool Inferred) { HasInferredProfile = Inferred; }

void addCFIInstruction(uint64_t Offset, MCCFIInstruction &&Inst) {
assert(!Instructions.empty());

Expand Down
4 changes: 4 additions & 0 deletions bolt/include/bolt/Profile/YAMLProfileReader.h
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,10 @@ class YAMLProfileReader : public ProfileReaderBase {
bool parseFunctionProfile(BinaryFunction &Function,
const yaml::bolt::BinaryFunctionProfile &YamlBF);

/// Infer function profile from stale data (collected on older binaries).
bool inferStaleProfile(BinaryFunction &Function,
const yaml::bolt::BinaryFunctionProfile &YamlBF);

/// Initialize maps for profile matching.
void buildNameMaps(std::map<uint64_t, BinaryFunction> &Functions);

Expand Down
19 changes: 19 additions & 0 deletions bolt/lib/Passes/BinaryPasses.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1341,10 +1341,13 @@ void PrintProfileStats::runOnFunctions(BinaryContext &BC) {
void PrintProgramStats::runOnFunctions(BinaryContext &BC) {
uint64_t NumRegularFunctions = 0;
uint64_t NumStaleProfileFunctions = 0;
uint64_t NumAllStaleFunctions = 0;
uint64_t NumInferredFunctions = 0;
uint64_t NumNonSimpleProfiledFunctions = 0;
uint64_t NumUnknownControlFlowFunctions = 0;
uint64_t TotalSampleCount = 0;
uint64_t StaleSampleCount = 0;
uint64_t InferredSampleCount = 0;
std::vector<const BinaryFunction *> ProfiledFunctions;
const char *StaleFuncsHeader = "BOLT-INFO: Functions with stale profile:\n";
for (auto &BFI : BC.getBinaryFunctions()) {
Expand Down Expand Up @@ -1379,6 +1382,11 @@ void PrintProgramStats::runOnFunctions(BinaryContext &BC) {

if (Function.hasValidProfile()) {
ProfiledFunctions.push_back(&Function);
if (Function.hasInferredProfile()) {
++NumInferredFunctions;
InferredSampleCount += SampleCount;
++NumAllStaleFunctions;
}
} else {
if (opts::ReportStaleFuncs) {
outs() << StaleFuncsHeader;
Expand All @@ -1387,6 +1395,7 @@ void PrintProgramStats::runOnFunctions(BinaryContext &BC) {
}
++NumStaleProfileFunctions;
StaleSampleCount += SampleCount;
++NumAllStaleFunctions;
}
}
BC.NumProfiledFuncs = ProfiledFunctions.size();
Expand Down Expand Up @@ -1433,6 +1442,16 @@ void PrintProgramStats::runOnFunctions(BinaryContext &BC) {
exit(1);
}
}
if (NumInferredFunctions) {
outs() << format("BOLT-INFO: inferred profile for %d (%.2f%% of profiled, "
"%.2f%% of stale) functions responsible for %.2f%% samples"
" (%zu out of %zu)\n",
NumInferredFunctions,
100.0 * NumInferredFunctions / NumAllProfiledFunctions,
100.0 * NumInferredFunctions / NumAllStaleFunctions,
100.0 * InferredSampleCount / TotalSampleCount,
InferredSampleCount, TotalSampleCount);
}

if (const uint64_t NumUnusedObjects = BC.getNumUnusedProfiledObjects()) {
outs() << "BOLT-INFO: profile for " << NumUnusedObjects
Expand Down
2 changes: 2 additions & 0 deletions bolt/lib/Profile/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -4,13 +4,15 @@ add_llvm_library(LLVMBOLTProfile
DataReader.cpp
Heatmap.cpp
ProfileReaderBase.cpp
StaleProfileMatching.cpp
YAMLProfileReader.cpp
YAMLProfileWriter.cpp

DISABLE_LLVM_LINK_LLVM_DYLIB

LINK_COMPONENTS
Support
TransformUtils
)

target_link_libraries(LLVMBOLTProfile
Expand Down
Loading

0 comments on commit 4426827

Please sign in to comment.