Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BOLT] Match functions with name similarity #95884

Merged

Conversation

shawbyoung
Copy link
Contributor

@shawbyoung shawbyoung commented Jun 18, 2024

A mapping - from namespace to associated binary functions - is used to
match function profiles to binary based on the
'--name-similarity-function-matching-threshold' flag set edit distance
threshold. The flag is set to 0 (exact name matching) by default as it is
expensive, requiring the processing of all BFs.

Test Plan: Added name-similarity-function-matching.test. On a binary
with 5M functions, rewrite passes took ~520s without the flag and
~2018s with the flag set to 20.

sayhaan and others added 2 commits June 17, 2024 23:14
Created using spr 1.3.4
@llvmbot
Copy link
Collaborator

llvmbot commented Jun 18, 2024

@llvm/pr-subscribers-bolt

Author: shaw young (shawbyoung)

Changes

Matching functions based on edit
distance.

Test Plan: tbd


Full diff: https://github.com/llvm/llvm-project/pull/95884.diff

1 Files Affected:

  • (modified) bolt/lib/Profile/YAMLProfileReader.cpp (+29)
diff --git a/bolt/lib/Profile/YAMLProfileReader.cpp b/bolt/lib/Profile/YAMLProfileReader.cpp
index f25f59201f1cd..a4f8ba9a440b6 100644
--- a/bolt/lib/Profile/YAMLProfileReader.cpp
+++ b/bolt/lib/Profile/YAMLProfileReader.cpp
@@ -13,6 +13,7 @@
 #include "bolt/Profile/ProfileYAMLMapping.h"
 #include "bolt/Utils/Utils.h"
 #include "llvm/ADT/STLExtras.h"
+#include "llvm/ADT/edit_distance.h"
 #include "llvm/Support/CommandLine.h"
 
 using namespace llvm;
@@ -23,6 +24,10 @@ extern cl::opt<unsigned> Verbosity;
 extern cl::OptionCategory BoltOptCategory;
 extern cl::opt<bool> InferStaleProfile;
 
+cl::opt<unsigned> NameSimilarityFunctionMatchingThreshold(
+    "name-similarity-function-matching-threshold", cl::desc("edit distance."),
+    cl::init(0), cl::Hidden, cl::cat(BoltOptCategory));
+
 static llvm::cl::opt<bool>
     IgnoreHash("profile-ignore-hash",
                cl::desc("ignore hash while reading function profile"),
@@ -415,6 +420,30 @@ Error YAMLProfileReader::readProfile(BinaryContext &BC) {
     if (!YamlBF.Used && BF && !ProfiledFunctions.count(BF))
       matchProfileToFunction(YamlBF, *BF);
 
+  // Uses name similarity to match functions that were not matched by name.
+  for (auto YamlBF : YamlBP.Functions) {
+    if (YamlBF.Used)
+      continue;
+
+    unsigned MinEditDistance = UINT_MAX;
+    BinaryFunction *ClosestNameBF = nullptr;
+
+    for (auto &[_, BF] : BC.getBinaryFunctions()) {
+      if (ProfiledFunctions.count(&BF))
+        continue;
+
+      unsigned BFEditDistance = BF.getOneName().edit_distance(YamlBF.Name);
+      if (BFEditDistance < MinEditDistance) {
+        MinEditDistance = BFEditDistance;
+        ClosestNameBF = &BF;
+      }
+    }
+
+    if (ClosestNameBF &&
+      MinEditDistance < opts::NameSimilarityFunctionMatchingThreshold)
+      matchProfileToFunction(YamlBF, *ClosestNameBF);
+  }
+
   for (yaml::bolt::BinaryFunctionProfile &YamlBF : YamlBP.Functions)
     if (!YamlBF.Used && opts::Verbosity >= 1)
       errs() << "BOLT-WARNING: profile ignored for function " << YamlBF.Name

Created using spr 1.3.4

[skip ci]
Created using spr 1.3.4
Created using spr 1.3.4
Created using spr 1.3.4
Copy link
Contributor

@aaupov aaupov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please also note the runtime of the namespace-filtered pairwise checks? This would guide us in whether to add a BB count filtering.

bolt/lib/Profile/YAMLProfileReader.cpp Outdated Show resolved Hide resolved
bolt/lib/Profile/YAMLProfileReader.cpp Outdated Show resolved Hide resolved
bolt/lib/Profile/YAMLProfileReader.cpp Outdated Show resolved Hide resolved
Created using spr 1.3.4
Created using spr 1.3.4

[skip ci]
Created using spr 1.3.4
Created using spr 1.3.4
Created using spr 1.3.4
Created using spr 1.3.4
@shawbyoung shawbyoung marked this pull request as draft June 24, 2024 23:05
shawbyoung and others added 3 commits June 24, 2024 16:33
Created using spr 1.3.4
Created using spr 1.3.4

[skip ci]
Created using spr 1.3.4
@shawbyoung
Copy link
Contributor Author

shawbyoung commented Jun 25, 2024

Can you please also note the runtime of the namespace-filtered pairwise checks? This would guide us in whether to add a BB count filtering.

Ran BOLT on a large binary with 4917369 functions with name similarity function matching - total execution time of rewrite passes was 2235.2883 seconds (1989.1823 wall clock) with an edit distance threshold of 20 and 2194.1708 seconds (1986.2007 wall clock) with an edit distance threshold of 10. Without name similarity matching, total execution time of rewrite passes was 520.4121 seconds (296.4178 wall clock).

@shawbyoung shawbyoung marked this pull request as ready for review June 25, 2024 16:30
maksfb and others added 2 commits June 25, 2024 11:51
Created using spr 1.3.4

[skip ci]
Created using spr 1.3.4
Copy link
Contributor

@aaupov aaupov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for checking the runtime, and it's quite high. We'll need extra filtering by block count to keep runtime low - please add it as we discussed.

bolt/lib/Profile/YAMLProfileReader.cpp Outdated Show resolved Hide resolved
bolt/lib/Profile/YAMLProfileReader.cpp Outdated Show resolved Hide resolved
@shawbyoung
Copy link
Contributor Author

Thank you for checking the runtime, and it's quite high. We'll need extra filtering by block count to keep runtime low - please add it as we discussed.

if (BF->size() != YamlBF.NumBasicBlocks)
continue;

Functions are filtered by block size here @aaupov

maksfb and others added 2 commits June 26, 2024 11:53
Created using spr 1.3.4

[skip ci]
Created using spr 1.3.4
Copy link
Contributor

@maksfb maksfb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please refactor new code into a separate function. Add a comment on how the matching is done such that the interface can be understood without reading the code.

bolt/test/X86/name-similarity-function-matching.test Outdated Show resolved Hide resolved
bolt/test/X86/name-similarity-function-matching.test Outdated Show resolved Hide resolved
bolt/lib/Profile/YAMLProfileReader.cpp Outdated Show resolved Hide resolved
bolt/lib/Profile/YAMLProfileReader.cpp Outdated Show resolved Hide resolved
bolt/lib/Profile/YAMLProfileReader.cpp Outdated Show resolved Hide resolved
maksfb and others added 4 commits June 28, 2024 10:22
Created using spr 1.3.4

[skip ci]
Created using spr 1.3.4
Created using spr 1.3.4

[skip ci]
Created using spr 1.3.4
@shawbyoung shawbyoung changed the title [BOLT] Name similarity function matching [BOLT] Match functions with name similarity Jun 28, 2024
maksfb and others added 2 commits June 28, 2024 14:28
Created using spr 1.3.4

[skip ci]
Created using spr 1.3.4
Copy link
Contributor

@aaupov aaupov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LG in general with some comments

bolt/lib/Profile/YAMLProfileReader.cpp Outdated Show resolved Hide resolved
bolt/lib/Profile/YAMLProfileReader.cpp Outdated Show resolved Hide resolved
bolt/lib/Profile/YAMLProfileReader.cpp Outdated Show resolved Hide resolved
bolt/lib/Profile/YAMLProfileReader.cpp Outdated Show resolved Hide resolved
bolt/lib/Profile/YAMLProfileReader.cpp Outdated Show resolved Hide resolved
bolt/lib/Profile/YAMLProfileReader.cpp Outdated Show resolved Hide resolved
maksfb and others added 2 commits July 3, 2024 08:39
Copy link
Contributor

@aaupov aaupov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LG with a couple of nits.

bolt/lib/Profile/YAMLProfileReader.cpp Outdated Show resolved Hide resolved
bolt/lib/Profile/YAMLProfileReader.cpp Outdated Show resolved Hide resolved
bolt/lib/Profile/YAMLProfileReader.cpp Show resolved Hide resolved
bolt/lib/Profile/YAMLProfileReader.cpp Show resolved Hide resolved
bolt/lib/Profile/YAMLProfileReader.cpp Show resolved Hide resolved
maksfb and others added 7 commits July 3, 2024 09:49
Created using spr 1.3.4

[skip ci]
Created using spr 1.3.4
Created using spr 1.3.4

[skip ci]
Created using spr 1.3.4
Created using spr 1.3.4

[skip ci]
Created using spr 1.3.4
@shawbyoung shawbyoung changed the base branch from users/shawbyoung/spr/main.bolt-name-similarity-function-matching to main July 3, 2024 18:35
Created using spr 1.3.4
@shawbyoung shawbyoung removed the request for review from JDevlieghere July 3, 2024 18:38
@shawbyoung shawbyoung merged commit 97dc508 into main Jul 3, 2024
4 of 5 checks passed
@shawbyoung shawbyoung deleted the users/shawbyoung/spr/bolt-name-similarity-function-matching branch July 3, 2024 18:39
@llvm-ci
Copy link
Collaborator

llvm-ci commented Jul 3, 2024

LLVM Buildbot has detected a new failure on builder bolt-x86_64-ubuntu-shared running on bolt-worker while building bolt at step 5 "build-bolt".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/151/builds/775

Here is the relevant piece of the build log for the reference:

Step 5 (build-bolt) failure: build (failure)
0.010 [90/2/1] Performing build step for 'bolt_rt'
ninja: no work to do.
0.013 [89/2/2] Generating VCSRevision.h
0.020 [7/4/3] No install step for 'bolt_rt'
0.035 [6/4/4] Completed 'bolt_rt'
6.100 [6/3/5] Building CXX object tools/bolt/lib/Profile/CMakeFiles/LLVMBOLTProfile.dir/StaleProfileMatching.cpp.o
6.692 [6/2/6] Building CXX object tools/bolt/lib/Profile/CMakeFiles/LLVMBOLTProfile.dir/YAMLProfileReader.cpp.o
6.728 [5/2/7] Linking CXX shared library lib/libLLVMBOLTProfile.so.19.0git
FAILED: lib/libLLVMBOLTProfile.so.19.0git 
: && /usr/bin/c++ -fPIC -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror=date-time -fno-lifetime-dse -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wno-missing-field-initializers -pedantic -Wno-long-long -Wimplicit-fallthrough -Wno-uninitialized -Wno-nonnull -Wno-class-memaccess -Wno-redundant-move -Wno-pessimizing-move -Wno-noexcept-type -Wdelete-non-virtual-dtor -Wsuggest-override -Wno-comment -Wno-misleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -O3 -DNDEBUG  -Wl,-z,defs -Wl,-z,nodelete -fuse-ld=lld -Wl,--color-diagnostics   -Wl,--gc-sections -shared -Wl,-soname,libLLVMBOLTProfile.so.19.0git -o lib/libLLVMBOLTProfile.so.19.0git tools/bolt/lib/Profile/CMakeFiles/LLVMBOLTProfile.dir/BoltAddressTranslation.cpp.o tools/bolt/lib/Profile/CMakeFiles/LLVMBOLTProfile.dir/DataAggregator.cpp.o tools/bolt/lib/Profile/CMakeFiles/LLVMBOLTProfile.dir/DataReader.cpp.o tools/bolt/lib/Profile/CMakeFiles/LLVMBOLTProfile.dir/Heatmap.cpp.o tools/bolt/lib/Profile/CMakeFiles/LLVMBOLTProfile.dir/StaleProfileMatching.cpp.o tools/bolt/lib/Profile/CMakeFiles/LLVMBOLTProfile.dir/YAMLProfileReader.cpp.o tools/bolt/lib/Profile/CMakeFiles/LLVMBOLTProfile.dir/YAMLProfileWriter.cpp.o  -Wl,-rpath,"\$ORIGIN/../lib:/home/worker/bolt-worker2/bolt-x86_64-ubuntu-shared/build/lib:"  lib/libLLVMBOLTCore.so.19.0git  lib/libLLVMBOLTUtils.so.19.0git  lib/libLLVMTransformUtils.so.19.0git  lib/libLLVMSupport.so.19.0git  -Wl,-rpath-link,/home/worker/bolt-worker2/bolt-x86_64-ubuntu-shared/build/lib && :
ld.lld: error: undefined symbol: llvm::ItaniumPartialDemangler::ItaniumPartialDemangler()
>>> referenced by YAMLProfileReader.cpp
>>>               tools/bolt/lib/Profile/CMakeFiles/LLVMBOLTProfile.dir/YAMLProfileReader.cpp.o:(llvm::bolt::YAMLProfileReader::matchWithNameSimilarity(llvm::bolt::BinaryContext&) (.localalias))

ld.lld: error: undefined symbol: llvm::demangle[abi:cxx11](std::basic_string_view<char, std::char_traits<char>>)
>>> referenced by YAMLProfileReader.cpp
>>>               tools/bolt/lib/Profile/CMakeFiles/LLVMBOLTProfile.dir/YAMLProfileReader.cpp.o:(llvm::bolt::YAMLProfileReader::matchWithNameSimilarity(llvm::bolt::BinaryContext&) (.localalias))

ld.lld: error: undefined symbol: llvm::ItaniumPartialDemangler::partialDemangle(char const*)
>>> referenced by YAMLProfileReader.cpp
>>>               tools/bolt/lib/Profile/CMakeFiles/LLVMBOLTProfile.dir/YAMLProfileReader.cpp.o:(llvm::bolt::YAMLProfileReader::matchWithNameSimilarity(llvm::bolt::BinaryContext&) (.localalias))
>>> referenced by YAMLProfileReader.cpp
>>>               tools/bolt/lib/Profile/CMakeFiles/LLVMBOLTProfile.dir/YAMLProfileReader.cpp.o:(llvm::bolt::YAMLProfileReader::matchWithNameSimilarity(llvm::bolt::BinaryContext&) (.localalias))

ld.lld: error: undefined symbol: llvm::ItaniumPartialDemangler::getFunctionDeclContextName(char*, unsigned long*) const
>>> referenced by YAMLProfileReader.cpp
>>>               tools/bolt/lib/Profile/CMakeFiles/LLVMBOLTProfile.dir/YAMLProfileReader.cpp.o:(llvm::bolt::YAMLProfileReader::matchWithNameSimilarity(llvm::bolt::BinaryContext&) (.localalias))
>>> referenced by YAMLProfileReader.cpp
>>>               tools/bolt/lib/Profile/CMakeFiles/LLVMBOLTProfile.dir/YAMLProfileReader.cpp.o:(llvm::bolt::YAMLProfileReader::matchWithNameSimilarity(llvm::bolt::BinaryContext&) (.localalias))

ld.lld: error: undefined symbol: llvm::ItaniumPartialDemangler::~ItaniumPartialDemangler()
>>> referenced by YAMLProfileReader.cpp
>>>               tools/bolt/lib/Profile/CMakeFiles/LLVMBOLTProfile.dir/YAMLProfileReader.cpp.o:(llvm::bolt::YAMLProfileReader::matchWithNameSimilarity(llvm::bolt::BinaryContext&) (.localalias))
collect2: error: ld returned 1 exit status
19.233 [5/1/8] Building CXX object tools/bolt/lib/Rewrite/CMakeFiles/LLVMBOLTRewrite.dir/RewriteInstance.cpp.o
ninja: build stopped: subcommand failed.

kbluck pushed a commit to kbluck/llvm-project that referenced this pull request Jul 6, 2024
A mapping - from namespace to associated binary functions - is used to
match function profiles to binary based on the
'--name-similarity-function-matching-threshold' flag set edit distance
threshold. The flag is set to 0 (exact name matching) by default as it is
expensive, requiring the processing of all BFs.

Test Plan: Added name-similarity-function-matching.test. On a binary
with 5M functions, rewrite passes took ~520s without the flag and
~2018s with the flag set to 20.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants