-
Notifications
You must be signed in to change notification settings - Fork 10.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BOLT][NFC] Unify two symbol table iterations #90724
Conversation
@llvm/pr-subscribers-bolt Author: Donghee Na (corona10) ChangesSimple refactoring which was suggested from #90661 (comment) Full diff: https://github.com/llvm/llvm-project/pull/90724.diff 1 Files Affected:
diff --git a/bolt/lib/Rewrite/RewriteInstance.cpp b/bolt/lib/Rewrite/RewriteInstance.cpp
index 23f79e3c135a78..2e91856c0ebfe6 100644
--- a/bolt/lib/Rewrite/RewriteInstance.cpp
+++ b/bolt/lib/Rewrite/RewriteInstance.cpp
@@ -820,7 +820,26 @@ void RewriteInstance::discoverFileObjects() {
return std::hash<decltype(DataRefImpl::p)>{}(S.getRawDataRefImpl().p);
}
};
+
+ // Sort symbols in the file by value. Ignore symbols from non-allocatable
+ // sections. We memoize getAddress(), as it has rather high overhead.
+ struct SymbolInfo {
+ uint64_t Address;
+ SymbolRef Symbol;
+ };
+ auto isSymbolInMemory = [this](const SymbolRef &Sym) {
+ if (cantFail(Sym.getType()) == SymbolRef::ST_File)
+ return false;
+ if (cantFail(Sym.getFlags()) & SymbolRef::SF_Absolute)
+ return true;
+ if (cantFail(Sym.getFlags()) & SymbolRef::SF_Undefined)
+ return false;
+ BinarySection Section(*BC, *cantFail(Sym.getSection()));
+ return Section.isAllocatable();
+ };
+ std::vector<SymbolInfo> SortedSymbols;
std::unordered_map<SymbolRef, StringRef, SymbolRefHash> SymbolToFileName;
+
for (const ELFSymbolRef &Symbol : InputFile->symbols()) {
Expected<StringRef> NameOrError = Symbol.getName();
if (NameOrError && NameOrError->starts_with("__asan_init")) {
@@ -835,6 +854,8 @@ void RewriteInstance::discoverFileObjects() {
"support. Cannot optimize.\n";
exit(1);
}
+ if (isSymbolInMemory(Symbol))
+ SortedSymbols.push_back({cantFail(Symbol.getAddress()), Symbol});
if (cantFail(Symbol.getFlags()) & SymbolRef::SF_Undefined)
continue;
@@ -856,27 +877,6 @@ void RewriteInstance::discoverFileObjects() {
SymbolToFileName[Symbol] = FileSymbolName;
}
- // Sort symbols in the file by value. Ignore symbols from non-allocatable
- // sections. We memoize getAddress(), as it has rather high overhead.
- struct SymbolInfo {
- uint64_t Address;
- SymbolRef Symbol;
- };
- std::vector<SymbolInfo> SortedSymbols;
- auto isSymbolInMemory = [this](const SymbolRef &Sym) {
- if (cantFail(Sym.getType()) == SymbolRef::ST_File)
- return false;
- if (cantFail(Sym.getFlags()) & SymbolRef::SF_Absolute)
- return true;
- if (cantFail(Sym.getFlags()) & SymbolRef::SF_Undefined)
- return false;
- BinarySection Section(*BC, *cantFail(Sym.getSection()));
- return Section.isAllocatable();
- };
- for (const SymbolRef &Symbol : InputFile->symbols())
- if (isSymbolInMemory(Symbol))
- SortedSymbols.push_back({cantFail(Symbol.getAddress()), Symbol});
-
auto CompareSymbols = [this](const SymbolInfo &A, const SymbolInfo &B) {
if (A.Address != B.Address)
return A.Address < B.Address;
|
I checked the build success on my local machine but not familiar to run the BOLT only test, is there any guide exists? |
Okay |
Thanks Donghee! Looks good, but please retitle to "[BOLT][NFC] Unify two symbol table iterations". |
Yeah, I will share the info soon :) |
@aaupov |
https://github.com/llvm/llvm-project/blob/main/bolt/docs/OptimizingClang.md Is this still recommened setup? |
The guide covers more than what we need in this case. You can just run on BOLT on any Clang binary, for instance, you can try BOLT instrumentation:
|
Hmm, my system clang-18 and pre-built clang binary from llvm-release page is not able to be processed because of the compile option (in my guess)
|
Sorry, yes, instrumentation requires relocations in the input binary. I've checked that rewriting in
|
There was a binary issue from Fedora 39 and I tested with release pre-built binary from GitHub that given more proper information CPU: AMD Ryzen™ 7 8845HS AS-IS TO-BE |
Let me test it with a large binary. |
To isolate the effect of the change, I added llvm-project/bolt/lib/Rewrite/RewriteInstance.cpp Lines 876 to 880 in ffc9a30
For chromium 5735 with 1.2M symbols on Intel ADL i7-12700K the effect is the following (warmup=3, 10 runs):
after:
So merging loops has an opposite effect to what I had hoped. Thanks for putting the effort nonetheless! |
Next time I will consider this binary for the benchmark. |
Simple refactoring which was suggested from #90661 (comment)