-
Notifications
You must be signed in to change notification settings - Fork 10.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Exegesis] Do not assume the size and layout of the assembled snippet #79636
Conversation
Currently llvm-exegesis assumes that there will only be 3 symbols in the snippet object, in which the benchmarking function 'foo' is always the last symbol. These assumptions do not hold for object file formats of other targets we support downstream. I think it would be more ideal to generalize this part of the logics into a simple search on all symbols, as proposed by this patch.
@llvm/pr-subscribers-tools-llvm-exegesis Author: Min-Yih Hsu (mshockwave) ChangesCurrently llvm-exegesis assumes that there will only be 3 symbols in the snippet object, in which the benchmarking function 'foo' is always the last symbol. These assumptions do not hold for object file formats of other targets we support downstream. I think it would be more ideal to generalize this part of the logics into a simple search on all symbols, as proposed by this patch. Full diff: https://github.com/llvm/llvm-project/pull/79636.diff 1 Files Affected:
diff --git a/llvm/tools/llvm-exegesis/lib/Assembler.cpp b/llvm/tools/llvm-exegesis/lib/Assembler.cpp
index 9f03a4e3a5a6ff..366971ed6a546d 100644
--- a/llvm/tools/llvm-exegesis/lib/Assembler.cpp
+++ b/llvm/tools/llvm-exegesis/lib/Assembler.cpp
@@ -365,11 +365,18 @@ Expected<ExecutableFunction> ExecutableFunction::create(
auto SymbolSizes = object::computeSymbolSizes(*ObjectFileHolder.getBinary());
// Get the size of the function that we want to call into (with the name of
- // FunctionID). This should always be the third symbol returned by
- // calculateSymbolSizes.
- assert(SymbolSizes.size() == 3);
- assert(cantFail(std::get<0>(SymbolSizes[2]).getName()) == FunctionID);
- uintptr_t CodeSize = std::get<1>(SymbolSizes[2]);
+ // FunctionID).
+ auto SymbolIt = llvm::find_if(SymbolSizes, [&](const auto &Pair) {
+ auto SymbolName = Pair.first.getName();
+ if (SymbolName)
+ return *SymbolName == FunctionID;
+ // Suppress the error.
+ llvm::consumeError(SymbolName.takeError());
+ return false;
+ });
+ assert(SymbolIt != SymbolSizes.end() &&
+ "Cannot find the symbol for FunctionID");
+ uintptr_t CodeSize = SymbolIt->second;
auto EJITOrErr = orc::LLJITBuilder().create();
if (!EJITOrErr)
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One nit. Otherwise LGTM.
Thanks for fixing this. Is it possible to have a test in-tree for this or is the object file code for that downstream as well? It would be nice to have a test if possible. I mainly implemented this the way I did as I wanted to avoid the search (which I guess isn't actually that expensive) and figured exegesis de-facto only supports ELF, so this should be fine (didn't test any different object file formats though).
I thought about this too: ideally we could insert an artificially-created object file with more than 3 symbols right before we ran it with JIT. But IIUC llvm-exegesis currently lacks such functionality (I think using snippet file as input doesn't exercise this program path) so it would be difficult to write a test at this moment. |
Using the snippet file does exercise this path (as the assembly needs to be wrapped in setup code, duplicated/wrapped in a loop, and then emitted into an in-memory object file), but it only lets you load in textual assembly that gets assembled (through the same path) within |
Good to know, thanks! I'll merge this shortly. |
Currently llvm-exegesis assumes that there will only be 3 symbols in the snippet object, in which the benchmarking function 'foo' is always the last symbol.
These assumptions do not hold for object file formats of other targets we support downstream. I think it would be more ideal to generalize this part of the logics into a simple search on all symbols, as proposed by this patch.