-
Notifications
You must be signed in to change notification settings - Fork 787
Integrating LLVM optimizations with wasm-opt #7634
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
I think there is a lot of potential here! Btw, I remembered in #7637 (comment) that our dataflow IR may be useful here, which is SSA-like: https://github.com/WebAssembly/binaryen/tree/main/src/dataflow There is a simple pass that does so, https://github.com/WebAssembly/binaryen/blob/main/src/passes/DataFlowOpts.cpp I'm not sure, but an option might be to use the existing Binaryen IR => DataFlow IR, and add DataFlow IR => LLVM IR (and the last part could be simpler since it would be SSA => SSA). |
|
||
// Create global memory buffer | ||
ArrayType* memType = ArrayType::get(llvmBuilder->getInt8Ty(), totalSize); | ||
GlobalVariable* llvmMem = new GlobalVariable( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How would this be rewritten back into, presumably, multi-memory WASM? Or should this pass only work on single-memory WASM?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still relatively new to compilers and WebAssembly (currently studying both), so please forgive any naivety in my code. This was just a small attempt...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a newbie, I’m trying to learn compilers and sys like llvm and wasm. However, the docs are huge and not very beginner-friendly. Any advice on where and how to start? Thanks!
Value* visitStore(Store* store) { | ||
// 1. Get the @wasm_memory global variable. | ||
GlobalVariable* wasmMemory = | ||
llvmMod->getGlobalVariable("wasm_memory", true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If multiple memories are supported, this would be unsound.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The initial though would be:
- for MVP, we just need map each instructions while carefully dealing with semantics gaps such as UB in LLVM, inconsistency of FP spec and other low-level differences between the source and target semantics.
- for non-MVP (GC), we could perform code slicing to collect non GC parts (LLVM-optimizable), and transpile & send them to LLVM optimizer, then retrieve optimized code back and "stitch" it back.
So, in my humble view, it's better to satrt with MVP first.
|
||
Value* visitConst(Const* c) { | ||
assert(c->type.isBasic()); | ||
switch (c->type.getBasic()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note LLVM supports WASM's externref
There is now a proposal to add wasm input to upstream LLVM: https://discourse.llvm.org/t/rfc-mlir-dialect-for-webassembly/86758 If accepted, that could be very useful here, as it would let some wasm modules be read by LLVM, optimized, and re-emitted as LLVM. They will never support all of wasm (like GC, I assume), but we could do work on our side to "filter" out the parts they can't handle, let them optimize, and then re-apply the filtered parts, something like that. That might still be a lot of work for us, but a lot less than otherwise. |
Thanks for sharing, and I'll read it carefully. |
This draft is about leveraging llvm opt to benefiting wasm-opt.
Languages like C/C++ and Rust are from LLVM and benefit a lot. However, not all come from LLVM (GC languages like Java, Kotlin, Dart, etc). wasm-opt wishes to take the role of a toolchain optimizer but cannot do optimizations due to the AST level optimizations. For example, wasm-opt cannot optimize the redundant
store
(first one):The general idea is: translate Binaryen IR (from LLVM-compatible code) into LLVM IR, let llvm-opt optimize it, and then get back the optimized result . The most closely related work is Speeding up SMT Solving via Compiler Optimization (FSE 2023), which uses a similar approach by translating SMT queries into LLVM IR to benefit from LLVM optimizations.
An earlier prototype implementing this idea can be found in this PR: main...kripken:binaryen:llvm. That experiment used existing tools like wabt, emcc, and llvm-opt, but a direct 1-to-1 translation may be better.
(I'll continue this if time allows)