Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Another Performance Problem Compared With Other Runtime Tools #759

Open
luxinyi0105 opened this issue Sep 11, 2023 · 3 comments
Open

Another Performance Problem Compared With Other Runtime Tools #759

luxinyi0105 opened this issue Sep 11, 2023 · 3 comments
Labels
stack-machine An issue tagged to the deprecated stack-machine `wasmi` engine backend.

Comments

@luxinyi0105
Copy link

luxinyi0105 commented Sep 11, 2023

Description

Similar to issue #754, I found another testcase that requires a lot more running time than other runtime tools.

Versions and Environment

Tools version: wasmi_cli 0.31.0
Operating system: Ubuntu 22.04.1
Architecture: x86_64

Explanation of TestCase

The given testcase was simply mutated a wasm file, which was obtained by compiling C program generated with Csmith using Emscripten Compiler(Emcc).

The original C program is c_file.c, the compilation results with Emscripten is wasm_file.wasm, and its wat format is wat_file.wat.

We mutated the wat file to change all i32/i64.add to i32/i64.sub, and change all i32/i64.shl to i32/i64.rotr. The result after mutation is mutated_file.wat, and its wasm format is mutated_file.wasm.

Extra Info

Although it takes different execution times, the running results obtained by different runtime tools are the same: RuntimeError due to out of bounds memory access. The execution time of wasmi and other runtime tools are as follows:

wasmer28.679 s
wasmtime22.521 s
wasmedge with AOT mode9.485 s
wasm-micro-runtime with JIT mode2 m 55.342 s
wasm315 m 54.901 s
wasmi140 m 58.431 s

I'm not sure if the problem in this issue is similar with issue #754, the detail still needs you to confim later. Thanks a lot!

@Robbepop
Copy link
Collaborator

Robbepop commented Sep 11, 2023

Thanks a lot @luxinyi0105 !

The WAMR (JIT mode) also seems to be extremely slow for it being a JIT. Maybe you could also open an issue there?
Also it generally seems that this mutated Wasm file just makes life significantly worse for interpreters and maybe generally speaking for Wasm runtimes that do not perform too many optimization under the hood.

Note that Wasmer, Wasmtime (and most certainly WasmEdge with AoT) are known for performing tons of optimizations to the input Wasm files before and during execution while a Wasm interpreter applies only a few if any. At least, maybe this could explain the huge differences we see here.

A 43x slowdown from Wasmtime to Wasm3 is also slower than I usually measure for Wasm3. Usually Wasm3 is 5-20x slower than Wasmtime.
It sure is a bit worrying that wasmi seems to be roughly 10x slower than Wasm3 despite both being interpreter based Wasm runtimes. Usual slowdown is roughly 2-3x. But again, Wasm3 applies more optimizations to the Wasm input than wasmi does which could again explain the huge difference. Usually you feed highly optimized Wasm inputs to Wasm runtimes, because that's one of the main benefits.

We could try to check this by applying Binaryen's wasm-opt tool onto the mutated Wasm file and see how the Wasm runtimes perform on this post-optimized Wasm file.

The register-machine based wasmi engine (#729) will implement many more optimizations on its Wasm input and will probably (hopefully) behaves more similar to Wasm3. So maybe it would be nice to test it too. However, it currently is experimental and might not work.

@Robbepop
Copy link
Collaborator

Robbepop commented Sep 22, 2023

@luxinyi0105 I have a new suspicion about the performance testings. You mutated the Wasm blobs without applying proper optimizations. This leads to optimizing JIT and AoT Wasm engines to highly outperform non-optimizing interpreter based Wasm runtimes such as Wasm3 and wasmi.
Therefore if you want to profile mutated Wasm blobs you should apply wasmopt -O3 on them afterwards to balance out those effects on the performance.
Usually you do not run unoptimized Wasm blobs on production Wasm runtimes, rather Wasm runtimes usually receive already highly optimized Wasm blobs.

For example, at Parity we optimize our Wasm smart contracts using LLVM, fat LTO and codegen-units=1 as well as a post-optimization step via the aforementioned wasmopt tool by Bynarien before we actually execute them via wasmi. This ensures a high degree of performance even in interpreter mode.

@Robbepop
Copy link
Collaborator

Note that this issue might be fixed by the upcoming release of the wasmi register-machine engine backend.

@Robbepop Robbepop added the stack-machine An issue tagged to the deprecated stack-machine `wasmi` engine backend. label Dec 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stack-machine An issue tagged to the deprecated stack-machine `wasmi` engine backend.
Projects
None yet
Development

No branches or pull requests

2 participants