-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a compile-time feature for call hooks #8795
Add a compile-time feature for call hooks #8795
Conversation
This commit moves the `Store::call_hook` API behind a Cargo feature named `call-hook`. This helps speed up the path from wasm into the host by avoiding branches at the start and the end of the execution. In a thread on [Zulip] this is locally leading to significant performance gains in this particular microbenchmark so having an option to disable it at the crate layer seems like a reasonable way to thread this needle for now. This definitely has a downside in that it requires a crate feature at all, but I'm not sure of a better solution as LLVM can't dynamically detect that `Store::call_hook` is never invoked and therefore the branch can be optimized away. [Zulip]: https://bytecodealliance.zulipchat.com/#narrow/stream/217126-wasmtime/topic/Performance.20regression.20since.20rust.201.2E65/near/444505571
I think this is the correct choice - call hooks fill a pretty particular need, and I'd even be fine if this feature was not enabled by default. Have we surveyed if anyone besides Fastly's embedding even makes use of it? |
I haven't surveyed myself but carrying a Cargo feature for it isn't too bad. Most of our Cargo features have been relatively low overhead to maintain so far. The main question to me is that this might be an appropriate feature to be off-by-default instead of on-by-default as it can affect benchmarks, but even then it's pretty rare anyone benchmarks host-to-wasm calls or vice versa, most benchmarking is just of the wasm itself. |
I think regardless of who might encounter this in benchmarks, as long it's not essentially always used, it makes sense to disable by default: we should keep the call overhead low wherever we can, after all, and requiring people to disable very specific features for best performance doesn't seem great to me. |
I can try to investigate this a bit. I believe the |
Measuring locally from the benchmark from Zulip, which is very heavy in wasm->host calls and the host call itself does nothing, enabling the Put another way I don't think it's worth over-rotating too much on |
With #8807 the performance is on-par now with the |
This commit disables the `call-hook` feature for the Wasmtime crate added in bytecodealliance#8795 by default. The rationale is that this has a slight cost to all embeddings even if the feature isn't used and it's not expected to be that widely used of a feature, so off-by-default seems like a more appropriate default.
This commit disables the `call-hook` feature for the Wasmtime crate added in bytecodealliance#8795 by default. The rationale is that this has a slight cost to all embeddings even if the feature isn't used and it's not expected to be that widely used of a feature, so off-by-default seems like a more appropriate default.
* Disable `call-hook` crate feature by default This commit disables the `call-hook` feature for the Wasmtime crate added in #8795 by default. The rationale is that this has a slight cost to all embeddings even if the feature isn't used and it's not expected to be that widely used of a feature, so off-by-default seems like a more appropriate default. * Enable all features in doc build * More doc fixes
This commit moves the
Store::call_hook
API behind a Cargo feature namedcall-hook
. This helps speed up the path from wasm into the host by avoiding branches at the start and the end of the execution. In a thread on Zulip this is locally leading to significant performance gains in this particular microbenchmark so having an option to disable it at the crate layer seems like a reasonable way to thread this needle for now. This definitely has a downside in that it requires a crate feature at all, but I'm not sure of a better solution as LLVM can't dynamically detect thatStore::call_hook
is never invoked and therefore the branch can be optimized away.