Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add *_unchecked variants of Func APIs for the C API #3350

Merged
merged 2 commits into from
Sep 24, 2021

Conversation

alexcrichton
Copy link
Member

This commit is what is hopefully going to be my last installment within
the saga of optimizing function calls in/out of WebAssembly modules in
the C API. This is yet another alternative approach to #3345 (sorry) but
also contains everything necessary to make the C API fast. As in #3345
the general idea is just moving checks out of the call path in the same
style of TypedFunc.

This new strategy takes inspiration from previously learned attempts
effectively "just" exposes how we previously passed *mut u128 through
trampolines for arguments/results. This storage format is formalized
through a new ValRaw union that is exposed from the wasmtime crate.
By doing this it made it relatively easy to expose two new APIs:

  • Func::new_unchecked
  • Func::call_unchecked

These are the same as their checked equivalents except that they're
unsafe and they work with *mut ValRaw rather than safe slices of
Val. Working with these eschews type checks and such and requires
callers/embedders to do the right thing.

These two new functions are then exposed via the C API with new
functions, enabling C to have a fast-path of calling/defining functions.
This fast path is akin to Func::wrap in Rust, although that API can't
be built in C due to C not having generics in the same way that Rust
has.

For some benchmarks, the benchmarks here are:

  • nop - Call a wasm function from the host that does nothing and
    returns nothing.
  • i64 - Call a wasm function from the host, the wasm function calls a
    host function, and the host function returns an i64 all the way out to
    the original caller.
  • many - Call a wasm function from the host, the wasm calls
    host function with 5 i32 parameters, and then an i64 result is
    returned back to the original host
  • i64 host - just the overhead of the wasm calling the host, so the
    wasm calls the host function in a loop.
  • many host - same as i64 host, but calling the many host function.

All numbers in this table are in nanoseconds, and this is just one
measurement as well so there's bound to be some variation in the precise
numbers here.

Name Rust C (before) C (after)
nop 19 112 25
i64 22 207 32
many 27 189 34
i64 host 2 38 5
many host 7 75 8

The main conclusion here is that the C API is significantly faster than
before when using the *_unchecked variants of APIs. The Rust
implementation is still the ceiling (or floor I guess?) for performance
The main reason that C is slower than Rust is that a little bit more has
to travel through memory where on the Rust side of things we can
monomorphize and inline a bit more to get rid of that. Overall though
the costs are way way down from where they were originally and I don't
plan on doing a whole lot more myself at this time. There's various
things we theoretically could do I've considered but implementation-wise
I think they'll be much more weighty.

@github-actions github-actions bot added wasmtime:api Related to the API of the `wasmtime` crate itself wasmtime:c-api Issues pertaining to the C API. wasmtime:ref-types Issues related to reference types and GC in Wasmtime labels Sep 14, 2021
@github-actions
Copy link

Subscribe to Label Action

cc @fitzgen, @peterhuene

This issue or pull request has been labeled: "wasmtime:api", "wasmtime:c-api", "wasmtime:ref-types"

Thus the following users have been cc'd because of the following labels:

  • fitzgen: wasmtime:ref-types
  • peterhuene: wasmtime:api, wasmtime:c-api

To subscribe or unsubscribe from this label, edit the .github/subscribe-to-label.json configuration file.

Learn more.

This commit is what is hopefully going to be my last installment within
the saga of optimizing function calls in/out of WebAssembly modules in
the C API. This is yet another alternative approach to bytecodealliance#3345 (sorry) but
also contains everything necessary to make the C API fast. As in bytecodealliance#3345
the general idea is just moving checks out of the call path in the same
style of `TypedFunc`.

This new strategy takes inspiration from previously learned attempts
effectively "just" exposes how we previously passed `*mut u128` through
trampolines for arguments/results. This storage format is formalized
through a new `ValRaw` union that is exposed from the `wasmtime` crate.
By doing this it made it relatively easy to expose two new APIs:

* `Func::new_unchecked`
* `Func::call_unchecked`

These are the same as their checked equivalents except that they're
`unsafe` and they work with `*mut ValRaw` rather than safe slices of
`Val`. Working with these eschews type checks and such and requires
callers/embedders to do the right thing.

These two new functions are then exposed via the C API with new
functions, enabling C to have a fast-path of calling/defining functions.
This fast path is akin to `Func::wrap` in Rust, although that API can't
be built in C due to C not having generics in the same way that Rust
has.

For some benchmarks, the benchmarks here are:

* `nop` - Call a wasm function from the host that does nothing and
  returns nothing.
* `i64` - Call a wasm function from the host, the wasm function calls a
  host function, and the host function returns an `i64` all the way out to
  the original caller.
* `many` - Call a wasm function from the host, the wasm calls
   host function with 5 `i32` parameters, and then an `i64` result is
   returned back to the original host
* `i64` host - just the overhead of the wasm calling the host, so the
  wasm calls the host function in a loop.
* `many` host - same as `i64` host, but calling the `many` host function.

All numbers in this table are in nanoseconds, and this is just one
measurement as well so there's bound to be some variation in the precise
numbers here.

| Name      | Rust | C (before) | C (after) |
|-----------|------|------------|-----------|
| nop       | 19   | 112        | 25        |
| i64       | 22   | 207        | 32        |
| many      | 27   | 189        | 34        |
| i64 host  | 2    | 38         | 5         |
| many host | 7    | 75         | 8         |

The main conclusion here is that the C API is significantly faster than
before when using the `*_unchecked` variants of APIs. The Rust
implementation is still the ceiling (or floor I guess?) for performance
The main reason that C is slower than Rust is that a little bit more has
to travel through memory where on the Rust side of things we can
monomorphize and inline a bit more to get rid of that. Overall though
the costs are way way down from where they were originally and I don't
plan on doing a whole lot more myself at this time. There's various
things we theoretically could do I've considered but implementation-wise
I think they'll be much more weighty.
@alexcrichton
Copy link
Member Author

@peterhuene I hope you don't mind a ping for review here, but if you're busy no worries and I can tag someone else!

Copy link
Member

@peterhuene peterhuene left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just a few comments.

crates/c-api/include/wasmtime/val.h Outdated Show resolved Hide resolved
crates/c-api/include/wasmtime/val.h Show resolved Hide resolved
crates/c-api/include/wasmtime/val.h Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
wasmtime:api Related to the API of the `wasmtime` crate itself wasmtime:c-api Issues pertaining to the C API. wasmtime:ref-types Issues related to reference types and GC in Wasmtime
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants