Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable Rust/C functions to "hook up" to multi-value functions #1178

Closed
alexcrichton opened this issue Jan 29, 2020 · 7 comments · Fixed by #2806
Closed

Enable Rust/C functions to "hook up" to multi-value functions #1178

alexcrichton opened this issue Jan 29, 2020 · 7 comments · Fixed by #2806
Labels
cranelift Issues related to the Cranelift code generator

Comments

@alexcrichton
Copy link
Member

Over in wasmtime I've been exploring the area of hooking up compile-time functions (those defined in Rust or C) directly with cranelift-generated functions. For example a wasm module like this:

(module
  (import "" "" (func (param i32)))
  (start (func 0)))

could get hooked up directly to a Rust function that looked like:

extern "C" fn the_import(vmctx: *mut VMContext, param: i32) { /* ... */ }

(more or less). When I say "hooked up directly" here I mean that zero cranelift-generated shim functions are required to have cranelift call out to the imported function. Today the wasmtime::Func type always uses cranelift to generate a fresh jit function which has an appropriate trampoline.

The goal here is to basically make extraction of a function pointer and calling it or providing a function pointer as an import as cheap of an operation as possible.

This all works well today with the default ABIs implemented in Cranelift and in general we don't have any issues. For example #839 is a start of how this might look in the wasmtime crate.

A wrench is thrown into the works with multi-value wasm functions, however. The ABI of a multi-value return function is fundamentally incompatible with anything you can write in C/Rust/etc today. In talking with @sunfishcode it looks like the multi-value return ABI (when dealing with more than one return value) is modeled after the concept of a multi-value return in LLVM. This is, however, purely an aspect of LLVM IR where you can return a tuple, and you can't actually write stable Rust or C code to generate LLVM IR that has this form of a tuple return. This means that if you have a module like:

(module
  (import "" "" (func (result i32 i32)))
)

it's fundamentally impossible to provide a Rust/C function pointer as the import there. There's no way to get the compiler to generate a function that supports the ABI that cranelift expects, meaning that cranelift is required to generate a jit function shim to call between Rust and wasm.

In some discussion, I think there's two possible ways to fix this:

Change Cranelift to behave as if it's returning a struct, not a tuple

Currently cranelift's multi-value return ABI is modeled after what the multi-return ABI looks like in LLVM/gcc. While there is not source-language-level equivalent to this it's all compiler-internal details and you can generate LLVM IR to match up with what cranelift generates today.

An alternative, though, is to update Cranelift's interpretation of a multi-value return to "behave as if an aggregate struct was returned with the multi-value fields" if there is more than one return value. This feels a bit weird, but for example given:

(module
  (import "" "" (func (result i32 i32)))
)

you could hook that up natively to:

#[repr(C)]
struct A(i32, i32);

extern "C" fn foo(vmctx: *mut VMContext) -> A {
    // ... 
}

Or perhaps instead of changing cranelift's default we could simply add a new ABI which matches a "struct-like return" ABI rather than the current tuple-like return ABI.

In any case this seems like it would be a significant undertaking. The rules for how to return an aggregate are really complicated and at least to me make basically no sense. Trying to have Cranelift match exactly what the system ABI looks like is likely a very large undertaking which would take quite some time to get right (and likely a bunch of internal refactoring).

This leads us to another alternative...

Change cranelift to always use an out-pointer for multi-value returns

Instead of trying to match exactly what the system ABI looks like for returning aggregates, we could change cranelift's system ABI (or add a new ABI, or just do this all at the wasm layer) to do something like follows:

  • If a function returns 0 values or 1 value, do the normal thing you'd expect (match the system ABI)
  • If a function returns 2 or more values, then always synthesize an out-pointer where the layout of the out-pointer is the same as a C struct with fields of the types of the return value.

This way the above example could be hooked up to a function like this:

#[repr(C)]
struct A(i32, i32);

extern "C" fn foo(outptr: *mut A, vmctx: *mut VMContext) {
    // ... 
}

The downside of this approach is that it may be wasm-specific or a bit of a hack in cranelift, but the upside is that there's no mucking around with the system ABI and trying to do clever things like packing (i32, i32) into a 64-bit register.

This feels like the better solution to me, but I'm curious to hear what others think as well! If others have questions I certainly don't mind answering them as well.

@bjorn3
Copy link
Contributor

bjorn3 commented Jan 29, 2020

If a function returns 2 or more values, then always synthesize an out-pointer where the layout of the out-pointer is the same as a C struct with fields of the types of the return value.

For exactly two return values, cg_clif needs to pass both in registers to match the native abi. This is what is currently implemented for the SystemV abi.

@bnjbvr
Copy link
Member

bnjbvr commented Feb 21, 2020

Change cranelift to always use an out-pointer for multi-value returns

I like this approach because it would be nice and simple to do so. Cranelift has several use cases, though, and it would create a privilege for a given use case to only adopt one possible way of returning multiple values; other implementations would need thunks to adapt to this ABI, adding a performance penalty.

So I'm all in favor of implementing this in addition to the current thing (that could be called "systemv_multiple_results"), if you want to simplify the way it's done in Wasmtime.

Spidermonkey does its own thing, and this is subject to change in the future too: if it's a multiple results function, the first result goes into a register, and all the following will go onto the stack. So we will have this requirement of having several multiple-returns ABI specifications anyway.

@alexcrichton
Copy link
Member Author

@bnjbvr oh right sorry I realize now how I was saying that things should outright change, but what I meant and a much better way to word it is how you put it, adding support for a different ABI. That'd definitely work well for wasmtime I think!

@bnjbvr
Copy link
Member

bnjbvr commented Feb 21, 2020

@alexcrichton No worries! This plan sounds good.

@fitzgen
Copy link
Member

fitzgen commented Feb 21, 2020

The other wrinkle is that by the time we allocate a struct return or not, we don't really know if the values we're returning are "actually" multi-value or simply got split during legalization into multiple values.

A dedicated ABI for always using return pointers seems like a good plan, and should make dealing with the above easier.

@alexcrichton alexcrichton transferred this issue from bytecodealliance/cranelift Feb 28, 2020
@alexcrichton alexcrichton added the cranelift Issues related to the Cranelift code generator label Feb 28, 2020
@nlewycky
Copy link

We're looking into implementing multi-value return with the same ABI as a C struct return for our wasm implementation.

// We pack the values.
#[repr(C)]
pub struct S2 ( i32, i32 );
extern fn test1() -> S2 {
    S2(1, 2)
}

// Needs to be returned sret, but cranelift returns RAX, RDX, RCX.
#[repr(C)]
pub struct S3 ( i64, i64, i64 );
extern fn test2() -> S3 {
    S3(1, 2, 3)
}

The way I've done it so far is that the user of cranelift is responsible for doing packing and unpacking of pairs of 32-bit value into a single 64-bit value (including writing out a different function signature if needed, for example, a wasm signature returning I32, F32, I32 becomes a cranelift signature that returns i64, i32), while cranelift is responsible for doing sret conversion when needed. Unfortunately, cranelift doesn't always perform sret conversion when the equivalent C struct would be returned sret.

This means that either the caller also needs to perform sret conversion in the remaining cases, or we can write a patch that adds a new system_v_multiple_results ABI to cranelift. The new ABI would disable usage of RCX as a return and change when cranelift applies sret, but would not resolve the issue with packing and unpacking of values because there is currently no way for legalize_args to assign two return values into one register (ValueConversion::IntSplit does the opposite, splitting a single value across multiple registers). Would you be interested in a patch which adds that?

@sunfishcode
Copy link
Member

I would like Cranelift to avoid evolving in the direction of C-compatible struct calling convention support. A full implementation, especially on x86-64, would require fairly detailed knowledge of the C type system, which is otherwise out of scope for Cranelift, so most likely we'd end up with partial support, which is awkward in terms of users knowing what to expect, developers knowing what features we intend to support and what's out of scope, and awkwardly overlapping functionality with the full support we hope to add in Cranelift frontends in the future.

alexcrichton added a commit to alexcrichton/wasmtime that referenced this issue Apr 5, 2021
For quite some time now Wasmtime has "supported" multiple return values,
but only in the mose bare bones ways. Up until recently you couldn't get
a typed version of functions with multiple return values, and never have
you been able to use `Func::wrap` with functions that return multiple
values. Even recently where `Func::typed` can call functions that return
multiple values it uses a double-indirection by calling a trampoline
which calls the real function.

The underlying reason for this lack of support is that cranelift's ABI
for returning multiple values is not possible to write in Rust. For
example if a wasm function returns two `i32` values there is no Rust (or
C!) function you can write to correspond to that. This commit, however
fixes that.

This commit adds two new ABIs to Cranelift: `WasmtimeSystemV` and
`WasmtimeFastcall`. The intention is that these Wasmtime-specific ABIs
match their corresponding ABI (e.g. `SystemV` or `WindowsFastcall`) for
everything *except* how multiple values are returned. For multiple
return values we simply define our own version of the ABI which Wasmtime
implements, which is that for N return values the first is returned as
if the function only returned that and the latter N-1 return values are
returned via an out-ptr that's the last parameter to the function.

These custom ABIs provides the ability for Wasmtime to bind these in
Rust meaning that `Func::wrap` can now wrap functions that return
multiple values and `Func::typed` no longer uses trampolines when
calling functions that return multiple values. Although there's lots of
internal changes there's no actual changes in the API surface area of
Wasmtime, just a few more impls of more public traits which means that
more types are supported in more places!

Another change made with this PR is a consolidation of how the ABI of
each function in a wasm module is selected. The native `SystemV` ABI,
for example, is more efficient at returning multiple values than the
wasmtime version of the ABI (since more things are in more registers).
To continue to take advantage of this Wasmtime will now classify some
functions in a wasm module with the "fast" ABI. Only functions that are
not reachable externally from the module are classified with the fast
ABI (e.g. those not exported, used in tables, or used with `ref.func`).
This should enable purely internal functions of modules to have a faster
calling convention than those which might be exposed to Wasmtime itself.

Closes bytecodealliance#1178
alexcrichton added a commit to alexcrichton/wasmtime that referenced this issue Apr 5, 2021
For quite some time now Wasmtime has "supported" multiple return values,
but only in the mose bare bones ways. Up until recently you couldn't get
a typed version of functions with multiple return values, and never have
you been able to use `Func::wrap` with functions that return multiple
values. Even recently where `Func::typed` can call functions that return
multiple values it uses a double-indirection by calling a trampoline
which calls the real function.

The underlying reason for this lack of support is that cranelift's ABI
for returning multiple values is not possible to write in Rust. For
example if a wasm function returns two `i32` values there is no Rust (or
C!) function you can write to correspond to that. This commit, however
fixes that.

This commit adds two new ABIs to Cranelift: `WasmtimeSystemV` and
`WasmtimeFastcall`. The intention is that these Wasmtime-specific ABIs
match their corresponding ABI (e.g. `SystemV` or `WindowsFastcall`) for
everything *except* how multiple values are returned. For multiple
return values we simply define our own version of the ABI which Wasmtime
implements, which is that for N return values the first is returned as
if the function only returned that and the latter N-1 return values are
returned via an out-ptr that's the last parameter to the function.

These custom ABIs provides the ability for Wasmtime to bind these in
Rust meaning that `Func::wrap` can now wrap functions that return
multiple values and `Func::typed` no longer uses trampolines when
calling functions that return multiple values. Although there's lots of
internal changes there's no actual changes in the API surface area of
Wasmtime, just a few more impls of more public traits which means that
more types are supported in more places!

Another change made with this PR is a consolidation of how the ABI of
each function in a wasm module is selected. The native `SystemV` ABI,
for example, is more efficient at returning multiple values than the
wasmtime version of the ABI (since more things are in more registers).
To continue to take advantage of this Wasmtime will now classify some
functions in a wasm module with the "fast" ABI. Only functions that are
not reachable externally from the module are classified with the fast
ABI (e.g. those not exported, used in tables, or used with `ref.func`).
This should enable purely internal functions of modules to have a faster
calling convention than those which might be exposed to Wasmtime itself.

Closes bytecodealliance#1178
alexcrichton added a commit to alexcrichton/wasmtime that referenced this issue Apr 5, 2021
For quite some time now Wasmtime has "supported" multiple return values,
but only in the mose bare bones ways. Up until recently you couldn't get
a typed version of functions with multiple return values, and never have
you been able to use `Func::wrap` with functions that return multiple
values. Even recently where `Func::typed` can call functions that return
multiple values it uses a double-indirection by calling a trampoline
which calls the real function.

The underlying reason for this lack of support is that cranelift's ABI
for returning multiple values is not possible to write in Rust. For
example if a wasm function returns two `i32` values there is no Rust (or
C!) function you can write to correspond to that. This commit, however
fixes that.

This commit adds two new ABIs to Cranelift: `WasmtimeSystemV` and
`WasmtimeFastcall`. The intention is that these Wasmtime-specific ABIs
match their corresponding ABI (e.g. `SystemV` or `WindowsFastcall`) for
everything *except* how multiple values are returned. For multiple
return values we simply define our own version of the ABI which Wasmtime
implements, which is that for N return values the first is returned as
if the function only returned that and the latter N-1 return values are
returned via an out-ptr that's the last parameter to the function.

These custom ABIs provides the ability for Wasmtime to bind these in
Rust meaning that `Func::wrap` can now wrap functions that return
multiple values and `Func::typed` no longer uses trampolines when
calling functions that return multiple values. Although there's lots of
internal changes there's no actual changes in the API surface area of
Wasmtime, just a few more impls of more public traits which means that
more types are supported in more places!

Another change made with this PR is a consolidation of how the ABI of
each function in a wasm module is selected. The native `SystemV` ABI,
for example, is more efficient at returning multiple values than the
wasmtime version of the ABI (since more things are in more registers).
To continue to take advantage of this Wasmtime will now classify some
functions in a wasm module with the "fast" ABI. Only functions that are
not reachable externally from the module are classified with the fast
ABI (e.g. those not exported, used in tables, or used with `ref.func`).
This should enable purely internal functions of modules to have a faster
calling convention than those which might be exposed to Wasmtime itself.

Closes bytecodealliance#1178
alexcrichton added a commit that referenced this issue Apr 7, 2021
* Fully support multiple returns in Wasmtime

For quite some time now Wasmtime has "supported" multiple return values,
but only in the mose bare bones ways. Up until recently you couldn't get
a typed version of functions with multiple return values, and never have
you been able to use `Func::wrap` with functions that return multiple
values. Even recently where `Func::typed` can call functions that return
multiple values it uses a double-indirection by calling a trampoline
which calls the real function.

The underlying reason for this lack of support is that cranelift's ABI
for returning multiple values is not possible to write in Rust. For
example if a wasm function returns two `i32` values there is no Rust (or
C!) function you can write to correspond to that. This commit, however
fixes that.

This commit adds two new ABIs to Cranelift: `WasmtimeSystemV` and
`WasmtimeFastcall`. The intention is that these Wasmtime-specific ABIs
match their corresponding ABI (e.g. `SystemV` or `WindowsFastcall`) for
everything *except* how multiple values are returned. For multiple
return values we simply define our own version of the ABI which Wasmtime
implements, which is that for N return values the first is returned as
if the function only returned that and the latter N-1 return values are
returned via an out-ptr that's the last parameter to the function.

These custom ABIs provides the ability for Wasmtime to bind these in
Rust meaning that `Func::wrap` can now wrap functions that return
multiple values and `Func::typed` no longer uses trampolines when
calling functions that return multiple values. Although there's lots of
internal changes there's no actual changes in the API surface area of
Wasmtime, just a few more impls of more public traits which means that
more types are supported in more places!

Another change made with this PR is a consolidation of how the ABI of
each function in a wasm module is selected. The native `SystemV` ABI,
for example, is more efficient at returning multiple values than the
wasmtime version of the ABI (since more things are in more registers).
To continue to take advantage of this Wasmtime will now classify some
functions in a wasm module with the "fast" ABI. Only functions that are
not reachable externally from the module are classified with the fast
ABI (e.g. those not exported, used in tables, or used with `ref.func`).
This should enable purely internal functions of modules to have a faster
calling convention than those which might be exposed to Wasmtime itself.

Closes #1178

* Tweak some names and add docs

* "fix" lightbeam compile

* Fix TODO with dummy environ

* Unwind info is a property of the target, not the ABI

* Remove lightbeam unused imports

* Attempt to fix arm64

* Document new ABIs aren't stable

* Fix filetests to use the right target

* Don't always do 64-bit stores with cranelift

This was overwriting upper bits when 32-bit registers were being stored
into return values, so fix the code inline to do a sized store instead
of one-size-fits-all store.

* At least get tests passing on the old backend

* Fix a typo

* Add some filetests with mixed abi calls

* Get `multi` example working

* Fix doctests on old x86 backend

* Add a mixture of wasmtime/system_v tests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cranelift Issues related to the Cranelift code generator
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants