Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: cmd/compile: relax wasm32 function import signature type constraints #66984

Open
johanbrandhorst opened this issue Apr 23, 2024 · 9 comments
Labels
Milestone

Comments

@johanbrandhorst
Copy link
Member

johanbrandhorst commented Apr 23, 2024

Background

#59149 removed the package restrictions on the use of go:wasmimport, but established strict constraints on the types that can be used as input and result parameters. The motivation for this was that supporting rich types between the host and the client would require sophisticated and expensive runtime type conversions because of the mismatch between the 64 bit architecture of the client and the 32 bit architecture of the host.

With the upcoming 32 bit wasm port, this problem goes away, as both client and host will use 32 bit pointers.

Proposal

Relax the constraints on types that can be used as input and result parameters with the go:wasmimport compiler directive, on ports using the wasm32 architecture only. This currently limits this proposal to wasip1/wasm32.

The following types would be allowed as input parameters:

  • bool
  • int, uint, int8, uint8, int16, uint16, int32, uint32, int64, uint64
  • float32, float64
  • string
  • struct where all fields are allowed types
  • [...]T array where T is an allowed type
  • uintptr, unsafe.Pointer, *T where T is an allowed type

The following types would remain disallowed:

  • chan T
  • complex64, complex128
  • func
  • interface
  • map[T]U
  • []T

Only simple scalar types (bool, (u|)int(|8|16|32|64), float(32|64), uintptr, unsafe.Pointer) would be allowed as the result parameter type.

Discussion

Compatibility guarantees

The Go spec does not specify the struct layout and leaves it up to implementations to decide. As such, we cannot provide a guaranteed ABI without having to change the spec or force future layout changes to provide runtime conversion of data. This proposal suggests making it clear to users through documentation that there are no guarantees of compatibility across versions of the Go compiler.

Type conversion rules

The following conversion rules would be automatically applied by the compiler for the respective parameter type:

Go Type Type passed to host (per Wasm spec) Type read from host
bool i32 i32
int, uint, int8, uint8, int16, uint16, int32, uint32

int64, uint64

i32, i32, i32, i32, i32, i32, i32, i32

i64, i64

i32, i32, i32, i32, i32, i32, i32, i32

i64, i64

float32, float64 f32, f64 f32, f64
string Assigned to two call parameters as a (i32, i32) tuple of (pointer, len). N/A
struct Struct fields are assigned to call parameters in order (i.e. field 1 goes into param 1) according to its type conversion rule. This follows the C struct value semantics. N/A
[...]T Each entry is assigned to call parameters in order according to its type conversion rule. N/A
uintptr, unsafe.Pointer, *T i32, i32, i32 i32, i32, N/A

Result parameters

Result parameters are more restricted since pointer values from the host cannot be managed safely by the GC, and Wasm practically does not allow more than 1 result parameter. Only basic scalar values and unsafe.Pointer are allowed as the result parameter type.

Supporting slices, maps

Both slices and maps are disallowed because of the uncertainty around the memory underlying these types and interactions with struct and array rules. Users who wish to use slices can manually use (&slice, len(slice)) or unsafe.Pointer. There is no clear way to support passing or returning map data from the host other than by using unsafe.Pointer and making assumptions about the underlying data.

Related proposals

struct.Hostlayout

#66408 proposes a way for users to request that struct layout is host compatible. It does not create a conflict with the ideas put forth in this proposal.

go:wasmexport

The proposed relaxing of constraints would also apply to uses of go:wasmexport, as described in #65199.

Future work

WASI Preview 2 (AKA WASI 0.2)

WASI Preview 2 defines its API in terms of the Component Model, with a rich type system and an IDL language, WIT. The Component Model also defines a Canonical ABI with a specification for lifting and lowering Component Model types into and out of linear memory. This proposal does not attempt to define the ABI for any hypothetical wasip2 target, and would leave such decisions for any future wasip2 proposal.

Contributors

@johanbrandhorst, @evanphx, @achille-roussel, @dgryski, @ydnar

CC @cherrymui @golang/wasm

@gopherbot gopherbot added this to the Proposal milestone Apr 23, 2024
@dr2chase
Copy link
Contributor

dr2chase commented Apr 23, 2024

"This follows the C struct value semantics" is just a hair vague; are 8-byte quantities (float64, int64, uint64) stored at a 4-byte or 8-byte alignment? It was my understanding (and the purpose of #66408) to specify a 4-byte alignment for fields of those types when they occur in structs passed to wasm32 (tagged structs.HostLayout).

(edited to note error, the host alignment for 8-byte integers and floats is 8 bytes).

@ydnar
Copy link

ydnar commented Apr 23, 2024

Ideally 8-byte values would always be 8-byte aligned in the wasm32 port.

@evanphx
Copy link
Contributor

evanphx commented Apr 23, 2024

@dr2chase Looking at what clang does, it uses 8-byte alignment on 64bit quantities so we'd match that.

@dr2chase
Copy link
Contributor

dr2chase commented Apr 23, 2024

You are right, I got it backwards. But that is what you are expecting for anything that has pointers-to-it passed to the wasm host platform, yes?

@cherrymui
Copy link
Member

Thanks for the proposal! A few questions:

  • 8-byte alignment for 64-bit values, as mentioned above. cmd/compile: create GOARCH=wasm32 #63131 doesn't seem to have a definitive answer, and currently it seems the CL doesn't implement 64-bit alignment.
  • If we don't always align 64-bit value to 8 bytes (which differs from current all 32-bit architectures we support and probably requires quite some work), we should align 64-bit value to 8 bytes when structs.HostLayout is specified. So proposal: structs: add HostLayout "directive" type #66408 is very related.
  • structs and arrays. What is the ABI specification exactly? The C ABI on, say ELF AMD64, is pretty complex for passing structs and arrays. Small fields may be packed into one word. Large structs may be passed indirectly (stored on stack, passing a pointer to the callee). Do we have a specification for this?
  • string. What does a string look like on Wasm/WASI side? I couldn't find its specification on WASI P1 doc. On Component Model doc https://github.com/WebAssembly/component-model/blob/main/design/mvp/CanonicalABI.md (which I guess is for WASI P2, not P1?), it specifies string is two i32, which is similar to Go's string, which is good. But also it allows three encodings, UTF-8, UTF-16, and "latin1+utf16" differentiated by a high bit. The second and third encoding are not compatible with Go strings. Do we require UTF-8 encoding? Or we don't allow passing Go strings directly?

Besides, for structs, arrays of structs, and pointer to structs, I would suggest we allow only structs with structs.HostLayout to be passed. The reason is that in the Go spec we don't require struct fields to be laid out in memory in source order, and it may well change in a future Go release. structs.HostLayout specifies a fixed layout. Structs without that marker can change. This gives a clear way to say which structs should have a fixed layout, which are okay to change.

Thanks.

@dr2chase
Copy link
Contributor

Two other questions, first:

type w32thing struct {
    _ structs.HostLayout
    a uint8
    b uint16
}

Is this laid out a_bb or is it aaaabbbb? What sizes do I use for struct fields? I assume it is the smaller ones, but I wanted to verify this else it would be a problem.

Second, passing pointers to 8-byte primitive types to the host will be tricky unless those references come from fields in structures tagged with HostLayout -- otherwise, they may not be aligned. So

type wx struct {
   _ structs.HostLayout
  x int64
}
func f(x int64, w wx) {
  someWasmFunc(&x) // might not work, x might not be 8-byte aligned
  someWasmFunc(&w.x) // this will work because w is a wx and its x field is 8-byte aligned
  someOtherWasmFunc(&w) // if it used *wx for its parameter type instead of *int64
}

@johanbrandhorst
Copy link
Member Author

johanbrandhorst commented Apr 25, 2024

Thanks for the quick feedback! I've tried to answer each question:

structs and arrays. What is the ABI specification exactly? The C ABI on, say ELF AMD64, is pretty complex for passing structs and arrays. Small fields may be packed into one word. Large structs may be passed indirectly (stored on stack, passing a pointer to the callee). Do we have a specification for this?

The specification falls out of the table of transformations (I think?). There current plan isn't to introduce any sort of magic around large structs or field packing. Structs fields are added as call parameters, from the first field to the last, according to the conversion rules for the type of the field. Examples:

type foo struct {
    a int
    b string
    c [2]float32
}

With a function signature of

//go:wasmimport some_module some_function
func wasmFunc(in foo) int

Would roughly translate to (in WAT format)

// $a is of type `i32` holding the value of `a`
// $b_addr is of type `i32` and is a pointer to the start of the bytes for the Go string `b`
// $b_len is of type `i32` and is the length in bytes to read from `$b_addr` to get the whole string
// $c_0 is of type `f32` and is the value of `c[0]`
// $c_1 is of type `f32` and is the value of `c[1]`
call $some_function (local.get $a) (local.get $b_addr) (local.get $b_len) (local.get $c_0) (local.get $c_1)

Struct fields would be expanded into call parameters before subsequent fields at the same level.

What does a string look like on Wasm/WASI side?

For wasip1, we will treat Go string parameters simply as a (*byte, int) tuple. There will be no encoding constraints, just as with regular Go strings. To the Wasm host, it will look identical to using struct { a *byte; b int } as a parameter. For wasip2, those constraints would have to be considered in a hypothetical future wasip2 proposal.

Making structs.HostLayout required for structs, arrays of structs and pointers to structs

This sounds like a great idea, and we should also extend it to pointers to 8 byte sized primitive types to guarantee alignment, as suggested by @dr2chase's last question. This would avoid any question around alignment issues for pointers. It hurts the ergonomics a little bit but that's a price worth paying, I think.

type w32thing struct {
_ structs.HostLayout
a uint8
b uint16
}

Is this laid out a_bb or is it aaaabbbb? What sizes do I use for struct fields? I assume it is the smaller ones, but I wanted to verify this else it would be a problem.

I'm a little confused by the question to be honest. If this type was used as an input to a Wasm call, it would look like this:

// $a is of type `i32`
// $b is of type `i32`
call $some_function (local.get $a) (local.get $b)

I suppose that might mean the memory looks like this: a___bb__? We're not passing a pointer to the struct or the fields, so we'd need to copy the values into locals, which will be of type i32 (I think)? Admittedly my grasp of this exact part of the code is a bit weak so I appreciate corrections.

@cherrymui
Copy link
Member

Thanks for the response!

Structs fields are added as call parameters, from the first field to the last, according to the conversion rules for the type of the field.

This sounds like a reasonable choice. Is this ABI specified anywhere in Wasm/WASI docs? Or the Wasm side has to define the function taking parameters element-wise?

For wasip1, we will treat Go string parameters simply as a (*byte, int) tuple. There will be no encoding constraints, just as with regular Go strings. To the Wasm host, it will look identical to using struct { a *byte; b int } as a parameter.

This sounds reasonable as well. Is it specified anywhere in Wasm/WASI docs?

Thanks.

@johanbrandhorst
Copy link
Member Author

This sounds like a reasonable choice. Is this ABI specified anywhere in Wasm/WASI docs? Or the Wasm side has to define the function taking parameters element-wise?

I don't know about this being an official ABI so much as just a consequence of the Wasm spec around function calls and how we can apply Go semantics to it. We're limited to the i32, i64, f32 and f64 value types, and the call instruction takes a function index and arguments from the stack. In order to simulate pass-by-value for structs, we have to flatten each field to one of the allowed value types.

This sounds reasonable as well. Is it specified anywhere in Wasm/WASI docs?

Not sure there's a doc anywhere, but practically, definitions like path_create_directory, which take a string parameter, use this pattern: https://cs.opensource.google/go/go/+/refs/tags/go1.22.2:src/syscall/fs_wasip1.go;l=230.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Incoming
Development

No branches or pull requests

6 participants