Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add experimental support for the anyref type #1002

Merged
merged 1 commit into from Feb 20, 2019

Conversation

alexcrichton
Copy link
Contributor

This commit adds experimental support to wasm-bindgen to emit and
leverage the anyref native wasm type. This native type is still in a
proposal status (the reference-types proposal). The intention of
anyref is to be able to directly hold JS values in wasm and pass the
to imported functions, namely to empower eventual host bindings (now
renamed WebIDL bindings) integration where we can skip JS shims
altogether for many imports.

This commit doesn't actually affect wasm-bindgen's behavior at all
as-is, but rather this support requires an opt-in env var to be
configured. Once the support is stable in browsers it's intended that
this will add a CLI switch for turning on this support, eventually
defaulting it to true in the far future.

The basic strategy here is to take the stack and slab globals in the
generated JS glue and move them into wasm using a table. This new table
in wasm is managed at the fringes via injected shims. At
wasm-bindgen-time the CLI will rewrite exports and imports with shims
that actually use anyref if needed, performing loads/stores inside the
wasm module instead of externally in the wasm module.

This should provide a boost over what we have today, but it's not a
fantastic strategy long term. We have a more grand vision for anyref
being a first-class type in the language, but that's on a much longer
horizon and this is currently thought to be the best we can do in terms
of integration in the near future.

The stack/heap JS tables are combined into one wasm table. The stack
starts at the end of the table and grows down with a stack pointer (also
injected). The heap starts at the end and grows up (state managed in
linear memory). The anyref transformation here will hook up various
intrinsics in wasm-bindgen to the runtime functionality if the anyref
supoprt is enabled.

The main tricky treatment here was applied to closures, where we need JS
to use a different function pointer than the one Rust gives it to use a
JS function pointer empowered with anyref. This works by switching up a
bit how descriptors work, embedding the shims to call inside descriptors
rather than communicated at runtime. This means that we're accessing
constant values in the generated JS and we can just update the constant
value accessed.

@alexcrichton
Copy link
Contributor Author

This is intended to get some early review and not actually merge at this time. This is all still entirely untested other than hand-inspecting wat output, we're still waiting on support in an engine to play with this!

@alexcrichton
Copy link
Contributor Author

Also if there are thoughts about landing parts of this first, I'd totally be down for that. Some separable pieces that could be landed first are:

  • Refactoring js/mod.rs to use self.get_object(..) and friends everywhere
  • Moving invoke functions from runtime to descriptors for closures (making them constants in JS bindings)

alexcrichton added a commit to alexcrichton/wasm-bindgen that referenced this pull request Nov 29, 2018
This is split out from rustwasm#1002 and is intended to fix the tool's handling
of the `start` function. For the most accurate emulation of the wasm ESM
spec I believe we need to defer execution of the start function until
all our exports are wired up which should allow valid cyclical
references during instantiation.

The fix here is to remove the start function, if one is present, and
inject an invocation of it at the end of initialization (after our
exports are wired up). This fixes tests on rustwasm#1002, but doesn't have any
direct analogue for tests here just yet.

Along the way because multiple files now come out of `wasm2es6js` by
default I've added an `--out-dir` argument as well as `-o` to ensure
that a folder for all outputs can be specified.
alexcrichton added a commit to alexcrichton/wasm-bindgen that referenced this pull request Nov 29, 2018
This commit fixes a case in `wasm2es6js` where if an imported function
was reexported it wasn't handled correctly. This doesn't have a direct
test but came up during the development of rustwasm#1002
alexcrichton added a commit to alexcrichton/wasm-bindgen that referenced this pull request Nov 29, 2018
Currently closure shims are communicated to JS at runtime, although at
runtime the same constant value is always passed to JS! More pressing,
however, work in rustwasm#1002 requires knowledge of closure descriptor indices
at `wasm-bindgen` time which is not currently known.

Since the closure descriptor shims and such are already constant values,
this commit moves the descriptor function indices into the *descriptor*
for a closure/function pointer. This way we can learn about these values
at `wasm-bindgen` time instead of only knowing them at runtime.

This should have no semantic change on users of `wasm-bindgen`, although
some closure invocations may be slightly speedier because there's less
arguments being transferred over the boundary. Overall though this will
help rustwasm#1002 as the closure shims that the Rust compiler generates may not
be the exact ones we hand out to JS, but rather wrappers around them
which do `anyref` business things.
alexcrichton added a commit to alexcrichton/wasm-bindgen that referenced this pull request Nov 29, 2018
alexcrichton added a commit to alexcrichton/wasm-bindgen that referenced this pull request Nov 29, 2018
This was intended earlier, but fogotten! Found during work on rustwasm#1002
alexcrichton added a commit to alexcrichton/wasm-bindgen that referenced this pull request Nov 29, 2018
Currently closure shims are communicated to JS at runtime, although at
runtime the same constant value is always passed to JS! More pressing,
however, work in rustwasm#1002 requires knowledge of closure descriptor indices
at `wasm-bindgen` time which is not currently known.

Since the closure descriptor shims and such are already constant values,
this commit moves the descriptor function indices into the *descriptor*
for a closure/function pointer. This way we can learn about these values
at `wasm-bindgen` time instead of only knowing them at runtime.

This should have no semantic change on users of `wasm-bindgen`, although
some closure invocations may be slightly speedier because there's less
arguments being transferred over the boundary. Overall though this will
help rustwasm#1002 as the closure shims that the Rust compiler generates may not
be the exact ones we hand out to JS, but rather wrappers around them
which do `anyref` business things.
alexcrichton added a commit to alexcrichton/wasm-bindgen that referenced this pull request Nov 29, 2018
This commit fixes a case in `wasm2es6js` where if an imported function
was reexported it wasn't handled correctly. This doesn't have a direct
test but came up during the development of rustwasm#1002
alexcrichton added a commit to alexcrichton/wasm-bindgen that referenced this pull request Nov 29, 2018
This is split out from rustwasm#1002 and is intended to fix the tool's handling
of the `start` function. For the most accurate emulation of the wasm ESM
spec I believe we need to defer execution of the start function until
all our exports are wired up which should allow valid cyclical
references during instantiation.

The fix here is to remove the start function, if one is present, and
inject an invocation of it at the end of initialization (after our
exports are wired up). This fixes tests on rustwasm#1002, but doesn't have any
direct analogue for tests here just yet.

Along the way because multiple files now come out of `wasm2es6js` by
default I've added an `--out-dir` argument as well as `-o` to ensure
that a folder for all outputs can be specified.
@alexcrichton
Copy link
Contributor Author

Ok this is getting to a good spot! I've split out a bunch of PRs from this which aren't directly related to this change, so only the last commit should be the most relevant for this PR itself.

I'm able to get a number of toy "hello world" style programs working in Firefox with this PR. Small DOM interactions and such work and I can indeed start to see the glacier that is our glue starting to melt! (aka glue is being removed).

Further work I believe will be blocked on https://bugzilla.mozilla.org/show_bug.cgi?id=1505768. I'd still prefer to not land this until there's a fully tested proof-of-concept (aka a larger test suite passing against an implementation). Firefox is the only implementor of anyref as of this red-hot-second (AFAIK), and that bug means that we can't actually store all JS values in anyref tables (undefined flat out doesn't work, true gets converted to a Boolean, etc). Once that bug is fixed we'll be able to keep testing on this end and make sure all this support is ship shape!

In the meantime having the current implementation in Firefox (even if not complete) was mega helpful. I've found tons of little issues here and there already!

alexcrichton added a commit to alexcrichton/wasm-bindgen that referenced this pull request Nov 29, 2018
Currently closure shims are communicated to JS at runtime, although at
runtime the same constant value is always passed to JS! More pressing,
however, work in rustwasm#1002 requires knowledge of closure descriptor indices
at `wasm-bindgen` time which is not currently known.

Since the closure descriptor shims and such are already constant values,
this commit moves the descriptor function indices into the *descriptor*
for a closure/function pointer. This way we can learn about these values
at `wasm-bindgen` time instead of only knowing them at runtime.

This should have no semantic change on users of `wasm-bindgen`, although
some closure invocations may be slightly speedier because there's less
arguments being transferred over the boundary. Overall though this will
help rustwasm#1002 as the closure shims that the Rust compiler generates may not
be the exact ones we hand out to JS, but rather wrappers around them
which do `anyref` business things.
alexcrichton added a commit that referenced this pull request Nov 29, 2018
A few minor CLI tweaks during work on #1002
alexcrichton added a commit to alexcrichton/wasm-bindgen that referenced this pull request Nov 29, 2018
This commit fixes a case in `wasm2es6js` where if an imported function
was reexported it wasn't handled correctly. This doesn't have a direct
test but came up during the development of rustwasm#1002
alexcrichton added a commit to alexcrichton/wasm-bindgen that referenced this pull request Nov 30, 2018
This commit switches strategies for storing `JsValue` from a heap/stack
to just one heap. This mirrors the new strategy for `JsValue` storage
in rustwasm#1002 and should make multiplexing those strategies at
`wasm-bindgen`-time much easier.

Instead of having one array which acts as a stack for borrowed values
and one array for a heap of borrowed values, only one JS array is used
for storage of JS values now. This makes `getObject` far simpler by
simply being an array access, but it means that cloning an object now
reserves a new slot instead of reference counting it. If the old
reference counting behavior is needed it's thought that `Rc<JsValue>`
can be used in Rust.

The new "heap" has an initial stack pointer which grows downwards, and a
heap which grows upwards. The heap is a singly-linked-list which is
allocated/deallocated from. The stack grows downwards to zero and
presumably starts generating errors once it underflows. An initial stack
size of 32 is chosen as that should encompass all use cases today, but
we can eventually probably add configuration for this!

Note that the heap is initialized to all `null` for the stack and then
the initial JS values (`undefined`, `null`, `true`, `false`) are pushed
onto the heap in reserved locations.
alexcrichton added a commit to alexcrichton/wasm-bindgen that referenced this pull request Nov 30, 2018
This commit switches strategies for storing `JsValue` from a heap/stack
to just one heap. This mirrors the new strategy for `JsValue` storage
in rustwasm#1002 and should make multiplexing those strategies at
`wasm-bindgen`-time much easier.

Instead of having one array which acts as a stack for borrowed values
and one array for a heap of borrowed values, only one JS array is used
for storage of JS values now. This makes `getObject` far simpler by
simply being an array access, but it means that cloning an object now
reserves a new slot instead of reference counting it. If the old
reference counting behavior is needed it's thought that `Rc<JsValue>`
can be used in Rust.

The new "heap" has an initial stack pointer which grows downwards, and a
heap which grows upwards. The heap is a singly-linked-list which is
allocated/deallocated from. The stack grows downwards to zero and
presumably starts generating errors once it underflows. An initial stack
size of 32 is chosen as that should encompass all use cases today, but
we can eventually probably add configuration for this!

Note that the heap is initialized to all `null` for the stack and then
the initial JS values (`undefined`, `null`, `true`, `false`) are pushed
onto the heap in reserved locations.
alexcrichton added a commit to alexcrichton/wasm-bindgen that referenced this pull request Nov 30, 2018
This commit switches strategies for storing `JsValue` from a heap/stack
to just one heap. This mirrors the new strategy for `JsValue` storage
in rustwasm#1002 and should make multiplexing those strategies at
`wasm-bindgen`-time much easier.

Instead of having one array which acts as a stack for borrowed values
and one array for a heap of borrowed values, only one JS array is used
for storage of JS values now. This makes `getObject` far simpler by
simply being an array access, but it means that cloning an object now
reserves a new slot instead of reference counting it. If the old
reference counting behavior is needed it's thought that `Rc<JsValue>`
can be used in Rust.

The new "heap" has an initial stack pointer which grows downwards, and a
heap which grows upwards. The heap is a singly-linked-list which is
allocated/deallocated from. The stack grows downwards to zero and
presumably starts generating errors once it underflows. An initial stack
size of 32 is chosen as that should encompass all use cases today, but
we can eventually probably add configuration for this!

Note that the heap is initialized to all `null` for the stack and then
the initial JS values (`undefined`, `null`, `true`, `false`) are pushed
onto the heap in reserved locations.
alexcrichton added a commit to alexcrichton/wasm-bindgen that referenced this pull request Nov 30, 2018
This commit switches strategies for storing `JsValue` from a heap/stack
to just one heap. This mirrors the new strategy for `JsValue` storage
in rustwasm#1002 and should make multiplexing those strategies at
`wasm-bindgen`-time much easier.

Instead of having one array which acts as a stack for borrowed values
and one array for a heap of borrowed values, only one JS array is used
for storage of JS values now. This makes `getObject` far simpler by
simply being an array access, but it means that cloning an object now
reserves a new slot instead of reference counting it. If the old
reference counting behavior is needed it's thought that `Rc<JsValue>`
can be used in Rust.

The new "heap" has an initial stack pointer which grows downwards, and a
heap which grows upwards. The heap is a singly-linked-list which is
allocated/deallocated from. The stack grows downwards to zero and
presumably starts generating errors once it underflows. An initial stack
size of 32 is chosen as that should encompass all use cases today, but
we can eventually probably add configuration for this!

Note that the heap is initialized to all `null` for the stack and then
the initial JS values (`undefined`, `null`, `true`, `false`) are pushed
onto the heap in reserved locations.
alexcrichton added a commit to alexcrichton/wasm-bindgen that referenced this pull request Nov 30, 2018
This commit switches strategies for storing `JsValue` from a heap/stack
to just one heap. This mirrors the new strategy for `JsValue` storage
in rustwasm#1002 and should make multiplexing those strategies at
`wasm-bindgen`-time much easier.

Instead of having one array which acts as a stack for borrowed values
and one array for a heap of borrowed values, only one JS array is used
for storage of JS values now. This makes `getObject` far simpler by
simply being an array access, but it means that cloning an object now
reserves a new slot instead of reference counting it. If the old
reference counting behavior is needed it's thought that `Rc<JsValue>`
can be used in Rust.

The new "heap" has an initial stack pointer which grows downwards, and a
heap which grows upwards. The heap is a singly-linked-list which is
allocated/deallocated from. The stack grows downwards to zero and
presumably starts generating errors once it underflows. An initial stack
size of 32 is chosen as that should encompass all use cases today, but
we can eventually probably add configuration for this!

Note that the heap is initialized to all `null` for the stack and then
the initial JS values (`undefined`, `null`, `true`, `false`) are pushed
onto the heap in reserved locations.
@alexcrichton
Copy link
Contributor Author

Ok I've rebased this now! This now is built all on top of walrus and I've tested to have everything work locally as well (even caught a bug in the old implementation).

@fitzgen mind doing a once-over of this? I think that reference types are getting final enough that we should be good to go with an implementation on our end.

@alexcrichton alexcrichton force-pushed the reference-types branch 2 times, most recently from 43dab46 to 715545c Compare February 19, 2019 21:15
let ty = module.types.add(&[ValType::I32], &[ValType::I32]);
let clone_ref = builder.finish(ty, vec![arg], vec![set_table, get_local], module);
let name = "__wbindgen_object_clone_ref".to_string();
module.funcs.get_mut(clone_ref).name = Some(name);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add a method for names to the builder. I also vaguely wanted this before.

Filed rustwasm/walrus#61

This commit adds experimental support to `wasm-bindgen` to emit and
leverage the `anyref` native wasm type. This native type is still in a
proposal status (the reference-types proposal). The intention of
`anyref` is to be able to directly hold JS values in wasm and pass the
to imported functions, namely to empower eventual host bindings (now
renamed WebIDL bindings) integration where we can skip JS shims
altogether for many imports.

This commit doesn't actually affect wasm-bindgen's behavior at all
as-is, but rather this support requires an opt-in env var to be
configured. Once the support is stable in browsers it's intended that
this will add a CLI switch for turning on this support, eventually
defaulting it to `true` in the far future.

The basic strategy here is to take the `stack` and `slab` globals in the
generated JS glue and move them into wasm using a table. This new table
in wasm is managed at the fringes via injected shims. At
`wasm-bindgen`-time the CLI will rewrite exports and imports with shims
that actually use `anyref` if needed, performing loads/stores inside the
wasm module instead of externally in the wasm module.

This should provide a boost over what we have today, but it's not a
fantastic strategy long term. We have a more grand vision for `anyref`
being a first-class type in the language, but that's on a much longer
horizon and this is currently thought to be the best we can do in terms
of integration in the near future.

The stack/heap JS tables are combined into one wasm table. The stack
starts at the end of the table and grows down with a stack pointer (also
injected). The heap starts at the end and grows up (state managed in
linear memory). The anyref transformation here will hook up various
intrinsics in wasm-bindgen to the runtime functionality if the anyref
supoprt is enabled.

The main tricky treatment here was applied to closures, where we need JS
to use a different function pointer than the one Rust gives it to use a
JS function pointer empowered with anyref. This works by switching up a
bit how descriptors work, embedding the shims to call inside descriptors
rather than communicated at runtime. This means that we're accessing
constant values in the generated JS and we can just update the constant
value accessed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants