Skip to content
This repository has been archived by the owner on Aug 17, 2022. It is now read-only.

How much would "anyref" as a new value type help? #9

Closed
titzer opened this issue Nov 21, 2017 · 10 comments
Closed

How much would "anyref" as a new value type help? #9

titzer opened this issue Nov 21, 2017 · 10 comments

Comments

@titzer
Copy link

titzer commented Nov 21, 2017

We have a pretty uncontroversial plan to introduce an anyref type to WASM in the future. Instead of restricting this type to tables only, what if we also allow anyref as a WASM value type (not to be stored in linear memory) and table_get and table_set operations? If anyref remains an opaque type for now (i.e. no new operators), I think this has minimal impact on engine complexity. (For non-JS embedded engines there are implementation strategies that don't require pointer maps, see below*)

This could remove a lot of table/slot management complexity, especially for references that never need to be stored into memory, where table/slot management is just overhead. Instead, user code can implement its own WASM wrappers around imports and exports to manage tables and slots so that slot indexes can be stored into the linear memory.

I think this would be more forward-compatible, since I can imagine a world where managed data is dominant and JS/WASM interop is much smoother. In that future world the table bindings would be legacy complexity that we'd probably want to remove.

  • I specifically imagine native WASM engines (i.e., not browser engines) that have no support for GC, no pointer maps for stack walking, etc. In such engines, it would be less work to introduce their own stack-allocated handle tables and not have to track pointer maps internally, e.g. in JITed code.
@lukewagner
Copy link
Member

That's a really good point: in the absence of GC data types (or with them turned off), anyref could be implemented with, e.g., ObjC ARC, maintaining the ability of wasm to operate in a GC-free mode. For browsers, this does still effectively make some amount of wasm GC integration block the Host Binding feature, but OTOH it removes a growing swath of complexity that is essentially working around the lack of an anyref value type.

So if we only have anyref, this would lose the static type info that allows efficient calls into WebIDL bindings (avoiding the type check). So presumably we'd also allow references to type imports as value types as well?

Another question is whether, in the web embedding, anyref would be the same as the WebIDL any which can hold any JS value. This would be necessary to bind to calls that take/return a WebIDL any.

But with that, it seems like we would avoid:

  • copy_elem (assuming we add a cast operator on anyref and type imports)
  • set_elem_null (assuming we add a constant anyref.null operator)
  • OBJECT_HANDLE binding
  • JSON and STRING bindings if we can work out an efficient way to get data into and out of linear memory without making an extra copy (maybe some new overloads of JSON.parse/stringify and TextDecoder.prototype.decode that take typed array views)

The remaining uses of the binding section would be:

  • the function "mode" (CALL_THIS etc)
  • U32
  • STRING_IMMEDIATE (allows a number of optimizations when calling a WebIDL method)
  • ARRAY_BUFFER / TYPED_ARRAY (since, when calling a WebIDL method, this can avoid creating the view/buffer object at all)
  • something for dictionaries and probably sequences (same reason)

@magcius
Copy link

magcius commented Nov 21, 2017

A few questions:

  • Is anyref only about opaque host reference types, or about any reference type? e.g. could I make my own anyref and point it at linear memory, or is it just about host things?
  • Is the type tracked with the value on the stack itself, or is the implication that type would be static, inside the code? Can I dynamically check the type of an anyref? Does client code need to explicitly cast, or would that be implicit?
  • Can an anyref be null? undefined? false? true? If yes to a value, how can I create an anyref with these values, or check if it is one?

@titzer
Copy link
Author

titzer commented Nov 21, 2017

@lukewagner That's a good argument about strong types for WebIDL. Worth considering whether imported host types (of representation anyref) would be substantially different than anyref alone.

Looking back at WebAssembly/gc#27, I noticed that @DLehenbauer basically proposed this 21 days ago.

@titzer
Copy link
Author

titzer commented Nov 21, 2017

@lukewagner I also think we definitely want anyref to mean reference types only and have a separate any only to be added later as needed. I imagine that the V8 implementation of any would use NaN boxing while anyref would continue to use our current strategy of a SMI tag. Also, any with a SIMD value might implement a heap allocation...back to madness...

@lukewagner
Copy link
Member

@titzer Yeah, NaN-boxing vs. pointer-to-GC-cell is the same thing I was wondering about. While it seems we'll want any in the limit, the Host Bindings feature may be viable without it.

@magcius

  • I think initially anyref would only hold host references; if we keep any a separate feature as mentioned above, then you wouldn't be able to stick an i32 in anyref.
  • The type would be tracked with the value and observable via cast operator (to a specific type import) and probably an is_null operator. You make a good point in Instanceof / Upcast equivalent #4 that a non-trapping "would this cast succeed?" operator would make sense too.
  • I think with all these designs there's a special "null" value for anyref and type imports, if for no other reason than we assume that values have a default value in multiple places (grow_table, locals, etc).

@rossberg
Copy link
Member

Agreed that supporting anyref would be much better than the ongoing explosion of features for host bindings.

Regarding anyref vs any: I had assumed that the GC extension would eventually allow types like (ref i32) or similar. Semantically, they are references to ints, but an engine could optimise them into tagged ints where applicable. Under this approach, they would naturally inhabit anyref, and so would all JS values. A separate "any" type wouldn't be needed (nor make much sense).

I guess I'm missing something regarding the representation of a separate 'any' type. How could a Wasm engine itself benefit from NaN-boxing when all the bits in a NaN are already observable by Wasm programs? AFAICS an engine would need to box all float values injected into type 'any' anyways.

@titzer
Copy link
Author

titzer commented Nov 22, 2017

I think we want to avoid boxing floating point values that are injected into any, so that would probably mean that a float -> any conversion would have to be explicit and do NaN pattern canonicalization.

@eholk
Copy link

eholk commented Dec 4, 2017

We considered something similar for exception handling. See some discussion at WebAssembly/exception-handling#30 (comment)

There we abandoned this idea because it introduced some thorny lifetime issues. If there's more interest in this kind of feature now, maybe we could also simplify the exception handling proposal.

@rossberg
Copy link
Member

Proposal for basic reference types is here.

@pchickey
Copy link
Collaborator

Closing as out-of-date: these concepts don't map to the current proposal, which has evolved a lot since this issue was opened.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants