Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggestion: Combine opaque external references with explicit wasm-extern casts #307

Closed
manoskouk opened this issue Jun 10, 2022 · 88 comments

Comments

@manoskouk
Copy link
Contributor

In issue #293, we discussed the possibility that funcref not be a subtype of anyref, to avoid representation-change costs at the JS/Wasm boundary. It is reasonable to assume that other types might also require similar transformations in the future: i31ref will most probably require some form of range check, and it is not unlikely that structs and arrays will require some wrapping to function optimally in both worlds.
This creates a situation where passing a value to wasm as anyref carries significant cost at the boundary, since every object has to be checked against every Wasm type, in case a representation change has to be applied. The idea to apply these checks during downcasts from anyref will not work, as we would need to apply the reverse transformation when upcasting to anyref, breaking the fundamental assumption that upcasts are no-ops.
Therefore I suggest the following (assuming funcref <: anyref, but this does not matter for this discussion):

  • Reintroduce an externref type which represents host references in the host's representation. This will be separate from the type hierarchy, i.e. not a subtype of anyref.
  • Specify that anyref is not just a supertype of funcref, i31ref, and dataref, but in fact contains no values that do not classify as one of these three types. In other words, anyref is the type of all references that can be generated by a wasm module.
  • Introduce explicit casts from anyref to externref and vice versa. These will not be free, and will perform the same representation change as at the Wasm/JS boundary. For performance, we will also introduce additional casts from externref to more specific types. The specifics of the representation change will of course depend on the module's host.

With this design

  • an API that wants to pass opaque references to Wasm can pass them as externref and have the guarantee that there is no cost at the boundary. Such references might even be Wasm references (albeit in the host's representation); this way, another Wasm module can create objects internally and pass them to an externref interface.
  • an API that wants to pass references of specific Wasm types can do so, accepting the cost of representation change. Note that this is likely to be quite minimal for more concrete types than anyref.

I understand that the premise of this post is not a given. However, I also do not see the downside: even if we decide that funcref </: anyref, we allow for more flexibility in the implementation of other (current and future) reference types, and allow Wasm modules to specify more cleanly whether they expose an opaque interface.

@manoskouk manoskouk changed the title Suggestion: Combine opaque external references with explicit wasm-JS casts Suggestion: Combine opaque external references with explicit wasm-extern casts Jun 10, 2022
@jakobkummerow
Copy link
Contributor

What I like about this idea is that it's a general solution that addresses several issues that came up in previous discussions (such as i31-vs-host-tagging-schemes, costs created by funcref <: anyref) by separating concerns and making certain costs pay-as-you-go (instead of imposing them on every case). Thereby it lets us define the core Wasm type hierarchy however we see fit without worrying too much about costs incurred when anyref is used at the module boundary. And I think it avoids all the pitfalls that plagued earlier suggestions (including my own) around possible relationships between anyref, externref, "opaqueref" etc.

@titzer
Copy link
Contributor

titzer commented Jun 10, 2022

+1 to reintroducing externref; I think we removed it prematurely.

From what I glean above, the anyref would in fact be a wasmref and the two types are not related, thus requiring explicit casts both directions. Did I get that right?

One concern that @rossberg has brought up in that context is a Wasm module should be virtualizable as a host module and vice versa. We'd need to be able to specify a type above these two, or have type imports, for that to work.

@fgmccabe
Copy link

Perhaps a different approach is to have type imports. An externref could then be a type import provided by the host.

@manoskouk
Copy link
Contributor Author

From what I glean above, the anyref would in fact be a wasmref and the two types are not related, thus requiring explicit casts both directions. Did I get that right?

Correct. These would be the same "casts" that are happening in the boundary.

One concern that @rossberg has brought up in that context is a Wasm module should be virtualizable as a host module and vice versa. We'd need to be able to specify a type above these two, or have type imports, for that to work.

Would that not be achieved by creating references in the host, then casting them to externref (if needed) and passing them to the hosted module?

@titzer
Copy link
Contributor

titzer commented Jun 10, 2022

Correct. These would be the same "casts" that are happening in the boundary.

I agree that these casts are essentially hidden in the JS embedding, so making them explicit is attractive. That does have a downside in that they impose a cost on cross-module wasm -> wasm calls that use externref to communicate.

@lukewagner
Copy link
Member

I also like the idea of externref </: anyref. E.g., this would allow SM to, if it wanted, represent externref with its existing 64-bit NaN-boxed JS value representation while representing anyref with a SMI/pointer word. So, along with funcref </: anyref, this would give us 3 "root" supertypes with 3 distinct representations which seems pragmatic.

Incidentally, I don't think we can emulate externref with type imports: the idea with type imports is that you import a subtype of some fixed supertype. Thus, you could only type-import a subtype of any, func, data or any other heaptype. This "can only import subtypes of a fixed heaptype" restriction is key in allowing us to compile type imports as a fixed blob of machine code without monomorphization (using the "universal representation" of the import's supertype).

@manoskouk
Copy link
Contributor Author

That does have a downside in that they impose a cost on cross-module wasm -> wasm calls that use externref to communicate.

Right, but the downside compared the the current state would exist only in the case where they communicate directly though externref, and these values never cross the boundary. Modules could expose an anyref and an externref API if this starts becoming measurable.

@dcodeIO
Copy link

dcodeIO commented Jun 11, 2022

IIUC (please correct me if I'm wrong), in the presence of funcref <: anyref, one way to phrase the suggested hierarchy is

      may have casting cost
               |
     anyref   <->   externref
     /    \
dataref  funcref
   |
  ...

where anyref is kinda a typedref that can perhaps one day be obtained directly through type imports. Yet, this doesn't seem to solve the representation problem on the anyref/funcref edge - or does it?

If it does not, a likely (here: consolidated) alternative hierarchy with funcref </: anyref could become

      may have casting cost or may fail
             |               |
  dataref   <->   funcref   <->   externref
     |
    ...

yet this one seems odd on the background that a) from a producer's point of view (regardless of representation issues), the most viable/convenient/expected hierarchy would be the initially envisioned

     anyref
      /  \
dataref  funcref
   |
  ...

and b) given the former dataref/funcref/externref separation, if there is a JS import that accepts any JS/Wasm value (and would hence prefer to use the original anyref), the import needs to be duplicated twice to accept either a dataref, funcref or externref, offloading the representation change to the Wasm<->JS stub, unless (some of) these conversions are permitted implicitly to avoid the duplication, but doing either of these leads back to square one at the boundary anyway it seems.

What am I missing? :)

@manoskouk
Copy link
Contributor Author

Casts from dataref or funcref to externref should always succeed, thus the import could accept externref.

@rossberg
Copy link
Member

A few observations:

  • We have identified 3 possible universes of representation:

    (1) external: references in host-prescribed representation
    (2) internal: references in native Wasm representation
    (3) function: function references in Wasm representation

  • Internal includes both i31 and data refs – these must be in the same universe and have a shared representation and a common supertype, otherwise core use cases of i31ref would be broken. (But as long as extern is separate, there is no problem with that.)

  • All external references ought to be convertible into internal references somehow (at least at the boundary), otherwise virtualisation would be broken in one direction. Furthermore, externref, if not fully internalisable, would always enforce extra boxing, even on platforms where it wouldn't be needed. Consequently, we need to allow that internal references can be formed from all external references, including those that were formed from functions (which does not imply that they are castable to functions or have the same representation).

Putting this together, there is the following conceptual hierarchy:

            ___________ any? _________          
           /             |            \
     extern            intern          func
    /      \          /  |   \
(host)  (exported) i31  data  (imported)
        /   |    \             /      \
 (e_i31) (e_data) (e_func)  (i_host) (i_func)

where the parenthesised nodes are types that we may not necessarily name in Wasm. Here, exported are the native Wasm values converted/wrapped to be represented at the host side, and dually, imported are host values converted/wrapped to be represented uniformly with native Wasm values (this may include wrapped Wasm functions).

If any does not exist, then explicit conversions will be required between these types (these are conversions, as opposed to casts, since they can involve representation changes):

extern.convert_intern : [internref] -> [externref]
intern.convert_extern : [externref] -> [internref]
extern.convert_func : [funcref] -> [externref]
func.convert_extern : [externref] -> [funcref]  ;; may trap

and for completeness, maybe also

intern.convert_func : [funcref] -> [internref]
func.convert_intern : [internref] -> [funcref]  ;; may trap

If any exists, then the diagram collapses imported = extern - func and exported = intern + func, necessarily (and e_* = * and i_* = *).

But if any does not exist (at least to unify intern and extern), then every module will have to pick a preferred representation between intern and extern. We need to be aware that this reduces composability. The same problem is created for type imports, as @lukewagner points out.

That is, every module will have to choose whether it is optimising for linking against the host or linking against another Wasm module. Note that this does not just affect exports, but internal choices like type definitions for data structures, so having two versions for every export function does not solve the problem (nor would it scale). Furthermore, the problem extends to exports like mutable tables and globals, which you cannot duplicate anyway.

So, while I think separating the types is possible, there is no free lunch, and this certainly creates its own set of complexities and problems (some of which may be difficult to predict).

@tlively
Copy link
Member

tlively commented Jun 23, 2022

At the last subgroup meeting we all agreed that figuring out the type hierarchy was a high priority, so it would be good to move this discussion forward as much as we can.

I was thinking about the top any type in @rossberg's diagram above and wondering whether we could get the benefit of having externref as a "lazy" internref even if we had that single top type. I think we can, but it would require representation changes on downcasts. Consider a smi that flows from JS into Wasm where an extern reference is expected. Since extern is supposed to be lazy, the engine does not check what kind of reference it is at this point. Later, the value flows to a location with type any and is then downcast to i31. Since it is a smi, the cast succeeds without representation changes. However, if a boxed number followed the same path, then this downcast would require a representation change to i31. (The end result in both cases needs to be the same since a smi and boxed number should not be distinguishable.)

In the OP, @manoskouk wrote:

The idea to apply these checks during downcasts from anyref will not work, as we would need to apply the reverse transformation when upcasting to anyref, breaking the fundamental assumption that upcasts are no-ops.

But I don't see why we would need to apply reverse transformations on upcasts to any. In the example above the i31 you get after the downcast can flow anywhere an any is expected with no further modifications. Is there another case where upcasts require representation changes that I'm missing?

Also, are representation changes on downcasts acceptable? I could imagine this could get hairy if the engine moved code around so that the value to be cast was materialized in multiple locations (e.g. in two different registers). Then the cast would have to update the representation in multiple locations.

@titzer
Copy link
Contributor

titzer commented Jun 24, 2022

The only case that I can think of a representation on downcast being a problem is if both the from and to type are both under eqref. IIUC the intention is that ref.eq is always a word compare.

As for multiple places to update representations, a cast produces a new value and leaves the old value unchanged.

@manoskouk
Copy link
Contributor Author

Is there another case where upcasts require representation changes that I'm missing?

Other types may have different representations in JS and Wasm. In V8, these are currently functions, but it may expand in the future.
This means that if an upcast from funcref to anyref does not do the reverse transformation, you end up in a spot where an anyref value may have either representation. This complicates further downcasts, as well as crossing the boundary to JS.

@askeksa-google
Copy link

I very much like the framing of this as making boundary conversions optional and explicit. With this in mind, it makes sense to extend the conversions to any Wasm type, i.e. not just reference types. We can convert to/from any Wasm type at the boundary, and the explicit conversions would mirror that capability.

We would have two instructions (names not meant as suggestions):

  • to_extern: any type -> externref
  • from_extern *type*: externref -> type

This positions the feature as independent of the funcref <: anyref question. It also takes serious the observation (CMIIW) that, even for reference types, the engine can often do a better job of converting a value from externref to a known final type than converting to internref and then downcasting.

In this model, we might want to rename the externref type to reflect that it's not necessarily a reference type in the same sense as the other reference types (i.e. it's not part of the hierarchy and is not composed of a heap type and a nullability).

@jakobkummerow
Copy link
Contributor

@tlively : I think having that top type or not doesn't make much of a practical difference (so I don't have a strong opinion either way).
With a top type (I'll call it topref just to not overload any existing names): casting from externref to internref would be a (free, implicit) externref -> topref upcast plus an (explicit, costly, possibly-representation-changing) topref -> internref downcast. Analogously for the reverse direction: free implicit upcast internref -> topref, expensive explicit downcast topref -> externref. topref itself would have "undefined" representation (because it could be inhabited by lazy values originating in either world); I don't think the spec needs to worry about this, only engines do.
Without a top type, casting from externref to internref would be an explicit "sideways cast" that does the same work (and at the same cost) as the topref -> internref downcast does in the other model. Again, same thing in the reverse direction.

A minor argument against having topref would be one of clarity: people might get confused about what topref's role is, and whether they should be using that (instead of externref) on their module boundary. Maybe this could be mitigated by making topref a spec-only thing (like bottom used to be), and disallowing ref top as a type that shows up in modules.
I suppose an argument in favor of having topref might be that it avoids the "we need three null types" problem?

@tlively
Copy link
Member

tlively commented Jun 24, 2022

Is there another case where upcasts require representation changes that I'm missing?

Other types may have different representations in JS and Wasm. In V8, these are currently functions, but it may expand in the future. This means that if an upcast from funcref to anyref does not do the reverse transformation, you end up in a spot where an anyref value may have either representation. This complicates further downcasts, as well as crossing the boundary to JS.

Right, with lazily-converted externref + a top type + representation-preserving upcasts, you are already in the position where top values may have either representation, so all the conversion complexity necessarily moves to the downcast from top, as @jakobkummerow described. If those more complicated, representation-changing downcasts are acceptable, then we could actually just have top = extern and still benefit from explicit lazy conversions to Wasm representations. Those conversions would be done on downcasts from top.

These converting downcasts would only be acceptable if they could be avoided where any is currently used for polymorphism over internal reference types. Having intern available to fill this role would suffice. If top had to be used for polymorphism over internal reference types, then its potentially-converting downcast would be unnecessarily expensive for those use cases.

So it makes sense to have a top/extern/any type above a usable intern type or to have extern and intern as separate hierarchy roots, but it would not make sense to have a single root top/any type without a usable intern type.

With a top type (A):

top/any/extern
|
intern
| \
eq func
|
...
Simplified from
top/any
|     \
intern extern
| \
eq func
|
...

Or if we wanted func <: eq (B):

top/any/extern
|
eq/intern
|  \
... func
Simplified from
top/any/extern
|
intern
|
eq
|  \
... func

Without a top type we have either (C):

intern extern
| \
eq func
|
...

Or, if we want to continue allowing func to have a different representation than intern (D):

intern/eq extern func
|
...
Simplified from
intern extern func
|
eq
|
...

@askeksa-google, I don't think trying to generalize extern beyond reference types would work out because different external non-reference types may have different representations (and in particular different bit widths). For example we wouldn't want extern_i32 and extern_f64 and extern_v128 to be able to flow into a module at the same extern-typed location.

@jakobkummerow
Copy link
Contributor

@tlively : The key consequence of merging extern into top is that it makes internextern casts implicit upcasts, whereas previously they were explicit, which is appropriate when they perform type checks and representation changes under the hood. So I don't think this merging solves the same problems as Manos' thread-starting suggestion; it's not just a benefits-preserving simplification.

With today's state of the world, we specifically have a cost when converting between funcref and extern (in either direction). So I'm in favor of solutions that make these conversions explicit, so that modules can control whether they want to pay these costs. From your list, that leaves (C) and (D), though other options exist as well (such as the original suggestion, or simply keeping funcref outside the hierarchy).

(It's hard to say whether structs/arrays might one day also have explicit conversions at the intern<->extern boundary; that depends in part on how the spec evolves.)

@tlively
Copy link
Member

tlively commented Jun 24, 2022

In my mental model intern -> extern never performs a representation change, so I must be missing something. Certainly for i31 it is a no-op because extern must be able to hold both smis and boxed numbers. Are you saying that you would expect extern functions would always have the JS representation and never the Wasm representation (in V8 specifically)? That would certainly be reasonable if externref were completely separate. If we have intern <: extern, though, then an extern function could have either the JS or the Wasm representation. Are there downsides to that?

@dcodeIO
Copy link

dcodeIO commented Jun 24, 2022

FWIW, even though it differs from the initial idea proposed in the OP, option (A)

               anyref
                  |--------may have a cost when anyref->wasmref
               wasmref
              /   |   \
         eqref funcref stringref
         /   \
     dataref i31ref
    /      \
arrayref structref

(names according to my mental model) seems attractive both ergonomically and given the aspects that a) as long as anyrefs remain anyrefs, when passing them along there is no overhead, and b) that potential overhead is limited to anyref->wasmref, which's existence kinda implies what to expect, while giving producers options. Unless, of course, two function representations in anyref turn out to be a problem for engines.

@jakobkummerow
Copy link
Contributor

@tlively

In my mental model intern -> extern never performs a representation change, so I must be missing something.

When an engine has different representations for the same thing depending on whether it's inside Wasm or outside (as V8 currently does for functions; I think a NaN-boxing engine would have a similar situation for small integers, at least if we end up specifying that JS numbers must be converted to Wasm i31refs when possible; on non-pointer-compressed 64-bit configurations, even in V8 there's a conversion-requiring difference between Smi and i31ref), then conversions must be performed when a value leaves Wasm (e.g. is being passed as parameter to an imported function, or returned as result from an exported function, or read by the outside world from an exported global/table, etc). For example, in V8 a wasm-function must be converted back to a js-function, in a NaN-boxing engine an i31ref has to be turned back into a float64.
We can do that implicitly (and with the current design, we have to); one of the key benefits of the proposal that started this thread is that it made this conversion explicit (and hence under module producer control). By merging extern into any/top, we lose this explicitness.

Another way of phrasing it: if conversions to extern are necessarily explicit, then we can guarantee that after such an explicit conversion, passing an extern value across the boundary is a super-cheap pass-by-reference; and the cost of converting to extern depends on the original type (e.g. data->extern might be no-op, funcref->extern might have a cost). If extern is a supertype and hence accessible by implicit upcast, this predictability is lost, and it's likely (AFAICS) that all values would at least pay the cost of checking: consider that a cascade like if (is_func(value)) { return convert_funcref(value); } else if (is_i31(value)) { return convert_i31(value); } else { return value; } has a price even when the value is neither func nor i31.

@tlively
Copy link
Member

tlively commented Jun 25, 2022

Aha, so having extern = top allows values to lazily be converted when they go from JS to Wasm, but forces them to eagerly be converted on the boundary when they go from Wasm to JS. If we never allow JS to hold an intern reference, then this outbound implicit conversion might not be so bad because it’s the latest point at which a conversion could happen, except that more values than necessary would have to pay the the cost of conversion.

if we had a top type with extern as a distinct subtype, it would be tempting to make the explicit conversions the same as the downcasts from top to any, but that would suffer from the same problem. Since we need explicit conversion instructions both to and from extern anyhow, I don’t see any benefit to having a top type.

Edit: The benefit of a top type would be for polymorphic modules that never downcast or convert and want to be generic over all reference types.

@tlively
Copy link
Member

tlively commented Jun 25, 2022

Oh, also having top and a distinct extern subtype means that you should be able to not just downcast to extern, but also do ref.is_extern. But that would either force a representation difference between all intern and extern (which we don’t want) or would return true while e.g. ref.is_data would also return true. But we wouldn’t want to expose which internal types have the same representation as external types, so ref.is_extern would have to return true for everything and be useless. So that’s more evidence that there probably shouldn’t be a top type above extern.

@manoskouk
Copy link
Contributor Author

The way I see it, having a wasmref and a separate topref does not add anything to the current design, other than being able to cast from topref to wasmref exactly. But that does not seem very useful, since there are no operations defined on wasmref. You still have the issue that topref could be both a wasm-shaped and a host-shaped reference, which adds strain to the boundary.
An unrelated comment: @rossberg mentioned at some point that each separate type hierarchy would require both a null value and a null type. However, IIUC a nullref type is useful as the bottom of a type hierarchy. Since it appears funcref and externref would be isolated, I think we can include the null values in these types, much like we do today. We already have ref.null with type immediate to generate the null values for these types.

@askeksa-google
Copy link

@tlively

@askeksa-google, I don't think trying to generalize extern beyond reference types would work out because different external non-reference types may have different representations (and in particular different bit widths). For example we wouldn't want extern_i32 and extern_f64 and extern_v128 to be able to flow into a module at the same extern-typed location.

I didn't mean to suggest that we extend the extern type to cover more kinds of values.

My proposal was perhaps a bit confusingly phrased, since I was conflating two issues which should be considered separately:

  1. Whether the conversion instructions introduced in this proposal can convert extern to/from any Wasm type, rather than just reference types. At the boundary, all Wasm types are supported, so as a way to make boundary conversions explicit, it seems natural to also support all types in the instructions. I imagine that avoiding/postponing checks/conversions can also be useful for non-reference types.
  2. Whether extern is a heap type. Even though this discussion originated from discussions of the heap type hierarchy, it seems to me that extern as discussed here is more of an opaque "any host value" type than something that belongs in the Wasm type hierarchy. Having it as a heap type gives rise to some situations where we might want to treat it differently from other heap types. Can an externref be declared non-nullable? Does this mean that a host value flowing into such a parameter must be checked for null? This seems at odds with the purpose of extern being that no checks are performed. Can extern be used in casts? The explicit separation proposed here suggests not. On the other hand, having extern as a heap type provides a natural way to explicitly construct the default value for externref via ref.null extern.

@rossberg
Copy link
Member

@manoskouk:

The way I see it, having a wasmref and a separate topref does not add anything to the current design, other than being able to cast from topref to wasmref exactly. But that does not seem very useful, since there are no operations defined on wasmref. You still have the issue that topref could be both a wasm-shaped and a host-shaped reference, which adds strain to the boundary.

What a top type adds to the design is avoiding the need for a module to choose between intern or extern representations for handling unknown outside values. If a module does not care itself about that representation (because it's "polymorphic"), then neither choice is ideal, since either would impose extra cost on some of the clients.

With a "lazy" top type, the module would not impose a cost on either, because no conversion is ever actually performed in the common case (unless different clients are mixed somehow).

An unrelated comment: @rossberg mentioned at some point that each separate type hierarchy would require both a null value and a null type. However, IIUC a nullref type is useful as the bottom of a type hierarchy. Since it appears funcref and externref would be isolated, I think we can include the null values in these types, much like we do today. We already have ref.null with type immediate to generate the null values for these types.

No, that does not work. Their representations could be incompatible, which is the point of separating the types. For example, a data ref might be one word, a func ref two words. Then, certainly, a nullref array cannot be a subtype of both a dataref array and a funcref array. You have to keep the hierarchies completely disjoint.

@manoskouk
Copy link
Contributor Author

This is not what I meant: nullref should as you say be subtype only of the anyref hierarchy, but we do not need additional nullrefs for the isolated types externref and funcref.

@rossberg
Copy link
Member

Oh, okay. But what would be the type of ref.null func, other than (some way of writing) funcnullref?

It's a moot point: since we'll already need distinct bottom types for the three hierarchies, the three nullref types exist automatically, by way of being (ref null bot1/2/3). You cannot rule those out without breaking the type algebra.

@manoskouk
Copy link
Contributor Author

Right, funcref will need a bottom type. For externref (if it ends up separate and with no subtypes), are we able to type ref.null extern simply as (ref null extern) (as we do now)?

@rossberg
Copy link
Member

Right, another way to say this is that extern is the bottom type. That works, as long as we do not introduce subtypes of extern, like the parenthesised types in my diagram above. I would not be surprised if it turns out that we will want some of them eventually.

@conrad-watt
Copy link
Contributor

@jakobkummerow thanks! This does cause me to have a stronger opinion that we should defer any. The split between intern and extern seems very well-motivated to me because it makes explicit a general truth that the host and Wasm-land may want to use different representations (and all the implications at the boundary that come with that). This approach seems to provide a lot of implementation freedom, and I think we should see how this shakes out before potentially closing options off by introducing any (which implies restrictions on how the host and Wasm representations can overlap).

In practical terms, the proposed intern functions like the current V8 prototype's any (given an appropriate JS api) so shouldn't present an expressivity hazard for producers, and we can introduce "lazy any" later on if there's a good performance motivation, and extern is enough to solve the pass-through problem (which is guess is what's relevant for DOM nodes etc?).

@lukewagner
Copy link
Member

I might just be misunderstanding the diagram but my assumption is that (1) if there is an edge between two types, the "higher" type is the supertype of the "lower" type and (2) because subtyping is non-coercive, an engine cannot rely on being able to perform a representation change on upcast and thus this diagram implies that all types shown must share the same universal representation. The prose below the diagram seems to suggest that the intention is to let the engine have different representations for different types in the tree. If so, could we have a diagram that only shows subtyping (as opposed to coercibility)?

@tlively
Copy link
Member

tlively commented Jul 11, 2022

@lukewagner, your assumptions are right, but the interesting part is that the same value (e.g. an i31 or a func) can have two different representations (an intern representation and a different extern representation) and that any can hold the value using either representation, so there is never a representation change on an upcast. As @jakobkummerow pointed out, this only works when there is no representation that corresponds to two different values depending on whether it is interpreted as intern or an extern representation. All representations must either be classifiable as exclusively intern or extern or be valid with the same value for both.

@rossberg
Copy link
Member

@conrad-watt, regardless of any, extern and intern need to be in a bijective relation. Hence, there is no difference in the assumptions we have to make and basic work an engine has to do when going from extern to intern: with any, the work occurs when casting down from any to intern; without any, the same work has to occur when converting from extern to intern directly. In particular, wrt to i31, it will need to normalise JS number representations in both cases.

@tlively, the diagram would be fine by me. Plus I think:

  • we'd want to add data_or_func (perhaps under some better name, i.e., the type of Wasm things that are boxed), in order to factor casts optimally,

  • we should include struct for completeness,

  • FWIW, we should rename data, to avoid confusion with the data section (now that we have GC instructions like array.new_from_data).

I'm indifferent about a wasm type. That would seem to equate the union data_or_func | i31, but I'm not sure what the use case would be, in-between data_or_func and intern.

@conrad-watt
Copy link
Contributor

conrad-watt commented Jul 12, 2022

@conrad-watt, regardless of any, extern and intern need to be in a bijective relation. Hence, there is no difference in the assumptions we have to make and basic work an engine has to do when going from extern to intern: with any, the work occurs when casting down from any to intern; without any, the same work has to occur when converting from extern to intern directly. In particular, wrt to i31, it will need to normalise JS number representations in both cases.

I don't think this gets to the issue discussed above. If I have an extern, convert it to any, and then convert it to intern, at which steps do we accept the possibility of a representation change (and for the inverse direction)? It seems that any is at its most valuable if the extern/intern to any upcast step involves no checks or representation changes - but this seems to run into issues of the kind @jakobkummerow describes - the worst-case work on downcast may be the same, but some implementations can't work out whether to carry it out or not just from the reference's representation. Alternatively, if we accept that one of extern/intern to any involves an eager check and potential representation change, then what is any's value over just directly converting between extern <--> intern?

@rossberg
Copy link
Member

Sure, the only representation changes must happen on downcasts.

I'm not sure the described scenario can even come up. If a host engine was using 30 bit unboxed integers, then the only rational reason would be that its GC uses two tagging bits. In that case, it cannot possibly have 31 bit unboxed integers in the same GC to boot. Regardless of the hierarchy, such an engine either would have to change its implementation, or box i31refs that overflow 30 bits and suck up the bad performance – which also resolves the above problem. In reality, our design is essentially assuming a single tag bit on 32 bit architectures.

@conrad-watt
Copy link
Contributor

I'm not sure the described scenario can even come up.

I'm worried that we're not going to be able to anticipate all of the consequences of this form of any without broader implementer experience - it just seems very committal wrt reftype representations in comparison to the high-level motivation behind the extern/intern split.

I appreciate that this concern isn't necessarily actionable beyond "defer any just in case".

@rossberg
Copy link
Member

To refine what I stated above, there indeed is an additional assumption in the presence of any, which I believe is what you're getting at. It's what @tlively mentioned above: any representation that exists as both intern and extern value, must mean the same for both. But I can't think of a plausible scenario where that cannot be achieved – unboxed values are the only critical part where overlap is hard to avoid.

Of course, the more fundamental constraint is that, given any, extern and intern need to have a uniform representation inside Wasm(*). Whereas if both were separate types, as well as func, the engine could use completely different representations for extern (e.g., keep a NaN-boxed representation) and func (e.g., two-word fat closure) internally. I think the relevance of these scenarios ought to be the primary focus of our discussion. So far, they have been hypothetical, as @titzer has criticised repeatedly.

(*) It may still be different from the representation outside Wasm, e.g., there could be a uniform conversion for extern at the boundary (uniform as in not involving arbitrary case analysis).

@titzer
Copy link
Contributor

titzer commented Jul 12, 2022

@rossberg I agree with the need for data_or_func, we had called this pointer at various times. As for renaming data, we could call it record or object--though further bikeshedding is probably necessary.

Overall, "upcasts"[1] along the subtype hierarchy cannot imply a representation change. Otherwise, at every point where a value of a subtype can be passed to a supertype anywhere in the Wasm program or even at the boundary (local.set, calls in and across Wasm, conditionals, branches, fall-through of a block, table write, field write, etc) would imply runtime work. That'd be more like implicit value conversions that occur in some source languages. It would be a very disruptive change; all tiers would then have to track types and emit additional code where it occurs. (interpreter sidetable entries for local.set because subsumption? oh my!)

[1] Normally this is referred to as "subsumption." I think we should prefer that term because "cast" has the connotation of a runtime operation of some kind, and there is never a runtime operation for subsumption.

@titzer
Copy link
Contributor

titzer commented Jul 12, 2022

As for any, a point I made earlier that might have been lost.

Yes, we can add any later. However, doing so leaves design warts along the way, since in the short term we'd need two bottom types and thus two nulls[1], both of which become redundant when any is introduced later. In that sense, any is the endpoint of repeatedly adding higher and higher supertypes to the heaptype hierarchy. If we add any now, we'd be "done".

I don't agree that we should conservatively leave any out unless there is a really good justification, because that leaves the afore-mentioned design warts. I do think the type system is simpler with any, rather than without.

Thus I think the performance argument for any needs to be evaluated.

[1] On the other hand, two nulls might allow V8 to use a null object (valid reference) for JavaScript and the null pointer (0) for Wasm nulls. But given how hard it was to get V8 to use signals for trapping on Wasm OOB memory accesses, it might be difficult to make GC nullchecks rely on hardware checks in V8, so that might be moot.

@rossberg
Copy link
Member

@titzer, yes, I believe we all agree that subtyping must be non-coercive. I also agree that "upcast" is a misleading term.

Yes, we called it "pointer" in some of the discussions. Though as a final name that may be odd: pointer and reference are often synonymous in other contexts, so what's a "pointer reference"?

FWIW, I don't think an implementation would need to represent the different null values differently. If the type hierarchies are disjoint, they can never be mixed anyway. But it would have to deal with different null types.

@titzer
Copy link
Contributor

titzer commented Jul 12, 2022

FWIW, I don't think an implementation would need to represent the different null values differently. If the type hierarchies are disjoint, they can never be mixed anyway. But it would have to deal with different null types.

I know, but disjoint implies the engine could have different null values if it wanted to. In V8, I was thinking of the case if they are not disjoint and, a JS null object subsumed to any and was then passed to Wasm. Then there is a null object value and a null pointer (e.g. address 0), i.e. two nulls again. It turns out that I think this is fine because we have carefully restricted any to have essentially no operations, not even comparison. So it would have to be downcast to intern, at which point the null object could be converted to the null pointer (address 0).

Thinking it through more, it's still possible to have different null values for intern and extern, even with any. We could still allow ref.null extern and engines would generate a different constant for each. (interpreter sidetable would have a bit).

@tlively
Copy link
Member

tlively commented Jul 12, 2022

At the meeting today we had consensus to move forward with the three-pronged approach, to be revisited only in the presence of significant new user/implementer feedback. I know that this is not everyone's preferred solution, but I'm glad that we have made a decision and can now move forward.

We should now focus on whether we should have @titzer's proposed wasm type as well as bikeshedding the names of the types.

@rossberg
Copy link
Member

rossberg commented Jul 12, 2022

Pondering more over the additional instructions that we'll need to support the 3 separate hierarchies, I realised the following:

  • Because Wasm functions can be exported to JS, they will have a representation as extern values.

  • Because all extern values can be converted to intern values, Wasm functions also have a representation as intern values!

  • Consequently, functions do not have 2 but 3 different conceptual representations in our design! In particular, there are two competing representations inside Wasm itself.

  • Because Wasm code can compose func->extern and extern->intern conversions, and likewise the opposite direction, the third representation can easily be constructed inside Wasm. Conceptually, this is the func subtype of intern that we wanted to remove!

  • However, we don't name this subtype, nor do we make it usable directly. Instead, we require converting it back to func first. But we could equally well introduce proper structural types for it and have a version of call_ref operating on it directly.

  • Which raises the question why we bother with the separate func representation when the one under intern is unavoidable anyway. Effectively, we don't get rid of it, we merely make it artificially unusable, and introduce a redundant representation instead.

Maybe it's too late over here, but at this point, the design we chose looks increasingly silly, and I wonder if we should reconsider the decision to have a separate representation for functions in the MVP. Note that we can also add it later, as the separate type of "raw" code pointers that it is.

Thoughts?

@tlively
Copy link
Member

tlively commented Jul 12, 2022

Good points, and I don't think they've been raised before. I don't think it would entirely violate the spirit of our consensus to consider changing the three-pronged approach to a two-pronged approach, since fundamentally the question we answered was whether we should move forward with a single top type.

That being said, I don't think these points invalidate the reasons we had for keeping func separate, namely that we would like to keep the door open for engines to have a different representation for func. In an engine that has a different func representation, both the intern and extern representations of functions would be boxed in some way, so the explicit conversion back to func before calling would make a hidden conversion cost explicit.

Unless anyone feels strongly that these new considerations change the picture, we should stay the course with the three-pronged design.

@lukewagner
Copy link
Member

Could an even more minimal option be to not attempt to provide any coercions between funcref and anything else? This would allow funcref to be its own separate type, just like i64 (which can also be exported to JS and would also have the same issue if we attempted to, e.g., provide i64-to-externref coercions).

@titzer
Copy link
Contributor

titzer commented Jul 13, 2022

Apologies for missing the meeting, I have a recurring conflict at exactly that time and have to juggle.

I think we should go with the three-pronged consensus for now, in the interest of having a narrow enough cone of uncertainty to iron out the binary format and have tests for it. I love the part where I have some code to write and some tests that tell me if that code is working. Let's get to that part.

There still seems to be disagreement on where the burden of proof lies w.r.t to the any design question, though. We should move forward now, but if we want to revisit the decision, then what new information or data is required?

@jakobkummerow
Copy link
Contributor

@rossberg: I think I see your point, but I don't think this creates a silly situation: conceptually, I think of Wasm funcrefs exported to JS as "JS wrapper objects that (internally) reference a Wasm function" (whether they're actually implemented as wrapper objects is irrelevant). So if such a JS object is passed to Wasm as an externref, then it's just another opaque host object. Converting it to internref doesn't change its representation, and doesn't make it usable in any way other than as an opaque pointer -- not because we're explicitly barring "the third kind of funcref" from being useful, but because opaque host objects simply aren't usable in general. In that sense, I agree with @lukewagner 's i64 example, which applies to all numeric types: e.g. a WebAssembly f64 can be exported to JS, where it'll be a JS Number, which can be passed back to Wasm as externref, but that doesn't really mean that "there's a silly unusable f64 representation that's a subtype of anyref", it's just another opaque host object.

@rossberg
Copy link
Member

rossberg commented Jul 13, 2022

@lukewagner, since func refs can already escape into the JS universe, and the conversion happens implicitly at the boundary, we cannot truly remove that conversion. But you are right that we can probably make it unobservable from within Wasm, analogously to number types, as you suggested. That makes sense to me.

Okay, another question.

In the past, we repeatedly talked about the possibility of Wasm reference types that are not allowed to be passed to JS (similar to v128, e.g., consider continuation objects from the stack switching proposal). What would this mean for the intern->extern conversion lowered into Wasm?

  • A non-option is to make the conversion trap or otherwise fail. That is not viable, because whether something is allowed to escape to the host is fundamentally a policy decision that may differ between hosts, and we don't want the Wasm conversion semantics depend on the host. (As an example, we may not want to allow continuation values to escape to JS, while it would be undesirable to make them inaccessible through the C API.)

  • Then the only other option is that there are checks at the boundary, in environments that restrict the values. That would be fine by me, but it implies that the use of externref is still not check-free: in a host with such restrictions, it will require a case analysis at the boundary, but this time on the way out. Are we okay with that?

  • Edit: a third option is to never reject any references at the boundary either. Instead, just make them sufficiently inaccessible in JS. But I don't know if that's enough to cover all potential cases. In particular, that would still allow them to roundtrip back into Wasm in potentially undesirable ways. I guess we lack a good picture of the possible scenarios.

@conrad-watt
Copy link
Contributor

conrad-watt commented Jul 13, 2022

In the past, we repeatedly talked about the possibility of Wasm reference types that are not allowed to be passed to JS (similar to v128, e.g., consider continuation objects from the stack switching proposal). What would this mean for the intern->extern conversion lowered into Wasm?

Could such values be lossily converted at the intern->extern step to a completely opaque "sentinel/invalid" externref that can't be downcast/converted back to the original Wasm value? This would capture that such values have no real lossless extern representation while keeping the overhead of handling this confined to the conversion points.

@rossberg
Copy link
Member

rossberg commented Jul 13, 2022

@conrad-watt, that would fall under the "otherwise fail" clause I mentioned. It still means that the semantics is host-dependent. In particular, you can't know whether intern->extern followed by extern->intern succeeds and returns the original value. So, no.

@titzer
Copy link
Contributor

titzer commented Jul 13, 2022

@rossberg I was under the impression is that extern serves the role of representing "already legal" external host values. If some intern references are not legal as extern, depending on the host, it would seem like the intern -> extern explicit conversions are in fact, host-dependent. Maybe that shouldn't be an instruction then, but an imported function from the host?

@lukewagner
Copy link
Member

I think the "never fail" option makes a lot of sense. Just like externval lets us box all foreign host values so they can roundtrip host→wasm→host, it seems reasonable for the host to provide the inverse box, allowing all wasm values to roundtrip wasm→host→wasm. In a JS setting, I imagine this taking the form of some JS API-defined exotic object that stores the wasm value in an internal slot but provides no way to get at it, so the only thing you can do with it is pass it back to the JS API. I could even imagine this being a quite useful feature for linker/loader-type JS glue code scenarios.

@rossberg
Copy link
Member

rossberg commented Jul 14, 2022

@lukewagner, I think I agree. Though one reason to prevent values from crossing is not to expose identity/equality to JS, when they are not eqrefs in Wasm, e.g., to avoid exposing implementation details. OTOH, we already have that issue with funcrefs, and they are exposed.

@titzer, I agree that host-dependent behaviour would be fine if it was an import. But I'm not sure that we have a good model yet for such a "system import", or that it would meet the expectations of the group.

@lukewagner
Copy link
Member

@rossberg Yeah, that's a good concern. I wonder if maybe JS exotic object semantics give us hooks for making object identity unobservable? Otherwise, it seems like we could specify a "new object every time" semantics which would probably be fine.

@titzer
Copy link
Contributor

titzer commented Jul 26, 2022

As per today's meeting, I think we don't need a wasm type, since we've split the hierarchy into the three prongs.

@tlively
Copy link
Member

tlively commented Jul 26, 2022

I'll go ahead and close this issue. Any follow-up discussions can be new issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants