Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

i31 as a ref type dimension like null #130

Closed
titzer opened this issue Aug 31, 2020 · 56 comments
Closed

i31 as a ref type dimension like null #130

titzer opened this issue Aug 31, 2020 · 56 comments

Comments

@titzer
Copy link
Contributor

titzer commented Aug 31, 2020

In the course of discussion #100, @gasche proposed making i31ref not a type, but a dimension in the ref type constructor. I'd like to pull this idea out for a dedicated discussion because it is not entirely clear to me if we can add this later or must add this up front. It also seems like it can make a difference in code generation, so we should probably think it through.

Concretely, instead of (or perhaps, in addition to) a single i31ref type which is in the middle of the hierarchy between anyref and (ref $T), we have:

For each type $T, then:
(ref $T) which is a subtype of (ref i31 $T)

It is thus another independent dimension much the same way as null is a dimension for this type constructor.

(ref i31 $T) subtype of (ref null i31 $T)

When thinking about the code that would be generated by an engine for various reference operations, some engines may (already) need to generate these null checks:

  • null checks for struct.get, struct.set and corresponding array operations
  • null checks for casts between struct and array types

Of course, most engines will hopefully be able to do these implicit null checks via virtual memory tricks, but not all can.

We've been discussing the additional checks that would be needed for i31 tag checks in the context of downcasts from i31ref to a more specific reference type. In particular, they would be implicitly needed because i31ref is proposed as a subtype of anyref above all declared heap types (AFAICT).

I'd like to point out that tag checks can be avoided if we think of i31 values as just a big group of nulls: engines can elide the checks based on static information, which is exactly the same information provided in the null ref type dimension.

It also gives more expressive power to programs, because it is a very limited union type. In particular, using (ref i31 $T), a module could perform a (struct.get $T i) that has an implicit tag check that would trap on an i31 value in much the same way as it would trap on null. Without this, a program would have to use i31ref types everywhere and downcast everywhere explicitly; a potentially more expensive check than just a tag check.

@sabine
Copy link

sabine commented Aug 31, 2020

I see that WASM already has 64-bit integers, which suggests, that, eventually, WASM might be usable as a 64-bit compilation target. We are very interested in a spec that allows 63-bit unboxed scalars in addition to 31-bit unboxed scalars.

With ref i63 $T not being a subtype of anyref, it does not matter which pointer size the WASM engine uses internally.

Do you think it will be possible to import/export ref i31 $T explicitly from a WASM module (i.e. not as a private type or anything, just a plain import from one module and export from another)?

@jakobkummerow
Copy link
Contributor

I'm generally supportive of this suggestion. In particular because it allows modules that don't need/use i31ref to avoid some i31 checks (they may be cheap, but not doing them at all is even cheaper).

I'm not sure how many checks would be avoidable in practice. I assume that the type lattice will have to be:

eqref == (ref null i31 eq) <: anyref == (ref null i31 any)
(ref null i31 $T) <: eqref  (for $T being a struct or array type)
(ref $T <: (ref null $T) <: (ref null i31 $T)
(ref $T <: (ref i31 $T) <: (ref null i31 $T)

So in particular, downcasts from anyref/eqref to (ref $T)/(ref null $T) will still have to include i31 checks; only downcasts from (ref any) or (ref null any) or (ref eq) or (ref null eq) can skip the i31 checks.

With ref i63 $T not being a subtype of anyref

@sabine : I would have assumed that all (ref ...) types will continue to be subtypes of anyref, in which case (ref i63 ...) would force all pointers to be 64 bit wide (and would rule out NaN boxing as an implementation strategy). At any rate, I think i63 is a separate discussion, and fairly orthogonal to this issue.

@sabine
Copy link

sabine commented Aug 31, 2020

Ah, sorry about derailing. If all (ref ...) types must still be subtypes of anyref, then (ref i63 ...) is obviously not a reasonable thing to have.

@jakobkummerow
Copy link
Contributor

I just realized that #76 creates an interesting wrinkle here. If we have to disallow (ref i31 extern), then I think we have to disallow (ref i31 any) too, but if anyref means (non-i31!) (ref null any), then all the (ref i31 $T) can't be subtypes of it, and the eqref shorthand can't have the i31 bit either. Modules could still define (ref i31 eq), but the type would be incompatible with anyref or eqref.

So on the one hand, that would imply that (contrary to my previous post) downcasts from anyref would not have to include i31 checks; but on the other hand, it would also mean that there's no way to define a type for a global/local/parameter/field that can hold a struct/array ref, or an i31, or a funcref (which would have been nice in order to close a functionality gap compared with the "two-types" proposal).

One solution would be to introduce more general union types.

One solution would be to introduce another built-in type that sits between anyref and its non-externref subtypes, i.e. would be the new supertype of funcref and eqref.

Am I overlooking anything?

@RossTate
Copy link
Contributor

RossTate commented Sep 2, 2020

Nice catch, @jakobkummerow! To rephrase, no GC proposal with i31ref can have a common supertype between i31ref and externref, which in particular means no top type. Currently the designs of the Type Imports and GC proposals (and, to a lesser extent, Typed Function References) are premised around having a top type. So either we redesign around not having a top type and instead letting each related family of reference types specify whether it needs boxed scalars (and possibly what kind of boxed scalars), or it seems we need to remove i31ref (likely forever).

@tlively
Copy link
Member

tlively commented Sep 2, 2020

It seems like this simplest and least intrusive change would be to split externref off of the type hierarchy. Applications that need to use subtyping and casting could just use anyref in its place. But just to check my understanding, #76 means that there would be some conversion overhead under this solution because JS numbers as anyrefs would need to be boxed to be differentiated from i31refs?

@RossTate
Copy link
Contributor

RossTate commented Sep 2, 2020

You would also have to either disallow imported types from representing externref or also split imported types off of the type hierarchy.

@jakobkummerow
Copy link
Contributor

#76 means that there would be some conversion overhead under this solution because JS numbers as anyrefs would need to be boxed to be differentiated from i31refs?

That boxing would have to happen only if externref and i31ref overlap. It's a specific instance of a more general problem though: an externref pointer, being host-controlled, can have any arbitrary bit pattern. If we specify i31ref in Wasm with the expectation that it is encoded by giving one or more bits in a pointer particular meaning, then that means that in general, externref pointers could accidentally "look like an i31", which might lead to weird/unspecified behavior. Making sure that there is no overlap between externref and i31ref (or i31 type dimensions) means that externref pointers (including JS numbers) never have to go out of their way to accommodate the existence of i31 in Wasm.

@titzer
Copy link
Contributor Author

titzer commented Sep 25, 2020

@rossberg any thoughts?

If externref is a completely foreign reference type and split off the hierarchy, then an easy way to incorporate this into the binary format would be instead of the type construcotrs (ref null T) and (ref T) being 0x6C and 0x6B respectively (if encoded as a single byte), we could do:

(ref T) : 0x68 (0b01101000)
(ref null T) : 0x69 (0b01101001)
(ref i31 T) : 0x6A (0b01101010)
(ref i31 null T) : 0x6B (0b01101011)

Basically, the low order two bits indicate the two orthogonal dimensions of the type: whether null or i31 values are allowed.

Also, currently anyref has its own type constructor, but we could just as easily specify that heap type index -1 (i.e. invalid) means any, so anyref would be encoded by one of the four above type constructors, followed by a -1 for the heap type index.

@dcodeIO
Copy link

dcodeIO commented Sep 25, 2020

What would be the non-canonical type of i31ref? (ref i31 i31)?

        any               i31
   ┌─────┼────┬─────┐
extern  func  eq   exn
         │    │
   signature* │
              │
          ┌───┴────┐
       struct*   array*

Dimensions: null? i31?

Given the i31 and extern overlap, I guess what's proposed above is

          any        i31     extern
    ┌──────┼─────┐
  func     eq   exn
    │      │
signature* │
           │
       ┌───┴────┐
    struct*   array*

Dimensions: null? (i31 | extern)?

disallowing mixing (ref i31 extern T)?

Makes me wonder, how does (ref null extern) work if the underlying extern pointer can have any bit pattern, as mentioned above, and is the exact pattern indicating null? Edit: Thinking further, that's probably covered by the type system, and while a host can theoretically pass in the bit pattern for null into an (ref extern), that's not a problem in practice?

@titzer
Copy link
Contributor Author

titzer commented Sep 25, 2020

I'm not sure what is meant by the "non-canonical type" of i31ref. Under this idea, i31 is not a type, in the same way that null is not a type.

I am also skeptical about the claim that externref values can have any bit pattern. If that were the case, then it could not have a common supertype with any other wasm value, heap or not, regardless of i31, since it would then be impossible to distinguish it (e.g. for a downcast) based on its bits alone. That's part of what that other issue was getting at: externrefs would have to be tagged by either being boxed or the top type being a fat pointer. The mapping of JavaScript numbers onto Wasm numbers is a semantic concern, and we're talking about representations here.

I think it's closer to the truth that a Wasm engine generally knows something about the encoding of externrefs of its host environment (e.g. they must be valid pointers into a common address space) and perhaps can work around those encodings if necessary. They generally aren't arbitrary bits.

@dcodeIO
Copy link

dcodeIO commented Sep 25, 2020

I'm not sure what is meant by the "non-canonical type" of i31ref. Under this idea, i31 is not a type, in the same way that null is not a type.

Hmm, interesting, so the closest type is (ref i31 eq) if non-nullable respectively (ref null i31 eq) if nullable, with initializers (i31.new (i31.const 0)) or (ref.null eq).

Makes me wonder about solving the i31 and extern overlap using a dimension then, since that'd imply removing externref as well which is already in phase 4 as part of reference types.

I am also skeptical about the claim that externref values can have any bit pattern. If that were the case, then it could not have a common supertype with any other wasm value

I am skeptical about externref in general. Once there's anyref, there isn't really much use for it anymore, and the last minute change of removing anyref and introducing externref made to reference types was already quite surprising to me. If there is indeed an overlap introduced here, which I'm not sure about anymore given your comment, then I think re-evaluating whether we really need the distinction might make sense. For instance, a clean approach to solve the overlap might be to rename externref to anyref in reference types (text format only change), and then build GC on top of that anyref, as initially planned. Or what am I missing?

@rossberg
Copy link
Member

This is not a bad idea. It is one step further in the direction of a union type -- really, (ref null i31 $t) is (ref (union null i31 $t)). That raises the question, though, why stop there? Why shouldn't we also enable (ref (union null i31 $t $u $v)), which allows even more refined tracking of possible sets of references, and likewise allows eliding more checks.

In principle, any such refinement is useful. However, union types have rather problematic properties in terms of type-checking complexity and worst cases, which can explode quickly, in particular when also combined with intersection type, forms of which have also been suggested. And it's always possible to fall back to anyref as the largest union type, for the price of losing some precision. So it's a slippery slope, and I might be a bit hesitant to go there too quickly.

OTOH, maybe i31 is enough of a common case to justify tracking it specially (like null).

@jakobkummerow
Copy link
Contributor

In the meantime, I've heard specific feedback that making i31 a ref type dimension would be more efficient/useful than the current proposal for surface languages that have primitive integers: they would typically represent such source-level integers as i31|$BoxedInt (for some custom $BoxedInt type). In the current proposal, this conceptual union type cannot be expressed more specifically than eqref, which has the downside that code processing such values must perform two checks: an i31 check, and then a checked cast to $BoxedInt. If there were a way to express this union type more accurately (either by making i31 a dimensional attribute (ref i31 $BoxedInt), or by introducing more general union types), then a single i31 check would be enough; at least if we type br_on_i31 properly.

@rossberg
Copy link
Member

Yes, that's one obvious use case. Although you can easily find similar use cases for other kinds of small unions, so that alone isn't necessarily enough to justify special casing i31.

One problem I realised with making i31 a dimension unlike other reftypes is that that creates a problem for type imports/exports (and possibly generics): naturally, you want to be able to instantiate a type import with i31, but that wouldn't be possible if it was segregated into a separate dimension.

Proper union types would provide a coherent solution, but suffer from complexity (in particular, quadratic subtyping). Maybe we need to look for some middle ground.

@taralx
Copy link

taralx commented Nov 12, 2020

Is it still quadratic if we don't support width subtyping of unions? The use cases I've gathered don't need it.

@rossberg
Copy link
Member

@taralx, do you mean that A or A | B would not be subtypes of A | B | C? Wouldn't that defeat the purpose?

@taralx
Copy link

taralx commented Nov 12, 2020

A|B wouldn't be a subtype of A|B|C, but A and B would be subtypes of both.

@rossberg
Copy link
Member

That wouldn't compose, though. Imagine you have A|T, where T is an import, and that is latter instantiated with B|C. You'd need to either disallow unions over imports or disallow instantiating imports with unions, both of which seem like rather severe restrictions.

@taralx
Copy link

taralx commented Nov 12, 2020

Let me think a bit more about it, but that might be ok. It probably requires coercive subtyping, though, which we've been avoiding so far.

@RossTate
Copy link
Contributor

Subtyping for arbitrary union types is quadratic without recursive types. Subtyping for equi-recursive types is quadratic without arbitrary union types. The combination of these features is not quadratic and in fact may not even be polynomial. At least for the subtyping rules that would enable the standard automata-based subtyping algorithms, subtyping is PSPACE-complete. And that's without taking into account the contravariance that would be introduced by function types.

@taralx
Copy link

taralx commented Nov 17, 2020

I'm left feeling like the correct thing is just to bound the "size" of types to make the problem tractable rather than restricting functionality to linear or near-linear complexity.

@RossTate
Copy link
Contributor

Unfortunately low-level structural heap types tend to be quite large. If you think of a language like Java, a class's structural type includes not only all of its fields but also all of its overridable methods (as they are part of the v-table) as well as its meta-data (e.g. the data-structures used for interface-method dispatch, casting, and reflection) and then recursively the same components of any classes referenced by those fields or methods.

rossberg pushed a commit that referenced this issue Feb 24, 2021
* Fix outdated stuff

* Update Overview.md

* Update Overview.md

* Update Overview.md

* Clarify zero table index

* Address review comments
rossberg added a commit that referenced this issue Feb 24, 2021
@rossberg
Copy link
Member

@askeksa-google, a type export applies to a type definition from the type section. That contains func, struct, or array definitions – and the aforementioned TODO would add i31 as a definition. This is orthogonal to dimensions in a reftype. So turning i31 into a dimension would move it to the wrong category, making it incompatible with exports.

Minimal toy example of a client module:

(module $A
  (type $t (import "B" "t"))
  (func $make (import "B" "make") (param i32) (result (ref $t)))
  (func $get (import "B" "get") (param (ref $t)) (result i32))

  (func (export "run") (param $x i32) (result i32)
    (call $get (call $make (local.get $x)))
  )
)

Now, by basic principles of abstraction, we need both the following implementations for B to be possible (among others):

(module $B_boxed
  (type $t (export "t") (struct (field $n i32)))
  (func (export "make") (param $x i32) (result (ref $t))
    (struct.new (local.get $x) (rtt.canon $t))
  )
  (func (export "get") (param $r (ref $t)) (result i32)
    (struct.get $n (local.get $r))
  )
)

or

(module $B_unboxed
  (type $t (export "t") i31)  ;; note TODO in MVP.md
  (func (export "make") (param $x i32) (result (ref $t))
    (i31.new (local.get $x))
  )
  (func (export "get") (param $r (ref $t)) (result i32)
    (i31.get_s (local.get $r))
  )
)

(For the sake of simple example, I'm assuming that the module's contract implies that the parameter has only, say, 16 significant bits.)

@titzer
Copy link
Contributor Author

titzer commented Jan 21, 2022

@rossberg The above reason is why I have always thought that type imports and exports should be types, and not heap type definitions. Then the constraint on a type import is just a subtype bound, which can specify any type, including our simple union type mechanism.

@tlively
Copy link
Member

tlively commented Jan 21, 2022

It would be nice if we could put off figuring out how to make this work with type imports and exports. Without considering that proposal, it seems that we would all agree that adding a bottom type and making i31 a type dimension would be a good solution for the GC proposal.

@rossberg
Copy link
Member

rossberg commented Jan 22, 2022

@titzer, I know you do :), but that is a way more complex and much less obvious feature, and one that requires fundamental changes to Wasm's compilation model, as you know (also, to its type indexing model). To be honest, I still would not know how to design all that concretely.

It also is no substitute for exporting heap types. When a module exports some data representation, then clients may need to be able to, for example, form both nullable and non-nullable references to it (e.g., to define locals/globals/tables). Hence, requiring that choice to be made on the exporting side typically is the wrong factorisation (even if there may be use cases for that a well). Also note that null is very different from i31 in that regard.

@tlively, we can't put it off, since the suggestion would paint us into the wrong corner. What the above shows is that making i31 a property of the pointer instead of a representation type is a category mistake, even though the unboxing optimisation for i31 refs can easily mislead us into thinking otherwise. Or to put it differently: there is a fine line between designing semantics that enables certain optimisations and making optimisations the semantics itself. :)

The motivating argument of wanting to express a certain kind of type union to save some checks applies equally to other type unions. That suggests that we should approach this from a principled angle.

@rossberg
Copy link
Member

rossberg commented Feb 2, 2022

While thinking more about how to pull off union types, I reread the OP of this issue and realised that some of the assumptions are actually off. Sorry that I had not noticed this earlier!

Concretely, instead of (or perhaps, in addition to) a single i31ref type which is in the middle of the hierarchy between anyref and (ref $T)

Ah, the type i31ref is not a supertype of any (ref $t)! In fact it is currently disjoint to all such types. That makes a significant difference regarding your conclusions further on. Specifically:

We've been discussing the additional checks [similar to null checks?] that would be needed for i31 tag checks in the context of downcasts from i31ref to a more specific reference type. In particular, they would be implicitly needed because i31ref is proposed as a subtype of anyref above all declared heap types (AFAICT).

I don't think there are nearly as many redundant checks as you seem to assume.

Let's walk through a few representative examples.

  • Example A: a representation that is the union of multiple struct types – for simplicity's sake, let's assume just two:

    type $i32  =  struct i32
    type $str  =  array i8
    

    In the MVP, a value with such a representation can by typed as dataref. That rules out i31 altogether, i.e., there is no need to ever check for it, neither implicitly nor explicitly. Code processing a value $x with that representation will typically consist of the following sequence of type casts:

    (local.get $x)   ;; stack has (ref data)
    (br_on_cast $on_i32 (type $i32))
    (br_on_cast $on_str (type $str))
    (unreachable)
    

    Nothing would change here with the addition of an i31 dimension. You'd use type (ref data) as before, and the generated code stays exactly the same.

  • Example B: same as A, but the representation also includes an i31 case. To allow that, the type of $x has to be generalised to eqref, yielding the following instruction sequence for processing it in the MVP:

    (local.get $x)   ;; stack has (ref eq)
    (br_on_i31 $on_i31)
    (ref.as_data)    ;; now stack has (ref data)
    (br_on_cast $on_i32 (type $i32))
    (br_on_cast $on_str (type $str))
    (unreachable)
    

    The i31 dimension would allow us to use the slightly more precise type (ref i31 data) instead of eqref, which makes the downcast with ref.as_data unnecessary. Nothing else changes, once we know it's pure data, the i31 case does not matter anymore in the MVP either.

    Furthermore, in the MVP we actually have eq = i31 xor data! That means that an engine can already deduce that the reference is (ref data) after the test for i31, and a simple peep hole optimiser can eliminate the ref.as_data as well.

  • Example C: same as B, but with only one struct case, say $i32. Again, the MVP would have to use eqref:

    (local.get $x)   ;; stack has (ref eq)
    (br_on_i31 $on_i31)
    (ref.as_data)    ;; now stack has (ref data)
    (br_on_cast $on_i32 (type $i32))
    (unreachable)
    

    This time, the i31 would help, because we could use type (ref i31 $i32) for $x. That simplifies the above to:

    (local.get $x)   ;; stack has (ref i31 $i32)
    (br_on_i31 $on_i31)  ;; after this, we have (ref $i32)
    (br $on_i32)  ;; unconditional branch
    

So, the only case where an i31 dimension has a real benefit is a binary union like Example C, where there is only a single other case besides the i31. For all other examples, the MVP type hierarchy can already express essentially the same information. (Moreover, it would be straightforward to extend the hierarchy such that this remains so even if we extended eq in the future, i.e., by adding an abstract type for i31-or-data.) So overall, I think the win of an i31 dimension would be relatively small.

The biggest cost in the examples is actually the unnecessary br_on_str cast in the end, plus the seemingly redundant unreachable instruction. To eliminate it in more than just the special case of Example C, proper union types would be required. Then the sequence for Example B could be:

(local.get $x)   ;; stack has (ref (i31 or $i32 or $str))
(br_on_i32 $on_i31)  ;; after this, stack has (ref ($i32 or $str))
(br_on_cast $on_i32 (type $i32))  ;; after this, stack has (ref $str)
(br $on_str)  ;; this branch is unconditional

Obviously, this is not essential for the MVP. But if we wanted to optimise union representations better eventually, then that's the more useful route. And that requires no change to the MVP.

Does that make sense?

@titzer
Copy link
Contributor Author

titzer commented Feb 2, 2022

Ah, the type i31ref is not a supertype of any (ref $t)! In fact it is currently disjoint to all such types.

Ok, I did misunderstand this. But this means that if i31ref is not a supertype of (ref $T), then an explicit operation is required to turn a (ref $T) into an i31ref. This would just be subsumption otherwise. At the representation level, this is a nop.

If i31ref is indeed a completely disjoint (basically singleton) type, then AFAICT it already represents the union of dataref and i31 (but not below it, to the side). That has limited utility except for the very specific use case of implementing parametric types by erasure. It's a far more dynamically typed mechanism that what I expected we were designing.

As for the examples above, I think example C will be far more common than examples A or B, and I do think it's an important win. Otherwise, we are introducing a requirement that engines represent i31 efficiently (by not boxing and being able to unify them with references) but intentionally leaving out program's ability to specify more precise types, forcing them to do more case analyses (via casts) and redundant operations that would normally just be subsumption. I would take this even further than case C and say that accesses of struct fields of that have i31 in the reference type dimension should be allowed, and it's the engine's responsibility to dynamically test (and trap) on i31, like it does with null. We'll get the best code that way (in particular, a smart engine can unmap the low 2-8GiB of the engine's address space and all i31 values will cause a hardware trap, thus making the i31 check completely free, like checking for null).

As a path forward, we could design the general mechanism of the i31 dimension but only allow specifying i31 with (ref data). That's effectively equivalent (except it allows subsumption of a struct reference to it), would result in exactly the same casts as above, and would be forwards-compatible with generalization.

@rossberg
Copy link
Member

rossberg commented Feb 2, 2022

Ah, the type i31ref is not a supertype of any (ref $t)! In fact it is currently disjoint to all such types.

Ok, I did misunderstand this. But this means that if i31ref is not a supertype of (ref $T), then an explicit operation is required to turn a (ref $T) into an i31ref. This would just be subsumption otherwise. At the representation level, this is a nop.

I'm not sure I follow. Currently, there is no type with i31 representation that inhabits any (ref $t). Once I add the aforementioned todo to be able to define $t = i31, then type (ref $t) would in fact be equivalent to i31ref, so subtyping isn't even needed.

If i31ref is indeed a completely disjoint (basically singleton) type, then AFAICT it already represents the union of dataref and i31 (but not below it, to the side).

Hm, I'm still lost. The union of data and i31 is eq in the MVP. And when lifted to value types as references, eqref is the union of i31ref and dataref (modulo cleaning up nullability via #266). Nothing much else is needed for the time being, AFAICS, it all composes as one would expect.dataref and i31 are not in the same category (one is a value type, the other a heap type), so they are just incomparable.

That has limited utility except for the very specific use case of implementing parametric types by erasure. It's a far more dynamically typed mechanism that what I expected we were designing.

As for the examples above, I think example C will be far more common than examples A or B, and I do think it's an important win.

I have to respectfully disagree. AFAIAA, example C has come up in none of the compilers that have been prototyped so far. They all needed either A or B. I would expect B to be the most common case by far, basically every managed language will use it in some form or the other. I can imagine very few scenarios where C would come up and not be a subset of some larger B. Do you have a specific scenario in mind?

@titzer
Copy link
Contributor Author

titzer commented Feb 2, 2022

I can imagine very few scenarios where C would come up and not be a subset of some larger B. Do you have a specific scenario in mind?

Yes, it comes up when, e.g. packing algebraic data types. For example, if some cases can be packed into 31 bits and others need to be boxed. I will use this in my Virgil compiler, first for native targets (including Wasm sans GC), and then when I retool it to target Wasm iso-recursive types. It also comes up when, ironically enough, implementing dynamic languages like JavaScript. There, you might want i31 (aka SMI) to be unioned with (ref $js_object). Distinguishing between SMIs and JavaScript objects (as opposed to other structs in the language runtime) is extremely frequent.

I know we need to be focused on use cases not unnecessarily drag the discussion towards generalities, but the compilers prototyped so far have taken only one approach to implementing polymorphism, which is erasure and dynamic typing. The inclusion of i31 at all seems to have only been motivated by that one use case. i31 is a considerably more general mechanism at the engine level and affords many opportunities like the ones I just described. It's an implementation-level union and IMHO is analagous to the null value.

Forgive my slight misunderstanding above. I did actually implement all of this in Wizard, but that was a year ago.

In the last part I wrote:

As a path forward, we could design the general mechanism of the i31 dimension but only allow specifying i31 with (ref data). That's effectively equivalent (except it allows subsumption of a struct reference to it), would result in exactly the same casts as above, and would be forwards-compatible with generalization.

AFAICT that still holds.

@rossberg
Copy link
Member

rossberg commented Feb 3, 2022

Yes, it comes up when, e.g. packing algebraic data types. For example, if some cases can be packed into 31 bits and others need to be boxed.

Yes, that's the one example I could think of. But it only applies to the edge case where you have (a) exactly one non-nullary constructor, and (b) more than one nullary constructor (otherwise you could presumably use null). Not exactly the most common case. Why is optimising that case more pressing than any other?

It also comes up when, ironically enough, implementing dynamic languages like JavaScript. There, you might want i31 (aka SMI) to be unioned with (ref $js_object). Distinguishing between SMIs and JavaScript objects (as opposed to other structs in the language runtime) is extremely frequent.

Hm, any JS implementation I have seen requires more than two cases – there are boxed numbers, strings, null, etc., none of which are JS objects. So JS rather is an example of a language that would not benefit.

I know we need to be focused on use cases not unnecessarily drag the discussion towards generalities, but the compilers prototyped so far have taken only one approach to implementing polymorphism, which is erasure and dynamic typing.

Well, as I said, the B case comes up in practically all managed languages, and is far more frequent than the C case. C hardly ever comes up without B occurring as well. Even your ADT example is a case in point. I remain unconvinced that special semantics is justified for optimising such edge cases.

If we did special-case that, we'd essentially be designing an inherent performance cliff into the type system. That seems undesirable.

In the last part I wrote:

As a path forward, we could design the general mechanism of the i31 dimension but only allow specifying i31 with (ref data). That's effectively equivalent (except it allows subsumption of a struct reference to it), would result in exactly the same casts as above, and would be forwards-compatible with generalization.

AFAICT that still holds.

I didn't respond to that earlier, because I don't understand what that would achieve. As mentioned above, the type (ref i31 data) is effectively indistinguishable from (ref eq) at the moment, so rather redundant. Example C would require (ref i31 $t) specifically. So what would you gain under this restriction?

Also, any path forward should be generalisable to proper unions somehow. It's not clear to me how this would. AFAICS, it would be a design dead end. And let's not forget, it likewise remains completely unclear how to reconcile it with type exports.

@askeksa-google
Copy link

I know we need to be focused on use cases not unnecessarily drag the discussion towards generalities, but the compilers prototyped so far have taken only one approach to implementing polymorphism, which is erasure and dynamic typing. The inclusion of i31 at all seems to have only been motivated by that one use case. i31 is a considerably more general mechanism at the engine level and affords many opportunities like the ones I just described. It's an implementation-level union and IMHO is analagous to the null value.

I think this is very well put. A union between i31 and a struct type is the canonical use case for i31. It's just one use case, true, but it can be rather prolific in languages where integers are considered objects.

For instance, in Dart, the redundant cast (that would be avoided by the union) occurs in mainly the following situations:

  • Calling a method on any supertype of int, i.e. Object, num or Comparable.
  • Using a nullable int? that was promoted by a null check.

Since an i31 check is extremely cheap (typically one or two instructions, including the branch), avoiding this cast could be quite significant.

@rossberg
Copy link
Member

rossberg commented Feb 3, 2022

@askeksa-google:

I think this is very well put. A union between i31 and a struct type is the canonical use case for i31.

The canonical use case is the union between i31 and a set of structs. The single struct case is no more special than the 2 structs case is. ;)

Does Dart not use a larger union? Of course, there will be places where you have narrowed it down to a binary union, but arent't there equally places where you have narrowed it down to a ternary union? Or a binary union not involving i31?

Since an i31 check is extremely cheap (typically one or two instructions, including the branch), avoiding this cast could be quite significant.

Now I'm confused. Wouldn't it be more significant if it was not cheap?

@askeksa-google
Copy link

Does Dart not use a larger union? Of course, there will be places where you have narrowed it down to a binary union, but arent't there equally places where you have narrowed it down to a ternary union? Or a binary union not involving i31?

I suppose num could be represented as i31 | _BoxedInt | _BoxedDouble (or _BoxedInt | _BoxedDouble if not using i31), which could save a cast in a few places, mostly when doing math directly on num values.

Since an i31 check is extremely cheap (typically one or two instructions, including the branch), avoiding this cast could be quite significant.

Now I'm confused. Wouldn't it be more significant if it was not cheap?

What I mean is that having an i31 check instead of a cast (which the union type enables, for instance in the int supertype method call case) is a significant saving because an i31 check is cheaper than a cast.

To be clear: having i31 as a type dimension or being able to express it using more general union types is all the same for my use cases. I just find i31 of dubious utility without any of the two (at least it becomes more of an unclear trade-off).

@rossberg
Copy link
Member

rossberg commented Feb 3, 2022

@askeksa-google, the primary motivation for i31 is saving space, allocations, and garbage, which is quite significant. Cheaper checks are more like a nice by-product. Example C is the only one for which an i31 dimension would allow a cheaper check.

@askeksa-google
Copy link

Yes, that's one side of the trade-off. The other side is that using i31 for integers comes with additional overhead on integer boxing and unboxing (independently of this issue), plus, in the absence of i31/struct unions, additional overhead on accessing objects that might be integers.

@rossberg
Copy link
Member

rossberg commented Feb 3, 2022

The other side is that using i31 for integers comes with additional overhead on integer boxing and unboxing (independently of this issue), plus, in the absence of i31/struct unions, additional overhead on accessing objects that might be integers.

Not sure I understand what you are referring to. Compared to what? If we did not have i31, then these cases would be even more expensive, right?

@askeksa-google
Copy link

I'll illustrate with some concrete examples of what various situations would look like in dart2wasm.

Boxing an int

Without i31:

(i32.const intClassId)
(local.get value)
(struct.new $BoxedInt)

With i31:

(local.get value)
(i64.const 33)
(i64.shl)
(i64.const 33)
(i64.shr_s)
(local.get value)
(i64.eq)
(if
  (local.get value)
  (i32.wrap_i64)
  (i31.new)
else
  (i32.const intClassId)
  (local.get value)
  (struct.new $BoxedInt)
end)

(There are several ways to perform the range check. They all seem to be similarly involved.)

Unboxing an int

Without i31, from an Object:

(ref.cast $BoxedInt)
(struct.get $BoxedInt 1)

Without i31, from an int?:

(struct.get $BoxedInt 1)

With i31, without union type:

(block $done
  (block $is_i31
    (br_on_i31 $is_i31)
    (ref.as_data)
    (ref.cast $BoxedInt)
    (struct.get $BoxedInt 1)
    (br $done)
  end)
  (i31.get_s)
  (i64.extend_i32_s)
end)

With i31, with union type, from an Object:

(block $done
  (block $is_i31
    (br_on_i31 $is_i31)
    ;; ref.as_data saved here
    (ref.cast $BoxedInt)
    (struct.get $BoxedInt 1)
    (br $done)
  end)
  (i31.get_s)
  (i64.extend_i32_s)
end)

With i31, with union type, from an int?:

(block $done
  (block $is_i31
    (br_on_i31 $is_i31)
    ;; ref.as_data and ref.cast saved here
    (struct.get $BoxedInt 1)
    (br $done)
  end)
  (i31.get_s)
  (i64.extend_i32_s)
end)

Method call on int supertype:

Without i31:

(struct.get $Top 0)
(i32.const selectorOffset)
(i32.add)
(call_indirect)

With i31, without union type:

(block $done
  (block $is_i31
    (br_on_i31 $is_i31)
    (ref.as_data)
    (ref.cast $Top)
    (struct.get $Top 0)
    (i32.const selectorOffset)
    (i32.add)
    (call_indirect)
    (br $done)
  end)
  ;; Perform operation on i31
end)

With i31, with union type:

(block $done
  (block $is_i31
    (br_on_i31 $is_i31)
    ;; ref.as_data and ref.cast saved here
    (struct.get $Top 0)
    (i32.const selectorOffset)
    (i32.add)
    (call_indirect)
    (br $done)
  end)
  ;; Perform operation on i31
end)

With i31, without union type, known non-i31 receiver:

(ref.as_data)
(ref.cast $Top)
(struct.get $Top 0)
(i32.const selectorOffset)
(i32.add)
(call_indirect)

With i31, with union type, known non-i31 receiver:

(ref.as_non_i31)
(struct.get $Top 0)
(i32.const selectorOffset)
(i32.add)
(call_indirect)

Conclusion

With or without an i31/struct union type, using i31 for integers comes with a definite code size overhead. Whether using i31 gives better performance depends on how the saved allocations stacks up against the overhead from the extra instructions executed. Without the union type, the odds are stacked more heavily towards the overhead outweighing the savings from the allocations.

@titzer
Copy link
Contributor Author

titzer commented Feb 3, 2022

@rossberg The term "edge case" is inherently subjective so we might get stuck in another holding pattern debating it, and that gets long with technical details, so I'll have to be brief.

It also comes up when, ironically enough, implementing dynamic languages like JavaScript. There, you might want i31 (aka SMI) to be unioned with (ref $js_object). Distinguishing between SMIs and JavaScript objects (as opposed to other structs in the language runtime) is extremely frequent.

Hm, any JS implementation I have seen requires more than two cases – there are boxed numbers, strings, null, etc., none of which are JS objects. So JS rather is an example of a language that would not benefit.

Of course I mean implementation-level object; boxed numbers, strings, null, undefined, which are all on-heap constructs in V8, and all on-heap constructs have a supertype which has a map. There's a complex hierarchy underneath that supertype of all heap objects, as we've both slogged through over the years. But checking the map word is extraordinarily common.

But it only applies to the edge case where you have (a) exactly one non-nullary constructor, and (b) more than one nullary constructor (otherwise you could presumably use null). Not exactly the most common case. Why is optimising that case more pressing than any other?

Just because a constructor isn't nullary doesn't mean its arguments can't be packed into 31 bits, e.g. if they are small primitives. A good example is the ADT I use to represent operands in the backend of the Virgil compiler. I specifically chose a limit of, e.g. 1 million vregs so that the VREG number, plus a tag, plus a hint, fits into a small number of bits, so it can be packed. But other operand cases might have a reference to something, so they are big and boxed.

ADTs are the example of the source-level construct that benefits most from having i31 as a representation choice for the source compiler to pack them. Since they are available to user programs then I don't think it's warranted to think of any user program's choice as an edge case.

@tlively
Copy link
Member

tlively commented Feb 4, 2022

Thanks for the concrete examples, @askeksa-google! What would it take to do the benchmarking to figure out the relative costs you mention in your conclusion?

@rossberg
Copy link
Member

rossberg commented Feb 7, 2022

@askeksa-google, thanks for the concrete examples. They match what I've seen as well. So the main effect of an i31 dimension would be one or two extra casts saved in places where you are only expecting an int64, as an instance of my Example C.

As @tlively says, it would be great to know how these affect the overall performance in order to justify special features enabling these casts' elimination in the MVP. Because, of course, the MVP requires many undesirable casts.

@titzer, okay, let's call it special case, that's perhaps more neutral. Objectively, it is just one of many similar cases. I think we can agree that it does not differ from the others qualitatively, the main question is how much it differs quantitatively. Do you agree?

Of course I mean implementation-level object; boxed numbers, strings, null, undefined, which are all on-heap constructs in V8

Well, in the proposal, the analogue to a map is an RTT. The analogue to the type of heap objects with a map is dataref. A language targeting the MVP can readily lower to that. Distinguishing between i31ref and dataref is the case I mentioned above, which presumably can already be optimised by an engine.

(I don't think a VM targetting Wasm would want to introduce custom maps and a custom supertype on top of RTTs and dataref. The only reason to do so would be if you wanted to do some form of table dispatch on those maps. AFAIK, not even V8 does that. And the space overhead would be so substantial that I doubt it is a viable implementation strategy. If such dispatch was needed, then Wasm ought to introduce suitable primitives for it.)

Just because a constructor isn't nullary doesn't mean its arguments can't be packed into 31 bits, e.g. if they are small primitives.

True, but still this is only a (rather incidental) fragment of a general set of cases of algebraic data types. Why do you think it deserves special attention in the MVP, where we have already accepted redundant casts in many other places?

Our criteria for putting some special handling into the MVP should be that (a) it can be generalised to the general case in the future, (b) it is forward-compatible with other anticipated features, and (c) the win is significant enough that missing out on it would put the MVP's success at risk. But AFAICS, the dimension idea is quite contrary to (a) and (b) as I argued above, and so far we lack evidence for (c).

@conrad-watt
Copy link
Contributor

conrad-watt commented Feb 7, 2022

If we were to punt this to post-MVP, would this end up looking like a proposal for tagged unions? (edit: discussed above)

Echoing @rossberg, my feeling is also that special-casing unions specifically with i31 in order to save a cast or two in the MVP is somewhat undesirably ad-hoc.

@askeksa-google
Copy link

As a first step towards some quantitative data on this, I did an experiment on dart2wasm where the type translation uses eqref for all int supertypes and the code generator inserts all necessary casts due to the less precise types.

One hurdle I ran into is that ref.as_data, as currently specified, does not propagate nullness, like ref.cast, but rather has an implicit ref.as_non_null built in. This makes nullable casts from eqref quite involved.

What is the reasoning behind the ref.as_data / ref.cast split anyway? It seems ref.cast could just as well implicitly check for data, just like it (and other instructions) implicitly check for null.

Anyway, it turned out that even though extra casts were inserted in many places, the overall code size increase from these was very small (around 0.1%), maybe because the change of the types also causes casts to be removed in other places.

In terms of performance, I tried various benchmarks, but did not observe any significant change.

Next step would be adding the actual i31 checks.

@titzer
Copy link
Contributor Author

titzer commented Feb 8, 2022

Based on the discussion today, my position is now that I think we can accomplish most of the goals of this proposal by adding a pointer heap type.

@conrad-watt
Copy link
Contributor

Also, just to explicitly record here, I'm supportive of a type constructor for special-cased unions with i31, rather than a new type dimension, if experiments suggest that it would be beneficial (either now or perhaps more likely in post-MVP).

@tlively
Copy link
Member

tlively commented Nov 1, 2022

After a lot of discussion of this topic, we've settled on keeping i31 as it is (although there are still questions about whether it is well-motivated enough to include in the MVP at all). In post-MVP, we expect to have some sort of general variant type that can contain i31 as well. Since we will not have i31 as a type dimension, I'll close this issue.

@tlively tlively closed this as completed Nov 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants