Skip to content
This repository has been archived by the owner on Nov 3, 2021. It is now read-only.

Should local.tee / br_if preserve subtypes? #55

Closed
lukewagner opened this issue Sep 18, 2019 · 19 comments
Closed

Should local.tee / br_if preserve subtypes? #55

lukewagner opened this issue Sep 18, 2019 · 19 comments

Comments

@lukewagner
Copy link
Member

Currently, local.tee validation requires C.locals[x] = t and br_if validation requires C.labels[l] = [t?]. This causes the following two examples to fail validation in the spec interpreter when intuitively they seem valid:

(module
  (func (param $p funcref)
    (local $x anyref)
    (local $y funcref)
    local.get $p
    local.tee $x
    local.set $y
  )
)
(module
  (func (param $p funcref)
    (block $b (result anyref)
      (block $c (result funcref)
        local.get $p
        (br_if $b (i32.const 1))
        br $c
      )
    )
    drop
  )
)

If instead the = was replaced with <: in the abovementioned validation rules, then I think these examples would validate.

Trying to think if this actually matters, you could imagine that it would be useful to have the property that:

  (local.set $local1 (local.get $x))
  (local.set $local2 (local.get $x))

was always equivalent to:

  (local.tee $local1 (local.get $x))
  local.set $local2

so that a size-optimizer could simply recognize and replace this pattern.

@rossberg
Copy link
Member

This probably is a duplicate of #48, see the discussion there for some rationale, especially this comment. The short answer is that such a "covariant" relaxation cannot be done in the general case (we'd be back to needing lubs), and it would be a bit inconsistent to only have it in simple cases. It also means less canonical typing rules, although we may not care about that.

@lukewagner
Copy link
Member Author

With the changes to select, though, it seems like there are two cases: (1) MVP select, in which there is no interesting subtyping to consider, just 4 number types, none the subtype of the other, (2) the new explicitly-typed select, where the result type is exactly the type parameter. IIUC, any future multiple-use-of-t instructions would only have the explicitly-parameterized form, making MVP select an oddity. Moreover, select is unlike tee.local and br_if in that it does not check type equality with some field of the context, which is what the proposal seems based on. Altogether, it seems like tee.local, br_if, and any future single-use-of-t instructions aren't really like select, and so their typing rules could be relaxed without breaking symmetry.

@rossberg
Copy link
Member

So, in all relevant cases (new select, tee, br_if), there are some operands, some type t that they have to match, and something to return which is supposed to be the most precise type the operands support (i.e., the lub).

Is the difference you are getting at that in the case of select, the type t is a local annotation, whereas in the others it is imposed by some separate definition? So in the select case, one could just locally adjust t to be the most precise type if desired.

That makes sense, but that difference is somewhat incidental. What if there was an instruction like, for the sake of argument, select_and_set $x that takes operands like select, but instead of just returning the selected value it also stores the other value in the local $x. This instruction would not have a local type annotation, because t is determined via $x. So the nicely aligned situation from above would not hold, and you'd again need a lub to remain symmetric with the relaxed local.tee.

@lukewagner
Copy link
Member Author

I see your point, but if we adopt the general design constraint that lubs are categorically avoided by explicit type parameters (as we've done with new select), then select_and_set would take an explicit t (which the validation rule requiring t to be a subtype of $x).

@RossTate
Copy link

Wouldn't select_and_set just check that the types of both arguments are subtypes of the local's declared type? There's no need for a lub.

It makes sense for select to be an odd case because it's essentially a shorthand. Note that the construct that select is a shorthand for requires an explicit type annotation, so it makes that select needs a type annotation.

In #48, you argue that br_if would need lub's, but that's not the case. Nor is it the case for local.tee.

In my experience, a good conservative design (which is the stage y'all are at right now) would both avoid the need for lubs and ensure variance with respect to subtyping. The latter because lubs might be costly to compute and constraining to require the existence of - the former because it makes code more robust to changes (e.g. optimizations or refinements to types of imported components).

@rossberg
Copy link
Member

rossberg commented Oct 1, 2019

@RossTate, you may be misreading this thread. In particular, we are all for avoiding lubs, and I proposed some earlier changes specifically to ensure that, see issue #42 and respective PR #43 (also presented at the last f2f meeting in June).

I'm not saying that br_if and local.tee would need lubs, but that adopting the relaxation suggested for them, if applied consistently throughout the language, would reintroduce lubs in other cases, like select or hypothetical select_and_set, unless we add extra type annotations for something like the latter.

Wouldn't select_and_set just check that the types of both arguments are subtypes of the local's declared type? There's no need for a lub.

To type it consistently with a relaxed br_if and local.tee, that lub would be needed. Or a secondary type annotation.

It makes sense for select to be an odd case because it's essentially a shorthand. Note that the construct that select is a shorthand for requires an explicit type annotation, so it makes that select needs a type annotation.

I don't think that is a meaningful distinction, especially since both br_if and local.tee can be viewed as shorthands in the same way, but do not require an annotation under the proposed change. (Although strictly speaking, of the three only br_if is actually macro-expandable without some let-like or stack-shuffling constructs.)

@lukewagner
Copy link
Member Author

@rossberg But what's wrong with saying that, hypothetically, select_and_set would have an explicit type parameter? Given the goal of "no lubs", this would be what you'd expect for any instruction with a single t used multiple times.

@rossberg
Copy link
Member

rossberg commented Oct 1, 2019

@lukewagner, nothing principally wrong, just a bit weird (given that the local implies a type already, which almost always is the same) and would cost an extra subtype check.

And for br_if and local.tee would still be the odd ones out in that they'd be the only rules with an explicit subtype premise, when for everything else the implicit subsumption rule suffices.

But neither is a big deal, I agree.

I'll look into making the change, working through the bulk proposal right now, ref-types are next.

@lars-t-hansen
Copy link
Contributor

@rossberg, where are we with this? As a process matter, is this not a compatible change that we could choose to make later, if we wish? It would be nice to start moving ref-types toward Phase 4.

@rossberg
Copy link
Member

Yeah, I've gotten hold back by work on the bulk proposal and its missing pieces, since I intended to rebase this one on that before continuing. I'd also be fine with deferring the relaxation, depends on how strongly @lukewagner feels about it.

@aheejin
Copy link
Member

aheejin commented Dec 4, 2019

Where are we with this? Currently local.tee's type is the type of local and not the value, right? I think it should stay that way.

Correct me if I'm mistaken, but I think local.tee is supposed to be a shorthand for local.set followed by local.get, and if we make local.tees type the type of value (not the local), they are not equivalent anymore. For example, when $x is a local with type anyref,

(local.set $x (ref.null))
(local.get $x)

This leaves anyref on the stack.

(local.tee $x (ref.null))

If local.tee's type is a local type, this expression's type also becomes anyref and the two are equivalent, but if it is a value's type (here nullref), the two expressions are not equivalent anymore. There are many optimization passes in the toolchain that changes one expression to the other as well.

@lukewagner
Copy link
Member Author

I'm not sure why it's "supposed" to be a local.set followed by a local.set; if you look at the word "tee", it means taking a single value and sending it to two places, which would naturally imply the value keeps its same type on both branches. E.g., if you tee to a file, it doesn't write to the file and then read out of the file...

This isn't especially important, but I initially filed the issue b/c it was surprising (given the name) which seems like a reason for improvement given that the general rule would be sound and it doesn't seem like it would be any harder to implement.

@kripken
Copy link
Member

kripken commented Dec 5, 2019

@lukewagner

Personally my intuition matches @aheejin's, that a tee is mentally equivalent to a set and a get, but I definitely see your point as well.

More importantly, about that last example,

(local.set $local-of-general-type
 (produce-specific-type)
)
(local.get $local-of-general-type)

=>

(local.tee $local-of-general-type
 (produce-specific-type)
)

That's something optimizers currently do, and it's very useful (as is the reverse, for other reasons). If we make those two things not 100% equivalent then that will make things more complex.

At the same time I get your point from earlier,

(use-specific-type
 (local.tee $local-of-general-type
  (produce-specific-type)
 )
)

That is a nice compact pattern that your rule would allow, so there are definitely arguments on both sides here. But the former pattern, that @aheejin mentioned, is extremely common in practice, so I think it matters more, and I would support keeping things as they are.

@lukewagner
Copy link
Member Author

@kripken IIUC, the former rewrite pattern would continue to be allowed in the direction you indicate. Rewriting in the opposite (expanding) direction would also be possible if you make the local type more specific or (if that's not possible b/c of other uses) introduce a separate local. In general the more general rules should only introduce more opportunities to shrink to the optimizer.

@kripken
Copy link
Member

kripken commented Dec 5, 2019

@lukewagner

Yes, that direction would still work, but the type changed - that's why I wrote that things get "more complex". It makes it a little more complex to think about, and tools may need adjustment. Losing the simplicity of 100% equivalence seems a little unfortunate to me.

aheejin added a commit to aheejin/binaryen that referenced this issue Dec 6, 2019
According to the current spec, `local.tee`'s return type should be the
same as its local's type. (Discussions on whether we should change this
rule is going on in WebAssembly/reference-types#55, but here I will
assume this spec does not change. If this changes, we should change many
parts of Binaryen transformation anyway...)

But currently in Binaryen `local.tee`'s type is computed from its
value's type. This didn't make any difference in the MVP, but after we
have subtype relationship in WebAssembly#2451, this can become a problem. For
example:
```
(func $test (result funcref) (local $0 anyref)
  (local.tee $0
    (ref.func $test)
  )
)
```
This shouldn't validate in the spec, but this will pass Binaryen
validation with the current `local.tee` implementation.

This makes `local.tee`'s type computed from the local's type, and makes
`LocalSet::makeTee` get a type parameter, to which we should pass the
its corresponding local's type. We don't embed the local type in the
class `LocalSet` because it may increase memory size.

This also fixes the type of `local.get` to be the local type where
`local.get` and `local.set` pair is created from `local.tee`.
aheejin added a commit to aheejin/binaryen that referenced this issue Dec 6, 2019
According to the current spec, `local.tee`'s return type should be the
same as its local's type. (Discussions on whether we should change this
rule is going on in WebAssembly/reference-types#55, but here I will
assume this spec does not change. If this changes, we should change many
parts of Binaryen transformation anyway...)

But currently in Binaryen `local.tee`'s type is computed from its
value's type. This didn't make any difference in the MVP, but after we
have subtype relationship in WebAssembly#2451, this can become a problem. For
example:
```
(func $test (result funcref) (local $0 anyref)
  (local.tee $0
    (ref.func $test)
  )
)
```
This shouldn't validate in the spec, but this will pass Binaryen
validation with the current `local.tee` implementation.

This makes `local.tee`'s type computed from the local's type, and makes
`LocalSet::makeTee` get a type parameter, to which we should pass the
its corresponding local's type. We don't embed the local type in the
class `LocalSet` because it may increase memory size.

This also fixes the type of `local.get` to be the local type where
`local.get` and `local.set` pair is created from `local.tee`.
aheejin added a commit to aheejin/binaryen that referenced this issue Dec 8, 2019
According to the current spec, `local.tee`'s return type should be the
same as its local's type. (Discussions on whether we should change this
rule is going on in WebAssembly/reference-types#55, but here I will
assume this spec does not change. If this changes, we should change many
parts of Binaryen transformation anyway...)

But currently in Binaryen `local.tee`'s type is computed from its
value's type. This didn't make any difference in the MVP, but after we
have subtype relationship in WebAssembly#2451, this can become a problem. For
example:
```
(func $test (result funcref) (local $0 anyref)
  (local.tee $0
    (ref.func $test)
  )
)
```
This shouldn't validate in the spec, but this will pass Binaryen
validation with the current `local.tee` implementation.

This makes `local.tee`'s type computed from the local's type, and makes
`LocalSet::makeTee` get a type parameter, to which we should pass the
its corresponding local's type. We don't embed the local type in the
class `LocalSet` because it may increase memory size.

This also fixes the type of `local.get` to be the local type where
`local.get` and `local.set` pair is created from `local.tee`.
aheejin added a commit to aheejin/binaryen that referenced this issue Dec 10, 2019
According to the current spec, `local.tee`'s return type should be the
same as its local's type. (Discussions on whether we should change this
rule is going on in WebAssembly/reference-types#55, but here I will
assume this spec does not change. If this changes, we should change many
parts of Binaryen transformation anyway...)

But currently in Binaryen `local.tee`'s type is computed from its
value's type. This didn't make any difference in the MVP, but after we
have subtype relationship in WebAssembly#2451, this can become a problem. For
example:
```
(func $test (result funcref) (local $0 anyref)
  (local.tee $0
    (ref.func $test)
  )
)
```
This shouldn't validate in the spec, but this will pass Binaryen
validation with the current `local.tee` implementation.

This makes `local.tee`'s type computed from the local's type, and makes
`LocalSet::makeTee` get a type parameter, to which we should pass the
its corresponding local's type. We don't embed the local type in the
class `LocalSet` because it may increase memory size.

This also fixes the type of `local.get` to be the local type where
`local.get` and `local.set` pair is created from `local.tee`.
aheejin added a commit to WebAssembly/binaryen that referenced this issue Dec 13, 2019
According to the current spec, `local.tee`'s return type should be the
same as its local's type. (Discussions on whether we should change this
rule is going on in WebAssembly/reference-types#55, but here I will
assume this spec does not change. If this changes, we should change many
parts of Binaryen transformation anyway...)

But currently in Binaryen `local.tee`'s type is computed from its
value's type. This didn't make any difference in the MVP, but after we
have subtype relationship in #2451, this can become a problem. For
example:
```
(func $test (result funcref) (local $0 anyref)
  (local.tee $0
    (ref.func $test)
  )
)
```
This shouldn't validate in the spec, but this will pass Binaryen
validation with the current `local.tee` implementation.

This makes `local.tee`'s type computed from the local's type, and makes
`LocalSet::makeTee` get a type parameter, to which we should pass the
its corresponding local's type. We don't embed the local type in the
class `LocalSet` because it may increase memory size.

This also fixes the type of `local.get` to be the local type where
`local.get` and `local.set` pair is created from `local.tee`.
@rossberg
Copy link
Member

There doesn't seem to be consensus for making this relaxation, and the current semantics is at least conservative. @lukewagner, are you fine with closing for the time being?

@lars-t-hansen
Copy link
Contributor

@lukewagner, ping.

@lukewagner
Copy link
Member Author

lukewagner commented Feb 1, 2020

(Sorry, I missed the first question.) Yes, I suppose it's fine for now.

@kripken
Copy link
Member

kripken commented Mar 19, 2021

(see some later discussion on this in WebAssembly/gc#201 ; as this is closed, I guess it makes sense to continue there)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants