Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bad interaction with tuples and local.tee using the value type #3704

Open
kripken opened this issue Mar 18, 2021 · 10 comments
Open

Bad interaction with tuples and local.tee using the value type #3704

kripken opened this issue Mar 18, 2021 · 10 comments

Comments

@kripken
Copy link
Member

kripken commented Mar 18, 2021

Investigating the feasibility of using the type of local.tee's value, instead of the local type, for the local.tee (which most of us prefer I think, but we were worried it might not be easy), I ran into this issue:

(module
 (type $N=>N (func))
 (func $0 (result f32 (ref null $N=>N))
  (local $temp (f32 funcref)) ;; tee through a supertype (funcref instead of the specific type)
  (local.tee $temp
   (tuple.make
    (f32.const 0)
    (ref.null $N=>N)
   )
  )
 )
)

Roundtripping that fails. What happens is that the tuple $temp has a funcref, a supertype of the specific one. If the tee has the type of the value, and not of the local, then this module validates. But when we roundtrip our binary writer first emits a local for each part of a tuple local, so we end up using funcref and not the specialized type, but then we fail when we try to create a tuple for the function return:

(module
 (type $none_=>_none (func))
 (type $none_=>_f32_ref?|none_->_none| (func (result f32 (ref null $none_=>_none))))
 (func $0 (result f32 (ref null $none_=>_none))
  (local $0 f32)
  (local $1 funcref)
  (local $2 f32)
  (tuple.make
   (local.tee $0
    (block (result f32)
     (local.set $2
      (f32.const 0)
     )
     (local.set $1
      (ref.null $none_=>_none)
     )
     (local.get $2)
    )
   )
   (local.get $1) ;; this has type funcref, but we need the specialized function type
  )
 )
)

It seems like a simple mapping of tuple locals to locals won't work, and the mapping needs to take into account the actual types assigned to the locals? Sounds more complex.

cc @tlively

@tlively
Copy link
Member

tlively commented Mar 18, 2021

Yep, and the same tuple local may participate in multiple tees with different specific types. The code currently maps tuple index => scratch local, but it would need to be updated to map tuple index, type => scratch local. The patch probably wouldn't be too bad, but it's definitely more complex.

@kripken
Copy link
Member Author

kripken commented Mar 18, 2021

Thanks for confirming @tlively !

Ok, I think I have a general idea of the situation now. Aside from this issue, the only other problem I can find in binaryen when switching tee's type from the local to the value is in the Flatten pass, which does the pattern of "replace a tee with a set and a get", which is no longer valid. Flatten is pretty easy to fix (at the cost of more code, but later opts should remove it), and the rest of the changes and refactoring as well. Can see that here: https://github.com/WebAssembly/binaryen/compare/spectee?expand=1

So the one tricky thing seems to be the tuple problem mentioned here. That would indeed be more complex, and also it seems like it would require us to scan the function twice (first to find those types). But that should probably not block things. However, it might not be a quick change we can do to speed up testing as was hoped in a discussion today with @jakobkummerow @manoskouk

@kripken
Copy link
Member Author

kripken commented Mar 18, 2021

cc @aheejin - I think I remember you had thoughts on this the last time it came up?

(context here is potentially switching the type of local.tee from the type of the local to the value, which could be more specific.)

@aheejin
Copy link
Member

aheejin commented Mar 19, 2021

I think the spec disallows this?

Binaryen used to use the value's type for local.tee's type, and it didn't matter in the MVP, but when I first introduced subtyping, it started to matter because local.tee's type was not equivalent to the combination of local.set and local.get, and there were passes that did this transformation and vice versa. So I changed local.tee's type to its local's type in #2511.

The example code I used in #2511 was this:

(func $test (result funcref) (local $0 anyref)
  (local.tee $0
    (ref.func $test)
  )
)

This code, and your code in this issue, should not validate by the spec. And last time I checked at the end of 2019 when I was working on subtyping, this code was (correctly) rejected by the spec interpreter and wabt. (Since then wabt has removed support for anyref after we changed the reference types spec so we can't test it anymore in wabt.) But it was somehow accepted by v8. I just checked now and it is still accepted in v8. Anyway according to the current spec this should not validate.

Luke once suggested to change the spec to make local.tee's type the value's type, but we disagreed on that.

@kripken
Copy link
Member Author

kripken commented Mar 19, 2021

@aheejin

Sorry, I should have given more context. Yes, I think you're right, this should not validate atm. There has been some confusion about it, including some GC prototypers (including v8 as you noticed) getting this wrong. It actually looks like most people implemented the other way, rather than what the spec says. In a conversation today we wondered if it was maybe easier to just go with that, since it does include more type information which seems better. So I spent some time to see how hard it would be to implement in Binaryen (and my conclusion is that it's not that hard, with the largest downside being the issue of tee'd tuples).

I'm wondering if I've missed something, and whether given the above it makes sense to try to do this now. My hope was it was easier, so now I wonder if maybe it can wait for later.

@jakobkummerow
Copy link
Contributor

I'm struggling to understand if there is any case where changing local.tee to maintain the type of the value on the stack would break anything. I understand that local.tee $x would then not be equivalent to a local.set $x; local.get $x sequence any more, but are there any cases where this lack of strict equivalence would be bad? We have implicit upcasting everywhere, so I would think that keeping the possibly-more-specific type around a little longer can't hurt -- worst case, some future instruction will rely on implicit upcasting to the possibly-more-general type that local.tee would produce given the current spec.

As far as V8's implementation is concerned: we have recently fixed that, see: https://chromium-review.googlesource.com/c/v8/v8/+/2756210/8/src/wasm/function-body-decoder-impl.h#3085

(FWIW, I think part of the confusion comes from the fact that the current spec reads local.tee x : [t] → [t], which looks like it preserves the topmost stack value's type, and you have to read the separate line Let t be the value type C.locals[x] to realize that when the actual type on the stack is a subtype of t, then local.tee itself relies on the implicit upcasting I mentioned above.)

@tlively
Copy link
Member

tlively commented Mar 19, 2021

@jakobkummerow I think you're right that the refined semantics can't possibly be worse as far as the final binary goes. The problem is that Binaryen performs transformations that are valid under the unrefined semantics but incorrect under the refined semantics. Specifically, it replaces a tee with a set+get of a local with the same type as the originally teed local. This is simply wrong under the refined semantics, so Binaryen would have to be updated to use the teed value's type in this situation instead.

@kripken
Copy link
Member Author

kripken commented Mar 19, 2021

Yes to both the last 2 comments.

And it looks like the 2 things we need to fix for that are Flatten and tuple local serialization. (But there could be more that I and the fuzzer have missed somehow.) I think both are feasible. It's more a question of whether it's worth the work right now.

@aheejin
Copy link
Member

aheejin commented Mar 19, 2021

So are you trying to change the spec? I don't have a specific opinion on that then.

By the way, should we also fix SimplifyLocals? I don't remember exactly but my work log says when I was working on subtyping SimplifyLocals and Flatten were the two passes which had problems with the Binaryen local.tee typing rule then, which used the value's type.

@kripken
Copy link
Member Author

kripken commented Mar 19, 2021

Yes, we thought perhaps it made sense to change the spec. Since most people prefer the other way, and some implementations assume it.

I have not seen issues in SimplifyLocals. That does the opposite rule (set+get into tee) but maybe there is some part of the pass I'm forgetting. It's possible it was a combination of SimplifyLocals + some other bug that was fixed meanwhile? (Or maybe I and the fuzzer are just missing it now...)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants