Move RTTs to Post-MVP #306

rossberg · 2022-06-08T15:24:47Z

This PR removes explicit RTTs and moves them to Post-MVP. Updates design docs, interpreter, and tests.

The rtt $t type and the rtt.canon instruction are removed and all RTT-consuming instructions (allocations and casts) are replaced with "canonical" versions that embed the semantics of rtt.canon. RTT-dependent cast instructions now need a static type as an additional immediate.

Obviously, this is a breaking change across the board, so I did not attempt to maintain backwards compat in details such as opcode assignments this time.

tlively

Thanks for doing this!

My one big comment is that I think it would be better for the MVP instructions not to contain _canon in their names. Understanding what _canon means would require knowing some of the context of the development of the proposal, but ideally the proposal would not require such context to fully understand. Also, having _canon in the name seems to imply that the instructions will always work for any implicit rtt.canon, but I think it would make sense to leave ourselves space to consider whether the sugar should be restricted to only a subset of possible rtt.canon cases. I would prefer to differentiate the post-MVP instructions by adding _rtt to their names instead.

tlively · 2022-06-08T18:23:41Z

interpreter/README.md

  br_on_non_null <var>
  br_on_non_<castop> <var>
-  br_on_cast_fail <var>
+  br_on_cast_canon_fail <var> (type <var>)


What is (type <var>) here? Why isn't it added to other instructions like struct.new below?

This is the type to cast to, as it can no longer be derived from the RTT operand. Most other instructions already had a (previously redundant) explicit type (cf the discussion regarding type annotations).

The syntax is chosen to mirror other cases like call_indirect, and to avoid confusion with the branch label.

I'd be fine with switching to (type _) syntax everywhere, it would be more consistent – but also more verbose. I justified the difference to myself by the prevalent choice that indices associated with the primary object kind of an instruction do not have a sort prefix (e.g., the global index in global.get, the table index in call_indirect, the label index in branches) whereas secondary ones do (e.g., the type index in call_indirect and br_on_cast, the memory index in data). And one can argue that the type in GC instructions represents the primary object kind, if that makes sense. :)

tlively · 2022-06-08T18:29:46Z

interpreter/syntax/ast.ml

@@ -127,7 +127,7 @@ type vec_storeop = (vec_type, unit) memop
 type vec_laneop = (vec_type, pack_size) memop * int

 type initop = Explicit | Implicit
-type castop = NullOp | I31Op | DataOp | ArrayOp | FuncOp | RttOp
+type castop = NullOp | I31Op | DataOp | ArrayOp | FuncOp | RttOp of int32 Source.phrase


What is this new int32 for?

See above, this is the now necessary type immediate.

tlively · 2022-06-08T18:36:57Z

proposals/gc/Post-MVP.md

+* `rtt <typeidx>` is a new heap type that is a runtime representation of the static type `<typeidx>`
+  - `heaptype ::= ... | rtt <typeidx>`
+  - `rtt t ok` iff `t ok`
+  - the reference type `(rtt $t)` is a shorthand for `(ref (rtt $t))`


Would this be a text-only shorthand? It seems slightly inconsistent that this shorthand corresponds to a non-nullable reference while our other shorthands correspond to nullable references, but I don't think that's a big deal.

It's what we previously had, including in the binary format. I agree it's a minor inconsistency, but you rarely ever want nullable RTTs.

tlively · 2022-06-08T18:39:40Z

proposals/gc/Post-MVP.md

+```
+With extensions like [type parameters](#type-parameters), this instruction will become more nuanced and will involve additional RTT operands if `$t` has type paramters.
+
+Correspondingly, all allocation and cast instructions will get counter parts taking an explicit RTT operand. The MVP's canonical versions can be reinterpreted as the combination of the explicit ones whose RTT operand is created with `rtt.canon`:


In the presence of type parameters, would the MVP instructions also gain additional parameters for types where rtt.canon would have additional parameters?

I hope not, as that would create a lot of extra complexity in the instruction set and also be somewhat inconsistent (the primary RTT operand is implicit but the others are not?).

But it is this kind of potential extra complexity and non-trivial warts that makes me so uneasy. ;)

rossberg

The idea is that the generic mnemonics ought to be reserved for the instructions that can actually handle the general case. I’m not wed to the particular choice of suffix (better suggestions welcome!), but given that we cannot change them later, I do think that the mnemonics need to reflect that these instructions are limited to a special case in the future.

rossberg · 2022-06-09T08:03:32Z

interpreter/README.md

  br_on_non_null <var>
  br_on_non_<castop> <var>
-  br_on_cast_fail <var>
+  br_on_cast_canon_fail <var> (type <var>)


This is the type to cast to, as it can no longer be derived from the RTT operand. Most other instructions already had a (previously redundant) explicit type (cf the discussion regarding type annotations).

The syntax is chosen to mirror other cases like call_indirect, and to avoid confusion with the branch label.

I'd be fine with switching to (type _) syntax everywhere, it would be more consistent – but also more verbose. I justified the difference to myself by the prevalent choice that indices associated with the primary object kind of an instruction do not have a sort prefix (e.g., the global index in global.get, the table index in call_indirect, the label index in branches) whereas secondary ones do (e.g., the type index in call_indirect and br_on_cast, the memory index in data). And one can argue that the type in GC instructions represents the primary object kind, if that makes sense. :)

rossberg · 2022-06-09T08:04:35Z

interpreter/syntax/ast.ml

@@ -127,7 +127,7 @@ type vec_storeop = (vec_type, unit) memop
 type vec_laneop = (vec_type, pack_size) memop * int

 type initop = Explicit | Implicit
-type castop = NullOp | I31Op | DataOp | ArrayOp | FuncOp | RttOp
+type castop = NullOp | I31Op | DataOp | ArrayOp | FuncOp | RttOp of int32 Source.phrase


See above, this is the now necessary type immediate.

rossberg · 2022-06-09T08:07:04Z

proposals/gc/Post-MVP.md

+* `rtt <typeidx>` is a new heap type that is a runtime representation of the static type `<typeidx>`
+  - `heaptype ::= ... | rtt <typeidx>`
+  - `rtt t ok` iff `t ok`
+  - the reference type `(rtt $t)` is a shorthand for `(ref (rtt $t))`


It's what we previously had, including in the binary format. I agree it's a minor inconsistency, but you rarely ever want nullable RTTs.

rossberg · 2022-06-09T08:13:47Z

proposals/gc/Post-MVP.md

+```
+With extensions like [type parameters](#type-parameters), this instruction will become more nuanced and will involve additional RTT operands if `$t` has type paramters.
+
+Correspondingly, all allocation and cast instructions will get counter parts taking an explicit RTT operand. The MVP's canonical versions can be reinterpreted as the combination of the explicit ones whose RTT operand is created with `rtt.canon`:


I hope not, as that would create a lot of extra complexity in the instruction set and also be somewhat inconsistent (the primary RTT operand is implicit but the others are not?).

But it is this kind of potential extra complexity and non-trivial warts that makes me so uneasy. ;)

jakobkummerow · 2022-06-09T18:48:35Z

Regarding opcodes, since we've had the RTT-free instructions implemented in both V8 and Binaryen for a long time now, we also have an existing set of opcode assignments (that don't overlap with the other opcodes), see here. It would be nice to maintain them for now? Not a big issue though, we'll need to make some backwards-incompatible changes soon anyway and could batch up the opcode changes with those.

The state in that doc is in part due to this proposal's history: you may recall that in days long gone we actually had both struct.new and struct.new_with_rtt, then at some point dropped the former, then (in V8/Binaryen at least) reintroduced it, now we're dropping the latter. This history also explains the textual names used there: for the allocating opcodes, .new is with implicit RTT and .new_with_rtt is with explicit RTT. We've never renamed them to preserve backwards compatibility of the text format. For the cast/test instructions, there is no such history, so we picked the _static (as in: "statically typed") suffix when we added the implicit-RTT versions. (I don't care too much about renaming some or all of the instructions, and it would certainly be nice to make them consistent; I'm just explaining how we arrived at what that doc is saying today.)

A possible (follow-up?) simplification could be to merge ref.cast_canon, ref.test_canon, br_on_cast_canon and br_on_cast_canon_fail into ref.as <ht>, ref.is <ht>, br_on <ht> and br_on_not <ht>, respectively, as issue #274 sketches out. I don't see a reason to keep them separate, and merging them might help assuage concerns people have about the overall size of the instruction set once we re-introduce explicit-RTT versions later. That said, I don't feel strongly about it, and if there are reasons to keep them separate, we can certainly do that too.
(The fact that the *_canon group currently takes a dataref rather than an anyref as input is not providing any benefits, as discussed at length on that issue, so that's not a reason to keep the distinction. Effectively, to cast to type $T, you need to use ref.as iff $T is generic and ref.cast_canon iff $T is non-generic, and I don't see this distinction providing any practical benefits. There is no $T for which ref.as $T and ref.cast_canon $T are both valid, so no expressiveness is lost if we merge them.)

rossberg · 2022-06-13T14:58:49Z

Yes, refactoring the cast operators is a follow-up. But it's gonna be affected by the anyref/funcref/externref subtyping questions, so it makes more sense to wait with it until we have resolves that.

This commit: * Adds new instructions that have been added to the proposal * Removes RTT producing instructions * Modifies instructions that consumed RTT values to no longer take the RTT The last point is challenging because there are no agreed upon names or encodings for the modified instructions yet [1]. I tried to choose something reasonable, and will come back to fix them once this is figured out. [1] WebAssembly/gc#306

* wast: Add explicit recursion groups for type definitions The GC proposal adds the ability to define multiple types inside of a recursion group [1]. There is no text syntax for this yet, so I tried to pick a reasonable starting point. e.g. (rec (type (struct)) (type (array))) [1] https://github.com/WebAssembly/gc/blob/main/proposals/gc/MVP.md#type-definitions * wast: Add declared subtyping to type definitions The GC proposal type system has declared subtyping where a definition indicates the parent type(s) it is a subtype of [1]. For now a type is restricted to just one parent type. There is no text format for this yet, so I tried to pick a reasonable starting point. e.g. (sub $parent (type (struct))) [1] https://github.com/WebAssembly/gc/blob/main/proposals/gc/MVP.md#type-definitions * wast: Remove rtt value type * wast: Update GC instructions This commit: * Adds new instructions that have been added to the proposal * Removes RTT producing instructions * Modifies instructions that consumed RTT values to no longer take the RTT The last point is challenging because there are no agreed upon names or encodings for the modified instructions yet [1]. I tried to choose something reasonable, and will come back to fix them once this is figured out. [1] WebAssembly/gc#306 * wast: Add arrayref heap type This heap type was added upstream in the GC proposal. * wast: Update GC tests * wast: Update array instructions. * array.init is now array.new_fixed * array.init_from_data is now array.new_from_data * array.new_from_elem was added for creating an array from an elem segment

rossberg · 2022-07-14T11:21:54Z

To avoid conflicts with followup PRs, taking the liberty to land this. We can bikeshed the mnemonics later.

https://bugs.webkit.org/show_bug.cgi?id=243079 Reviewed by Justin Michaud. This patch removes RTT support from the Wasm GC implementation. This part of the spec was postponed to later post-MVP phases in recent decisions. PR for RTT removal in spec: WebAssembly/gc#306 Discussion issue: WebAssembly/gc#275 Instead of using RTTs, MVP instructions will use the type index for the information needed to allocate values or do casts. For example, array.new_canon $t : [t' i32] -> [(ref $t)] instead of array.new $t : [t' i32 (rtt $t)] -> [(ref $t)] * JSTests/wasm/gc/rtt.js: Removed. * JSTests/wasm/wasm.json: * Source/JavaScriptCore/bytecode/BytecodeList.rb: * Source/JavaScriptCore/llint/WebAssembly.asm: * Source/JavaScriptCore/wasm/WasmAirIRGenerator.cpp: (JSC::Wasm::AirIRGenerator::gRef): (JSC::Wasm::AirIRGenerator::tmpForType): (JSC::Wasm::AirIRGenerator::emitCCall): (JSC::Wasm::AirIRGenerator::AirIRGenerator): (JSC::Wasm::AirIRGenerator::gRtt): Deleted. (JSC::Wasm::AirIRGenerator::addRttCanon): Deleted. * Source/JavaScriptCore/wasm/WasmB3IRGenerator.cpp: (JSC::Wasm::B3IRGenerator::addRttCanon): Deleted. * Source/JavaScriptCore/wasm/WasmCallingConvention.h: (JSC::Wasm::WasmCallingConvention::marshallLocation const): * Source/JavaScriptCore/wasm/WasmFormat.h: (JSC::Wasm::isValueType): (JSC::Wasm::isValidHeapTypeKind): * Source/JavaScriptCore/wasm/WasmFunctionParser.h: (JSC::Wasm::FunctionParser<Context>::parseExpression): (JSC::Wasm::FunctionParser<Context>::parseUnreachableExpression): * Source/JavaScriptCore/wasm/WasmLLIntGenerator.cpp: (JSC::Wasm::LLIntGenerator::callInformationForCaller): (JSC::Wasm::LLIntGenerator::callInformationForCallee): (JSC::Wasm::LLIntGenerator::addArguments): (JSC::Wasm::LLIntGenerator::addRttCanon): Deleted. * Source/JavaScriptCore/wasm/WasmOperations.cpp: * Source/JavaScriptCore/wasm/WasmOperations.h: * Source/JavaScriptCore/wasm/WasmParser.h: (JSC::Wasm::Parser<SuccessType>::parseValueType): * Source/JavaScriptCore/wasm/WasmSlowPaths.cpp: * Source/JavaScriptCore/wasm/WasmSlowPaths.h: * Source/JavaScriptCore/wasm/js/WasmToJS.cpp: (JSC::Wasm::wasmToJS): * Source/JavaScriptCore/wasm/wasm.json: Canonical link: https://commits.webkit.org/252788@main

* wast: Add explicit recursion groups for type definitions The GC proposal adds the ability to define multiple types inside of a recursion group [1]. There is no text syntax for this yet, so I tried to pick a reasonable starting point. e.g. (rec (type (struct)) (type (array))) [1] https://github.com/WebAssembly/gc/blob/main/proposals/gc/MVP.md#type-definitions * wast: Add declared subtyping to type definitions The GC proposal type system has declared subtyping where a definition indicates the parent type(s) it is a subtype of [1]. For now a type is restricted to just one parent type. There is no text format for this yet, so I tried to pick a reasonable starting point. e.g. (sub $parent (type (struct))) [1] https://github.com/WebAssembly/gc/blob/main/proposals/gc/MVP.md#type-definitions * wast: Remove rtt value type * wast: Update GC instructions This commit: * Adds new instructions that have been added to the proposal * Removes RTT producing instructions * Modifies instructions that consumed RTT values to no longer take the RTT The last point is challenging because there are no agreed upon names or encodings for the modified instructions yet [1]. I tried to choose something reasonable, and will come back to fix them once this is figured out. [1] WebAssembly/gc#306 * wast: Add arrayref heap type This heap type was added upstream in the GC proposal. * wast: Update GC tests * wast: Update array instructions. * array.init is now array.new_fixed * array.init_from_data is now array.new_from_data * array.new_from_elem was added for creating an array from an elem segment

rossberg added 2 commits June 8, 2022 15:53

Defer explicit RTTs

d7526bf

Simplify type annotation syntax

7bc041b

rossberg requested a review from tlively June 8, 2022 15:24

Remove obsolete RTT value rep

2194ca2

tlively reviewed Jun 8, 2022

View reviewed changes

rossberg commented Jun 9, 2022

View reviewed changes

rossberg merged commit 79ae4ce into main Jul 14, 2022

tlively mentioned this pull request Jul 21, 2022

Pushing RTTs to post-MVP #275

Closed

takikawa mentioned this pull request Jul 22, 2022

[Wasm-GC] Remove RTT support WebKit/WebKit#2645

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Move RTTs to Post-MVP #306

Move RTTs to Post-MVP #306

rossberg commented Jun 8, 2022

tlively left a comment

tlively Jun 8, 2022

rossberg Jun 9, 2022

tlively Jun 8, 2022

rossberg Jun 9, 2022

tlively Jun 8, 2022

rossberg Jun 9, 2022

tlively Jun 8, 2022

rossberg Jun 9, 2022

rossberg left a comment •

edited

rossberg Jun 9, 2022

rossberg Jun 9, 2022

rossberg Jun 9, 2022

rossberg Jun 9, 2022

jakobkummerow commented Jun 9, 2022

rossberg commented Jun 13, 2022

rossberg commented Jul 14, 2022

Move RTTs to Post-MVP #306

Move RTTs to Post-MVP #306

Conversation

rossberg commented Jun 8, 2022

tlively left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rossberg left a comment • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jakobkummerow commented Jun 9, 2022

rossberg commented Jun 13, 2022

rossberg commented Jul 14, 2022

rossberg left a comment •

edited