Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wasm-parser: should typed select parse with result arity ≠ 1? #870

Closed
nagisa opened this issue Jan 4, 2023 · 3 comments · Fixed by #872
Closed

wasm-parser: should typed select parse with result arity ≠ 1? #870

nagisa opened this issue Jan 4, 2023 · 3 comments · Fixed by #872

Comments

@nagisa
Copy link
Contributor

nagisa commented Jan 4, 2023

In the webassembly specification select with any number of result types is allowed by the binary representation, and is rejected in validation rather than the binary encoding.

wasmparser::VisitOperator instead models this kind of instruction as visit_typed_select with a single result argument, and therefore is forced to reject this encoding earlier, while decoding the binary representation:

0x1c => {
let results = self.read_var_u32()?;
if results != 1 {
return Err(BinaryReaderError::new(
"invalid result arity",
self.position,
));
}
visitor.visit_typed_select(self.read()?)
}

In particular WebAssembly/spec@ef01de5 has added tests that do successfully parse (and, I think, validate? Not sure.) which triggers this condition in wasmparser.

Should wasmparser pass this test? Should we allow decoding (and encoding) typed selects with result arity ≠ 1?

@alexcrichton
Copy link
Member

From the commit referenced it looks like the binary representation of all of the select instructions is still just either untyped or typed as one result? Is there a particular statement you think should be parsed but isn't?

I suspect there may be an issue parsing (result) (result i32) and the other way around, but that's more of a text issue than a binary issue.

@nagisa
Copy link
Contributor Author

nagisa commented Jan 4, 2023

Nope, parsing the text actually works okay (or at least wast does an okay job at it).

The way I have this set up is that I take wast::Module or whatever call encode on it and then throw the bytes at wasmparser. wasmparser is what fails to handle these 2 specific functions.

  (func (result i32) unreachable select (result i32) (result))
  (func (result i32) unreachable select (result i32) (result) select)

I haven’t looked into this but I believe in either case the first of the selects ends up being encoded as 0x1C 0x00 (that is, a typed_select with 0 result types) and then the if condition in the parsing code fails.

I should actually look into what makes the interpreter stomach these functions fine. I’m still not sure whether this passes the interpreter’s validator (and if so – whether than has anything to do with the unreachable) or if it just so happens that the interpreter does not validate this code at all for some reason. I’ll get back to this some time soon.

It might also be the case that encode works weird, but then there are test cases with results specified in either order.

@alexcrichton
Copy link
Member

You can inspect a bit with wasm-tools dump like so:

$ wasm-tools dump <<< "(func (result i32) unreachable select (result i32) (result))"
  0x0 | 00 61 73 6d | version 1 (Module)
      | 01 00 00 00
  0x8 | 01 05       | type section
  0xa | 01          | 1 count
  0xb | 60 00 01 7f | [type 0] Func(FuncType { params: [], returns: [I32] })
  0xf | 03 02       | func section
 0x11 | 01          | 1 count
 0x12 | 00          | [func 0] type 0
 0x13 | 0a 07       | code section
 0x15 | 01          | 1 count
============== func 0 ====================
 0x16 | 05          | size of function
 0x17 | 00          | 0 local blocks
 0x18 | 00          | unreachable
 0x19 | 1c 00       | ??
 0x1b | 0b          | end

so you're right it's popping out as 0x1c 0x00 which might be a bug in the AST-to-binary translation since that should probably be 1 with the i32 result type.

alexcrichton added a commit to alexcrichton/wasm-tools that referenced this issue Jan 5, 2023
Previously multiple `(result)` blocks would erroneously discard the
previously parsed result blocks, so this commit fixes the textual
parsing of this instruction to collect all of the `(result)` blocks
instead of only the final one.

Closes bytecodealliance#870
alexcrichton added a commit to alexcrichton/wasm-tools that referenced this issue Jan 5, 2023
Previously multiple `(result)` blocks would erroneously discard the
previously parsed result blocks, so this commit fixes the textual
parsing of this instruction to collect all of the `(result)` blocks
instead of only the final one.

Closes bytecodealliance#870
alexcrichton added a commit that referenced this issue Jan 5, 2023
Previously multiple `(result)` blocks would erroneously discard the
previously parsed result blocks, so this commit fixes the textual
parsing of this instruction to collect all of the `(result)` blocks
instead of only the final one.

Closes #870
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants