-
Notifications
You must be signed in to change notification settings - Fork 291
Relax the block check in OperatorsReader<'_>
#2202
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This commit relaxes a check added in bytecodealliance#2134 which maintains a stack of frame kinds in the operators reader, in addition to the validator. The goal of bytecodealliance#2134 was to ensure that spec-wise-syntactically-invalid-modules are caught in the parser without the need of the validator, but investigation in bytecodealliance#2180 has shown that this is a source of at least some of a performance regression. The change here is to relax the check to still be able to pass spec tests while making such infrastructure cheaper. The reader now maintains just a small `depth: u32` counter instead of a stack of kinds. This means that the reader can still catch invalid modules such as instructions-after-`end`, but the validator is required to handle situations such as `else` outside of an `if` block. This required some adjustments to tests as well as some workarounds for the upstream spec tests that assert legacy exception-handling instructions are malformed, not invalid.
@alexcrichton thank you for trying to come up with a solution for this! :) |
Hmm... Honestly, I've really tried to replicate a slowdown in parsing individual large modules, and to date my measurements (#2180 (comment)) were that parsing alone doesn't seem to have gotten slower, when we're talking about parsing a single huge module, e.g. https://github.com/turbolent/w2c2/blob/main/examples/clang/clang.wasm, vs. something like the testsuite which is about parsing thousands of tiny modules. So my preference would be to find and be able to replicate the problem first because I think that will help inform how to fix it, and e.g. where it's worth complicating the story and spec adherence. My view is that the spec defines a notion of binary well-formedness in chapter 5, and validity in chapter 3, and it would be nice if the parser enforced chapter 5 by itself, and the validator enforced chapter 3, rather than just saying that the parser+validator is going to jointly enforce the requirements of chapters 5+3. This is particularly relevant to me because we're building a development environment that enforces the syntactic well-formedness on entry (and shows validation errors but doesn't prevent them), so the distinction matters to us for that reason. All things being equal, chapter 5 does seem pretty clear that the binary module corresponding to |
To be a little more concrete, as far as I can tell so far, the slowdowns seem to be in "parsing and validating thousands of tiny modules" but not in "parsing one gigantic module." But I'm not sure if we're all on the same page about this? That's certainly what I was trying to figure out in #2180. :-) If we are on the same page about where the slowdown is, then I guess my thinking would be that it's worth it to reduce the startup overhead of the Parser (maybe in being able to share its heap allocations across runs like the Validator can do?), but I'd be less eager to use startup overhead as a reason to reclassify a module like |
@alexcrichton Did you also use I ran the benchmarks on an
I also wondered about this. It was a hypothesis so far since I could not proof if it really behaves like this. If true then indeed we should see nice speed-ups by introducing
I fully agree with you. It would certainly be better and we should ideally find a set of changes that allow for this while keeping good performance.
@alexcrichton has shown in a prior post which parts of the parser causes which slowdowns: #2180 (comment) I am not saying that the mentioned fixes should be implemented but they provide a clue to what is actually causing the regressions and at least in this comments it doesn't look like the regressions are stemming entirely from the startup of the parser. |
FWIW I do agree with @keithw that I'm not really certain what the regression/benchmark being optimized for is here. I don't dispute that criterion is printing regressions in wasmparser/wasmi, but the numbers currently don't make much sense. The main number I at least personally care about is the time it takes to parse + validate a module. In wamsparser using a variety of modules the regression is in the 5% range, while wasmi shows it's in the 30% range. I don't think any of us know at this time where such a difference comes from. I'm the same as @keithw where I've been unable to show anything more than 5% regression when simply validating a module (or within that ballpark at least). I would also agree that the benchmark of just iterating over all tests in this repository isn't necesarily representative so while it can be useful to track I don't think it's worthwhile to live and die by that number. @keithw after thinking about this for awhile I'm now at the point where I think I have a more crisp articulation of my thinking. Right now we've effectively got two competing goals:
My understanding though is that this distinction of whether an invalid module is syntactically invalid or semantically invalid is basically an artifact of the way the spec is authored and/or the spec interpreter. I'm also under the impression that web engines (and maybe no other engine?) also do not worry about distinguishing between these two kinds of modules. To me this creates a tension because it means that there's not much recourse to providing a form of pressure on the specification itself to handle a concern such as this. This puts wasmparser in a bit of an unfortunate spot where I'd like it to follow the spec exactly ideally but it's pretty unmaintainable from a 100% compliance perspective (insofar as exactly matching whether invalid modules are semantically invalid or syntactically invalid). We already have some affordances for where wasmparser differs from the spec, and this PR could arguably be classified as such a change too. Ok that's all a bunch of background which leads me to my articulation: I'm not sure where to draw the line of where wasmparser can/should be different from the spec. Is the current state of "just a little different" ok? Is this PR's state of "ok a bit more significantly different" ok? Personally I don't really feel equipped to answer these questions myself as it feels like we're stuck between a rock and a hard place with the way the spec is developed and tested. Do you think that it would be worthwhile to bring these concerns to the CG perhaps? For example these are two possible (and radical) states to shift the development of the spec hypothetically:
In any case that's at least some more thoughts (sorry about so many words on this somewhat niche topic). I'll clarify again I personally have no predetermined outcome I'd like to reach at this time, so if everyone's willing this is definitely a topic I'd like to dig into and explore as opposed to rush to any particular solution. Basically I'm not trying to land this PR ASAP or anything like that. |
Indeed, that is very confusing to me as well. I tried to have minimal refactorings for the One area where Wasmi uses I will re-benchmark Wasmi and take a deeper dive to see what is going on as soon as time and capacity permits. I will inform you about my findings (if any) at #2180. |
Hi @alexcrichton, I don't want to derail the practical questions of dealing with the performance issues in the code, but I feel like I owe you a response on these more policy-oriented questions -- I'm sorry I missed this the first time. One high-level response would be to ask if you think it would be a good idea to start a #wasm-tools and #git-wasm-tools topic on the Bytecode Alliance Zulip. That might be a useful place to have longer-horizon discussions vs. GitHub PR/issue threads, especially among stakeholders from different projects.
I think if we're looking for a semi-principled position, let me shop a proposal for an intermediate place to "draw the line," which happens to match exactly where wasm-tools is right now. Namely: I think it's fine if wasmparser (without validation) over-accepts some malformed binary modules, as long as these modules still have a plausible representation in the abstract syntax defined by Chapter 2. That means these binaries can be translated to well-formed text and back to binary and back to the same text as before. In other words, the exceptions are only relevant to the binary format. So the current affordances (e.g. missing DataCount section, malformed mutability flags, integer too large) are all fine under that rubric, because those modules can still be represented in the abstract syntax, and they don't cause wasmprinter to output malformed text. By contrast, if wasmparser allowed (For context, my group is writing a structured editor to teach freshman CS where the editor enforces that the input buffer is always syntactically well-formed WAT, so when I picture myself in front of a classroom of 300 angry freshman trying to defend these distinctions to people who have never programmed before, I'm hoping for a certain moral clarity. I realize this isn't a typical use-case... I think you're totally right that a browser doesn't have to worry about these distinctions, but, the browser engines aren't trying to be general Wasm toolkits; they don't have text-mode parsers and I don't think they worry about roundtripping, or performing manipulations on well-formed Wasm modules that might take them through an invalid state.)
I think a concern these would have to overcome is that most of the spec (including the validation and execution rules) are specified in terms of the abstract syntax of chapter 2. So if we wanted to allow stuff that isn't currently representable in the abstract syntax, that syntax would have to be widened to allow, and then the validation rules would have to be adapted to account for it. E.g. looking at 2.4.8 (https://webassembly.github.io/spec/core/syntax/instructions.html#control-instructions), the abstract syntax for |
I think this would be and excellent idea (working on making that web-public)
Ok that all sounds pretty reasonable to me. I agree with you that changing Overall I'm personally warming up to #2228 so my current plan is to close this when that PR merges, but I'll leave this open in the meantime for now. (thanks again for all the discussion on this!) |
Closing in favor of #2228 |
This commit relaxes a check added in #2134 which maintains a stack of frame kinds in the operators reader, in addition to the validator. The goal of #2134 was to ensure that spec-wise-syntactically-invalid-modules are caught in the parser without the need of the validator, but investigation in #2180 has shown that this is a source of at least some of a performance regression. The change here is to relax the check to still be able to pass spec tests while making such infrastructure cheaper.
The reader now maintains just a small
depth: u32
counter instead of a stack of kinds. This means that the reader can still catch invalid modules such as instructions-after-end
, but the validator is required to handle situations such aselse
outside of anif
block.This required some adjustments to tests as well as some workarounds for the upstream spec tests that assert legacy exception-handling instructions are malformed, not invalid.