New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[wasm] Implement partial backward branch support in the Jiterpreter #82756
Conversation
… looping Remove unnecessary slow write barrier from ldelem_ref Update heuristic for backward branches Add back branch success rate statistic
Tagging subscribers to 'arch-wasm': @lewing Issue DetailsThis PR adds partial support for backward branches to the jiterpreter. Due to WASM's quirky approach to constrained control flow and loops, the code it generates is very suboptimal, but it still produces a measurable speed-up (~22sec/iter -> ~20sec/iter for one of the benchmarks that regressed, for example) and is a decent starting point to improve on. This PR also removes a write barrier from ldelem_ref that didn't need to be there and was very expensive. That takes the regressed benchmark down further from ~20sec/iter to ~3sec/iter. Additional statistics and a runtime option are added to go with the new backward branch support. More detail on how it works:
Attentive readers will note that this algorithm is inefficient because we have to scan over blocks until we find the branch target - there's only one loop for the entire trace. I tried creating a separate loop for each backwards branch target (which would allow us to jump directly to it) but its interaction with forward branches (which require the ability to jump forward to the end of a control region) made it too hard to get that working, at least initially. A better implementation of control flow would probably allow direct branching for both forwards and backwards control flow, or at least reduce the cost of the branches from what it is currently. But that implementation would likely require a second pass in the trace compiler and building some sort of CFG on the fly so I haven't started on it yet.
|
Don't generate a loop for trace if all the back branch offsets are before its start offset
This PR adds partial support for backward branches to the jiterpreter.
Due to WASM's quirky approach to constrained control flow and loops, the code it generates is very suboptimal, but it still produces a measurable speed-up (~22sec/iter -> ~20sec/iter for one of the benchmarks that regressed, for example) and is a decent starting point to improve on.
This PR also removes a write barrier from ldelem_ref that didn't need to be there and was very expensive. That takes the regressed benchmark down further from ~20sec/iter to ~3sec/iter.
Additional statistics and a runtime option are added to go with the new backward branch support.
More detail on how it works:
loop
instruction so that we can transfer control back to the top.eip
is updated and control is sent back to the top of the trace by branching to theloop
. After that, execution will skip over blocks until it reaches the branch target.Attentive readers will note that this algorithm is inefficient because we have to scan over blocks until we find the branch target - there's only one loop for the entire trace. I tried creating a separate loop for each backwards branch target (which would allow us to jump directly to it) but its interaction with forward branches (which require the ability to jump forward to the end of a control region) made it too hard to get that working, at least initially.
A better implementation of control flow would probably allow direct branching for both forwards and backwards control flow, or at least reduce the cost of the branches from what it is currently. But that implementation would likely require a second pass in the trace compiler and building some sort of CFG on the fly so I haven't started on it yet.