Replace try/catch/rethrow in unwind with unwind_guard #309

wravery · 2022-03-18T18:57:22Z

I noticed while investigating microsoft/cppgraphqlgen#222 that parse_tree implements unwind on the make_control<>::state_handler<> struct, and this results in creating a try/catch/rethrow block around every rule match in match_control_unwind. Every try/catch/rethrow seems to be consuming significant stack space, so if you raise/throw an exception from a relatively shallow (< 20) nested rule, it often exhausts the stack.

This change makes it so parse_tree will only implement unwind on the Control type wrapper if either the Control or the Node type implements it. When parsing into a tree that doesn't implement either of them, the exception will be thrown once all the way from the initial raise to the caller of parse_tree::parse.

Since the unit tests use the default node which inherits basic_node, they still implement unwind. The unit tests also wrap the rule in a try_catch_type handler, though, so the exception doesn't bubble all the way up. Before I nailed down the constexpr bool definitions, I accidentally suppressed unwind for basic_node types in the unit tests as well, and the lack of stack.pop_back() calls resulted in the assert(stack.size() == 1) firing for the unit test. So that tells me this is incompatible with try_catch_type. That's the only thing I think might block it.

d-frey · 2022-03-18T20:18:35Z

As you noticed, skipping the unwind() makes the assert(stack.size() == 1) fire. Conceptually, the state.pop_back() is necessary for the parse tree to work correctly, it can't just be skipped. In its current form, we therefore can not accept this PR.

However, you do have a point here, all those try/catch-blocks are something to look into and they do provide an optimization opportunity. So, what could be done? I can see some potential by checking if a grammar rule (including all reachable sub-rules) can throw an exception. This could be used to skip unwind properly. I'll see if I can come up with something.

wravery · 2022-03-18T21:42:28Z

Yeah, partly this change is predicated on letting the exception out to the caller, in which case you don't need to unwind because the stack will just be deleted and the assert is never reached. But to rely on that, you need to know that there won't be any other catch blocks that turn an exception into a regular return, like try_catch_type.

What I'm probably going to do for my own project is copy enough of the parse_tree implementation to implement a non-unwinding version. My grammar and custom ast_node implementation don't require any try_catch_type handling or per-frame unwinding, and it made a pretty dramatic difference in how far I could get without overflowing. Originally it overflowed at a depth of 18 or so, only when throwing an exception from an if_must rule. Now it can successfully throw an exception without overflowing the stack from a depth of at least 100 (I stopped at that point, I'm also going to add a depth limit since it can't remain completely unbounded).

wravery · 2022-03-18T22:18:37Z

Maybe it would be better to define an RAII unwind_guard which will pop the stack in its destructor. You'd then need to disarm it in case an exception is not thrown and you don't want to pop the stack, but then you wouldn't need an explicit try/catch block, it could just perform regular stack unwinding.

Edit: This would actually apply to the core match_control_unwind method and not to anything in parse_tree specifically.

wravery · 2022-03-19T04:50:13Z

Here you go, how about something like this?

codecov-commenter · 2022-03-19T05:08:14Z

Codecov Report

Merging #309 (a2e3cc3) into main (e3c8cb4) will increase coverage by 0.00%.
The diff coverage is 100.00%.

@@           Coverage Diff           @@
##             main     #309   +/-   ##
=======================================
  Coverage   99.68%   99.68%           
=======================================
  Files         253      254    +1     
  Lines        5083     5088    +5     
=======================================
+ Hits         5067     5072    +5     
  Misses         16       16

Impacted Files	Coverage Δ
include/tao/pegtl/internal/unwind_guard.hpp	`100.00% <100.00%> (ø)`
include/tao/pegtl/match.hpp	`100.00% <100.00%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e3c8cb4...a2e3cc3. Read the comment docs.

d-frey · 2022-03-20T08:10:43Z

I like the idea about the unwind_guard, but it seems independent of the rest, so I'd like to see this as a separate PR.

Also, I would like to get some idea of how it improves code on all major compilers (GCC, Clang, and MSVC), as we don't want to improve code on one compiler on the expense of others. I ran some small checks with GCC so far and sometimes the binaries became larger, sometimes they became smaller. With Clang, there was a lot more overhead and most binaries grew in size.

Minor details: The capture list can be shortened to just [&] instead of [&in, &st...] and the <functional> include should be <utility>.

d-frey · 2022-03-20T08:32:46Z

One more thing that I should probably mention: We don't want to check the code with Debug mode, as the PEGTL is heavily templated and relies on the optimizer to generate reasonable code. Anything compiled without optimizations is best effort, at best. We should probably also put this in the documentation in the "Requirements" section. @ColinH Would you like to formulate something?

wravery · 2022-04-21T17:05:16Z

One more thing that I should probably mention: We don't want to check the code with Debug mode, as the PEGTL is heavily templated and relies on the optimizer to generate reasonable code. Anything compiled without optimizations is best effort, at best. We should probably also put this in the documentation in the "Requirements" section. @ColinH Would you like to formulate something?

FWIW, the same stack overflow occurs in release builds using MSVC (VS2022) on my machine. I don't think the release optimizations can save it from this particular issue.

wravery · 2022-04-21T17:08:18Z

I like the idea about the unwind_guard, but it seems independent of the rest, so I'd like to see this as a separate PR.

OK, will do.

Also, I would like to get some idea of how it improves code on all major compilers (GCC, Clang, and MSVC), as we don't want to improve code on one compiler on the expense of others. I ran some small checks with GCC so far and sometimes the binaries became larger, sometimes they became smaller. With Clang, there was a lot more overhead and most binaries grew in size.

Would you prefer an #ifdef for MSVC? I think the stack overflow I'm trying to mitigate is likely specific to the code that compiler generates for catching and rethrowing exceptions.

Minor details: The capture list can be shortened to just [&] instead of [&in, &st...] and the <functional> include should be <utility>.

OK. 👍🏼

wravery · 2022-04-21T17:13:10Z

What I'm probably going to do for my own project is copy enough of the parse_tree implementation to implement a non-unwinding version. My grammar and custom ast_node implementation don't require any try_catch_type handling or per-frame unwinding, and it made a pretty dramatic difference in how far I could get without overflowing. Originally it overflowed at a depth of 18 or so, only when throwing an exception from an if_must rule. Now it can successfully throw an exception without overflowing the stack from a depth of at least 100 (I stopped at that point, I'm also going to add a depth limit since it can't remain completely unbounded).

I have a configurable depth limit and a prototype of a custom parse_tree fork now, but I'm going to abandon the custom implementation of parse_tree in favor of this PR. I'd rather not diverge like that from PEGTL.

ColinH · 2022-05-14T13:50:49Z

@d-frey Can we merge this as is?

d-frey · 2022-05-14T14:48:26Z

I think we already talked about this, especially benchmarks. Are we sure this PR would not introduce drawbacks for GCC/Clang?

wravery · 2022-05-14T19:20:56Z

I think we already talked about this, especially benchmarks. Are we sure this PR would not introduce drawbacks for GCC/Clang?

I've been playing around with it on Compiler Explorer, and with -O2 it seems to slightly shrink the code for GCC (12 lines of assembly vs. 10 lines), and it grows the code by about 58% for Clang (31 lines of assembly vs. 49 lines). It's definitely an improvement for MSVC at runtime just by not causing a stack overflow, no matter what the size difference might be. Overall, I'd argue it's a wash between GCC and Clang (which seems to generate a lot more code than GCC to begin with).

N.B. A side benefit of #313 is that it'll remove this code in parse_tree entirely in cases where a Node::unwind declaration is omitted.

ColinH · 2022-06-28T19:21:28Z

Here's one finding after finally playing around with this for a bit, on a Mac with M1 Max running Monterey.

With a slightly modified json_parse example that adds a couple of unwind() functions to the control, both compilers produce identical binaries, no difference between try-catch and unwind_guard.

So, at least for Clang and GCC, it doesn't seem to matter, since, at least in this simple case, the optimisers sees right through the supposed "difference".

PS: On this platform Clang consistently produced much smaller binaries than GCC 11.

d-frey · 2022-06-28T20:04:52Z

Thanks Colin for testing. @wravery Can you update this PR so I can merge it? Thanks!

d-frey · 2022-06-29T16:35:43Z

I merged it manually, I'll do a release of the 3.x branch shortly.

Avoid try/catch/rethrow if unwind is unimplemented

0797d3e

d-frey self-assigned this Mar 18, 2022

d-frey added the enhancement label Mar 18, 2022

wravery added 3 commits March 18, 2022 19:22

Make unwind unconditional but keep it optional on Node

b09ea77

Replace try/catch/throw with unwind_guard

edcbc99

Fix variable shadowing

d0c5101

wravery added 2 commits March 18, 2022 22:22

Remove the default move/copy constructor/assignment

4ac97b8

Simplify unwind_guard and put it in a separate internal header

3fafe87

wravery mentioned this pull request Apr 21, 2022

Make parse_tree Node::unwind optional #313

Merged

wravery added 2 commits April 21, 2022 11:04

Remove changes from separate PR taocpp#313

2043287

Simplify capture list in unwind_guard lambda

ae66aae

wravery changed the title ~~Avoid try/catch/rethrow if unwind is unimplemented~~ Replace try/catch/rethrow in unwind with unwind_guard Apr 21, 2022

wravery added 2 commits April 21, 2022 11:15

Cleanup unused includes

6e8bd6d

Add #include <utility> for std::move

a2e3cc3

wravery mentioned this pull request May 17, 2022

Customize the PEGTL parse_tree implementation microsoft/cppgraphqlgen#252

Merged

d-frey closed this in ad71c24 Jun 29, 2022

d-frey added a commit that referenced this pull request Jun 29, 2022

Manually merge PR from wravery, closes #309

ffc08ad

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace try/catch/rethrow in unwind with unwind_guard #309

Replace try/catch/rethrow in unwind with unwind_guard #309

wravery commented Mar 18, 2022

d-frey commented Mar 18, 2022

wravery commented Mar 18, 2022

wravery commented Mar 18, 2022 •

edited

wravery commented Mar 19, 2022

codecov-commenter commented Mar 19, 2022 •

edited

d-frey commented Mar 20, 2022 •

edited

d-frey commented Mar 20, 2022

wravery commented Apr 21, 2022

wravery commented Apr 21, 2022

wravery commented Apr 21, 2022

ColinH commented May 14, 2022

d-frey commented May 14, 2022

wravery commented May 14, 2022

ColinH commented Jun 28, 2022 •

edited

d-frey commented Jun 28, 2022

d-frey commented Jun 29, 2022

Replace try/catch/rethrow in unwind with unwind_guard #309

Replace try/catch/rethrow in unwind with unwind_guard #309

Conversation

wravery commented Mar 18, 2022

d-frey commented Mar 18, 2022

wravery commented Mar 18, 2022

wravery commented Mar 18, 2022 • edited

wravery commented Mar 19, 2022

codecov-commenter commented Mar 19, 2022 • edited

Codecov Report

d-frey commented Mar 20, 2022 • edited

d-frey commented Mar 20, 2022

wravery commented Apr 21, 2022

wravery commented Apr 21, 2022

wravery commented Apr 21, 2022

ColinH commented May 14, 2022

d-frey commented May 14, 2022

wravery commented May 14, 2022

ColinH commented Jun 28, 2022 • edited

d-frey commented Jun 28, 2022

d-frey commented Jun 29, 2022

wravery commented Mar 18, 2022 •

edited

codecov-commenter commented Mar 19, 2022 •

edited

d-frey commented Mar 20, 2022 •

edited

ColinH commented Jun 28, 2022 •

edited