Store terminal failures as TerminalParseFailures #26

sds · 2015-06-07T22:43:42Z

When attempting to print out a failure message using failure_reason
for my tiny personal project propose[1], I was encountering the following error:

NoMethodError: undefined method `unexpected' for [0, "[a-z]", false]:Array> with backtrace

This started happening after upgrading to treetop 1.6.0+.

It looks like the problem here is that terminal failures are stored in
@terminal_failures as tuples, and when you call terminal_failures it
does a check to see if the first item is a TerminalParseFailure, and
converts the list if it isn't. However, if you call terminal_failures
multiple times, it's possible other errors were added, and so you'll
have a mixture of Array tuples and TerminalParseFailures.

The fix is to store them as TerminalParseFailures when the failure is
first encountered in terminal_parse_failure. It's not overly expensive
and allows us to avoid this unnecessary complexity.

[1] https://github.com/sds/propose

When attempting to print out a failure message using `failure_reason` for my project `propose`[1], I was encountering the following error: NoMethodError: undefined method `unexpected' for [0, "[a-z]", false]:Array> with backtrace It looks like the problem here is that terminal failures are stored in `@terminal_failures` as tuples, and when you call `terminal_failures` it does a check to see if the first item is a `TerminalParseFailures`, and converts the list if it isn't. However, if you call `terminal_failures` multiple times, it's possible other errors were added, and so you'll have a mixture of `Array` tuples and `TerminalParseFailure`s. The fix is to store them as `TerminalParseFailure`s when the failure is first encountered in `terminal_parse_failure`. It's not overly expensive and allows us to avoid this unnecessary complexity. [1] https://github.com/sds/propose

sds · 2015-06-09T05:33:02Z

Looks like @terminal_failures is already cleared on each run (

treetop/lib/treetop/runtime/compiled_parser.rb

Line 74 in e30bd48

@terminal_failures = []

).

I'm honestly not sure what I'm doing to cause this behavior. FWIW, I think this change makes sense even if this weren't an issue, since it seems a bit odd to store parse errors this way, and this pull request actually reduces complexity/LOC IMHO.

If you're still opposed to merging, I'll try to find some time later next week to dig deeper.

cjheath · 2015-06-09T06:06:35Z

Not opposed, just want to understand so as to avoid fixing what's not broken.
In the early days, TerminalParseFailures were being created for every failed terminal,
now we only create them when we're at the right-most point we've reached in the input,
and that saved a huge amount of time and memory. Now it's creating the Array tuple and
converting it later, which probably isn't really a saving; except perhaps that Arrays are
optimised in the Ruby runtime and might be quicker to create than the TPF. I don't know,
as AFAIK there's been no comparative performance testing done in some years and a lot
of things have changed in the Ruby runtimes.

I hope that explains the (historical) rationale anyhow. Whatever makes sense now will
be acceptable, as long as we understand why the change is needed. So yes, if you
could find time to investigate, I'd like to know.

sds · 2015-07-12T22:06:30Z

Sorry for the delay in following up on this issue.

I've dug around and it's clear that there are multiple invocations of terminal_failures happening within the code generated by Treetop during the course of the parse phase. I only call Treetop::Runtime::CompiledParser#failure_reason once (confirmed in the debugger) and yet the call to terminal_failures which converts the array of tuples is happening before that call (and thus the array ends up being a mixture of TerminalParseFailure instances and Array tuples, as described above).

If you look at Treetop::Compiler::Predicate or Treetop::Compiler::Repetition, you can clearly see terminal_failures.pop calls being generated in the code. These calls trigger the premature conversion of the array of tuples into an array of TerminalParseFailure instances, resulting in the problem described above.

Does that make sense? It seems like we can avoid the confusion by always storing an actual TerminalParseFailure instance instead of the current lazy conversion procedure, which this pull request addresses.

Interested in hearing your thoughts.

cjheath · 2015-07-13T00:07:04Z

Thanks for investigating. There is no need to call a method in order to pop terminal_parse_failures (which is needed in semantic predicates and in repetition node types). I changed the code generation to just pop @terminal_parse_failures directly to restore the original side-effect-free intention.

sds · 2015-07-13T03:58:45Z

Great, that solved the problem. Thanks!

cjheath closed this Jul 13, 2015

sds deleted the fix-failure_reasure branch July 13, 2015 03:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Store terminal failures as TerminalParseFailures #26

Store terminal failures as TerminalParseFailures #26

sds commented Jun 7, 2015

sds commented Jun 9, 2015

cjheath commented Jun 9, 2015

sds commented Jul 12, 2015

cjheath commented Jul 13, 2015

sds commented Jul 13, 2015

Store terminal failures as TerminalParseFailures #26

Store terminal failures as TerminalParseFailures #26

Conversation

sds commented Jun 7, 2015

sds commented Jun 9, 2015

cjheath commented Jun 9, 2015

sds commented Jul 12, 2015

cjheath commented Jul 13, 2015

sds commented Jul 13, 2015