Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Store terminal failures as TerminalParseFailures #26

Closed
wants to merge 1 commit into from
Closed

Store terminal failures as TerminalParseFailures #26

wants to merge 1 commit into from

Conversation

sds
Copy link

@sds sds commented Jun 7, 2015

When attempting to print out a failure message using failure_reason
for my tiny personal project propose[1], I was encountering the following error:

NoMethodError: undefined method `unexpected' for [0, "[a-z]", false]:Array> with backtrace

This started happening after upgrading to treetop 1.6.0+.

It looks like the problem here is that terminal failures are stored in
@terminal_failures as tuples, and when you call terminal_failures it
does a check to see if the first item is a TerminalParseFailure, and
converts the list if it isn't. However, if you call terminal_failures
multiple times, it's possible other errors were added, and so you'll
have a mixture of Array tuples and TerminalParseFailures.

The fix is to store them as TerminalParseFailures when the failure is
first encountered in terminal_parse_failure. It's not overly expensive
and allows us to avoid this unnecessary complexity.

[1] https://github.com/sds/propose

When attempting to print out a failure message using `failure_reason`
for my project `propose`[1], I was encountering the following error:

    NoMethodError: undefined method `unexpected' for [0, "[a-z]",
                   false]:Array> with backtrace

It looks like the problem here is that terminal failures are stored in
`@terminal_failures` as tuples, and when you call `terminal_failures` it
does a check to see if the first item is a `TerminalParseFailures`, and
converts the list if it isn't. However, if you call `terminal_failures`
multiple times, it's possible other errors were added, and so you'll
have a mixture of `Array` tuples and `TerminalParseFailure`s.

The fix is to store them as `TerminalParseFailure`s when the failure is
first encountered in `terminal_parse_failure`. It's not overly expensive
and allows us to avoid this unnecessary complexity.

[1] https://github.com/sds/propose
@sds
Copy link
Author

sds commented Jun 9, 2015

Looks like @terminal_failures is already cleared on each run (

@terminal_failures = []
).

I'm honestly not sure what I'm doing to cause this behavior. FWIW, I think this change makes sense even if this weren't an issue, since it seems a bit odd to store parse errors this way, and this pull request actually reduces complexity/LOC IMHO.

If you're still opposed to merging, I'll try to find some time later next week to dig deeper.

@cjheath
Copy link
Owner

cjheath commented Jun 9, 2015

Not opposed, just want to understand so as to avoid fixing what's not broken.
In the early days, TerminalParseFailures were being created for every failed terminal,
now we only create them when we're at the right-most point we've reached in the input,
and that saved a huge amount of time and memory. Now it's creating the Array tuple and
converting it later, which probably isn't really a saving; except perhaps that Arrays are
optimised in the Ruby runtime and might be quicker to create than the TPF. I don't know,
as AFAIK there's been no comparative performance testing done in some years and a lot
of things have changed in the Ruby runtimes.

I hope that explains the (historical) rationale anyhow. Whatever makes sense now will
be acceptable, as long as we understand why the change is needed. So yes, if you
could find time to investigate, I'd like to know.

@sds
Copy link
Author

sds commented Jul 12, 2015

Sorry for the delay in following up on this issue.

I've dug around and it's clear that there are multiple invocations of terminal_failures happening within the code generated by Treetop during the course of the parse phase. I only call Treetop::Runtime::CompiledParser#failure_reason once (confirmed in the debugger) and yet the call to terminal_failures which converts the array of tuples is happening before that call (and thus the array ends up being a mixture of TerminalParseFailure instances and Array tuples, as described above).

If you look at Treetop::Compiler::Predicate or Treetop::Compiler::Repetition, you can clearly see terminal_failures.pop calls being generated in the code. These calls trigger the premature conversion of the array of tuples into an array of TerminalParseFailure instances, resulting in the problem described above.

Does that make sense? It seems like we can avoid the confusion by always storing an actual TerminalParseFailure instance instead of the current lazy conversion procedure, which this pull request addresses.

Interested in hearing your thoughts.

@cjheath
Copy link
Owner

cjheath commented Jul 13, 2015

Thanks for investigating. There is no need to call a method in order to pop terminal_parse_failures (which is needed in semantic predicates and in repetition node types). I changed the code generation to just pop @terminal_parse_failures directly to restore the original side-effect-free intention.

@cjheath cjheath closed this Jul 13, 2015
@sds
Copy link
Author

sds commented Jul 13, 2015

Great, that solved the problem. Thanks!

@sds sds deleted the fix-failure_reasure branch July 13, 2015 03:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants