Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Collision in LALR grammars #10

Closed
kyouko-taiga opened this issue May 21, 2017 · 5 comments
Closed

Collision in LALR grammars #10

kyouko-taiga opened this issue May 21, 2017 · 5 comments

Comments

@kyouko-taiga
Copy link

I may very well be a bit rusty on LALR grammars, but I can't find what I'm doing wrong with that one:

_expr          : or_test

?or_test       : and_test
               | or_test  "or"  and_test    -> binary_expr
?and_test      : _primary
               | and_test "and" _primary    -> binary_expr

_primary       : call_expr
               | ident
               | "(" _expr ")"

call_expr      : _expr "(" _expr ")"

ident          : NAME
NAME           : /[^\W\d][\w]*/
NUMBER         : /(0|([1-9][0-9]*))(\.[0-9]+)?([Ee][+-]?[0-9]+)?/
STRING         : /'[^']*'/

%ignore /[\t \f]+/

(This is a simplified version of a my complete grammar, which can be seen here: light.g).

I initialize my parser like that:

parser = Lark(f, parser='lalr', start='_expr')

This grammar gets me a lark.common.GrammarError: Collision .... The remainder of the error message doesn't seem to be deterministic, as I always get a different one. It seems like I can't handle recursion of the _expr rule with the call_expr. Looks pretty much like a left-recursion issue, which is quite surprising for a LALR parser.

Everything works fine with the Earley parser.

@erezsh
Copy link
Member

erezsh commented May 22, 2017

Hi, you are right! This grammar can and should be accepted by a LALR parser.
But, it's not a left-recursion issue. The reason that it doesn't work for lark, is that Lark is currently missing a reduce-reduce resolution feature, which I never really needed so far. For example, I managed to write a complete grammar for Python without running into such a collision.

However, reduce-reduce resolution is on my TODO list, and it shouldn't be difficult to add. So if Earley isn't performant enough for your needs, let me know and I'll move it up the list.

@kyouko-taiga
Copy link
Author

Hey, thanks for the quick answer!

Earley seems to be doing the job so far. I'll have to work on code generation before parsing can be an issue, so I can move to the LALR later.

@erezsh
Copy link
Member

erezsh commented May 22, 2017

Great. Let me know if it ever becomes an issue.

@erezsh
Copy link
Member

erezsh commented May 27, 2017

Upon further inspection, this reduce-reduce collision is actually not resolvable within LALR(1).

PLY can handle some simplistic cases like this of reduce-reduce collisions, I presume using mechanisms specifically targeting those cases.

Alternatively, It might be possible to solve this collision, along with a wider range of collisions, using LALR(2).

However, both these non-general solutions take some non-trivial work, so I'm reluctant to implement them, unless there's a real need for them.

I'm angling towards a much wider solution, which is a hybrid Earley+LALR engine, that should give near-LALR performance even for non-LALR grammars, and should have no problem solving reduce-reduce collisions such as this. It will take a lot of work, but it should solve the problem in the general case, so that's where I prefer to strive.

I'm closing this issue. Please feel free to re-open it (or a new one) if you feel that it's necessary.

@RevanthRameshkumar
Copy link

Hi, I ran into this issue and I am also going to use earley for now. I would love to use lalr though for the determinism

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants