Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handling rules that are both left- and right-recursive #55

Closed
josephg opened this issue Jan 29, 2016 · 5 comments
Closed

Handling rules that are both left- and right-recursive #55

josephg opened this issue Jan 29, 2016 · 5 comments

Comments

@josephg
Copy link

josephg commented Jan 29, 2016

I'm reading the literature on PEGs. This paper talks about a bug in @alexwarth's approach to left recursion which is present in ohm:

This grammar:

MyLang {
  Expr = Expr "-" Num --a
       | Num --b  
  Num = digit+
}

... Correctly parses "1-2-3" to (((1)-2)-3). But when I change the grammar to this:

MyLang {
  Expr = Expr "-" Expr --a
       | Num --b
  Num = digit+
}

The tree is flipped to (1-(2-(3))).

Maybe this isn't an issue in practice? Its hard to say. But I can certainly see it tripping some people up.

@pdubroy
Copy link
Contributor

pdubroy commented Jan 29, 2016

Interesting that you bring this up, as Alex and I were just talking about it last week. In my experience, it doesn't come up that often -- but yes, when it does, it can be a bit unexpected.

Thanks for filing this -- I think it's something we should think about before 1.0.

@pdubroy pdubroy added the 1.0 label Jan 29, 2016
@pdubroy pdubroy changed the title Left recursion sometimes generates wrong parse tree Handling rules that are both left- and right-recursive Jan 29, 2016
@josephg
Copy link
Author

josephg commented Jan 29, 2016

Do you have some other examples of this problem? I have an idea on a way to solve this, but I want to make sure it generalises beyond expr = expr - expr | num.

@alexwarth
Copy link
Contributor

Hi Joseph,

I've known about Laurie's paper for a while, and (to put it mildly) I violently disagree with it :)

Who's to say that the result of applying a rule like

expr = expr "-" expr | num

should be left-associative? Or right-associative, for that matter? PEGs don't support left-recursion! The only "correct" result of applying this rule is non-termination. Of course that isn't terribly useful, so researchers have proposed extensions to PEGs that enable left-recursive rules to produce useful results. The semantics of these extensions isn't prescribed by Bryan Ford's definition of PEGs -- it's really up to the people who created the extensions.

Ohm's approach to left recursion happens to produce right-associative parse trees for your example. This makes sense if you think of left recursion as a kind of loop: the second application of expr in expr "-" expr is like a nested loop, which will consume as much input as it can before the "outer loop" gets a chance to consume more input. I can imagine a different semantics in which the result would be left-associative. That would be fine, too, but certainly no more correct or valid than our current semantics.

You asked if this is an issue in practice. It isn't. I'll go a step further and say that this kind of "ambiguous" rule definition is code smell. Do you really want people to wonder whether expr produces a left- or right-associative result? No! In a well-written grammar, it's obvious. In this case, you want the - operator to be left-associative so you're better off defining expr like this:

expr = expr "-" num | num

Now there's no question about it.

We have plans to release a bunch of exciting new features and tools for Ohm. I want to prioritize that work over this, which I don't consider to be a real issue. So I'm closing it.

Thanks!

@josephg
Copy link
Author

josephg commented Jan 30, 2016

Great! Thanks for taking the time to explain your thoughts. I didn't want to spend a bunch of time figuring out realtime parsing only for the recursion algorithm to change.

There's no rush for this, but it might make sense at some point to add a warning for left- and right- recursive rules, warning people that the associativity of the resultant parse tree is arbitrary.

@alexwarth
Copy link
Contributor

Good idea! See #56.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants