Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A | B != B | A #188

Closed
timzaak opened this issue Mar 27, 2021 · 12 comments
Closed

A | B != B | A #188

timzaak opened this issue Mar 27, 2021 · 12 comments

Comments

@timzaak
Copy link

timzaak commented Mar 27, 2021

maybe, remove this | method. just use orElse

@johnynek
Copy link
Collaborator

While it is true that | is not commutative (a.backtrack | b.backtrack would be), having an infix operator is very nice.

Do you have any other suggestions?

@timzaak
Copy link
Author

timzaak commented Mar 28, 2021

maybe | would act like oneOf.

I just write graphql schema parser.
Parser2.scala.
there are some user experience to share:

  1. need insert lots of whitespaces, maybe it's a good idea to create ParserContext, just like fastparse to handle whitespaces.
  2. so much backtrack, and ? always need backtrack. maybe it's my code fault, there would have nice way to avoid using so much backtrack.
  3. P[A] ~ (P[B].surround(whitespaces).backtrack.? ~ P[C].surround(whitespaces).backtrack.?) will read whitespaces twice when P[B] fails. but I don't know how to avoid.

@ritschwumm
Copy link

ritschwumm commented Mar 28, 2021

PEGs use / for their non-commutative "or" for this reason - would that fit here?

@johnynek
Copy link
Collaborator

Sorry I'm slow to reply on the weekend.

You definitely don't always need backtrack with ?. I'll rewrite some of your examples without backtrack and it might give you a feel for how to do it.

A big design property of this library is to encourage you to write parsers avoiding backtracking since that can easily result in accidental exponential complexity and also it generally results in poor descriptions of what went wrong.

That said, we could probably do with better educational materials to help people get started.

@johnynek
Copy link
Collaborator

Also note | does act like oneOf. It is a synonym. oneOf is also not commutative, it matches left to right.

@timzaak
Copy link
Author

timzaak commented Mar 29, 2021

@johnynek thanks. And I'm sorry that I misunderstand the usage of |. It's good.

@ritschwumm
Copy link

intuitively, i'd expect | to be commutative, too and therefore rename it to /. or is this a problem concerning operator precedence?

@johnynek
Copy link
Collaborator

johnynek commented Apr 1, 2021

related to #184

@johnynek
Copy link
Collaborator

johnynek commented Apr 1, 2021

While / is a good suggestion, the cost of breaking compatibility is significant... I can imagine adding / as another synonym of orElse but then we have three equivalent methods: orElse, / and |.

I'd like to take it a bit slow.

@johnynek
Copy link
Collaborator

johnynek commented Apr 1, 2021

Hey @timzaak you have quite a bit of nice parsing code. One thing you don't seem to be using is "soft products", A normal product a ~ b will have an arresting failure if a parses and b has any failure at all... So, once a parses you expect b MUST come next. A soft product only fails if at least one has an arresting failure. So, a.soft ~ b will not be an arresting failure if a parses but b has an epsilon error (it fails to parse without a partial parse).

So, for instance:

pa ~ (pb.surround(whitespaces).backtrack.? ~ pc.surround(whitespaces).backtrack.?)

we can write instead as:

val hasB = pb ~ (whitespaces.soft *> pc).?
pa ~ (whitespaces.soft *> (hasB | pc)).?

So, now, we are very precise in the backtracking: only over the whitespace part. We are not erasing an arresting error as we did above: see pb.surround(whitespaces).backtrack what if you hit something in pb that could not parse: an arresting failure indicates a partial parse which we generally strive to be a real syntax error. By backtracking there, we are hiding from the user the real cause of the problem and presenting them with a confusing error when instead we wind up hitting the None branches in ?. This is the key goal of avoiding backtracking: it very easily hides the true syntax error from the user.

Does this help @timzaak ?

@timzaak
Copy link
Author

timzaak commented Apr 1, 2021

@johnynek thanks, I will rewrite my code. It's a very useful advice.

@timzaak
Copy link
Author

timzaak commented Apr 2, 2021

@johnynek thanks again, I rewrite the code, and it looks good to me now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants