Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Try later alternatives when earlier success can't complete parsing #21

Closed
drhagen opened this issue Apr 26, 2021 · 1 comment
Closed

Comments

@drhagen
Copy link
Owner

drhagen commented Apr 26, 2021

Under no circumstances will Parsita attempt later alternatives if an early alternative succeeds. This behavior, which is typical in parser combinators, can be unexpected.

from parsita import *

class FunctionParsers(TextParsers):
    id = reg('[a-z]+')
    function = id & '(' >> repsep(id, ',') << ')'
    expression_good = function | id
    expression_bad = id | function

FunctionParsers.expression_good.parse('a(b)').or_die()
# ['a', ['b']]

FunctionParsers.expression_bad.parse('a(b)').or_die()
# parsita.state.ParseError: Expected end of source but found '('
# Line 1, character 2
# a(b)
#  ^  

That is because consume returns a single result--in the case of the AlternativeParser, the first result found among the alternatives. The typical workaround is to always put the longest alternative first if multiple may match. This is annoying, counterintuitive, and for some complex grammars, hard to achieve. It would be better if Parsita would keep trying alternatives if it could not succeed.

The consume method would have to be rewritten into a generator where each alternative of a parsers argument is tried before failing. This is probably doable, but would be a pretty big change to the internals of Parsita.

Implementing this could lead to very bad performance in cases where the text could not be parsed. Parsita would have to explore every avenue before giving up, which may take a while. It is possible that this feature would require a packrat parser to be usable.

@drhagen
Copy link
Owner Author

drhagen commented Aug 4, 2021

An alternative to the generator design is to make a LongestAlternativeParser that tries all alternatives and returns the longest one. This should also play much nicer with the packrat parser.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant