Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parsing invalid input #2

Closed
darichey opened this issue Jun 12, 2022 · 6 comments
Closed

Parsing invalid input #2

darichey opened this issue Jun 12, 2022 · 6 comments

Comments

@darichey
Copy link

One attractive feature of rnix-parser is, through its use of rowan, it is able to recover from errors and provide partial parses for input. This is especially useful in the context of editor tooling where the code is frequently invalid while it is being edited. How does nixel/santiago stack up in this respect?

@kamadorueda
Copy link
Owner

We don't support that at the moment

Do you know if there is literature explaining how to extend the Earley Parser to that use case?

@darichey
Copy link
Author

Unfortunately I'm not very familiar with the Earley algorithm (or really parsing in general beyond recursive descent). I had a quick look, but I can't immediately find any explanation on how rowan accomplishes this beyond "a special effort of continuing the parsing if an error is detected" here.

In that document, rowan refers to the desired properties as "lossless" and "resilient":

  • Parsing is lossless (even if the input is invalid, the tree produced by the parser represents it exactly).
  • Parsing is resilient (even if the input is invalid, parser tries to see as much syntax tree fragments in the input as it can).

But I couldn't immediately find any literature explaining the technique. I'll keep looking :)

@darichey
Copy link
Author

darichey commented Jun 13, 2022

Small update: I misunderstood how rowan plays in here. It is not responsible for producing the lossless syntax trees, but representing them. Therefore, the techniques we're looking for (not for Earley, but just to get a general idea of what's going on) are actually found in the users of rowan like rnix-parser and rust-analyzer.

@darichey
Copy link
Author

I was reminded of this in the rnix matrix channel today. I'd like to just dump some links I found in case anyone wants to continue investigating:

Feel free to close if this isn't something you're interested in pursuing. It seems like it would be a lot of work.

@kamadorueda
Copy link
Owner

Actually, is not much work now that we use GNU Bison as a parser generator (since nixel 5.0), it would be just a matter of using: https://www.gnu.org/software/bison/manual/bison.html#Error-Recovery

I'm happy to receive contributions for it, I think the sensible places to recover from errors would be between containers, like '(' error ')', or in unfinished bindings like binds attrpath '=' error and so on

@kamadorueda
Copy link
Owner

Closing in favor of #3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants