New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
repeated
lets keywords recognize as idents instead (priorization of or
breaks?)
#196
Comments
Chumsky, as with all PEG parsers, deals with ambiguity by simply using the first correctly parsing pattern. If you want to ensure that This is discussed in a little more depth towards the end of the 'parsing Alternatively, a solution that circumvents this problem is entirely is to separate lexing and parsing into distinct phases (for an example of this, see the |
Thanks for the answer.
Yeah that's what I have thought of, and how I've done as well (at least as far as I understood it), to sum it up quickly (also minimal reproducable form): let ident = text::ident().padded();
recursive(|expr| {
// let <ident> = <expr> in <expr>
let let_in = text::keyword("let")/* ... shortened for readability */;
let var = ident.map(Expr::Var);
// some other rules...
let atom = let_in.or(var);
// <expr> <expr>
let app = atom.repeated()
.at_least(1)
.map(|v| v.into_iter().reduce(|e1, e2| Expr::App(e1.into(), e2.into())).unwrap());
app
}) This should first try the I'd like to avoid an extra lexer, to keep this simple, but I think I'm trying this now anyway. |
This sounds to me like that As an aside, rather than doing let app = atom.repeated()
.at_least(1)
.map(|v| v.into_iter().reduce(|e1, e2| Expr::App(e1.into(), e2.into())).unwrap()); I'd recommend instead doing let app = atom.then(atom.repeated())
.foldl(|e1, e2| Expr::App(e1.into(), e2.into())); It does the same thing but is quite a bit easier to read and avoids the |
Thanks for the tip, that certainly is more readable. As I said the // let <ident> = <expr> in <expr>
let let_in = text::keyword("let")
.padded()
.ignore_then(ident)
.then_ignore(just('='))
.then(expr.clone())
.then_ignore(text::keyword("in").padded())
.then(expr.clone())
.map(|((name, let_body), in_body)| {
Expr::Let(name, Box::new(let_body), Box::new(in_body))
}); |
That's definitely strange. Do you have a minimal example you can send that reproduces this?
|
I've just tested it with the |
Thanks, I'll try to find the time to look at this tomorrow. |
Sorry is took me so long to get to this. The |
No problem, thanks for investigating, now that you say it, it definitely makes sense. Like this: let let_kw = text::keyword("let").padded();
let in_kw = text::keyword("in").padded();
let ident = text::ident().padded().none_of([let_kw, in_kw]); |
Yep, larsers end up requiring a lot of annoying hacks like that to work reliably. Glad to hear that the problem is fixed! |
Ok I'm not sure what is going wrong here, and/or if this is a bug, I'm trying to port https://github.com/kritzcreek/fby19 into Rust to learn type inference.
This is where the issue appears: https://github.com/Philipp-M/simple-hindley-milner-typeinference/blob/0c49d2c289efb498d23eacd81acae543a7aa4a97/src/parser.rs#L40
Apparently the repeated rule (
Expr::App
) lets the keywordslet
parse into aExpr:Var
instead of applying the let rule (it fails?).Output of the relevant test:
If I return
atom
directly (uncomment line 37),let_in
parses correctly.As far as I understood, the ors for the atom rule are tried one after the other, so
let_in
should be prioritized vsExpr::Var
?(Can be tested with
cargo test
)The text was updated successfully, but these errors were encountered: