Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[zero-copy] Implement E::Context and Parser::Config - parse-time control of parser parameters #269

Merged
merged 6 commits into from
Feb 13, 2023

Conversation

CraftSpider
Copy link
Collaborator

Now with a more readable diff and less intrusive changes, thanks to ParserExtra

Only two configurations are provided so far - more may be added on-demand, but these two provide enough for useful examples and the most common use-case.

@CraftSpider CraftSpider changed the title zero-copy] Implement E::Context and Parser::Config - parse-time control of parser parameters [zero-copy] Implement E::Context and Parser::Config - parse-time control of parser parameters Feb 8, 2023
@zesterer
Copy link
Owner

zesterer commented Feb 8, 2023

Nice :)

Regarding the previous PR... Part of me regrets that the Parser trait has become more complex to use (even if easier to implement). I wonder whether it would be possible to keep the public API the same as before but have the utility of a secondary trait that hides the 'cruft' behind associated types, with a blanket impl. That said, I think the worst outcome would be for both traits to 'leak' into the public API, creating even more confusion.

fn configure<F>(self, cfg: F) -> Configure<Self, F>
where
Self: Sized,
F: Fn(Self::Config, &E::Context) -> Self::Config,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps Fn(&mut Self::Config, &E::Context)? LLVM sometimes has trouble optimising moves into references, even when inlining, and I don't see the value of passing by value really, especially if the config is unsized or something.

src/zero_copy/primitive.rs Outdated Show resolved Hide resolved
src/zero_copy/mod.rs Outdated Show resolved Hide resolved
@zesterer
Copy link
Owner

zesterer commented Feb 8, 2023

So I've been experimenting with this a little locally and I've hit a few issues.

#[derive(Clone, Debug)]
enum Stmt {
    Expr,
    Loop(Vec<Stmt>),
}

fn parser<'a>() -> impl Parser<'a, str, Vec<Stmt>> {
    let expr = just("expr"); // TODO

    let block = recursive(|block| {
        let indent = any().filter(|c| *c == ' ')
            .repeated()
            .configure(|cfg, parent_indent| cfg.exactly(*parent_indent));

        let expr_stmt = expr
            .then_ignore(text::newline())
            .to(Stmt::Expr);
        let control_flow = just("loop:")
            .then(text::newline())
            .ignore_then(block).map(Stmt::Loop);
        let stmt = expr_stmt.or(control_flow);

        text::whitespace().count()
            .then_with_ctx(stmt
                .separated_by(indent)
                .collect::<Vec<_>>(), |a, b| a)
    });

    // empty().then_with_ctx(block, |_: (), _: &()| 0)
    Parser::<_, _>::then_with_ctx(empty(), block, |_, _| 0)
}

First off: this works!

let stmts = parser()
    .padded()
    .then_ignore(end())
        .parse(r#"
expr
expr
loop:
    expr
    loop:
        expr
        expr
    expr
expr
"#);
println!("{:#?}", stmts.output());

yields

Some(
    [
        Expr,
        Expr,
        Loop(
            [
                Expr,
                Loop(
                    [
                        Expr,
                        Expr,
                    ],
                ),
                Expr,
            ],
        ),
        Expr,
    ],
)

See here for the full example.

Very cool stuff.

However, if I swap the last two lines (they're equivalent) I get the following compilation error:

error[E0284]: type annotations needed
  --> examples/indent.rs:34:13
   |
34 |     empty().then_with_ctx(block, |_: (), _: &()| 0)
   |             ^^^^^^^^^^^^^
   |
   = note: cannot satisfy `<_ as ParserExtra<'_, str>>::Error == EmptyErr`
help: try using a fully qualified path to specify the expected types
   |
34 |     <chumsky::zero_copy::primitive::Empty<str> as chumsky::zero_copy::Parser<'_, str, (), E>>::then_with_ctx::<Vec<Stmt>, usize, chumsky::zero_copy::recursive::Recursive<dyn chumsky::zero_copy::Parser<'_, str, Vec<Stmt>, chumsky::zero_copy::extra::Full<EmptyErr, (), usize>, Config = ()>>, [closure@examples/indent.rs:34:34: 34:49]>(empty(), block, |_: (), _: &()| 0)
   |     +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++       ~

This seems to happen because the call to .then_with_ctx needs to use an implementation of Parser, but to do that it needs to figure out what E: ParserExtra is in use for empty(): but because we immediately override the context with a usize (0) there's nothing constraining this E so Rust can't infer anything.

I'm not sure how to resolve this. Perhaps it doesn't need solving? Ideally we'd have a .with_ctx(0) combinator that Avoids the need for the empty().then_with_ctx hack, but I'm fairly sure that it'll still end up having this unconstrained type variable floating around causing problems.

Perhaps we should do something like forcing the context of the LHS of then_with_ctx to be () or something? That seems a bit radical, but I think it would allow Rust to infer the type here.

@CraftSpider
Copy link
Collaborator Author

Forcing the context might work, but that does make issues with nested context-sensitive code. I couldn't do a .then_with_context(array) where array = foo.then_with_context(value). Once there's map_ctx and possibly just with_ctx, they should help, but we may still need explicit ctx in some places. OTOH - this kind of inference failure is already an issue with state, and can be fairly easily fixed by adding extra::Default or extra::Err<E> somewhere on the preceding parser.

@zesterer
Copy link
Owner

zesterer commented Feb 8, 2023

Agreed. Perhaps we should make E a type parameter of fn empty rather than just inferring it based on context.

@CraftSpider
Copy link
Collaborator Author

CraftSpider commented Feb 8, 2023

You should now have access to map_ctx and with_ctx, which should help the issue. I also added E to empty, since personally I find being able to specify extras on primitives is helpful - sometimes inference just isn't quite strong enough.

@zesterer
Copy link
Owner

zesterer commented Feb 8, 2023

Hmmm, actually, I've reversed my stance on &mut Cfg vs Cfg -> Cfg: with the former, it's difficult to specify configuration inline. Did you want me to change this in my example PR?

@CraftSpider
Copy link
Collaborator Author

I can swap it back. That was kind of my idea behind doing it via passing ownership originally, it made the inline code nicer. I can swap it back, it's all of a three line change.

@zesterer
Copy link
Owner

zesterer commented Feb 8, 2023

I can swap it back. That was kind of my idea behind doing it via passing ownership originally, it made the inline code nicer. I can swap it back, it's all of a three line change.

I've already made the change in CraftSpider#1, feel free to merge it into your branch.

@zesterer
Copy link
Owner

zesterer commented Feb 8, 2023

Oh, one other thing! This effectively replaces then_with, so I think it's worth getting rid of it given that it's opaque and potentially slow.

@CraftSpider
Copy link
Collaborator Author

I wasn't sure whether we wanted to remove it yet, since a lot of parsers aren't yet configurable. OTOH, hopefully most of them will be by the time we release zero-copy, so I'm fine removing it now if you want.

@CraftSpider
Copy link
Collaborator Author

Considering whether to split configuration into its own subtrait like IterParser - CfgParser or CfgIterParser which should help with the expanding Parser generics. OTOH, you still need Context on the extra, and it could lead to a bit of a trait explosion if we add more possible combinations :P

@zesterer
Copy link
Owner

Considering whether to split configuration into its own subtrait like IterParser - CfgParser or CfgIterParser which should help with the expanding Parser generics. OTOH, you still need Context on the extra, and it could lead to a bit of a trait explosion if we add more possible combinations :P

I think this seems sensible. I see Context as being sort of like State, something that 'moves through' parsers and hence makes sense to be on every parser. Config, however, is specific to each parser and independent so I think your idea of having a dedicated ConfigParser trait is a good one and should reduce overall complexity.

@CraftSpider
Copy link
Collaborator Author

Update: Been busy the past couple days, will get this rebased soon. I'm most of the way through rewriting it into a new trait.

@zesterer
Copy link
Owner

Ready to merge?

@CraftSpider CraftSpider marked this pull request as ready for review February 13, 2023 19:29
@CraftSpider
Copy link
Collaborator Author

I would say so, if you think it looks good

@zesterer zesterer merged commit 50aba26 into zesterer:zero-copy Feb 13, 2023
@CraftSpider CraftSpider deleted the zc-context branch March 2, 2023 21:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants