Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The state type is required to implement Default even if client code never calls new or new_from_iter #54

Closed
alan-j-hu opened this issue Aug 11, 2022 · 4 comments
Labels
bug Something isn't working

Comments

@alan-j-hu
Copy link
Contributor

alan-j-hu commented Aug 11, 2022

The README claims the following:

lexgen/README.md

Lines 302 to 308 in 6be10c9

- `fn new(input: &str) -> Self`: Used when the lexer does not have user state,
or user state implements `Default`.
- `fn new_with_state(input: &str, user_state: S) -> Self`: Used when the lexer
has user state that does not implement `Default`, or you want to initialize
the state with something other than the default. `S` is the user state type
specified in lexer definition. See stateful lexer example below.

However, as the library currently stands, the state type is required to implement Default even if the new and new_from_iter functions are never used. The reason is that the new and new_from_iter do not list $StateType: Default as a constraint, yet assume an implementation of Default in their bodies, and therefore if $StateType does not implement Default, the compilation of the entire program fails, even if the client never uses those functions.

A partial solution, which I implemented in the process of making #53, is to add $StateType: Default as an explicit constraint. However, this only bypasses the problem if$StateType contains generic parameters. So, if $StateType does not implement Default, $StateType = Foo<'input> and $StateType = Foo<'a> are fine, but $StateType = Foo<'static> and $StateType = Foo are not. The fact that Rust rejects Foo<'static>: Default means that I had to do 'input: static and Foo<'input>, which solves the issue but creates warnings that the generic parameter 'input should be replaced with 'static. This is also only a partial solution because it does not allow a state type that does not implement Default if the state has no generic parameters.

(In my opinion, the rejection of a function with constraint Foo: Default, where Foo does not implement Default, is an illogical design choice on the part of Rust. If the constraint Foo: Trait is not satisfied, that is no different in principle than if Foo<T>: Trait is not satisfied for some choice of T. If Foo<T>: Trait is not satisfied for some choice of T, that simply means that the function is not usable for such T. If Foo: Trait is not satisfied where Foo has no generic parameters, this should simply mean that the function is not usable under any circumstances, not that the compiler should reject the function definition. This illogical design choice unfortunately makes my PR messy.)

@alan-j-hu
Copy link
Contributor Author

To be honest, I think the simplest solution would be to make a breaking change and remove the new and new_from_iter functions. It would be slightly inconvenient for people whose types implement Default, who would have to explicitly pass default() to the *_with_state functions. However, the current situation is much more inconvenient for the people whose types do not implement Default (e.g. because all their constructors take parameters). If the state type is required to implement Default, that restricts what programs may be written. On the other hand, removing the two convenience functions would mean that users whose types do implement Default would have to type slightly more than they otherwise would, but would not fundamentally change what programs are possible.

@osa1
Copy link
Owner

osa1 commented Aug 11, 2022

Thanks for reporting this.

As can be seen from my wording in the README ("Used when the lexer has user state that does not implement Default"), my intention was to support user states that do not implement Default, but apparently I didn't do it right.

IIRC, originally we only supported state that implements Default. The constructors ..._with_state were added later.

Just to be concrete, here's the relevant parts of the generated code for a lexer Lexer(LexerState) -> ():

struct Lexer<'input, I: Iterator<Item = char> + Clone>(
    ::lexgen_util::Lexer<'input, I, (), LexerState, ::std::convert::Infallible, Lexer<'input, I>>,
);

impl<'input> Lexer<'input, ::std::str::Chars<'input>> {
    fn new(input: &'input str) -> Self {
        Lexer(::lexgen_util::Lexer::new(input)) // error: `LexerState: Default` is not satisfied
    }

    fn new_with_state(input: &'input str, user_state: LexerState) -> Self {
        Lexer(::lexgen_util::Lexer::new_with_state(input, user_state))
    }
}

impl<I: Iterator<Item = char> + Clone> Lexer<'static, I> {
    fn new_from_iter(iter: I) -> Self {
        Lexer(::lexgen_util::Lexer::new_from_iter(iter)) // error: `LexerState: Default` is not satisfied
    }

    fn new_from_iter_with_state(iter: I, user_state: LexerState) -> Self {
        Lexer(::lexgen_util::Lexer::new_from_iter_with_state(
            iter, user_state,
        ))
    }
}

lexgen_util::Lexer::new_from_iter and lexgen_util::Lexer::new require the state to implement Default.

I'm not sure if we can fix this without making the generated lexer structs (struct Lexer above) parametric on user state type, which would allow us to do something like:

struct Lexer<'input, I: Iterator<Item = char> + Clone, S>(
    ::lexgen_util::Lexer<'input, I, (), S, ::std::convert::Infallible, Lexer<'input, I>>,
);

impl<'input, S: Default> Lexer<'input, ::std::str::Chars<'input>, S> {
    fn new(input: &'input str) -> Self {
        Lexer(::lexgen_util::Lexer::new(input))
    }
}

Now Lexer::new would only type check if S: Default is satisfied.

To avoid typing the more complicated type MyLexer<MyLexerState> instead of MyLexer, we could generate a new type and a type synonym:

struct Lexer_<'input, I: Iterator<Item = char> + Clone, S>(
    ::lexgen_util::Lexer<'input, I, (), S, ::std::convert::Infallible, Lexer_<'input, I, S>>,
);

type Lexer<'input, I> = Lexer_<'input, I, LexerState>;

impl<'input, S: Default> Lexer_<'input, ::std::str::Chars<'input>, S> {
    fn new(input: &'input str) -> Self {
        Lexer(::lexgen_util::Lexer::new(input))
    }
}

impl<'input, S> Lexer_<'input, ::std::str::Chars<'input>, S> {
    fn new_with_state(input: &'input str, user_state: S) -> Self {
        Lexer(::lexgen_util::Lexer::new_with_state(input, user_state))
    }
}

Now the code should be backwards compatible and it should no longer require UserState: Default unless new or new_with_input are used.

I wanted to demonstrate the problem in detail but on the way came up with a solution..

WDYT about this solution @alan-j-hu? Would using a type synonym (instead of a struct) for the generated lexer type be a problem?

That said, I think it should be OK to just make new and new_from_iter take a state argument, just to keep the implementation simpler. I'm not sure if it would worth generating an extra type and making the code generator more complicated just so that some users won't have to pass Default::default() to lexer constructor in some cases. Passing Default::default() to constructors doesn't sound too bad.

@osa1 osa1 added the bug Something isn't working label Aug 11, 2022
@alan-j-hu
Copy link
Contributor Author

I don't see any issues with creating a type alias. I'm fine with either the type alias or removing the new and new_from_iter functions, whichever you think is better design for your library.

@osa1 osa1 closed this as completed in 8410f49 Aug 12, 2022
@osa1
Copy link
Owner

osa1 commented Aug 12, 2022

I wasn't sure how much complexity the solution described above would add to the code generator so wanted to give it a try. It turns out it's not too bad, so I just implemented it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants