New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
More flexible input lifetime #177
Conversation
It does not. The minimal test case you want to use is something like: lrlex_mod!("my_lexer_module");
lrpar_mod!("my_parser_module");
pub fn parse_my_data<'a>(input: &'a str) -> Option<crate::my_parser_module::ParseResult<'a>> {
let lexer_def = crate::my_lexer_module::lexerdef();
let l = lexer_def.lexer(input);
match crate::my_parser_module::parser(&l) {
(Option::Some(Ok(x)),_) => Some(x),
_ => None
}
} I roughly implemented that here. It gives me: The critical issue I believe is here. The resulting trait needs to track 2 separate lifetimes, the input source, and the lexer itself need to have separate lifetimes. When they're identical, this is when problems happen. This is why the lifetimes tend to "explode" when you get to internal types as you're borrowing 2 lifetimes explicitly, so you need a 3rd lifetime. Then later in |
OK, I can see what your example is doing now. Clearly I haven't yet fully understood #174 yet! I'll keep at it -- this is fascinating stuff! |
@valarauca c815524 now allows your example to compile. I did engage in one cheat, changing pub fn lexer(&self, s: &str) -> impl Lexer<StorageT> {
LRLexer::new(self, s)
} to: pub fn lexer<'lexer, 'input: 'lexer>(
&'lexer self,
s: &'input str
) -> LRLexer<'lexer, 'input, StorageT> {
LRLexer::new(self, s) AFAICS the only reason to do this is so that Do you think this commit is flexible enough or are there other things that it doesn't allow to compile? |
I depended on I guess I was wrong about the internal lifetimes. |
@valarauca Cool, thanks so much for trying this -- I really can't emphasise enough how much of an eye-opener your PR has been for me! I hate doing this (because I would dislike it being done to me), but I think this PR, as it's simpler, is probably the better candidate for merger relative to #174. Are you OK with that? And do you have any comments on any of the code? If we do end up merging this PR, I'd like to make sure it's as good as it can be. |
Of course. I can close #174 it is nbd.
I took a look across it; LGTM. Props on getting the change smaller, and more readable. Thanks for your hard work and help getting this ready to go 👍 |
Thanks for being a good sport @valarauca ! I'm going to ask @ptersilie to do a lightweight review of the PR, but since you're happy with it, I expect that will be a simple task for him. @ptersilie I will need to squash at least one commit away before merging. |
Typo in commit 8d41382? |
Agreed. |
Just one minor typo which you can fix during squashing. Another thought I had was, since this PR is based on @valarauca's work, maybe we could add him as a co-author to some of the commits where that makes sense. Thoughts? |
Excellent idea! Let me squash and do that at the same time. |
Ok, please go ahead. |
…type. Co-authored-by: valarauca <codylaeder@gmail.com>
This made clear some subtle inconsistencies in lifetimes between the various uses of the explicit version of this type. Co-authored-by: valarauca <codylaeder@gmail.com>
The actual 'input lifetime is going to be related to, but not the same as, 'b. Note that ctbuilder.rs remains unrenamed, because in that file there is a mix of "wrong" and "correct" uses of 'input, and it will probably be easier to change it one go later.
I found it a bit confusing that sometimes we had <'a, 'b: 'a> and sometimes <'a, 'b>; since the former is more restrictive I've settled on that.
With this the following program now successfully compiles: fn main() { let lexerdef = t_l::lexerdef(); let input = "a"; let t = { let lexer = lexerdef.lexer(&input); let lx = lexer.iter().next().unwrap().unwrap(); lexer.span_str(lx.span()) }; } A mildly modified version of this is now included as a test in lrpar/cttests/src/lexer_lifetime.test.
ab0d4c0
to
d28d511
Compare
OK, done: I've added @valarauca to the first two commits as those are clearly based off his work (I feel I should shoulder the blame for the latter commits!) and squashed. |
bors r+ |
177: More flexible input lifetime r=ptersilie a=ltratt This is based on #174: it tries to decouple the lifetime of a lexer from its input. That PR is a work of near genius: I simply would not have imagined that the outcome it achieves is possible without the PR as a proof-of-existence. The only minor problem was that I couldn't work out how it achieved its effect. I therefore tried to simplify the PR, but didn't get very far. I then tried reimplementing it, and didn't get very far with that either. This PR is the result of me taking a different approach. First I simplified and unified the existing lifetimes in grmtools, because there were several inconsistencies, which I thought might be responsible for some of the pain in #174. Once that was done I could then add an explicit `'input` lifetime in 91dab3f which I *think* achieves the same effect as #174. Certainly it is enough to allow this program to now compile: ```rust fn main() { let lexerdef = t_l::lexerdef(); let input = "a"; let t = { let lexer = lexerdef.lexer(&input); let lx = lexer.iter().next().unwrap().unwrap(); lexer.span_str(lx.span()) }; } ``` where previously rustc would complain: ``` error[E0597]: `lexer` does not live long enough --> src/main.rs:12:9 | 9 | let t = { | - borrow later stored here ... 12 | lexer.span_str(lx.span()) | ^^^^^ borrowed value does not live long enough 13 | }; | - `lexer` dropped here while still borrowed error: aborting due to previous error ``` However, I am not sure if this PR is able to handle all the same cases as #174. I'm hoping that @valarauca will be able to let me know if this solves the problem that led him to create #174. Co-authored-by: Laurence Tratt <laurie@tratt.net>
Build failed: |
Oops! That last commit will need squashing. bors try |
tryBuild succeeded: |
For the program: pub fn parse_my_data<'a>(input: &'a str) -> Option<&'a str> { let lexer_def = crate::t_l::lexerdef(); let l = lexer_def.lexer(input); match crate::t_y::parse(&l) { (Option::Some(x),_) => Some(x), _ => None } } this previously failed to compile with: error[E0515]: cannot return value referencing local variable `l` --> src/main.rs:26:32 | 25 | match crate::t_y::parse(&l) { | -- `l` is borrowed here 26 | (Option::Some(x),_) => Some(x), | ^^^^^^^ returns a value referencing data owned by the current function error[E0515]: cannot return value referencing local variable `lexer_def` --> src/main.rs:26:32 | 24 | let l = lexer_def.lexer(input); | --------- `lexer_def` is borrowed here 25 | match crate::t_y::parse(&l) { 26 | (Option::Some(x),_) => Some(x), | ^^^^^^^ returns a value referencing data owned by the current function error: aborting due to 2 previous errors; 1 warning emitted This commit fixes that problem, mostly through introducing a new lifetime 'lexer. There is one very tiny API change in lrlex::LexerDef from: pub fn lexer(&self, s: &str) -> impl Lexer<StorageT> { LRLexer::new(self, s) } to: pub fn lexer<'lexer, 'input: 'lexer>( &'lexer self, s: &'input str ) -> LRLexer<'lexer, 'input, StorageT> { LRLexer::new(self, s) } This is a safe change because, by definition, lrlex can only return LRLexer instances. The alternative to this would have been to add a second lifetime 'lexer to the Lexer trait, which seems overkill.
Please squash. |
64a04e3
to
e577a52
Compare
Squashed. |
bors r+ |
Build succeeded: |
This is based on #174: it tries to decouple the lifetime of a lexer from its input. That PR is a work of near genius: I simply would not have imagined that the outcome it achieves is possible without the PR as a proof-of-existence. The only minor problem was that I couldn't work out how it achieved its effect. I therefore tried to simplify the PR, but didn't get very far. I then tried reimplementing it, and didn't get very far with that either.
This PR is the result of me taking a different approach. First I simplified and unified the existing lifetimes in grmtools, because there were several inconsistencies, which I thought might be responsible for some of the pain in #174. Once that was done I could then add an explicit
'input
lifetime in 91dab3f which I think achieves the same effect as #174. Certainly it is enough to allow this program to now compile:where previously rustc would complain:
However, I am not sure if this PR is able to handle all the same cases as #174. I'm hoping that @valarauca will be able to let me know if this solves the problem that led him to create #174.