Previous Token Information on Errors. #613

christianvoigt · 2017-12-07T14:11:19Z

Hi, for my VS Code linter I need the exact token locations for every parser error. If the error is an EarlyExitException and the token is of type chevrotain.EOF, the location properties of the token are all set to null. As a workaround I can check for this case in my linter and replace the location info in the error with the endLine and endColumn of the last token. But it would be nice if Chevrotain could do this for me.

christianvoigt · 2017-12-07T16:05:19Z

For anyone having the same issue, the workaround requires to create a new token, as the token location properties are read-only. Use:

chevrotain.createTokenInstance(chevrotain.EOF, "", startOffset, endOffset, startLine, endLine, startColumn, endColumn);

bd82 · 2017-12-07T16:40:14Z

the EOF does not really have a location.
But what can be done is add another property to the EarlyExitError.
The previous (real) Token

https://github.com/SAP/chevrotain/blob/master/src/parse/parser_public.ts#L3083-L3085

      // old code
      throw this.SAVE_ERROR(new exceptions.EarlyExitException(msg, this.LA(1))

      // new code, LA(0) will return the previously successfully parsed Token.
      throw this.SAVE_ERROR(new exceptions.EarlyExitException(msg, this.LA(1), this.LA(0))

Of course the class of the EarlyExitError will also have to modified to add the new property.
https://github.com/SAP/chevrotain/blob/master/src/parse/errors_public.ts#L62-L82

bd82 · 2017-12-07T16:43:19Z

The one edge case is what if the error is for a completely empty text input.
In that case there would be no previously successfully parsed token...

So that edge case would have to be tested.

bd82 · 2017-12-07T16:46:38Z

This seems fairly trivial to implement, So I'll happily accept pull requests for this 😄
Or implement this myself when I get around to it (after 1.0.0 release and new web-page creation)

christianvoigt · 2017-12-08T07:55:35Z

That is a good solution. I will add the changes today and send you a pull request.

The edge case will be no problem for the linter, as VS Code will put the error marker at the beginning of the file if the location is missing.

bd82 · 2017-12-08T10:11:08Z

The edge case will be no problem for the linter

We still need to check the edge case does not cause some weird runtime exception for Chevrotain.
I guess in the edge case both token properties should be EOF, so anyone using a Chevrotain parser
could easily identify it and handle it in their code.

christianvoigt · 2017-12-08T10:15:33Z

ok, so I can simply check if LA(0) is null, right? If that is the case, I will use LA(1) twice. Or should I additionally check with the tokenMatcher if LA(1) is EOF?

bd82 · 2017-12-08T10:29:54Z

The cleanest solution would be up upgrade LA to handle this.

    protected LA(howMuch: number): IToken {
        // Maybe add another check here that: 
        // currIdx + HowMuch > 0
        if (this.tokVectorLength <= this.currIdx + howMuch) {
            return END_OF_FILE
        } else {
            return this.tokVector[this.currIdx + howMuch]
        }
    }

The problem is that it might actually cause a slight performance regression as LA(x) is used a-lot.
There are performance benchmarks here to inspect that
https://github.com/SAP/chevrotain/tree/master/benchmark_web

But they are not working 100% as i've made a lot of breaking changes in the upcoming 1.0.0 version.
(The benchmark compares dev version vs latest version, but that means both must have identical APIs)
So just start with the clean solution and I will hack it later if needed for performance reasons.

christianvoigt · 2017-12-08T11:07:02Z

Sorry, I am not sure I understand your last comment. Our edge case is that this.tokVector is empty because the file was empty, right? In this case this.tokVectorLength is 0 and if I ask for LA(0) this.currIdx + howMuch will also be 0, so LA will return END_OF_FILE. This is already the behavior you wanted, isn't it? It seems to me I can simply use LA(0) for saving the previously parsed token without any further checks.

bd82 · 2017-12-08T11:39:57Z

In this case this.tokVectorLength is 0 and if I ask for LA(0) this.currIdx + howMuch will also be 0

this.currIdx starts at -1
Because normally you look at the next token so you call LA(1).
Which means on the first token you would need to access

this.tokVector[this.currIdx + howMuch] // -1 + 1 === 0

and the If condition would be:

this.tokVectorLength <= this.currIdx + howMuch // 0 <= -1 + 0

I know its confusing 😿

christianvoigt · 2017-12-08T11:45:46Z

Ah ok, now I get it. Thank you for the explanation.

christianvoigt · 2017-12-08T11:53:35Z

How about this:

        if (this.tokVectorLength === 0 || this.tokVectorLength <= this.currIdx + howMuch) {
            return END_OF_FILE
        } else {
            return this.tokVector[this.currIdx + howMuch]
        }

christianvoigt · 2017-12-08T11:58:11Z

Mhm, while this would work in the edge case it might still cause problems if one calls LA(0) and you probably want a general solution. So I will simply do what you proposed. Sorry for thinking out loud.

christianvoigt · 2017-12-08T12:07:35Z

ok, last question: What should happen if tokVectorLength > 0and this.currIdx + howMuch < 0? It would be misleading to return EOF in this case.

bd82 · 2017-12-08T12:22:06Z

I think it is still correct that it is EOF, just that the EOF is also the Start Of File, for an "empty" file.
Also consider that if LA(x) is EOF LA(x+1) is also EOF.
So it is kinda of a "null"/"undefined" for tokens.

I don't see a good alternative, creating a SOF virtual seems redundant, and returning null/undefined
seems even less informative.

christianvoigt · 2018-02-19T12:32:44Z

Today I finally managed to update Chevrotain in my parser to 2.0.1 and remove the workaround I posted above. I made some tests and realized that the same issue can occur in MismatchedTokenException and NoViableAltException as well. I previously thought that error.token can only be an instance of EOF if the error is an EarlyExitException, but this is not the case. MismatchedTokenException and NoViableAltException can also have EOF as their token and will not contain any location information in these cases. I have not been able to produce such a case with a NotAllInputParsedException but maybe the same issue can occur there.

So I guess it would make sense to add error.previousToken for all types of exceptions, right?

bd82 · 2018-02-23T00:29:09Z

Sorry for the delay, I knew I missed replying somewhere 😄

NotAllInputParsedException can never have EOF as its token as the condition to throw it requires that
the next token is not EOF.

Anyhow it makes sense to expend whatever solution you implemented to those other Exceptions types as well...

christianvoigt · 2018-05-10T21:51:21Z

Hi, have you added the previousToken information for MismatchedTokenException and NoViableAltException?

Or did you expect a pull request from me and closed the issue because I did not send you one (to be honest, I forgot this issue)?

bd82 · 2018-05-10T21:56:48Z

Actually I was pruning the backlog and thought this issue was closed as I did not re-read the whole conversation. I will re-open it until once of us will get around to finishing it. 😄

…MismatchedTokenException

bd82 added Help Wanted Good First Issue labels Dec 7, 2017

bd82 mentioned this issue Dec 8, 2017

previousToken information for EarlyExitException #616

Merged

bd82 closed this as completed Dec 8, 2017

bd82 reopened this Feb 19, 2018

bd82 closed this as completed May 10, 2018

bd82 reopened this May 10, 2018

christianvoigt mentioned this issue Jun 6, 2018

typescript: CustomPatternMatcherFunc return type should be "RegExpExecArray | null" #734

Closed

christianvoigt added a commit to christianvoigt/chevrotain that referenced this issue Jun 6, 2018

fixes Chevrotain#613 previousToken info for NoViableAltException and …

7a9be2e

…MismatchedTokenException

christianvoigt added a commit to christianvoigt/chevrotain that referenced this issue Jun 6, 2018

fixes Chevrotain#613 previousToken info for NoViableAltException and …

d2ec403

…MismatchedTokenException

bd82 closed this as completed in c4e721f Jun 6, 2018

bd82 changed the title ~~Missing token location in EarlyExitException if token is EOF~~ Previous Token Information on Errors. Jun 7, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Previous Token Information on Errors. #613

Previous Token Information on Errors. #613

christianvoigt commented Dec 7, 2017

christianvoigt commented Dec 7, 2017 •

edited

Loading

bd82 commented Dec 7, 2017 •

edited

Loading

bd82 commented Dec 7, 2017 •

edited

Loading

bd82 commented Dec 7, 2017

christianvoigt commented Dec 8, 2017

bd82 commented Dec 8, 2017

christianvoigt commented Dec 8, 2017

bd82 commented Dec 8, 2017 •

edited

Loading

christianvoigt commented Dec 8, 2017 •

edited

Loading

bd82 commented Dec 8, 2017 •

edited

Loading

christianvoigt commented Dec 8, 2017

christianvoigt commented Dec 8, 2017

christianvoigt commented Dec 8, 2017

christianvoigt commented Dec 8, 2017

bd82 commented Dec 8, 2017

christianvoigt commented Feb 19, 2018

bd82 commented Feb 23, 2018 •

edited

Loading

christianvoigt commented May 10, 2018

bd82 commented May 10, 2018

Previous Token Information on Errors. #613

Previous Token Information on Errors. #613

Comments

christianvoigt commented Dec 7, 2017

christianvoigt commented Dec 7, 2017 • edited Loading

bd82 commented Dec 7, 2017 • edited Loading

bd82 commented Dec 7, 2017 • edited Loading

bd82 commented Dec 7, 2017

christianvoigt commented Dec 8, 2017

bd82 commented Dec 8, 2017

christianvoigt commented Dec 8, 2017

bd82 commented Dec 8, 2017 • edited Loading

christianvoigt commented Dec 8, 2017 • edited Loading

bd82 commented Dec 8, 2017 • edited Loading

christianvoigt commented Dec 8, 2017

christianvoigt commented Dec 8, 2017

christianvoigt commented Dec 8, 2017

christianvoigt commented Dec 8, 2017

bd82 commented Dec 8, 2017

christianvoigt commented Feb 19, 2018

bd82 commented Feb 23, 2018 • edited Loading

christianvoigt commented May 10, 2018

bd82 commented May 10, 2018

christianvoigt commented Dec 7, 2017 •

edited

Loading

bd82 commented Dec 7, 2017 •

edited

Loading

bd82 commented Dec 7, 2017 •

edited

Loading

bd82 commented Dec 8, 2017 •

edited

Loading

christianvoigt commented Dec 8, 2017 •

edited

Loading

bd82 commented Dec 8, 2017 •

edited

Loading

bd82 commented Feb 23, 2018 •

edited

Loading