Generate RuleStack #309

maxime645 · 2018-10-04T13:13:38Z

In v4.6.1-beta001, you modified the following lines in DefaultErrorStrategy.Sync(Parser recognizer) :

ITokenStream tokens = ((ITokenStream)recognizer.InputStream);
int la = tokens.La(1);
// try cheaper subset first; might get lucky. seems to shave a wee bit off
if (recognizer.Atn.NextTokens(s).Contains(la) || la == TokenConstants.Eof)
{
     return;
}
// Return but don't end recovery. only do that upon valid token match
if (recognizer.IsExpectedToken(la))
{
    return;
}

Into :

ITokenStream tokens = ((ITokenStream)recognizer.InputStream);
int la = tokens.La(1);
// try cheaper subset first; might get lucky. seems to shave a wee bit off
IntervalSet nextTokens = recognizer.Atn.NextTokens(s);
if (nextTokens.Contains(TokenConstants.Epsilon) || nextTokens.Contains(la))
{
    return;
}

This causes a change in the behavior of how one of our error is detected in our extension of DefaultErrorStrategy. Instead of being detected on the top rule, it is detected on the bottom one. This results in the RuleStack being not complete. The thing is that I need all the information of the RuleStack.

Behavior in v4.5.3.

Behavior in v4.6.1.

To solve this issue, I have overridden the sync() method in my class extending DefaultErrorStrategy, with the code from v4.5.3. My question is: is there a better and safer way to solve this issue? What should I look for ?

The text was updated successfully, but these errors were encountered:

sharwell · 2018-10-04T13:55:33Z

The original bug report was antlr/antlr4#1545, which was fixed in antlr/antlr4#1546. Follow-up bugs antlr/antlr4#1931 and antlr/antlr4#1922 contain similar information to what you present here.

The original behavior was not correct in certain cases. It's likely that in your scenario, a valid path existed through Rule2 and Rule3, and therefore the error reported in 4.5.3 was objectively incorrect (failure should not be reported when the rule can parse successfully on some path). My observation in antlr/antlr4#1922 (comment) is particularly relevant for these situations.

maxime645 · 2018-10-08T09:57:13Z

Thanks for the quick answer and for the links. I will then keep the new version.

I still have an underlying issue, though. At some point when I am reading a line and encountering an Unwanted Token, the following message is shown:
Syntax error : extraneous input 'USING' expecting {separator, numeric literal, symbol, statement starting keyword, keyword}
And then stops reading the line and goes to the next.

So considering the example:
CALL USING x
The parser will read “CALL USING”, find an error on “Using” and never read and process the token “x”. Is this the expected behavior?
I read something similar to this problem in antlr/antlr4/issues/1922#issuecomment-314544003. Was there any solution ?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generate RuleStack #309

Generate RuleStack #309

maxime645 commented Oct 4, 2018

sharwell commented Oct 4, 2018

maxime645 commented Oct 8, 2018

Generate RuleStack #309

Generate RuleStack #309

Comments

maxime645 commented Oct 4, 2018

sharwell commented Oct 4, 2018

maxime645 commented Oct 8, 2018