Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Single-token deletion regression between 4.5.3 and 4.7 #1931

Closed
bancron opened this issue Jun 27, 2017 · 3 comments
Closed

Single-token deletion regression between 4.5.3 and 4.7 #1931

bancron opened this issue Jun 27, 2017 · 3 comments

Comments

@bancron
Copy link

bancron commented Jun 27, 2017

Hi,

Mike Lischke suggested I file this as a bug based on my stackoverflow question.

I have been using ANTLR 4.5.3, and it has the behavior described in the Definitive ANTLR 4 Reference. Specifically, single-token deletion would occur. Now that I've upgraded to 4.7, the single-token deletion doesn't always occur. This new behavior shows up both in code and in the IntelliJ ANTLR4 plugin.

Here is my grammar:

grammar Viz2;

comparisonsRange: comparisonGroup+;
comparisonGroup: comparison INT | INT comparison;
comparison: '<';
unusedRule: 'this' | 'that';

INT: '-'?[0-9]+;

WS: [ \t\r\n]+ -> channel(HIDDEN);

OTHER: . -> channel(HIDDEN);

Here is the behavior in 4.7.

This one correctly deletes the token "that" which is in the grammar.
< that 100 < 1000
image

This one just stops parsing after it hits "that".
< 100 that < 1000
image

Here is the behavior for 4.5.3
image
image

TL;DR

Given input with an unmatched token which is part of the grammar.

Expected behavior: performs single-token deletion and continues with the rest of the parse.

Actual behavior: sometimes doesn't perform single-token deletion and gives up on the rest of the parse.

@sharwell
Copy link
Member

The behavior you describe appears to be correct. comparisonsRange does end with an explicit EOF, and therefore allows a valid subset of the file to match without error. In the case of the input < 100 that < 1000, the prefix < 100 represents a valid tree for comparisonsRange, so there is no need to attempt to consume more data if it would lead to an error.

📝 Note that not all invalid inputs will successfully handle this case (per #118), so it's recommended to always ensure your start rule ends with an EOF.

@sharwell
Copy link
Member

📝 To be clear, the behavior described for 4.5.3 was actually a bug in the prediction algorithm, and was fixed as part of #1546.

@bancron
Copy link
Author

bancron commented Jul 12, 2017

Thank you for your response. That makes sense.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants