-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix indirect left recursive rule operator precedence via delegated operator precedence #4480
base: dev
Are you sure you want to change the base?
Conversation
Signed-off-by: Alex Lüpges <luepges@se-rwth.de>
… operator precedence Signed-off-by: Alex Lüpges <luepges@se-rwth.de>
Signed-off-by: Alex Lüpges <luepges@se-rwth.de>
Signed-off-by: Alex Lüpges <luepges@se-rwth.de>
Signed-off-by: Alex Lüpges <luepges@se-rwth.de>
Signed-off-by: Alex Lüpges <luepges@se-rwth.de>
Signed-off-by: Alex Lüpges <git.rwth@luepg.es>
any update on when this will be merged ? |
Some aspects of this PR offer up some discussion (mainly my decicion on the RuleWalker and DOTGenerator), For my usages, I'm either working around this issue by preprocessing the input grammars or by patching these changes into antlr.
The -atn CLI option outputs the ATN in graphviz/dot format (which you can render using a graphviz or some websites) |
thanks for the answer, |
Resolves #4460
Operator precedence is currently (incorrectly) reset to zero, when a left-recursive rule contains other (non-left)recursive elements.
The ATN for rule e is shown below:
In case no indirect recursion was present (
e : '-' e; | e '*' e * ID;
), the ATN would look like the following:Notably, without indirectness,
<e[2]>
signals a precedence used.For completeness, the ATN of rule m is shown below too:
The
<e[0]>
(or in Java:e(0);
) in rulem
resets the operator precedence to zero, which is not correct (see issue #4460 or the tests added by this PR).Proposed solution:
In case we detect a rule reference resulting in a binary or prefix recursion, we introduce an optional parameter
int _dp
(similar to thep
parameter of left recursive rules) with a default of0
to the targeted rule.(This would replace the old
<e[0]>
by a<e[_dp]>
.)This PR does not touch the ATN behavior of ANTLR (well.. almost not. The possible precedence value
_dp
is now excluded during the ATN creation, but where this value is now set, any precedence was 0 before - yay).This MR also introduces tests for additional attributes for rules (but not for left recursive ones, because ANTLR outputs an error message for these:
rule e is left recursive but doesn't conform to a pattern ANTLR can handle
)The precedence option for indirect left-recursive rules is now set to
"_dp"
, which is a new parameter introduced (similar to the existing_p
).Breaking changes:
This PR would prevent optional parameters from being added to parsers for the Dart target via the grammar.
Normal parameters are not affected, thus I would argue that change to be not of importance.
The reason I found this bug at all was that with large grammars,
I wanted to use the same rules (such as formulas) in multiple other rules.
The expected token position of the TestSymbolIssues#testUndefinedLabel test had to be modified by 5 tokens (
(= dp 1)
).Thanks to Ken (@kaby76) for the initial directions over on SO
(By accident this also fixes #4477 :) )
P.S.: I've run a bunch of tests on internal grammars generated using the changes proposed, which so far has not shown additional errors.
P.P.S.: I am not sure if a RuleWalker similar to the
LeftRecursiveRuleWalker
instead of myLeftRecursiveRuleTransformer#isBinaryOrPrefix
method would have been the better option - (but my version is at least easier to step through while debugging)P.P.P.S.: The DOTGenerator is not aware of the presence of the
_dp
value for rererences - if desired, I can add this functionality (but wanted to refrain from doing so before the other changes were discussed.)