-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Non-optional rules with fully-optional children prevent additional rules from being collected #23
Comments
I was wondering about the same issue. As a fix, I suggest to continue to interpret the remaining rule when returning from a referenced rule. This can be easily done by replacing ll. 266-268 with ruleStack.push(ruleTransition.target.ruleIndex);
this.collectFollowSets(transition.target, stopState, followSets, seen, ruleStack);
ruleStack.pop();
this.collectFollowSets(ruleTransition.followState, stopState, followSets, seen, ruleStack); Two limitations still remain:
|
I'm working on the fix for this (and some related issues such as handling for an empty rule body at, below, or after the rule transitioned to at the caret position). But @mike-lischke can you provide some clarity on whether we intend to return the Epsilon token as a candidate when parsing is successful terminated at the caret (without any subsequent tokens being required)? The epsilon is returned in some cases (when a rule can be parsed at the caret) but not in others. For example the test case "Most simple setup" does not take this into account - it expects Epsilon is not a candidate token but |
The The fact that it is returned sometimes is probably just an oversight and if you can fix that, you are welcome to do so! |
Understood. But to explain, the idea would be that if we return a list of tokens including the epsilon (or via a separate boolean property) we could tell the caller if one of the returned tokens is required - or if they are all optional. That said, they should already have that information from parsing in advance. And the situation becomes complicated taking into account rules and ignored tokens. |
Currently, collectCandidates does not return all possible tokens when it encounters rules that do not necessarily consume tokens. This is due to two reasons: 1) collectFollowSets does not proceed to states beyond the end of a rule transitioned to. This means that subsequent tokens within the rule are not collected even if the rule transition is to a rule with fully-optional tokens or no tokens at all. 2) processRule always returns an empty-set RuleEndStatus when at the caret position. This means that tokens from subsequent transitions are not collected, even if the processed rule could reach the rule end state without consuming tokens. This PR fixes both issues by adding property isExhaustive to the follow sets object to represent whether the collected follow sets represent all possible tokens that can follow, or if subsequent transitions need to be followed to further collect tokens, and using that to further collect follow sets within the processed rule after the rule transition and after the processed rule. Also, it removes the use of the epsilon token when collecting follow sets and refactors code to use isExhaustive property instead, ensuring that the epsilon token is never returned in the candidate tokens list. Fixes mike-lischke#23
Hmm, what if there's a mix of mandatory and optional tokens? It would be more useful if that information is available for each candidate. If you return a flag or the EPSILON token then the caller can only assume that all candidates are optional. |
I am working on a grammar that has a handful of optional top-level rules. If I attempt to group a few of these optional rules together, for the convenience of listening/visiting, it changes the candidates collected by antlr4-c3.
Working Example:
core.collectCandidates('get', 1);
returns tokens['foo', 'bar', 'baz', 'with qux']
. This is the result I am expecting.Non-working Example:
core.collectCandidates('get', 1);
returns['foo', 'bar', 'baz']
but is unexpectedly missingwith qux
.If I make
fooBarBaz
itself optional,fooBarBaz?
, the compilation of the grammar throws a warning:rule 'expression' contains an optional block with at least one alternative that can match an empty string
, which is expected given the creation of an optional rule with optional children.As far as I can tell, the grammars are syntactically the same and I would expect them to return the same candidates.
The text was updated successfully, but these errors were encountered: