-
Notifications
You must be signed in to change notification settings - Fork 116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
r|!r does not always match everything #567
Comments
Slightly reduced test case: %%
%standalone
%debug
%%
([^]* [^a]) | !([^]* [^a]) { /* all */ }
aba { /* should not match */ } Input
(should match first action instead) Input
(should match all of the input in one match) Variations of the expression (e.g. |
This means the issue is in the automaton generation, not the engine. Most likely in the NFA/negation part. |
The NFA looks at least strange: nfa.pdf
|
Further reduced: %%
%int
%debug
%%
!([^]*[^a]) { } produces on
but should match the entire input, because |
NFA for that: nfa.pdf |
Confusingly |
And the spec that should be equivalent to %%
%int
%debug
%%
!([^]*[^\u{0}-`b-\u{10FFFF}]) { } With input
i.e. there is something funny going on with (negated?) char classes and negation, even though the input NFA to the negation looks identical (nfa4.pdf), and the big negated class does come out to just The expression |
(reproduced back to jflex-1.3, i.e. not a recent regression) |
The internal order of character class partitions seems to play a role. The error-example %%
%int
%debug
%%
!([^]*[^a]) { } produces %%
%int
%debug
DUMMY=[\u{0}-`b-\u{10FFFF}]
%%
!([^]*[^a]) { } and get the partitions |
The bug appears to be in the |
And indeed the |
For posterity: the implementation of the negation operator uses a pass to remove states that can never reach a final state from the negated sub-NFA (needed to conform with assumptions of the scanning engine). This function This was triggered rarely (e.g. in none of the examples or regression test or other specs we test JFlex on). You can determine wether a lexer spec was affected by looking at the number of DFA states before minimisation in JFlex 1.7.0 and (the upcoming) JFLex 1.8.0. If the number of states differ, it may have been affected by the bug, if the number of states is equal, it was not. |
Original bug report:
hello,
I found an issue with the negation operator "!"
With the following specification, string "baba" is not matched
by either EXP ou !EXP .
Pascal Hennequin
The text was updated successfully, but these errors were encountered: