Description
A pattern that uses both a lookbehind and a lookahead assertion together always returns false, even for inputs that clearly should match.
Reproduction
ReggieMatcher m = Reggie.compile("(?<=\\[)[^\\]]+(?=\\])");
m.find("[value]"); // returns false — WRONG, should be true (captures "value")
m.find("value"); // returns false — correct
// Another example:
Reggie.compile("(?<=\\d)[a-z]{1,4}(?!\\d)").find("3abc"); // should be true
Root cause
The DFA_UNROLLED_WITH_ASSERTIONS strategy does not correctly compose lookbehind and lookahead assertions in the same pattern. After the lookbehind succeeds and the body matches, the lookahead check is evaluated against an incorrect position.
Current mitigation
FallbackPatternDetector detects any pattern containing both a lookbehind and a lookahead assertion and falls back to java.util.regex.
Fix direction
In the assertion-aware DFA/NFA path: after the body match, the lookahead must be evaluated from the current end position. Likely requires fixing position tracking between lookbehind, body, and lookahead evaluation phases.
Impact
High — sandwich patterns (extract content between delimiters) are very common and completely non-functional without the fallback.
Description
A pattern that uses both a lookbehind and a lookahead assertion together always returns false, even for inputs that clearly should match.
Reproduction
Root cause
The
DFA_UNROLLED_WITH_ASSERTIONSstrategy does not correctly compose lookbehind and lookahead assertions in the same pattern. After the lookbehind succeeds and the body matches, the lookahead check is evaluated against an incorrect position.Current mitigation
FallbackPatternDetectordetects any pattern containing both a lookbehind and a lookahead assertion and falls back tojava.util.regex.Fix direction
In the assertion-aware DFA/NFA path: after the body match, the lookahead must be evaluated from the current end position. Likely requires fixing position tracking between lookbehind, body, and lookahead evaluation phases.
Impact
High — sandwich patterns (extract content between delimiters) are very common and completely non-functional without the fallback.