Skip to content

Commit

Permalink
[perl #79152] super-linear cache can prevent a valid match
Browse files Browse the repository at this point in the history
The super-linear cache in regexec.c can prevent a valid match
from being detected. For example:

print "yay\n" if 'xayxay' =~ /(q1|.)*(q2|.)*(x(a|bc)*y){2,}/;

This should match, but it doesn't because the cache fails to
distinguish between matching the final xay to x(a|bc)*y as the
first instance of the {2,} and matching it in the same position
as the second instance.

This seems to do the trick.
  • Loading branch information
ncleaton authored and Father Chrysostomos committed Nov 30, 2010
1 parent f603091 commit 779bcb7
Show file tree
Hide file tree
Showing 2 changed files with 15 additions and 7 deletions.
17 changes: 10 additions & 7 deletions regcomp.c
Expand Up @@ -3246,13 +3246,16 @@ S_study_chunk(pTHX_ RExC_state_t *pRExC_state, regnode **scanp,
f |= SCF_DO_STCLASS_AND;
f &= ~SCF_DO_STCLASS_OR;
}
/* These are the cases when once a subexpression
fails at a particular position, it cannot succeed
even after backtracking at the enclosing scope.
XXXX what if minimal match and we are at the
initial run of {n,m}? */
if ((mincount != maxcount - 1) && (maxcount != REG_INFTY))
/* Exclude from super-linear cache processing any {n,m}
regops for which the combination of input pos and regex
pos is not enough information to determine if a match
will be possible.
For example, in the regex /foo(bar\s*){4,8}baz/ with the
regex pos at the \s*, the prospects for a match depend not
only on the input position but also on how many (bar\s*)
repeats into the {4,8} we are. */
if ((mincount > 1) || (maxcount > 1 && maxcount != REG_INFTY))
f &= ~SCF_WHILEM_VISITED_POS;

/* This will finish on WHILEM, setting scan, or on NULL: */
Expand Down
5 changes: 5 additions & 0 deletions t/re/re_tests
Expand Up @@ -1482,5 +1482,10 @@ abc\N{def - c - \\N{NAME} must be resolved by the lexer
[\0005] 5\000 y $& 5
[\_] _ y $& _

# RT #79152
(q1|.)*(q2|.)*(x(a|bc)*y){2,} xayxay y $& xayxay
(q1|.)*(q2|.)*(x(a|bc)*y){2,3} xayxay y $& xayxay
(q1|z)*(q2|z)*z{15}-.*?(x(a|bc)*y){2,3}Z zzzzzzzzzzzzzzzz-xayxayxayxayZ y $& zzzzzzzzzzzzzzzz-xayxayxayxayZ

(?:(?:)foo|bar|zot|rt78356) foo y $& foo
# vim: softtabstop=0 noexpandtab

0 comments on commit 779bcb7

Please sign in to comment.