Skip to content

Commit

Permalink
Fix anchoring bug in conditionals with only one branch.
Browse files Browse the repository at this point in the history
  • Loading branch information
ph10 committed Sep 2, 2018
1 parent 8b9f137 commit a4498fc
Show file tree
Hide file tree
Showing 4 changed files with 66 additions and 5 deletions.
5 changes: 5 additions & 0 deletions ChangeLog
Original file line number Diff line number Diff line change
Expand Up @@ -174,6 +174,11 @@ not used in PCRE2.
from distribution tarballs, owing to a typo in Makefile.am which had
testoutput8-16-3 twice. Now fixed.

39. If the only branch in a conditional subpattern was anchored, the whole
subpattern was treated as anchored, when it should not have been, since the
assumed empty second branch cannot be anchored. Demonstrated by test patterns
such as /(?(1)^())b/ or /(?(?=^))b/.


Version 10.31 12-February-2018
------------------------------
Expand Down
11 changes: 6 additions & 5 deletions src/pcre2_compile.c
Original file line number Diff line number Diff line change
Expand Up @@ -1454,8 +1454,8 @@ else if ((i = escapes[c - ESCAPES_FIRST]) != 0)
/* \N{U+ can be handled by the \x{ code. However, this construction is
not valid in EBCDIC environments because it specifies a Unicode
character, not a codepoint in the local code. For example \N{U+0041}
must be "A" in all environments. Also, in Perl, \N{U+ forces Unicode
casing semantics for the entire pattern, so allow it only in UTF (i.e.
must be "A" in all environments. Also, in Perl, \N{U+ forces Unicode
casing semantics for the entire pattern, so allow it only in UTF (i.e.
Unicode) mode. */

if (ptrend - p > 1 && *p == CHAR_U && p[1] == CHAR_PLUS)
Expand All @@ -1464,12 +1464,12 @@ else if ((i = escapes[c - ESCAPES_FIRST]) != 0)
*errorcodeptr = ERR93;
#else
if (utf)
{
{
ptr = p + 1;
escape = 0; /* Not a fancy escape after all */
goto COME_FROM_NU;
}
else *errorcodeptr = ERR93;
else *errorcodeptr = ERR93;
#endif
}

Expand Down Expand Up @@ -7864,10 +7864,11 @@ do {
if (!is_anchored(scode, bracket_map, cb, atomcount, TRUE)) return FALSE;
}

/* Condition */
/* Condition. If there is no second branch, it can't be anchored. */

else if (op == OP_COND)
{
if (scode[GET(scode,1)] != OP_ALT) return FALSE;
if (!is_anchored(scode, bracket_map, cb, atomcount, inassert))
return FALSE;
}
Expand Down
15 changes: 15 additions & 0 deletions testdata/testinput2
Original file line number Diff line number Diff line change
Expand Up @@ -5459,4 +5459,19 @@ a)"xI

/(?x-i-i)/

/(?(?=^))b/I
abc

/(?(?=^)|)b/I
abc

/(?(?=^)|^)b/I
bbc
\= Expect no match
abc

/(?(1)^|^())/I

/(?(1)^())b/I

# End of testinput2
40 changes: 40 additions & 0 deletions testdata/testoutput2
Original file line number Diff line number Diff line change
Expand Up @@ -16631,6 +16631,46 @@ Failed: error 194 at offset 3: invalid hyphen in option setting
/(?x-i-i)/
Failed: error 194 at offset 5: invalid hyphen in option setting

/(?(?=^))b/I
Capturing subpattern count = 0
Last code unit = 'b'
Subject length lower bound = 1
abc
0: b

/(?(?=^)|)b/I
Capturing subpattern count = 0
First code unit = 'b'
Subject length lower bound = 1
abc
0: b

/(?(?=^)|^)b/I
Capturing subpattern count = 0
Compile options: <none>
Overall options: anchored
First code unit = 'b'
Subject length lower bound = 1
bbc
0: b
\= Expect no match
abc
No match

/(?(1)^|^())/I
Capturing subpattern count = 1
Max back reference = 1
May match empty string
Compile options: <none>
Overall options: anchored
Subject length lower bound = 0

/(?(1)^())b/I
Capturing subpattern count = 1
Max back reference = 1
Last code unit = 'b'
Subject length lower bound = 1

# End of testinput2
Error -70: PCRE2_ERROR_BADDATA (unknown error number)
Error -62: bad serialized data
Expand Down

0 comments on commit a4498fc

Please sign in to comment.