Skip to content

Commit

Permalink
Also set RXf_RTRIM if the pattern is /\s*$/ (without the //u flag).
Browse files Browse the repository at this point in the history
Without the //u flag, \s and [[:space:]] are compiled to POSIXD ops.
The POSIXD op behaves differently depending on whether the target string has
SVf_UTF8 set.

One could in theory continue and handle the case of the //a flag (POSIXA
ops), but this doesn't seem worth it for an optimisation, as it is unlikely
to be common on regexs that are intended to remove "generic" whitespace from
the end of a string.
  • Loading branch information
nwc10 authored and khwilliamson committed Jun 6, 2021
1 parent 9da0bd4 commit 1737272
Show file tree
Hide file tree
Showing 2 changed files with 19 additions and 1 deletion.
4 changes: 3 additions & 1 deletion regcomp.c
Expand Up @@ -8489,14 +8489,16 @@ Perl_re_op_compile(pTHX_ SV ** const patternp, int pat_count,
&& OP(regnext(first)) == END )
RExC_rx->extflags |= (RXf_SKIPWHITE|RXf_WHITE);
else if ((fop == PLUS || fop == STAR)
&& nop == POSIXU && FLAGS(next) == _CC_SPACE) {
&& (nop == POSIXU || nop == POSIXD)
&& FLAGS(next) == _CC_SPACE) {
regnode *second = regnext(first);
regnode *third = (OP(second) == EOS || OP(second) == SEOL)
? regnext(second) : NULL;
if (third && OP(third) == END) {
/* /[[:space:]]+\z/u
* /[[:space:]]+$/u
* /[[:space:]]*$/u
* /\s*$/
* etc */
RExC_rx->extflags |= RXf_RTRIM | RXf_CHECK_ALL;
}
Expand Down
16 changes: 16 additions & 0 deletions regexec.c
Expand Up @@ -953,7 +953,23 @@ Perl_re_intuit_start(pTHX_
}
}
}
else if (OP(NEXTOPER(progi->program + 1)) == POSIXD) {
/* Without //u \x{A0} mustn't match \s when stored as octets. */
DEBUG_EXECUTE_r(Perl_re_printf( aTHX_
" rtrim intuit Legacy ...\n"));
while (1) {
const char *was_s = s;
if (s == strpos)
break;
--s;
if (s < strpos || !isSPACE(*s)) {
s = was_s;
break;
}
}
}
else {
/* flag //u present - the op will be POSIXU */
DEBUG_EXECUTE_r(Perl_re_printf( aTHX_
" rtrim intuit Latin1 ...\n"));
while (1) {
Expand Down

0 comments on commit 1737272

Please sign in to comment.