Skip to content

Commit

Permalink
Fix infinite loop with $tied =~ s/non-utf8/utf8/
Browse files Browse the repository at this point in the history
Commit 3e462cd fixed bug #41530 (s/non-utf8/utf8/ was not working
properly at all) by upgrading the target and redoing the substitution
if the replacement was utf8 and the target was not.

Commit c95ca9b fixed one problem with it calling get-magic too
many times, by checking whether the upgrade caused a string realloca-
tion and only then redoing the substitution.  But it only fixed it
when magic returns a pure ASCII string.

Redoing the substitution meant going back to where the target was
initially stringified and starting again.  That meant calling get-
magic again.

So in those cases where magic returned something other than a UTF8 or
pure ASCII string the substitution restarted and magic would be trig-
gered again, possibly resulting in infinite loops (because it would
have to be upgraded again, resulting a reallocation, and a restart).

This happens with:

• Latin-1 strings
• Copy-on-write non-UTF8 strings
• References that stringify without UTF8

c95ca9b also added SvPVX without checking first that it is SvPVX-
able, so a typeglob causes an assertion failure.

It turned out that there were also two other places in pp_subst that
were calling FETCH a second time (the tests I added for the looping/
assertion bugs found this), so I changed them, too.
  • Loading branch information
Father Chrysostomos committed Oct 7, 2012
1 parent 0efd047 commit 5c1648b
Show file tree
Hide file tree
Showing 2 changed files with 38 additions and 5 deletions.
10 changes: 6 additions & 4 deletions pp_hot.c
Original file line number Diff line number Diff line change
Expand Up @@ -2101,8 +2101,8 @@ PP(pp_subst)
Perl_croak_no_modify(aTHX);
PUTBACK;

setup_match:
s = SvPV_mutable(TARG, len);
setup_match:
if (!SvPOKp(TARG) || SvTYPE(TARG) == SVt_PVGV || SvVOK(TARG))
force_on_match = 1;

Expand Down Expand Up @@ -2179,13 +2179,15 @@ PP(pp_subst)
* http://www.nntp.perl.org/group/perl.perl5.porters/2010/04/msg158809.html
*/
if (DO_UTF8(dstr) && ! DO_UTF8(TARG)) {
char * const orig_pvx = SvPVX(TARG);
char * const orig_pvx = SvPOKp(TARG) ? SvPVX(TARG) : NULL;
const STRLEN new_len = sv_utf8_upgrade_nomg(TARG);

/* If the lengths are the same, the pattern contains only
* invariants, can keep going; otherwise, various internal markers
* could be off, so redo */
if (new_len != len || orig_pvx != SvPVX(TARG)) {
/* Do this here, to avoid multiple FETCHes. */
s = SvPV_nomg(TARG, len);
goto setup_match;
}
}
Expand Down Expand Up @@ -2231,7 +2233,7 @@ PP(pp_subst)
#endif
if (force_on_match) {
force_on_match = 0;
s = SvPV_force(TARG, len);
s = SvPV_force_nomg(TARG, len);
goto force_it;
}
d = s;
Expand Down Expand Up @@ -2315,7 +2317,7 @@ PP(pp_subst)
cases where it would be viable to drop into the copy code. */
TARG = sv_2mortal(newSVsv(TARG));
}
s = SvPV_force(TARG, len);
s = SvPV_force_nomg(TARG, len);
goto force_it;
}
#ifdef PERL_OLD_COPY_ON_WRITE
Expand Down
33 changes: 32 additions & 1 deletion t/re/subst.t
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ BEGIN {
}

require './test.pl';
plan( tests => 190 );
plan( tests => 200 );

$_ = 'david';
$a = s/david/rules/r;
Expand Down Expand Up @@ -746,6 +746,8 @@ fresh_perl_is( '$_="abcef"; s/bc|(.)\G(.)/$1 ? "[$1-$2]" : "XX"/ge; print' => 'a
# when substituted with a UTF8 replacement string, due to
# magic getting called multiple times, and pointers now pointing
# to stale/freed strings
# The original fix for this caused infinite loops for non- or cow-
# strings, so we test those, too.
package FOO;
my $fc;
sub TIESCALAR { bless [ "abcdefgh" ] }
Expand All @@ -757,6 +759,35 @@ fresh_perl_is( '$_="abcef"; s/bc|(.)\G(.)/$1 ? "[$1-$2]" : "XX"/ge; print' => 'a
$s =~ s/..../\x{101}/;
::is($fc, 1, "tied UTF8 stuff FETCH count");
::is("$s", "\x{101}efgh", "tied UTF8 stuff");

::watchdog(300);
$fc = 0;
$s = *foo;
$s =~ s/..../\x{101}/;
::is($fc, 1, '$tied_glob =~ s/non-utf8/utf8/ fetch count');
::is("$s", "\x{101}::foo", '$tied_glob =~ s/non-utf8/utf8/ result');
$fc = 0;
$s = *foo;
$s =~ s/(....)/\x{101}/g;
::is($fc, 1, '$tied_glob =~ s/(non-utf8)/utf8/g fetch count');
::is("$s", "\x{101}\x{101}o",
'$tied_glob =~ s/(non-utf8)/utf8/g result');
$fc = 0;
$s = "\xff\xff\xff\xff\xff";
$s =~ s/..../\x{101}/;
::is($fc, 1, '$tied_latin1 =~ s/non-utf8/utf8/ fetch count');
::is("$s", "\x{101}\xff", '$tied_latin1 =~ s/non-utf8/utf8/ result');
$fc = 0;
{ package package_name; tied($s)->[0] = __PACKAGE__ };
$s =~ s/..../\x{101}/;
::is($fc, 1, '$tied_cow =~ s/non-utf8/utf8/ fetch count');
::is("$s", "\x{101}age_name", '$tied_cow =~ s/non-utf8/utf8/ result');
$fc = 0;
$s = \1;
$s =~ s/..../\x{101}/;
::is($fc, 1, '$tied_ref =~ s/non-utf8/utf8/ fetch count');
::like("$s", qr/^\x{101}AR\(0x.*\)\z/,
'$tied_ref =~ s/non-utf8/utf8/ result');
}

# RT #97954
Expand Down

0 comments on commit 5c1648b

Please sign in to comment.