UTF8-ness no longer propagating through assignments within regexes #14211
Comments
From damian@conway.orgFrom 5.20, matching and capturing against a utf8 string no longer The behaviour was first reported against Regexp::Grammars, but is not The attached tests pass (as expected) under 5.10, 5.12, 5.14, Damian |
From damian@conway.orguse utf8; our $result; my $parser = qr{ (?(DEFINE) 'za?��??_g??l?_ja??' =~ $parser; '���������������� ��������������������' =~ $parser; done_testing(); |
From bbkr@post.plLooks like test file broke during upload. |
From bbkr@post.pluse utf8; our $result; my $parser = qr{ (?(DEFINE) 'za�������� g����l�� ja����' =~ $parser; '�������� ����������' =~ $parser; done_testing(); |
The RT System itself - Status changed from 'new' to 'open' |
From @tonycozOn Wed Nov 05 12:50:31 2014, thoughtstream wrote:
Bisects to: bad - non-zero exit from ./perl -Ilib -e "\x{101c}" =~ /((?&TOP))(?{ $result = $^N })(?(DEFINE)(?<TOP>.))/; $result eq "\x{101c}" or die Rationalise RX_MATCH_UTF8_set() :100644 100644 bcf2b80cce11b32936a84c2b317ac6bbbd294beb 4aa6de05ab79798138d010fc679e88644b066e00 M regexec.c |
From @cpansproutOn Wed Nov 05 18:32:18 2014, tonyc wrote:
Argh! You beat me to it. My test case is simpler: -e 'chr(256) =~ /(.)(?{ die unless $^N eq chr 256})/'
That logic is clearly wrong, though I would not have known any better without this failing test case. The commit should simple be reverted, and tests should be added. I plan to do both shortly. -- Father Chrysostomos |
From @tonycozOn Wed Nov 05 20:06:13 2014, sprout wrote:
Oops, I had a test and manual revert done before I noticed this. Tony |
From @tonycoz0001-perl-123135-ensure-utf8-ness-propagated-to-N.patchFrom 7cc4874d20d36acc8db4f0ce73dff62c1d044394 Mon Sep 17 00:00:00 2001
From: Tony Cook <tony@develop-help.com>
Date: Thu, 6 Nov 2014 15:17:08 +1100
Subject: [PATCH] [perl #123135] ensure utf8-ness propagated to $^N
Reverts 0254aed965cd47adab9025a192546e4a5e63873a and adds a test
---
regexec.c | 6 ++++--
t/re/pat.t | 9 ++++++++-
2 files changed, 12 insertions(+), 3 deletions(-)
diff --git a/regexec.c b/regexec.c
index 37018ac..9f6291b 100644
--- a/regexec.c
+++ b/regexec.c
@@ -670,6 +670,8 @@ Perl_re_intuit_start(pTHX_
DEBUG_EXECUTE_r(PerlIO_printf(Perl_debug_log,
"Intuit: trying to determine minimum start position...\n"));
+ RX_MATCH_UTF8_set(rx,utf8_target);
+
/* for now, assume that all substr offsets are positive. If at some point
* in the future someone wants to do clever things with look-behind and
* -ve offsets, they'll need to fix up any code in this function
@@ -2630,6 +2632,8 @@ Perl_regexec_flags(pTHX_ REGEXP * const rx, char *stringarg, char *strend,
/* see how far we have to get to not match where we matched before */
reginfo->till = stringarg + minend;
+ RX_MATCH_UTF8_set(rx,utf8_target);
+
if (prog->extflags & RXf_EVAL_SEEN && SvPADTMP(sv)) {
/* SAVEFREESV, not sv_mortalcopy, as this SV must last until after
S_cleanup_regmatch_info_aux has executed (registered by
@@ -3137,8 +3141,6 @@ got_it:
if (RXp_PAREN_NAMES(prog))
(void)hv_iterinit(RXp_PAREN_NAMES(prog));
- RX_MATCH_UTF8_set(rx, utf8_target);
-
/* make sure $`, $&, $', and $digit will work later */
if ( !(flags & REXEC_NOT_FIRST) )
S_reg_set_capture_string(aTHX_ rx,
diff --git a/t/re/pat.t b/t/re/pat.t
index e532054..b3806f2 100644
--- a/t/re/pat.t
+++ b/t/re/pat.t
@@ -22,7 +22,7 @@ BEGIN {
skip_all_without_unicode_tables();
}
-plan tests => 755; # Update this when adding/deleting tests.
+plan tests => 756; # Update this when adding/deleting tests.
run_tests() unless caller;
@@ -1620,6 +1620,13 @@ EOP
use utf8;
like("ÿ", qr/[ÿ-ÿ]/, "\"ÿ\" should match [ÿ-ÿ]");
}
+
+ {
+ # [perl #123135]
+ my $result;
+ "\x{101c}" =~ /(.)(?{ $result = $^N })/;
+ is($result, "\x{101c}", 'check utf8 propagated to $^N');
+ }
} # End of sub run_tests
1;
--
1.7.10.4
|
From @cpansproutFixed in ab4e48c. Tests added in 9c6c921. (Oops. I just noticed a mistake in the commit message. I typed %^N intsead of $^N.) This should be a candidate for maint-5.20. -- Father Chrysostomos |
@cpansprout - Status changed from 'open' to 'pending release' |
@cpansprout - Status changed from 'pending release' to 'open' |
From @khwilliamsonOn 11/06/2014 03:36 PM, Tony Cook via RT wrote:
and me too |
From @cpansproutOn Thu Nov 06 15:09:56 2014, public@khwilliamson.com wrote:
I have backported it in commit 63100fa. -- Father Chrysostomos |
@cpansprout - Status changed from 'open' to 'pending release' |
From @khwilliamsonThanks for submitting this ticket The issue should be resolved with the release today of Perl v5.22, available at http://www.perl.org/get.html -- |
@khwilliamson - Status changed from 'pending release' to 'resolved' |
Migrated from rt.perl.org#123135 (status was 'resolved')
Searchable as RT123135$
The text was updated successfully, but these errors were encountered: