-
Notifications
You must be signed in to change notification settings - Fork 560
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UTF8-ness no longer propagating through assignments within regexes #14211
Comments
From damian@conway.orgFrom 5.20, matching and capturing against a utf8 string no longer The behaviour was first reported against Regexp::Grammars, but is not The attached tests pass (as expected) under 5.10, 5.12, 5.14, Damian |
From damian@conway.orguse utf8; our $result; my $parser = qr{ (?(DEFINE) 'za?��??_g??l?_ja??' =~ $parser; '���������������� ��������������������' =~ $parser; done_testing(); |
From bbkr@post.plLooks like test file broke during upload. |
From bbkr@post.pluse utf8; our $result; my $parser = qr{ (?(DEFINE) 'za�������� g����l�� ja����' =~ $parser; '�������� ����������' =~ $parser; done_testing(); |
The RT System itself - Status changed from 'new' to 'open' |
From @tonycozOn Wed Nov 05 12:50:31 2014, thoughtstream wrote:
Bisects to: bad - non-zero exit from ./perl -Ilib -e "\x{101c}" =~ /((?&TOP))(?{ Rationalise RX_MATCH_UTF8_set() :100644 100644 bcf2b80cce11b32936a84c2b317ac6bbbd294beb 4aa6de05ab79798138d010fc679e88644b066e00 M regexec.c |
From @cpansproutOn Wed Nov 05 18:32:18 2014, tonyc wrote:
Argh! You beat me to it. My test case is simpler: -e 'chr(256) =~ /(.)(?{ die unless $^N eq chr 256})/'
That logic is clearly wrong, though I would not have known any better without this failing test case. The commit should simple be reverted, and tests should be added. I plan to do both shortly. -- Father Chrysostomos |
From @tonycozOn Wed Nov 05 20:06:13 2014, sprout wrote:
Oops, I had a test and manual revert done before I noticed this. Tony |
From @tonycoz0001-perl-123135-ensure-utf8-ness-propagated-to-N.patchFrom 7cc4874d20d36acc8db4f0ce73dff62c1d044394 Mon Sep 17 00:00:00 2001
From: Tony Cook <tony@develop-help.com>
Date: Thu, 6 Nov 2014 15:17:08 +1100
Subject: [PATCH] [perl #123135] ensure utf8-ness propagated to $^N
Reverts 0254aed965cd47adab9025a192546e4a5e63873a and adds a test
---
regexec.c | 6 ++++--
t/re/pat.t | 9 ++++++++-
2 files changed, 12 insertions(+), 3 deletions(-)
diff --git a/regexec.c b/regexec.c
index 37018ac..9f6291b 100644
--- a/regexec.c
+++ b/regexec.c
@@ -670,6 +670,8 @@ Perl_re_intuit_start(pTHX_
DEBUG_EXECUTE_r(PerlIO_printf(Perl_debug_log,
"Intuit: trying to determine minimum start position...\n"));
+ RX_MATCH_UTF8_set(rx,utf8_target);
+
/* for now, assume that all substr offsets are positive. If at some point
* in the future someone wants to do clever things with look-behind and
* -ve offsets, they'll need to fix up any code in this function
@@ -2630,6 +2632,8 @@ Perl_regexec_flags(pTHX_ REGEXP * const rx, char *stringarg, char *strend,
/* see how far we have to get to not match where we matched before */
reginfo->till = stringarg + minend;
+ RX_MATCH_UTF8_set(rx,utf8_target);
+
if (prog->extflags & RXf_EVAL_SEEN && SvPADTMP(sv)) {
/* SAVEFREESV, not sv_mortalcopy, as this SV must last until after
S_cleanup_regmatch_info_aux has executed (registered by
@@ -3137,8 +3141,6 @@ got_it:
if (RXp_PAREN_NAMES(prog))
(void)hv_iterinit(RXp_PAREN_NAMES(prog));
- RX_MATCH_UTF8_set(rx, utf8_target);
-
/* make sure $`, $&, $', and $digit will work later */
if ( !(flags & REXEC_NOT_FIRST) )
S_reg_set_capture_string(aTHX_ rx,
diff --git a/t/re/pat.t b/t/re/pat.t
index e532054..b3806f2 100644
--- a/t/re/pat.t
+++ b/t/re/pat.t
@@ -22,7 +22,7 @@ BEGIN {
skip_all_without_unicode_tables();
}
-plan tests => 755; # Update this when adding/deleting tests.
+plan tests => 756; # Update this when adding/deleting tests.
run_tests() unless caller;
@@ -1620,6 +1620,13 @@ EOP
use utf8;
like("ÿ", qr/[ÿ-ÿ]/, "\"ÿ\" should match [ÿ-ÿ]");
}
+
+ {
+ # [perl #123135]
+ my $result;
+ "\x{101c}" =~ /(.)(?{ $result = $^N })/;
+ is($result, "\x{101c}", 'check utf8 propagated to $^N');
+ }
} # End of sub run_tests
1;
--
1.7.10.4
|
From @cpansproutFixed in ab4e48c. Tests added in 9c6c921. (Oops. I just noticed a mistake in the commit message. I typed %^N intsead of $^N.) This should be a candidate for maint-5.20. -- Father Chrysostomos |
@cpansprout - Status changed from 'open' to 'pending release' |
@cpansprout - Status changed from 'pending release' to 'open' |
From @khwilliamsonOn 11/06/2014 03:36 PM, Tony Cook via RT wrote:
and me too |
From @cpansproutOn Thu Nov 06 15:09:56 2014, public@khwilliamson.com wrote:
I have backported it in commit 63100fa. -- Father Chrysostomos |
@cpansprout - Status changed from 'open' to 'pending release' |
From @khwilliamsonThanks for submitting this ticket The issue should be resolved with the release today of Perl v5.22, available at http://www.perl.org/get.html -- |
@khwilliamson - Status changed from 'pending release' to 'resolved' |
Migrated from rt.perl.org#123135 (status was 'resolved')
Searchable as RT123135$
The text was updated successfully, but these errors were encountered: