[regex] backref problem with quantified groups #8267
Comments
From edi@agharta.deCreated by edi@agharta.de edi@bird:~ > perl -e 'use Data::Dumper; "a" =~ /((a)*)*/; print Dumper $1, $2' Obviously, $2 should either be undef or 'a' in _both_ cases. I think <http://groups.google.com/groups?selm=87zns15gal.fsf%40bird.agharta.de&rnum=7> Perl Info
|
From @andk
> # New Ticket Created by edi@agharta.de > This is a bug report for perl from edi@agharta.de, > ----------------------------------------------------------------- > edi@bird:~ > perl -e 'use Data::Dumper; "a" =~ /((a)*)*/; print Dumper $1, $2' Archaeological findings about this bug... It was introduced to the trunk with patch 6373. The bug was also integrated into 5.6.1 with patch 7772. (Note: 7772 Simply undoing the regexec.c part of that patch fixes the bug but also not ok 860 () ^(a(b)?)+$:aba:y:-$1-$2-:-a-- => `-a-b-', match=1 The patch I tried was: #### DO NOT APPLY #### Inline Patch--- perl-5.8.0@18217/regexec.c Fri Nov 29 21:38:04 2002
+++ perl-5.8.0@18217-ak/regexec.c Sun Dec 1 18:31:08 2002
@@ -293,8 +293,6 @@
PL_regstartp[paren] = HOPc(input, -1) - PL_bostr; \
PL_regendp[paren] = input - PL_bostr; \
} \
- else \
- PL_regendp[paren] = -1; \
} \
if (regmatch(next)) \
sayYES; \
|
From @hvdsandreas.koenig@anima.de (Andreas J. Koenig) wrote: Digging a bit further, the actual patch was submitted (by me) in the The difference between the two test cases is that for /((a)*)*/, the Hugo |
From eric.niebler@gmail.comCreated by eric.niebler@gmail.comThis is a bug report for perl from eric.niebler@gmail.com, ----------------------------------------------------------------- $str = 'aaA'; This prints "not defined," and I think that's right. I can't think of a reason why changing group 3 from Perl Info
|
From @hvdsEric Niebler (via RT) <perlbug-followup@perl.org> wrote: I agree the inconsistency smells like a bug, though it isn't clear -Dr output shows that the two regexps are optimised differently: The results may be reasonable nonetheless - we could in principle Hugo |
The RT System itself - Status changed from 'new' to 'open' |
From @AbigailOn Tue, Jan 03, 2006 at 04:27:40AM +0000, hv@crypt.org wrote:
The following program suggests that in both regexes, the outer set of #!/usr/bin/perl use strict; $_ = 'aaA'; 1: a Abigail |
From @ysthOn Tue, Jan 03, 2006 at 04:27:40AM +0000, hv@crypt.org wrote:
To me, it seems clear that on the last iteration of the +, the ? |
From @AbigailOn Tue, Jan 03, 2006 at 03:59:19PM -0800, Yitzchak Scott-Thoennes wrote:
That I don't understand. Since the ?: controls whether or not there's Abigail |
From @ysthOn Wed, Jan 04, 2006 at 09:48:14AM +0100, Abigail wrote:
Sorry, I was somehow assigning numbers from the inside out instead of |
From eric.niebler@gmail.comYitzchak Scott-Thoennes wrote:
There appears to be general agreement that this is a bug. But will it Eric |
From @hvdsEric Niebler <eric.niebler@gmail.com> wrote: Now it waits until someone simultaneously acquires the time, ability and If a fix is developed it will go into the "bleeding edge" codebase first, But there are few people with the knowledge to debug problems in the Hugo |
From @cpansproutperl -MData::Dumper -le '"aba" =~ /^(a(b)?)+$/; print Dumper $1, $2;' This is the case because the outer + makes the subexpression But if I change (b) to (b+) or ((b)), the behaviour changes: perl -MData::Dumper -le '"aba" =~ /^(a(b+)?)+$/; print Dumper $1, $2;' perl -MData::Dumper -le '"aba" =~ /^(a((b))?)+$/; print Dumper $1, $2;' (Though this probably makes no difference, if this is to be made This is the case both with 5.8.8 and 5.9.5 #31441. $s = "Juusstt aannootthheerr Peerrll hhaacckkeerr,\n"; Flags: Site configuration information for perl v5.8.8: Configured by neo at Tue Jan 9 16:06:53 PST 2007. Summary of my perl5 (revision 5 version 8 subversion 8) configuration: Locally applied patches: @INC for perl v5.8.8: Environment for perl v5.8.8: |
From p5p@spam.wizbit.beAttached is a patch with a todo test for this bug report. Summary of the report: #!/usr/bin/perl -l if ("A" =~ /(((?:A))?)+/) { if ("A" =~ /(((A))?)+/) { $1 = , $2 = , $3 = The value of the second capture group depends on wheter or not there is a The value should be the same in both cases. (For more info look at RT) |
From p5p@spam.wizbit.beInline Patch--- old/t/op/pat.t 2008-05-24 23:15:39.000000000 +0200
+++ new/t/op/pat.t 2008-05-24 23:16:15.000000000 +0200
@@ -4642,6 +4642,17 @@
iseq( join('', @isPunctLatin1), '',
'IsPunct agrees with [:punct:] with explicit Latin1');
}
+{
+ local $TODO = "[perl #38133]";
+
+ "A" =~ /(((?:A))?)+/;
+ my $first = $2;
+
+ "A" =~ /(((A))?)+/;
+ my $second = $2;
+
+ iseq($first, $second);
+}
# Test counter is at bottom of file. Put new tests above here.
@@ -4705,7 +4716,7 @@
# Don't forget to update this!
BEGIN {
- $::TestCount = 4035;
+ $::TestCount = 4036;
print "1..$::TestCount\n";
}
|
From [Unknown Contact. See original ticket]Attached is a patch with a todo test for this bug report. Summary of the report: #!/usr/bin/perl -l if ("A" =~ /(((?:A))?)+/) { if ("A" =~ /(((A))?)+/) { $1 = , $2 = , $3 = The value of the second capture group depends on wheter or not there is a The value should be the same in both cases. (For more info look at RT) |
From @khwilliamsonCommit 72aa120 |
Migrated from rt.perl.org#38133 (status was 'open')
Searchable as RT38133$
The text was updated successfully, but these errors were encountered: