Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assertion failed: (rx->sublen >= (s - rx->subbeg) + i), function Perl_reg_numbered_buff_fetch #10488

Closed
p5pRT opened this issue Jul 16, 2010 · 20 comments

Comments

@p5pRT
Copy link
Collaborator

@p5pRT p5pRT commented Jul 16, 2010

Migrated from rt.perl.org#76538 (status was 'resolved')

Searchable as RT76538$

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented Jul 16, 2010

From @moonlibs

Created by @moonlibs

##### sample.pl #####
use utf8;
my @​make;
while (<DATA>) {
  chomp;
  push @​make, '';
  s{\\\s*$}{};
  $make[-1] .= $_;
}
for (@​make) {
  m{^([^=]+?)\s*=\s*(.+)$}o;
}
__DATA__
A=
AAAA=\
### end sample.pl ###

or

##### sample.pl #####
my @​x = ("A=B","AAAA=/");
utf8​::upgrade $_ for @​x;
$x[1] =~ s{/\s*$}{};
for (@​x) {
  m{^([^=]+?)\s*=.+$};
}
### end sample.pl ###

Assertion failed​: (rx->sublen >= (s - rx->subbeg) + i), function Perl_reg_numbered_buff_fetch, file regcomp.c, line 5199.
Abort trap​: 6 (core dumped)

Repeatable under​:
  - v5.10.1 (*) built for i686-linux-thread-multi
  - v5.10.0 built for amd64-freebsd
  - v5.10.1 (*) built for amd64-freebsd
  - v5.12.0 built for amd64-freebsd
  - v5.13.2 built for amd64-freebsd

Perl Info

Flags:
    category=core
    severity=medium

Site configuration information for perl 5.12.0:

Configured by mons at Fri May  7 14:44:18 MSD 2010.

Summary of my perl5 (revision 5 version 12 subversion 0) configuration:
   
  Platform:
    osname=freebsd, osvers=7.1-release-p2, archname=amd64-freebsd
    uname='freebsd veda.park.rambler.ru 7.1-release-p2 freebsd 7.1-release-p2 #0: thu feb 12 22:34:21 msk 2009 root@veda.park.rambler.ru:usrobjusrsrcsysdevel amd64 '
    config_args='-des -Dprefix=/home/mons -Duse64bitint -DDEBUG_LEAKING_SCALARS -DDEBUGGING -Dinc_version_list=none -Dccflags=-O2 -march=athlon64 -fomit-frame-pointer -pipe -ggdb -g3 -Doptimize=-O2 -g3'
    hint=recommended, useposix=true, d_sigaction=define
    useithreads=undef, usemultiplicity=undef
    useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef
    use64bitint=define, use64bitall=define, uselongdouble=undef
    usemymalloc=n, bincompat5005=undef
  Compiler:
    cc='cc', ccflags ='-O2 -march=athlon64 -fomit-frame-pointer -pipe -ggdb -g3 -DHAS_FPSETMASK -DHAS_FLOATINGPOINT_H -DDEBUGGING -fno-strict-aliasing -fstack-protector -I/usr/local/include',
    optimize='-O2 -g3',
    cppflags='-O2 -march=athlon64 -fomit-frame-pointer -pipe -ggdb -g3 -DHAS_FPSETMASK -DHAS_FLOATINGPOINT_H -DDEBUGGING -fno-strict-aliasing -fstack-protector -I/usr/local/include'
    ccversion='', gccversion='4.2.1 20070719  [FreeBSD]', gccosandvers=''
    intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16
    ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
    alignbytes=8, prototype=define
  Linker and Libraries:
    ld='cc', ldflags ='-Wl,-E  -fstack-protector -L/usr/local/lib'
    libpth=/usr/lib /usr/local/lib
    libs=-lgdbm -lm -lcrypt -lutil -lc
    perllibs=-lm -lcrypt -lutil -lc
    libc=, so=so, useshrplib=false, libperl=libperl.a
    gnulibc_version=''
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags=' '
    cccdlflags='-DPIC -fPIC', lddlflags='-shared  -L/usr/local/lib -fstack-protector'

Locally applied patches:
    


@INC for perl 5.12.0:
    /home/mons/perl/perl-ex
    /home/mons/lib/perl5/site_perl/5.12.0/amd64-freebsd
    /home/mons/lib/perl5/site_perl/5.12.0
    /home/mons/lib/perl5/5.12.0/amd64-freebsd
    /home/mons/lib/perl5/5.12.0
    .


Environment for perl 5.12.0:
    HOME=/home/mons
    LANG=en_US.UTF-8
    LANGUAGE (unset)
    LD_LIBRARY_PATH (unset)
    LOGDIR (unset)
    PATH=/home/mons/home-bin:/home/mons/bin:/home/mons/flex/bin:/sbin:/bin:/usr/sbin:/usr/bin:/usr/games:/usr/local/sbin:/usr/local/bin:/home/mons/bin
    PERL5LIB=/home/mons/perl/perl-ex
    PERL_BADLANG (unset)
    SHELL=/usr/local/bin/bash

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented Dec 2, 2010

From holger.lehmann@catworkx.de

Same assertion fails when running the latest Ocsinventory Agent und
newer Perl versions. E.g. Perl v5.12.1 on openSuSE 11.3 x86_64
See here for details (different reporter, same problem)​:
http​://forums.ocsinventory-ng.org/viewtopic.php?id=7215

On Fr. 16. Jul. 2010, 05​:47​:19, mons wrote​:

This is a bug report for perl from mons@​cpan.org,
generated with the help of perlbug 1.39 running under perl 5.12.0.

-----------------------------------------------------------------
[Please describe your issue here]

##### sample.pl #####
use utf8;
my @​make;
while (<DATA>) {
chomp;
push @​make, '';
s{\\\s*$}{};
$make[-1] .= $_;
}
for (@​make) {
m{^([^=]+?)\s*=\s*(.+)$}o;
}
__DATA__
A=
AAAA=\
### end sample.pl ###

or

##### sample.pl #####
my @​x = ("A=B","AAAA=/");
utf8​::upgrade $_ for @​x;
$x[1] =~ s{/\s*$}{};
for (@​x) {
m{^([^=]+?)\s*=.+$};
}
### end sample.pl ###

Assertion failed​: (rx->sublen >= (s - rx->subbeg) + i), function
Perl_reg_numbered_buff_fetch, file regcomp.c, line 5199.
Abort trap​: 6 (core dumped)

Repeatable under​:
- v5.10.1 (*) built for i686-linux-thread-multi
- v5.10.0 built for amd64-freebsd
- v5.10.1 (*) built for amd64-freebsd
- v5.12.0 built for amd64-freebsd
- v5.13.2 built for amd64-freebsd

[Please do not change anything below this line]
-----------------------------------------------------------------
---
Flags​:
category=core
severity=medium
---
Site configuration information for perl 5.12.0​:

Configured by mons at Fri May 7 14​:44​:18 MSD 2010.

Summary of my perl5 (revision 5 version 12 subversion 0)
configuration​:

Platform​:
osname=freebsd, osvers=7.1-release-p2, archname=amd64-freebsd
uname='freebsd veda.park.rambler.ru 7.1-release-p2 freebsd 7.1-
release-p2 #0​: thu feb 12 22​:34​:21 msk 2009
root@​veda.park.rambler.ru​:usrobjusrsrcsysdevel amd64 '
config_args='-des -Dprefix=/home/mons -Duse64bitint
-DDEBUG_LEAKING_SCALARS -DDEBUGGING -Dinc_version_list=none
-Dccflags=-O2 -march=athlon64 -fomit-frame-pointer -pipe -ggdb -g3
-Doptimize=-O2 -g3'
hint=recommended, useposix=true, d_sigaction=define
useithreads=undef, usemultiplicity=undef
useperlio=define, d_sfio=undef, uselargefiles=define,
usesocks=undef
use64bitint=define, use64bitall=define, uselongdouble=undef
usemymalloc=n, bincompat5005=undef
Compiler​:
cc='cc', ccflags ='-O2 -march=athlon64 -fomit-frame-pointer -pipe
-ggdb -g3 -DHAS_FPSETMASK -DHAS_FLOATINGPOINT_H -DDEBUGGING
-fno-strict-aliasing -fstack-protector -I/usr/local/include',
optimize='-O2 -g3',
cppflags='-O2 -march=athlon64 -fomit-frame-pointer -pipe -ggdb -g3
-DHAS_FPSETMASK -DHAS_FLOATINGPOINT_H -DDEBUGGING -fno-strict-aliasing
-fstack-protector -I/usr/local/include'
ccversion='', gccversion='4.2.1 20070719 [FreeBSD]',
gccosandvers=''
intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678
d_longlong=define, longlongsize=8, d_longdbl=define,
longdblsize=16
ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t',
lseeksize=8
alignbytes=8, prototype=define
Linker and Libraries​:
ld='cc', ldflags ='-Wl,-E -fstack-protector -L/usr/local/lib'
libpth=/usr/lib /usr/local/lib
libs=-lgdbm -lm -lcrypt -lutil -lc
perllibs=-lm -lcrypt -lutil -lc
libc=, so=so, useshrplib=false, libperl=libperl.a
gnulibc_version=''
Dynamic Linking​:
dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags=' '
cccdlflags='-DPIC -fPIC', lddlflags='-shared -L/usr/local/lib
-fstack-protector'

Locally applied patches​:

---
@​INC for perl 5.12.0​:
/home/mons/perl/perl-ex
/home/mons/lib/perl5/site_perl/5.12.0/amd64-freebsd
/home/mons/lib/perl5/site_perl/5.12.0
/home/mons/lib/perl5/5.12.0/amd64-freebsd
/home/mons/lib/perl5/5.12.0
.

---
Environment for perl 5.12.0​:
HOME=/home/mons
LANG=en_US.UTF-8
LANGUAGE (unset)
LD_LIBRARY_PATH (unset)
LOGDIR (unset)
PATH=/home/mons/home-

bin​:/home/mons/bin​:/home/mons/flex/bin​:/sbin​:/bin​:/usr/sbin​:/usr/bin​:/usr/games​:/usr/local/sbin​:/usr/local/bin​:/home/mons/bin

PERL5LIB=/home/mons/perl/perl\-ex
PERL\_BADLANG \(unset\)
SHELL=/usr/local/bin/bash
@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented Jan 25, 2011

From @iabyn

On Fri, Jul 16, 2010 at 05​:47​:19AM -0700, mons@​cpan.org wrote​:

##### sample.pl #####
use utf8;
my @​make;
while (<DATA>) {
chomp;
push @​make, '';
s{\\\s*$}{};
$make[-1] .= $_;
}
for (@​make) {
m{^([^=]+?)\s*=\s*(.+)$}o;
}
__DATA__
A=
AAAA=\
### end sample.pl ###

or

##### sample.pl #####
my @​x = ("A=B","AAAA=/");
utf8​::upgrade $_ for @​x;
$x[1] =~ s{/\s*$}{};
for (@​x) {
m{^([^=]+?)\s*=.+$};
}
### end sample.pl ###

Assertion failed​: (rx->sublen >= (s - rx->subbeg) + i), function Perl_reg_numbered_buff_fetch, file regcomp.c, line 5199.
Abort trap​: 6 (core dumped)

Repeatable under​:
- v5.10.1 (*) built for i686-linux-thread-multi
- v5.10.0 built for amd64-freebsd
- v5.10.1 (*) built for amd64-freebsd
- v5.12.0 built for amd64-freebsd
- v5.13.2 built for amd64-freebsd

I can't reproduce the issue for the first sample code, but can with the
second, and its still present in 5.1.39 and blead.

--
You never really learn to swear until you learn to drive.

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented Jan 25, 2011

The RT System itself - Status changed from 'new' to 'open'

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented Mar 25, 2011

From @khwilliamson

This appears to be fixed, as it no longer happens in blead, but does for
me on earlier 5.13 releases. I'm guessing that it is either my changes
to buffering look-up or DaveM's fixes to s///.

--Karl Williamson

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented Mar 26, 2011

From @iabyn

On Fri, Mar 25, 2011 at 01​:45​:11PM -0700, Karl Williamson via RT wrote​:

This appears to be fixed, as it no longer happens in blead, but does for
me on earlier 5.13 releases. I'm guessing that it is either my changes
to buffering look-up or DaveM's fixes to s///.

Bisect shows it to be the following. Does that look plausible to you?

commit 137165a
Author​: Karl Williamson <public@​khwilliamson.com>
Date​: Fri Feb 25 20​:10​:47 2011 -0700

  Free up bit in ANYOF flags

  This is the foundation for fixing the regression RT #82610. My analysis
  was wrong that two bits could be shared, at least not without further
  work. This changes to use a different mechanism to pass needed
  information to regexec.c so that another bit can be freed up and, in a
  later commit, the two bits can become unshared again.

  The bit that is freed up is ANYOF_UTF8, which basically said there is
  something that is matched outside the ANYOF bitmap, and requires the
  target string to be in utf8. This changes things so the existence of
  something besides the bitmap indicates this, and so no flag is needed.
  The flag bit ANYOF_NONBITMAP_NON_UTF8 remains to indicate that there is
  something that should be matched outside the bitmap even if the target
  string isn't in utf8.

--
Now is the discount of our winter tent
  -- sign seen outside camping shop

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented Mar 26, 2011

From @khwilliamson

On 03/26/2011 11​:25 AM, Dave Mitchell wrote​:

On Fri, Mar 25, 2011 at 01​:45​:11PM -0700, Karl Williamson via RT wrote​:

This appears to be fixed, as it no longer happens in blead, but does for
me on earlier 5.13 releases. I'm guessing that it is either my changes
to buffering look-up or DaveM's fixes to s///.

Bisect shows it to be the following. Does that look plausible to you?

commit 137165a
Author​: Karl Williamson<public@​khwilliamson.com>
Date​: Fri Feb 25 20​:10​:47 2011 -0700

 Free up bit in ANYOF flags

 This is the foundation for fixing the regression RT \#82610\.  My analysis
 was wrong that two bits could be shared\, at least not without further
 work\.  This changes to use a different mechanism to pass needed
 information to regexec\.c so that another bit can be freed up and\, in a
 later commit\, the two bits can become unshared again\.

 The bit that is freed up is ANYOF\_UTF8\, which basically said there is
 something that is matched outside the ANYOF bitmap\, and requires the
 target string to be in utf8\.  This changes things so the existence of
 something besides the bitmap indicates this\, and so no flag is needed\.
 The flag bit ANYOF\_NONBITMAP\_NON\_UTF8 remains to indicate that there is
 something that should be matched outside the bitmap even if the target
 string isn't in utf8\.

It's semi plausible. There have been a number of bugs in the optimizer
that that commit hid (and which have hopefully subsequently been fixed).
  The way to tell for sure, I suppose, is to look at the compiled and
executed regex before and after that commit.

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented May 16, 2011

@iabyn - Status changed from 'open' to 'resolved'

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented May 17, 2011

From @iabyn

I'm just forwarding this message with a [perl #...] added in the subject
line so that it gets picked up by RT.

----- Forwarded message from Michael Schroeder <mls@​suse.de> -----

Date​: Thu, 28 Apr 2011 12​:45​:40 +0200
From​: Michael Schroeder <mls@​suse.de>
To​: perl5-porters@​perl.org
Subject​: 5.14.0 assertion failure (RT#76538)
Message-ID​: <20110428104540.GA31563@​suse.de>

Hi Porters,

This is about RT#76538, "Assertion failed​: (rx->sublen >= (s
- rx->subbeg) + i), function Perl_reg_numbered_buff_fetch"

The following little test program still crashes for me​:

  my @​x = ("AX=B","AAAAAAX=");
  utf8​::upgrade($x[1]);
  for (@​x) {
  m{^([^=]+?)X\s*=.+$};
  print "-> $1\n";
  }

What happens is that $1 is already set when "AAAAAAX=" is
matched against the ^([^=]+?) part of the regexp, then
the swash needs to be created and Perl_save_re_context() is
called. save_re_context tries to save $1, but $1 is not
usable at the moment - it contains the new offsets (0-6), but
subbeg and sublen still point to the old match, as they
only get set at the end of the match.

I'm not sure about the best way to fix this. A quick and dirty
fix is to modify Perl_save_re_context() to not use save_scalar(),
but to use a variant that doesn't call SvGETMAGIC(). (The
current content of $1 needs to be saved, see #18107.)

Inline Patch
--- ./regcomp.c.orig	2011-04-27 14:19:37.000000000 +0000
+++ ./regcomp.c	2011-04-27 14:21:58.000000000 +0000
@@ -9912,8 +9912,23 @@ Perl_save_re_context(pTHX)
 
 		if (gvp) {
 		    GV * const gv = *gvp;
-		    if (SvTYPE(gv) == SVt_PVGV && GvSV(gv))
-			save_scalar(gv);
+		    if (SvTYPE(gv) == SVt_PVGV && GvSV(gv)) {
+			/* this is a copy of save_scalar() without the GETMAGIC call, RT#76538 */
+			SV ** const sptr = &GvSVn(gv);
+			SV * osv = *sptr;
+			SV * nsv = newSV(0);
+			save_pushptrptr(SvREFCNT_inc_simple(gv), SvREFCNT_inc(osv), SAVEt_SV);
+			if (SvTYPE(osv) >= SVt_PVMG && SvMAGIC(osv) && SvTYPE(osv) != SVt_PVGV) {
+			    if (SvGMAGICAL(osv)) {
+				const bool oldtainted = PL_tainted;
+				SvFLAGS(osv) |= (SvFLAGS(osv) &
+				    (SVp_IOK|SVp_NOK|SVp_POK)) >> PRIVSHIFT;
+				PL_tainted = oldtainted;
+			    }
+			    mg_localize(osv, nsv, 1);
+			}
+			*sptr = nsv;
+		    }
 		}
 	    }
 	}


A saner way would probably be to avoid the inconsistency between offs and subbeg/sublen\.

Cheers,
  Michael.

--
Michael Schroeder mls@​suse.de
SUSE LINUX Products GmbH, GF Markus Rex, HRB 16746 AG Nuernberg
main(_){while(_=getchar())putchar(_-1/(~(_|32)/13*2-11)*13);}

----- End forwarded message -----

--
You live and learn (although usually you just live).

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented May 17, 2011

@iabyn - Status changed from 'resolved' to 'open'

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented Jan 27, 2013

From @khwilliamson

On Tue May 17 03​:38​:38 2011, davem wrote​:

I'm just forwarding this message with a [perl #...] added in the
subject
line so that it gets picked up by RT.

----- Forwarded message from Michael Schroeder <mls@​suse.de> -----

Date​: Thu, 28 Apr 2011 12​:45​:40 +0200
From​: Michael Schroeder <mls@​suse.de>
To​: perl5-porters@​perl.org
Subject​: 5.14.0 assertion failure (RT#76538)
Message-ID​: <20110428104540.GA31563@​suse.de>

Hi Porters,

This is about RT#76538, "Assertion failed​: (rx->sublen >= (s
- rx->subbeg) + i), function Perl_reg_numbered_buff_fetch"

The following little test program still crashes for me​:

my @&#8203;x = \("AX=B"\,"AAAAAAX="\);
utf8&#8203;::upgrade\($x\[1\]\);
for \(@&#8203;x\) \{
  m\{^\(\[^=\]\+?\)X\\s\*=\.\+$\};
  print "\-> $1\\n";
\}

What happens is that $1 is already set when "AAAAAAX=" is
matched against the ^([^=]+?) part of the regexp, then
the swash needs to be created and Perl_save_re_context() is
called. save_re_context tries to save $1, but $1 is not
usable at the moment - it contains the new offsets (0-6), but
subbeg and sublen still point to the old match, as they
only get set at the end of the match.

I'm not sure about the best way to fix this. A quick and dirty
fix is to modify Perl_save_re_context() to not use save_scalar(),
but to use a variant that doesn't call SvGETMAGIC(). (The
current content of $1 needs to be saved, see #18107.)

--- ./regcomp.c.orig 2011-04-27 14​:19​:37.000000000 +0000
+++ ./regcomp.c 2011-04-27 14​:21​:58.000000000 +0000
@​@​ -9912,8 +9912,23 @​@​ Perl_save_re_context(pTHX)

     if \(gvp\) \{
         GV \* const gv = \*gvp;

- if (SvTYPE(gv) == SVt_PVGV && GvSV(gv))
- save_scalar(gv);
+ if (SvTYPE(gv) == SVt_PVGV && GvSV(gv)) {
+ /* this is a copy of save_scalar() without the GETMAGIC call,
RT#76538 */
+ SV ** const sptr = &GvSVn(gv);
+ SV * osv = *sptr;
+ SV * nsv = newSV(0);
+ save_pushptrptr(SvREFCNT_inc_simple(gv), SvREFCNT_inc(osv),
SAVEt_SV);
+ if (SvTYPE(osv) >= SVt_PVMG && SvMAGIC(osv) && SvTYPE(osv) !=
SVt_PVGV) {
+ if (SvGMAGICAL(osv)) {
+ const bool oldtainted = PL_tainted;
+ SvFLAGS(osv) |= (SvFLAGS(osv) &
+ (SVp_IOK|SVp_NOK|SVp_POK)) >> PRIVSHIFT;
+ PL_tainted = oldtainted;
+ }
+ mg_localize(osv, nsv, 1);
+ }
+ *sptr = nsv;
+ }
}
}
}

A saner way would probably be to avoid the inconsistency between
offs and subbeg/sublen.

Cheers,
Michael.

This test program no longer panics. The reason is that \s no longer
loads a swash, and so save_re_context() is not called in this example.
Other programs that fail could still be generated. \w and \d still load
swashes, though that may change in the future. Likely to "always" load
swashes are [​:alnum​:] and similar.
--
Karl Williamson

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented Jan 27, 2013

From @khwilliamson

Looking into this further makes me wonder if something I've done is
problematic.

I tried to construct a test case that would panic with current blead,
and couldn't. First off, things have been restructured so that swashes
aren't loaded unless absolutely necessary. That requires something
above the Latin1 range. So just changing the \s to a \d, for example,
doesn't load a swash. Changing the '=' in the pattern to something
above Latin1 (along with the \d change) does load a swash. But I've
changed regexec.c so that it no longer calls save_re_context() in this
situation; it just loads the swash. And that makes me wonder if I'm wrong.

Coming into a project, as I've done, and making patches without grokking
the bigger picture is problematic. Unfortunately, there is little
bigger picture documentation for Perl, and none that I know of that
explains the need for save_re_context(). What I've done is
cargo-cult-follow current practice, until something breaks and I have to
learn why. This may be one of those instances.

The code I first read in regexec.c that loads swashes is in
regclass_swash(). It doesn't do a save_re_context(). It never occurred
to me to wonder why that was ok, but in other places, save_re_context()
was required. Looking now, the commit message from Larry, from 1998
adding regclass_swash, is unhelpful to me. The comments say basically,
only that it is a total kludge. If someone would enlighten me as to why
this is needed in some places, and not needed when in regclass_swash(),
I would be grateful.

My best guess is that it was because of utf8_heavy.pl potentially trying
to load a swash while itself being called from loading a swash. If so,
then it probably can be gotten rid of, as the restructuring causes no
swashes to be needed in processing utf8_heavy.pl. I can, and probably
should, put a 'use re "/aa"' at the beginning of it, and all tests pass.

(Any pattern that has the /aa modifiers will never load a swash, and
will compile and execute faster than those sans those modifiers.)

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented Jan 27, 2013

From @greerga

On Sun, 27 Jan 2013, Karl Williamson via RT wrote​:

Looking into this further makes me wonder if something I've done is
problematic.

I tried to construct a test case that would panic with current blead,
and couldn't. [...]

RT #60508 and e9105d3 fixed an occurrence
of this error starting 5.12 so the report of that error under 5.10.x isn't
surprising. I wouldn't be surprised if another code path had the same
sort of "swash loaded while already capturing values" result.

I can't get the example programs to fail with blead
(v5.17.7.0-155-g0eb30ae) either but of course make sure you have a
-DDEBUGGING build or the assert won't be there.

--
George Greer

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented Jan 29, 2013

From @iabyn

On Sun, Jan 27, 2013 at 10​:11​:20AM -0800, Karl Williamson via RT wrote​:

The code I first read in regexec.c that loads swashes is in
regclass_swash(). It doesn't do a save_re_context(). It never occurred
to me to wonder why that was ok, but in other places, save_re_context()
was required. Looking now, the commit message from Larry, from 1998
adding regclass_swash, is unhelpful to me. The comments say basically,
only that it is a total kludge. If someone would enlighten me as to why
this is needed in some places, and not needed when in regclass_swash(),
I would be grateful.

I can't answer your direct question, but I'll just make a general
observation.

Originally the regex engine stored all its internal state in global
variables. Over the years as the engine got more complex, and things like
(??{...}) and dynamically loadable swashes were added, it needed to become
re-entrant, and assorted ad-hoc 'save current regex state' bits of code
were added, in a hacky and inconsistent way. It's my long term goal to make
all state local, and so eliminate the need to save state; but until then
we're stuck with it.

Cargo-culting what's already there will probably be buggy, but no more
buggy than anything else.

--
"There's something wrong with our bloody ships today, Chatfield."
  -- Admiral Beatty at the Battle of Jutland, 31st May 1916.

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented Jan 31, 2013

From @khwilliamson

On 01/29/2013 10​:14 AM, Dave Mitchell wrote​:

On Sun, Jan 27, 2013 at 10​:11​:20AM -0800, Karl Williamson via RT wrote​:

The code I first read in regexec.c that loads swashes is in
regclass_swash(). It doesn't do a save_re_context(). It never occurred
to me to wonder why that was ok, but in other places, save_re_context()
was required. Looking now, the commit message from Larry, from 1998
adding regclass_swash, is unhelpful to me. The comments say basically,
only that it is a total kludge. If someone would enlighten me as to why
this is needed in some places, and not needed when in regclass_swash(),
I would be grateful.

I can't answer your direct question, but I'll just make a general
observation.

Originally the regex engine stored all its internal state in global
variables. Over the years as the engine got more complex, and things like
(??{...}) and dynamically loadable swashes were added, it needed to become
re-entrant, and assorted ad-hoc 'save current regex state' bits of code
were added, in a hacky and inconsistent way. It's my long term goal to make
all state local, and so eliminate the need to save state; but until then
we're stuck with it.

Cargo-culting what's already there will probably be buggy, but no more
buggy than anything else.

This explains a lot. It turns out however, that the code in utf8.c that
loads in a swash does a save_re_context(). So the ones in regexec.c are
extraneous; perhaps doing it twice has negative consequences, and could
explain some bugs.

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented Jul 25, 2013

From @khwilliamson

On Wed Jan 30 19​:43​:51 2013, public@​khwilliamson.com wrote​:

On 01/29/2013 10​:14 AM, Dave Mitchell wrote​:

On Sun, Jan 27, 2013 at 10​:11​:20AM -0800, Karl Williamson via RT
wrote​:

The code I first read in regexec.c that loads swashes is in
regclass_swash(). It doesn't do a save_re_context(). It never
occurred
to me to wonder why that was ok, but in other places,
save_re_context()
was required. Looking now, the commit message from Larry, from
1998
adding regclass_swash, is unhelpful to me. The comments say
basically,
only that it is a total kludge. If someone would enlighten me as
to why
this is needed in some places, and not needed when in
regclass_swash(),
I would be grateful.

I can't answer your direct question, but I'll just make a general
observation.

Originally the regex engine stored all its internal state in global
variables. Over the years as the engine got more complex, and things
like
(??{...}) and dynamically loadable swashes were added, it needed to
become
re-entrant, and assorted ad-hoc 'save current regex state' bits of
code
were added, in a hacky and inconsistent way. It's my long term goal
to make
all state local, and so eliminate the need to save state; but until
then
we're stuck with it.

Cargo-culting what's already there will probably be buggy, but no
more
buggy than anything else.

This explains a lot. It turns out however, that the code in utf8.c
that
loads in a swash does a save_re_context(). So the ones in regexec.c
are
extraneous; perhaps doing it twice has negative consequences, and
could
explain some bugs.

Now that save_re_context() is gone, can we close this ticket?
--
Karl Williamson

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented Jul 26, 2013

From @jkeenan

On Wed Jul 24 18​:37​:48 2013, khw wrote​:

Now that save_re_context() is gone, can we close this ticket?

With Perl 5.18.0 (Darwin/PPC), I get no failure or crash with either of
the OP's programs. On that basis, +1 to closing.

Thank you very much.
Jim Keenan

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented Jul 27, 2013

From @iabyn

On Wed, Jul 24, 2013 at 06​:37​:49PM -0700, Karl Williamson via RT wrote​:

Now that save_re_context() is gone, can we close this ticket?

Note that save_re_context() hasn't actually gone. All the parts of it
associated with saving and restoring global regex state in PL_ vars
has been stripped out, but it's still responsible for doing the equivalent
of local ($1, $2, ...) which is probably wrong and needs looking at at
some point, and is on my Big List of Things to Look At Sometime.

So I'm not sure whether the bug is fully resolved yet

--
Diplomacy is telling someone to go to hell in such a way that they'll
look forward to the trip

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented Mar 2, 2015

From @khwilliamson

It turns out that this has been fixed since 5.14. The first part of the script bisects to
ef87b81
and the second to
356123f

--
Karl Williamson

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented Mar 2, 2015

@khwilliamson - Status changed from 'open' to 'resolved'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
1 participant
You can’t perform that action at this time.