Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Perl segfault in regex engine in integer overflow situation #14858

Closed
p5pRT opened this issue Aug 17, 2015 · 10 comments
Closed

Perl segfault in regex engine in integer overflow situation #14858

p5pRT opened this issue Aug 17, 2015 · 10 comments
Labels

Comments

@p5pRT
Copy link
Collaborator

@p5pRT p5pRT commented Aug 17, 2015

Migrated from rt.perl.org#125826 (status was 'resolved')

Searchable as RT125826$

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented Aug 17, 2015

From @dcollinsn

Greetings,

The afl fuzzer has identified the following testcase which causes a segmentation fault in the regular expression engine in perl and miniperl (oddly, only when run WITHOUT -Ilib). The testcase is the 17-character file​:

/\x{E000000000}|/

GDB output identifies the segfault within malloc, which isn't very helpful, but (hopefully) successfuly isolated the crash to Perl_regexec_flags. Valgrind appears to be describing a buffer overflow. Old versions of perl throw "integer overflow in hexadecimal number" but do not segfault. Git bisect identifies a commit which appears to be a significant overhaul of part of the regular expression engine.

**GDB**

GNU gdb (GDB) 7.0.1-debian
Copyright (C) 2009 Free Software Foundation, Inc.
License GPLv3+​: GNU GPL version 3 or later <http​://gnu.org/licenses/gpl.html>
This is free software​: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "i486-linux-gnu".
For bug reporting instructions, please see​:
<http​://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /usr/local/perl-afl/bin/perl...done.
(gdb) run
Starting program​: /usr/local/perl-afl/bin/perl allcrash/f4i000012
[Thread debugging using libthread_db enabled]

Program received signal SIGSEGV, Segmentation fault.
0xb7e6321e in malloc_consolidate (av=<value optimized out>) at malloc.c​:5153
5153 malloc.c​: No such file or directory.
  in malloc.c
(gdb) bt
#0 0xb7e6321e in malloc_consolidate (av=<value optimized out>)
  at malloc.c​:5153
#1 0xb7e654dd in _int_malloc (av=<value optimized out>,
  bytes=<value optimized out>) at malloc.c​:4373
#2 0xb7e6797c in *__GI___libc_malloc (bytes=4056) at malloc.c​:3661
#3 0x082c4248 in Perl_safesysmalloc (size=4056) at util.c​:149
#4 0x085cce42 in Perl_regexec_flags (rx=0x873c3d8, stringarg=0x86df082 "",
  strend=0x86df082 "", strbeg=0x86df082 "", minend=0, sv=0x872b268,
  data=0x0, flags=97) at regexec.c​:2979
#5 0x08366221 in Perl_pp_match () at pp_hot.c​:1486
#6 0x0835d66b in Perl_runops_standard () at run.c​:41
#7 0x08106879 in S_run_body (my_perl=0x8729008) at perl.c​:2448
#8 perl_run (my_perl=0x8729008) at perl.c​:2371
#9 0x08065d7e in main (argc=2, argv=0xbffff4e4, env=0xbffff4f0)
  at perlmain.c​:116

**VALGRIND**

==9388== Memcheck, a memory error detector
==9388== Copyright (C) 2002-2010, and GNU GPL'd, by Julian Seward et al.
==9388== Using Valgrind-3.6.0.SVN-Debian and LibVEX; rerun with -h for copyright info
==9388== Command​: ../bin/perl allcrash/f4i000012
==9388==
==9388== Invalid write of size 1
==9388== at 0x85D2ED6​: Perl_uvoffuni_to_utf8_flags (utf8.c​:231)
==9388== by 0x8213DC9​: S_make_trie (regcomp.c​:2374)
==9388== by 0x823F2FD​: T.1957 (regcomp.c​:4316)
==9388== by 0x828B5CE​: Perl_re_op_compile (regcomp.c​:7232)
==9388== by 0x80D50A7​: Perl_pmruntime (op.c​:5579)
==9388== by 0x81CE567​: Perl_yyparse (perly.y​:1038)
==9388== by 0x810F4AE​: S_parse_body (perl.c​:2296)
==9388== by 0x81128C8​: perl_parse (perl.c​:1626)
==9388== by 0x8065B84​: main (perlmain.c​:114)
==9388== Address 0x4237342 is 0 bytes after a block of size 10 alloc'd
==9388== at 0x4023F50​: malloc (vg_replace_malloc.c​:236)
==9388== by 0x82C4247​: Perl_safesysmalloc (util.c​:149)
==9388== by 0x83C4803​: Perl_sv_grow (sv.c​:1624)
==9388== by 0x83C5B72​: Perl_newSV (sv.c​:5531)
==9388== by 0x8213D97​: S_make_trie (regcomp.c​:2374)
==9388== by 0x823F2FD​: T.1957 (regcomp.c​:4316)
==9388== by 0x828B5CE​: Perl_re_op_compile (regcomp.c​:7232)
==9388== by 0x80D50A7​: Perl_pmruntime (op.c​:5579)
==9388== by 0x81CE567​: Perl_yyparse (perly.y​:1038)
==9388== by 0x810F4AE​: S_parse_body (perl.c​:2296)
==9388== by 0x81128C8​: perl_parse (perl.c​:1626)
==9388== by 0x8065B84​: main (perlmain.c​:114)
==9388==
==9388== Invalid write of size 1
==9388== at 0x85D2EF4​: Perl_uvoffuni_to_utf8_flags (utf8.c​:232)
==9388== by 0x8213DC9​: S_make_trie (regcomp.c​:2374)
==9388== by 0x823F2FD​: T.1957 (regcomp.c​:4316)
==9388== by 0x828B5CE​: Perl_re_op_compile (regcomp.c​:7232)
==9388== by 0x80D50A7​: Perl_pmruntime (op.c​:5579)
==9388== by 0x81CE567​: Perl_yyparse (perly.y​:1038)
==9388== by 0x810F4AE​: S_parse_body (perl.c​:2296)
==9388== by 0x81128C8​: perl_parse (perl.c​:1626)
==9388== by 0x8065B84​: main (perlmain.c​:114)
==9388== Address 0x4237343 is 1 bytes after a block of size 10 alloc'd
==9388== at 0x4023F50​: malloc (vg_replace_malloc.c​:236)
==9388== by 0x82C4247​: Perl_safesysmalloc (util.c​:149)
==9388== by 0x83C4803​: Perl_sv_grow (sv.c​:1624)
==9388== by 0x83C5B72​: Perl_newSV (sv.c​:5531)
==9388== by 0x8213D97​: S_make_trie (regcomp.c​:2374)
==9388== by 0x823F2FD​: T.1957 (regcomp.c​:4316)
==9388== by 0x828B5CE​: Perl_re_op_compile (regcomp.c​:7232)
==9388== by 0x80D50A7​: Perl_pmruntime (op.c​:5579)
==9388== by 0x81CE567​: Perl_yyparse (perly.y​:1038)
==9388== by 0x810F4AE​: S_parse_body (perl.c​:2296)
==9388== by 0x81128C8​: perl_parse (perl.c​:1626)
==9388== by 0x8065B84​: main (perlmain.c​:114)
==9388==
==9388== Invalid write of size 1
==9388== at 0x85D2F04​: Perl_uvoffuni_to_utf8_flags (utf8.c​:233)
==9388== by 0x8213DC9​: S_make_trie (regcomp.c​:2374)
==9388== by 0x823F2FD​: T.1957 (regcomp.c​:4316)
==9388== by 0x828B5CE​: Perl_re_op_compile (regcomp.c​:7232)
==9388== by 0x80D50A7​: Perl_pmruntime (op.c​:5579)
==9388== by 0x81CE567​: Perl_yyparse (perly.y​:1038)
==9388== by 0x810F4AE​: S_parse_body (perl.c​:2296)
==9388== by 0x81128C8​: perl_parse (perl.c​:1626)
==9388== by 0x8065B84​: main (perlmain.c​:114)
==9388== Address 0x4237344 is 2 bytes after a block of size 10 alloc'd
==9388== at 0x4023F50​: malloc (vg_replace_malloc.c​:236)
==9388== by 0x82C4247​: Perl_safesysmalloc (util.c​:149)
==9388== by 0x83C4803​: Perl_sv_grow (sv.c​:1624)
==9388== by 0x83C5B72​: Perl_newSV (sv.c​:5531)
==9388== by 0x8213D97​: S_make_trie (regcomp.c​:2374)
==9388== by 0x823F2FD​: T.1957 (regcomp.c​:4316)
==9388== by 0x828B5CE​: Perl_re_op_compile (regcomp.c​:7232)
==9388== by 0x80D50A7​: Perl_pmruntime (op.c​:5579)
==9388== by 0x81CE567​: Perl_yyparse (perly.y​:1038)
==9388== by 0x810F4AE​: S_parse_body (perl.c​:2296)
==9388== by 0x81128C8​: perl_parse (perl.c​:1626)
==9388== by 0x8065B84​: main (perlmain.c​:114)
==9388==
==9388==
==9388== HEAP SUMMARY​:
==9388== in use at exit​: 93,910 bytes in 591 blocks
==9388== total heap usage​: 793 allocs, 202 frees, 117,673 bytes allocated
==9388==
==9388== LEAK SUMMARY​:
==9388== definitely lost​: 0 bytes in 0 blocks
==9388== indirectly lost​: 0 bytes in 0 blocks
==9388== possibly lost​: 12,727 bytes in 292 blocks
==9388== still reachable​: 81,183 bytes in 299 blocks
==9388== suppressed​: 0 bytes in 0 blocks
==9388== Rerun with --leak-check=full to see details of leaked memory
==9388==
==9388== For counts of detected and suppressed errors, rerun with​: -v
==9388== ERROR SUMMARY​: 3 errors from 3 contexts (suppressed​: 25 from 8)
dcollins@​nagios​:/usr/local/perl-afl/out$

**PERL -V**

Summary of my perl5 (revision 5 version 23 subversion 2) configuration​:
  Derived from​: 9728ed0
  Platform​:
  osname=linux, osvers=2.6.32-5-686, archname=i686-linux-64int-ld
  uname='linux nagios 2.6.32-5-686 #1 smp tue may 13 16​:33​:32 utc 2014 i686 gnulinux '
  config_args=''
  hint=recommended, useposix=true, d_sigaction=define
  useithreads=undef, usemultiplicity=undef
  use64bitint=define, use64bitall=undef, uselongdouble=define
  usemymalloc=n, bincompat5005=undef
  Compiler​:
  cc='afl-gcc', ccflags ='-fwrapv -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64',
  optimize='-g',
  cppflags='-fwrapv -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include'
  ccversion='', gccversion='4.4.5', gccosandvers=''
  intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=12345678, doublekind=3
  d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12, longdblkind=3
  ivtype='long long', ivsize=8, nvtype='long double', nvsize=12, Off_t='off_t', lseeksize=8
  alignbytes=4, prototype=define
  Linker and Libraries​:
  ld='afl-gcc', ldflags =' -fstack-protector -L/usr/local/lib'
  libpth=/usr/local/lib /usr/lib/gcc/i486-linux-gnu/4.4.5/include-fixed /usr/lib /lib/../lib /usr/lib/../lib /lib /usr/lib/i486-linux-gnu /usr/lib64
  libs=-lpthread -lnsl -ldl -lm -lcrypt -lutil -lc
  perllibs=-lpthread -lnsl -ldl -lm -lcrypt -lutil -lc
  libc=libc-2.11.3.so, so=so, useshrplib=false, libperl=libperl.a
  gnulibc_version='2.11.3'
  Dynamic Linking​:
  dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E'
  cccdlflags='-fPIC', lddlflags='-shared -g -L/usr/local/lib -fstack-protector'

Characteristics of this binary (from libperl)​:
  Compile-time options​: HAS_TIMES PERLIO_LAYERS PERL_COPY_ON_WRITE
  PERL_DONT_CREATE_GVSV
  PERL_HASH_FUNC_ONE_AT_A_TIME_HARD PERL_MALLOC_WRAP
  PERL_PRESERVE_IVUV USE_64_BIT_INT USE_LARGE_FILES
  USE_LOCALE USE_LOCALE_COLLATE USE_LOCALE_CTYPE
  USE_LOCALE_NUMERIC USE_LOCALE_TIME USE_LONG_DOUBLE
  USE_PERLIO USE_PERL_ATOF
  Locally applied patches​:
  uncommitted-changes
  Built under linux
  Compiled at Aug 11 2015 16​:38​:21
  @​INC​:
  /usr/local/perl-afl/lib/site_perl/5.23.2/i686-linux-64int-ld
  /usr/local/perl-afl/lib/site_perl/5.23.2
  /usr/local/perl-afl/lib/5.23.2/i686-linux-64int-ld
  /usr/local/perl-afl/lib/5.23.2
  .

**BISECT**

cdd87c1 is the first bad commit
commit cdd87c1
Author​: Karl Williamson <public@​khwilliamson.com>
Date​: Sun Sep 22 21​:36​:29 2013 -0600

  Teach regex optimizer to handle above-Latin1
 
  Until this commit, the regular expression optimizer has essentially
  punted on above-Latin1 code points. Under some circumstances, they
  would be taken into account, more or less, but often, the generated
  synthetic start class would end up matching all above-Latin1 code
  points. With the advent of inversion lists, it becomes feasible to
  actually fully handle such code points, as inversion lists are a
  convenient way to express arbitrary lists of code points and take their
  union, intersection, etc. This commit changes the optimizer to use
  inversion lists for operating on the code points the synthetic start
  class can match.
 
  I don't much understand the overall operation of the optimizer. I'm
  told that previous porters found that perturbing it caused unexpected
  behaviors. I had promised to get this change in 5.18, but didn't. I'm
  trying to get it in early enough into the 5.20 preliminary series that
  any problems will surface before 5.20 ships.
 
  This commit doesn't change the macro level logic, but does significantly
  change various micro level things. Thus the 'and' and 'or' subroutines
  have been rewritten to use inversion lists. I'm pretty confident that
  they do what their names suggest. I re-derived the equations for what
  these operations should do, getting the same results in some cases, but
  extending others where the previous code mostly punted. The derivations
  are given in comments in the respective routines.
 
  Some of the code is greatly simplified, as it no longer has to treat
  above-Latin1 specially.
 
  It is now feasible for /i matching of above-Latin1 code points to know
  explicitly the folds that should be in the synthetic start class. But
  more prepatory work needs to be done before putting that into place.
  ...

:100644 100644 ec203f9c1f3ea42c65324e632c746042c32954f1 3dd62f946eedd99f87489fecfeb1acd86e2d250b M embed.fnc
:100644 100644 fca8736feb457d00987232649f8b563aef4f24fa 5fc9171d233fd06b4d6cc08570cdb41f976d0b0b M embed.h
:100644 100644 568cdf733c7784cc3f5a059583d9ae9a262ae72a 33d00ef42cf01836e9365e4a9086cf64b678b74b M proto.h
:100644 100644 ec24583f1f7bec42e186a4241353b9688bf6f2de efefd0a17ba14144286dc55c149d6ee422b18249 M regcomp.c
:100644 100644 f0153fc12c842c26d928d3303feb519430b73a07 eccb46690a4d252fb107044116973e3baa0cbd2c M regcomp.h
bisect run success

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented Aug 18, 2015

From @tonycoz

On Sun Aug 16 20​:56​:17 2015, dcollinsn@​gmail.com wrote​:

Greetings,

The afl fuzzer has identified the following testcase which causes a
segmentation fault in the regular expression engine in perl and
miniperl (oddly, only when run WITHOUT -Ilib). The testcase is the 17-
character file​:

/\x{E000000000}|/

Here's a fix (attached).

Tony

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented Aug 18, 2015

From @tonycoz

0001-perl-125826-make-the-buffer-large-enough-in-TRIE_STO.patch
From ab3f825e8c3d0b6f70faac9d6b3552923bd511d0 Mon Sep 17 00:00:00 2001
From: Tony Cook <tony@develop-help.com>
Date: Tue, 18 Aug 2015 12:11:12 +1000
Subject: [PATCH] [perl #125826] make the buffer large enough in
 TRIE_STORE_REVCHAR

---
 regcomp.c           | 2 +-
 t/re/pat_advanced.t | 9 +++++++++
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/regcomp.c b/regcomp.c
index f08f08f..c052cc7 100644
--- a/regcomp.c
+++ b/regcomp.c
@@ -2001,7 +2001,7 @@ is the recommended Unicode-aware way of saying
 #define TRIE_STORE_REVCHAR(val)                                            \
     STMT_START {                                                           \
 	if (UTF) {							   \
-            SV *zlopp = newSV(7); /* XXX: optimize me */                   \
+            SV *zlopp = newSV(UTF8_MAXBYTES); /* XXX: optimize me */       \
 	    unsigned char *flrbbbbb = (unsigned char *) SvPVX(zlopp);	   \
             unsigned const char *const kapow = uvchr_to_utf8(flrbbbbb, val); \
 	    SvCUR_set(zlopp, kapow - flrbbbbb);				   \
diff --git a/t/re/pat_advanced.t b/t/re/pat_advanced.t
index 230fd89..33647f3 100644
--- a/t/re/pat_advanced.t
+++ b/t/re/pat_advanced.t
@@ -2419,6 +2419,15 @@ EOF
                         'No segfault on qr{(?&foo){0}abc(?<foo>)}');
     }
 
+    SKIP:
+    {   # [perl #125826] buffer overflow in TRIE_STORE_REVCHAR
+        # (during compilation, so use a fresh perl)
+        $Config{uvsize} == 8
+	  or skip("need large code-points for this test", 1);
+	fresh_perl_is('/\x{E000000000}|/ and print qq(ok\n)', "ok\n", {},
+		      "buffer overflow in TRIE_STORE_REVCHAR");
+    }
+
     # !!! NOTE that tests that aren't at all likely to crash perl should go
     # a ways above, above these last ones.
 
-- 
2.5.0

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented Aug 18, 2015

The RT System itself - Status changed from 'new' to 'open'

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented Aug 18, 2015

From @dcollinsn

Hello p5p,

Just wondering if these are useful enough that you'd like me to continue
reporting them. I've had a passing interest in the perl ecosystem for a few
years now, and have been hoping to find enough time to learn the codebase
so that I can actually contribute, but in the mean time I was experimenting
with afl for fun. Is there any information I should provide to make these
reports more useful, and do they hold any value at all beyond corner cases
unlikely to ever be encountered in the real world?

Regards,
Dan Collins

On Mon, Aug 17, 2015 at 10​:18 PM, Tony Cook via RT <
perlbug-followup@​perl.org> wrote​:

On Sun Aug 16 20​:56​:17 2015, dcollinsn@​gmail.com wrote​:

Greetings,

The afl fuzzer has identified the following testcase which causes a
segmentation fault in the regular expression engine in perl and
miniperl (oddly, only when run WITHOUT -Ilib). The testcase is the 17-
character file​:

/\x{E000000000}|/

Here's a fix (attached).

Tony

---
via perlbug​: queue​: perl5 status​: new
https://rt-archive.perl.org/perl5/Ticket/Display.html?id=125826

From ab3f825e8c3d0b6f70faac9d6b3552923bd511d0 Mon Sep 17 00​:00​:00 2001
From​: Tony Cook <tony@​develop-help.com>
Date​: Tue, 18 Aug 2015 12​:11​:12 +1000
Subject​: [PATCH] [perl #125826] make the buffer large enough in
TRIE_STORE_REVCHAR

---
regcomp.c | 2 +-
t/re/pat_advanced.t | 9 +++++++++
2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/regcomp.c b/regcomp.c
index f08f08f..c052cc7 100644
--- a/regcomp.c
+++ b/regcomp.c
@​@​ -2001,7 +2001,7 @​@​ is the recommended Unicode-aware way of saying
#define TRIE_STORE_REVCHAR(val)
\
STMT_START {
\
if (UTF) {
\
- SV *zlopp = newSV(7); /* XXX​: optimize me */
\
+ SV *zlopp = newSV(UTF8_MAXBYTES); /* XXX​: optimize me */
\
unsigned char *flrbbbbb = (unsigned char *) SvPVX(zlopp);
\
unsigned const char *const kapow = uvchr_to_utf8(flrbbbbb,
val); \
SvCUR_set(zlopp, kapow - flrbbbbb);
\
diff --git a/t/re/pat_advanced.t b/t/re/pat_advanced.t
index 230fd89..33647f3 100644
--- a/t/re/pat_advanced.t
+++ b/t/re/pat_advanced.t
@​@​ -2419,6 +2419,15 @​@​ EOF
'No segfault on qr{(?&foo){0}abc(?<foo>)}');
}

+ SKIP​:
+ { # [perl #125826] buffer overflow in TRIE_STORE_REVCHAR
+ # (during compilation, so use a fresh perl)
+ $Config{uvsize} == 8
+ or skip("need large code-points for this test", 1);
+ fresh_perl_is('/\x{E000000000}|/ and print qq(ok\n)', "ok\n", {},
+ "buffer overflow in TRIE_STORE_REVCHAR");
+ }
+
# !!! NOTE that tests that aren't at all likely to crash perl should
go
# a ways above, above these last ones.

--
2.5.0

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented Aug 18, 2015

From PeterCMartini@GMail.com

On Mon, Aug 17, 2015 at 11​:28 PM, Dan Collins <dcollinsn@​gmail.com> wrote​:

Hello p5p,

Just wondering if these are useful enough that you'd like me to continue
reporting them. I've had a passing interest in the perl ecosystem for a few
years now, and have been hoping to find enough time to learn the codebase
so that I can actually contribute, but in the mean time I was experimenting
with afl for fun. Is there any information I should provide to make these
reports more useful, and do they hold any value at all beyond corner cases
unlikely to ever be encountered in the real world?

I can only speak for myself, but yes, I find these very useful; those
corner cases are the edges we can investigate first during refactoring, and
also the places to be most careful about if any refactoring is being done.

The cases I've seen you provide have been solid enough that I wouldn't ask
for any more details, though I'd also be very happy to get some more
details on how you've set up AFL to find these :-)

Regards,
Dan Collins

On Mon, Aug 17, 2015 at 10​:18 PM, Tony Cook via RT <
perlbug-followup@​perl.org> wrote​:

On Sun Aug 16 20​:56​:17 2015, dcollinsn@​gmail.com wrote​:

Greetings,

The afl fuzzer has identified the following testcase which causes a
segmentation fault in the regular expression engine in perl and
miniperl (oddly, only when run WITHOUT -Ilib). The testcase is the 17-
character file​:

/\x{E000000000}|/

Here's a fix (attached).

Tony

---
via perlbug​: queue​: perl5 status​: new
https://rt-archive.perl.org/perl5/Ticket/Display.html?id=125826

From ab3f825e8c3d0b6f70faac9d6b3552923bd511d0 Mon Sep 17 00​:00​:00 2001
From​: Tony Cook <tony@​develop-help.com>
Date​: Tue, 18 Aug 2015 12​:11​:12 +1000
Subject​: [PATCH] [perl #125826] make the buffer large enough in
TRIE_STORE_REVCHAR

---
regcomp.c | 2 +-
t/re/pat_advanced.t | 9 +++++++++
2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/regcomp.c b/regcomp.c
index f08f08f..c052cc7 100644
--- a/regcomp.c
+++ b/regcomp.c
@​@​ -2001,7 +2001,7 @​@​ is the recommended Unicode-aware way of saying
#define TRIE_STORE_REVCHAR(val)
\
STMT_START {
\
if (UTF) {
\
- SV *zlopp = newSV(7); /* XXX​: optimize me */
\
+ SV *zlopp = newSV(UTF8_MAXBYTES); /* XXX​: optimize me */
\
unsigned char *flrbbbbb = (unsigned char *) SvPVX(zlopp);
\
unsigned const char *const kapow = uvchr_to_utf8(flrbbbbb,
val); \
SvCUR_set(zlopp, kapow - flrbbbbb);
\
diff --git a/t/re/pat_advanced.t b/t/re/pat_advanced.t
index 230fd89..33647f3 100644
--- a/t/re/pat_advanced.t
+++ b/t/re/pat_advanced.t
@​@​ -2419,6 +2419,15 @​@​ EOF
'No segfault on qr{(?&foo){0}abc(?<foo>)}');
}

+ SKIP​:
+ { # [perl #125826] buffer overflow in TRIE_STORE_REVCHAR
+ # (during compilation, so use a fresh perl)
+ $Config{uvsize} == 8
+ or skip("need large code-points for this test", 1);
+ fresh_perl_is('/\x{E000000000}|/ and print qq(ok\n)', "ok\n", {},
+ "buffer overflow in TRIE_STORE_REVCHAR");
+ }
+
# !!! NOTE that tests that aren't at all likely to crash perl should
go
# a ways above, above these last ones.

--
2.5.0

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented Aug 18, 2015

From @tonycoz

On Mon, Aug 17, 2015 at 11​:28​:03PM -0400, Dan Collins wrote​:

Hello p5p,

Just wondering if these are useful enough that you'd like me to continue
reporting them. I've had a passing interest in the perl ecosystem for a few
years now, and have been hoping to find enough time to learn the codebase
so that I can actually contribute, but in the mean time I was experimenting
with afl for fun. Is there any information I should provide to make these
reports more useful, and do they hold any value at all beyond corner cases
unlikely to ever be encountered in the real world?

I think they're useful, even if we don't always have the development
tuits to deal with them immediately.

Tony

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented Aug 19, 2015

From @tonycoz

On Mon Aug 17 19​:18​:42 2015, tonyc wrote​:

On Sun Aug 16 20​:56​:17 2015, dcollinsn@​gmail.com wrote​:

Greetings,

The afl fuzzer has identified the following testcase which causes a
segmentation fault in the regular expression engine in perl and
miniperl (oddly, only when run WITHOUT -Ilib). The testcase is the 17-
character file​:

/\x{E000000000}|/

Here's a fix (attached).

Applied with non-code changes as 668fcfe.

Tony

@p5pRT p5pRT closed this Aug 19, 2015
@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented Aug 19, 2015

@tonycoz - Status changed from 'open' to 'resolved'

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented Aug 25, 2015

From @rurban

On Aug 18, 2015, at 5​:29 AM, Dan Collins via RT <perlbug-followup@​perl.org> wrote​:

Hello p5p,

Just wondering if these are useful enough that you'd like me to continue
reporting them. I've had a passing interest in the perl ecosystem for a few
years now, and have been hoping to find enough time to learn the codebase
so that I can actually contribute, but in the mean time I was experimenting
with afl for fun. Is there any information I should provide to make these
reports more useful, and do they hold any value at all beyond corner cases
unlikely to ever be encountered in the real world?

I certainly do not speak for p5p, they usually tend to go into the opposite direction whenever I voice my opinion,
but I do value your afl reports a lot, and I try to fix all of them. I track them in a special ticket and most of them
are already fixed. They are not getting lost.
Even if some p5p committers reject them as unfixable or such, I fix them at least in our version of perl5 for the
benefit of some other perl5 users.

They are useful as is. No need to worry.
Thanks.

My list (maybe I missed some lately)​:

New crashes, not yet in the 5.20.3 blockers list​:
  • perl #125341, duplicate of perl #121048 (Fixed only with bugfix/CM-819-rt125341-begin)
  • perl #125697 fixed in 5.22
  • perl #125540 S_scan_heredoc​: Assertion `s' failed at toke.c​:9314 (Todo)
  • perl #125534 Perl_sv_clear​: Assertion (SvTYPE(sv) != (svtype)SVTYPEMASK) failed (sv.c​:6395) with -e'map{%0=map{0}m 0 0}%0=map{0}0' (Todo)
  • perl #125350 null ptr deref -> S_clear_yystack (perly.c​:218) (Todo)
  • perl #123878 Infinite recursion (+segfault) on die() after goto-ing (Todo, very relevant to us!) See `bugfix/CM-891-rt123878-goto-die`
  • perl #125840 -e'$x=*0; *x=$x' glob_assign_glob inner ref (Rejected by p5p. Fixed in merge-upstream, with bugfix/CM-891-glob_assign_glob-crash)

see http​://perl5.git.perl.org/perl.git/blob/refs/heads/maint-5.20-votes​:/Porting/cherry-pick-votes-maint-5.20.xml
or Tickets Listed in #123921​: 5.20.3 blockers

  • perl #123554 avoid a crash from SvGROW(MEM_SIZE_MAX) (Fixed in 5.22)
  • Fix double free with const overload after errors (Fixed in 5.22 with 67c71cb)
  • perl #123617 Localise PL_lex_stuff (crash fix) (Fixed in 5.22 with eabab8b)
  • perl #123955 Fix assert fail with 0 s/// in quotes (Fixed in 5.22 with ce7c414)
  • perl #123737 Fix assertion failure with 0${ (Fixed in 5.22 with 488bc57)
  • perl #123737 Fix assertion failure with 0$#{ (Fixed in 5.22 with 310a0d0)
  • perl #123753 &\0foo parsing (Fixed in 5.22 with 3c47da3)
  • perl #123753 Assert fail with &{+foo} and errors (Fixed in 5.22 with eea8938)
  • perl #123677 Crash with token stack overflow (Fixed in 5.22 with 7aa8cb0)
  • perl #123759 always count on OPpTRANS_IDENTICAL (Fixed in 5.22 with a53bfda)
  • perl #123755 including unknown char in error requires care (Fixed in 5.22 with 8a6d8ec)
regcomp can read past end of string after parsing flags
  • perl #123554 catch a couple of other size overflows (Fixed in 5.22 with 4 commits)
  • perl #123554 fix threaded builds and prevent a warning (Fixed in 5.22)
  • perl #123816 fix stat stacking (Fixed in 5.22 with 87ebf1e)
  • perl #123870 fixup trie runtime debug output (Fixed in 5.22 with d0bec20)
  • perl #123874 fix argument underflow for pack() (Fixed in 5.22 with fc1bb3f)
  • perl #123801 Stop s##[}#e from crashing (Fixed in 5.22 with f4460c6)
  • perl #123802 Assertion failure with /$0{}/ (Fixed in 5.22 with 479ae48)
  • perl #123802 Assertion failure with qq[\L\L] (Fixed in 5.22 with 66edcf7)
  • perl #123763 Clear target on my $_; split (Fixed in 5.22 with 55b3980)
  • perl #123817 Assert fail with attr in anon hash (Fixed in 5.22 with 6b2b48a)
  • perl #123849 sv.c​: Fix sv_clear -Do output (Fixed in 5.22 with 813d2eb)
  • perl #123960 sv.c​: Fix gp_free -Do output (Fixed in 5.22 with 923ed58)
  • perl #123963 qq[@​<fullwidth digit>] (Fixed in 5.22 with 9d58dbc)
  • perl #123847 crash with *foo​::=*bar​::=*with_hash (Fixed in 5.22 with 3d50185)
  • perl #123995 Assert fail with s;@​{<<; (Fixed in 5.22 with b24768f)
  • perl #123790 Assert fail with *x=<y> (Fixed in 5.22 with aab1202 and 21639bf)
  • perl #124099 Wrong CvOUTSIDE in find_lexical_cv (Fixed in 5.22 with d655d9a)
  • perl #124187 don't call pad_findlex() on a NULL CV (Fixed in 5.22 with b12396a)
  • perl #124385 null ptr deref in Perl_cv_forget_slab (Fixed in 5.22 with de0885d)

more crashes​:
  • perl #123711 Fix crash with 0-5x-l {0} (Fixed with 5.22 with 83a85f4)
  • perl #123712 Fix /$a[/ parsing (Fixed with 5.22 with e47d32d)
  • perl #123712 Don't check sub_inwhat (Fixed with 5.22 with d27f4b9)
  • perl #124156 death during unwinding causes crash (Fixed with 5.22 with 1956db7)
  • perl #123398) don't fatalize warnings during unwinding (Fixed in our perl with 46b27d2, reportedly merged into 5.22 lately)

@p5pRT p5pRT added the Severity Low label Oct 19, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
1 participant
You can’t perform that action at this time.