Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

_Bool Perl_isSCRIPT_RUN(const U8 *, const U8 *, const _Bool): Assertion `s' failed. #17372

Closed
dur-randir opened this issue Dec 18, 2019 · 6 comments

Comments

@dur-randir
Copy link
Member

This is a bug report for perl from sergey.aleynikov@gmail.com,
generated with the help of perlbug 1.41 running under perl 5.31.7.


[Please describe your issue here]

While fuzzing perl v5.31.6-158-gdca9f615c2 built with afl and run
under libdislocator, I found the following program

q0=~/(?n)()(0)|()(*sr:)/

to cause an assertion failure on debugging builds

perl: regexec.c:10800: _Bool Perl_isSCRIPT_RUN(const U8 *, const U8 *, const _Bool): Assertion `s' failed.

GDB stack strace is

#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1 0x00007ffff7c24535 in __GI_abort () at abort.c:79
#2 0x00007ffff7c2440f in __assert_fail_base (fmt=0x7ffff7d86ee0 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x555555b96ef5 "s",
file=0x555555b97288 "regexec.c", line=10800, function=) at assert.c:92
#3 0x00007ffff7c32102 in __GI___assert_fail (assertion=0x555555b96ef5 "s", file=0x555555b97288 "regexec.c", line=10800,
function=0x555555bdedc0 <PRETTY_FUNCTION.21450> "Perl_isSCRIPT_RUN") at assert.c:101
#4 0x00005555558db624 in Perl_isSCRIPT_RUN (s=0x0, send=0x555555c75d60 "q0", utf8_target=false) at regexec.c:10800
#5 0x00005555558ce212 in S_regmatch (reginfo=0x7fffffffdce0, startpos=0x555555c75d60 "q0", prog=0x555555c78e3c) at regexec.c:7862
#6 0x00005555558be926 in S_regtry (reginfo=0x7fffffffdce0, startposp=0x7fffffffdaa8) at regexec.c:4029
#7 0x00005555558be2a6 in Perl_regexec_flags (rx=0x555555c70d28, stringarg=0x555555c75d60 "q0", strend=0x555555c75d62 "", strbeg=0x555555c75d60 "q0",
minend=0, sv=0x555555c70d10, data=0x0, flags=97) at regexec.c:3892
#8 0x00005555557773bb in Perl_pp_match () at pp_hot.c:3014
#9 0x0000555555717cba in Perl_runops_debug () at dump.c:2571
#10 0x00005555555f0f79 in S_run_body (oldscope=1) at perl.c:2786
#11 0x00005555555f04f1 in perl_run (my_perl=0x555555c4a260) at perl.c:2709
#12 0x00005555555a1165 in main (argc=3, argv=0x7fffffffe1c8, env=0x7fffffffe1e8) at perlmain.c:134

Apparently this happens since the introduction of script runs.

[Please do not change anything below this line]


Flags:
category=core
severity=medium

Site configuration information for perl 5.31.7:

Configured by root at Tue Dec 17 21:38:32 MSK 2019.

Summary of my perl5 (revision 5 version 31 subversion 7) configuration:
Derived from: dca9f61
Platform:
osname=linux
osvers=4.19.0-6-amd64
archname=x86_64-linux
uname='linux dorothy 4.19.0-6-amd64 #1 smp debian 4.19.67-2+deb10u2 (2019-11-11) x86_64 gnulinux '
config_args='-des -Dusedevel -DDEBUGGING -Dcc=afl-clang-fast -Doptimize=-std=c99 -O3 -funroll-loops -g'
hint=previous
useposix=true
d_sigaction=undef
useithreads=undef
usemultiplicity=undef
use64bitint=define
use64bitall=define
uselongdouble=undef
usemymalloc=n
default_inc_excludes_dot=define
bincompat5005=undef
Compiler:
cc='afl-clang-fast'
ccflags ='-DDEBUGGING -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -D_FORTIFY_SOURCE=2'
optimize='-std=c99 -O3 -funroll-loops -g'
cppflags='-DDEBUGGING -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include'
ccversion=''
gccversion='4.2.1 Compatible Clang 6.0.1 (tags/RELEASE_601/final)'
gccosandvers=''
intsize=4
longsize=8
ptrsize=8
doublesize=8
byteorder=12345678
doublekind=3
d_longlong=define
longlongsize=8
d_longdbl=define
longdblsize=16
longdblkind=3
ivtype='long'
ivsize=8
nvtype='double'
nvsize=8
Off_t='off_t'
lseeksize=8
alignbytes=8
prototype=define
Linker and Libraries:
ld='afl-clang-fast'
ldflags =' -fstack-protector-strong -L/usr/local/lib'
libpth=/usr/local/lib /usr/lib/llvm-6.0/lib/clang/6.0.1/lib /usr/include/x86_64-linux-gnu /usr/lib /lib/x86_64-linux-gnu /lib/../lib /usr/lib/x86_64-linux-gnu /usr/lib/../lib /lib /usr/local/lib /usr/lib/llvm-6.0/lib/clang/6.0.1/lib /usr/include/x86_64-linux-gnu /usr/lib
libs=-lpthread -lnsl -ldl -lm -lcrypt -lutil -lc
perllibs=-lpthread -lnsl -ldl -lm -lcrypt -lutil -lc
libc=libc-2.28.so
so=so
useshrplib=false
libperl=libperl.a
gnulibc_version='2.28'
Dynamic Linking:
dlsrc=dl_dlopen.xs
dlext=so
d_dlsymun=undef
ccdlflags='-Wl,-E'
cccdlflags='-fPIC'
lddlflags='-shared -std=c99 -O3 -funroll-loops -g -L/usr/local/lib -fstack-protector-strong'

Locally applied patches:
uncommitted-changes


@inc for perl 5.31.7:
lib
/usr/local/lib/perl5/site_perl/5.31.7/x86_64-linux
/usr/local/lib/perl5/site_perl/5.31.7
/usr/local/lib/perl5/5.31.7/x86_64-linux
/usr/local/lib/perl5/5.31.7


Environment for perl 5.31.7:
HOME=/home/afl
LANG=en_US.UTF-8
LANGUAGE=en_US:en
LC_CTYPE=en_US.UTF-8
LC_TIME=C
LD_LIBRARY_PATH (unset)
LOGDIR (unset)
PATH=/home/afl/perlbrew/bin:/home/afl/perlbrew/perls/perl-5.20.2/bin:/opt/local/bin:/usr/texbin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
PERLBREW_BASHRC_VERSION=0.78
PERLBREW_HOME=/home/afl/.perlbrew
PERLBREW_MANPATH=/home/afl/perlbrew/perls/perl-5.20.2/man
PERLBREW_PATH=/home/afl/perlbrew/bin:/home/afl/perlbrew/perls/perl-5.20.2/bin
PERLBREW_PERL=perl-5.20.2
PERLBREW_ROOT=/home/afl/perlbrew
PERLBREW_VERSION=0.78
PERL_BADLANG (unset)
SHELL=/usr/bin/zsh

@khwilliamson
Copy link
Contributor

What I've determined so far is that this is coming from the optimization stage of the regex compiler.
Program before optimization:
1: BRANCH (5)
2: NOTHING (3)
3: EXACT <0> (10)
5: BRANCH (FAIL)
6: NOTHING (7)
7: SROPEN (8)
8: NOTHING (9)
9: SRCLOSE (10)
10: END (0)
Final program:
1: TRIE-EXACT(JUMP)<S:1/2 W:2 L:0/1 C:1/1>[0] (10)
<0> (10)
<> (9)
9: SRCLOSE (10)

The SROPEN gets omitted. It may be because it gets turned into a trie. Why that might be is not obvious to me without probably single stepping through. If I insert a \b like so:

q0=~/(?n)()(0)|()\b(*sr:)/
it kind of works
Final program:
1: TRIE-EXACT(JUMP)<S:1/2 W:2 L:0/1 C:1/1>[0] (11)
<0> (11)
<> (8)
8: SROPEN (10)
9: NOTHING (10)
10: SRCLOSE (11)
11: END (0)

But the \b is gone. If I replace the script run with a second \b,

/(?n)()(0)|()\b\b/

I get
Final program:
1: TRIE-EXACT(JUMP)<S:1/2 W:2 L:0/1 C:1/1>[0] (9)
<0> (9)
<> (8)
8: BOUND (9)
9: END (0)

So the first thing after the trie is getting removed. @demerphq @hvds, any ideas off the top of your heads?

@hvds
Copy link
Contributor

hvds commented Jan 8, 2020

I tried stepping through this some. I think the handling of NOTHING in the trie creation logic is suspect.

I'm far from understanding it, but when constructing the trie, we end up invoking TRIE_HANDLE_WORD(state) when noper points to the node that is to provide our trieable string. Usually this is the first node of each branch, but if the first node is NOTHING we may use the second node instead.

I think the problem may be that we set noper to the node following NOTHING even when it is non-trieable. So I tried the patch below, and that does appear to fix this problem. My understanding is not sufficient to recommend it though, would really like @demerphq to take a look.

Hugo

diff --git a/regcomp.c b/regcomp.c
index a0de45497d..00e2f22a9e 100644
--- a/regcomp.c
+++ b/regcomp.c
@@ -2539,6 +2539,22 @@ is the recommended Unicode-aware way of saying
     *(d++) = uv;
 */
 
+#define TRIE_TYPE(X) ( ( NOTHING == (X) )                                   \
+                       ? NOTHING                                            \
+                       : ( EXACT == (X) || EXACT_REQ8 == (X) )             \
+                         ? EXACT                                            \
+                         : (     EXACTFU == (X)                             \
+                              || EXACTFU_REQ8 == (X)                       \
+                              || EXACTFUP == (X) )                          \
+                           ? EXACTFU                                        \
+                           : ( EXACTFAA == (X) )                            \
+                             ? EXACTFAA                                     \
+                             : ( EXACTL == (X) )                            \
+                               ? EXACTL                                     \
+                               : ( EXACTFLU8 == (X) )                       \
+                                 ? EXACTFLU8                                \
+                                 : 0 )
+
 #define TRIE_STORE_REVCHAR(val)                                            \
     STMT_START {                                                           \
        if (UTF) {                                                         \
@@ -2799,7 +2815,7 @@ S_make_trie(pTHX_ RExC_state_t *pRExC_state, regnode *startbranch,
              * eg, /(?:)a|(?:b)/ should be the same as /a|b/
              */
             regnode *noper_next= regnext(noper);
-            if (noper_next < tail)
+            if (noper_next < tail && TRIE_TYPE(OP(noper_next)) == flags)
                 noper= noper_next;
         }
 
@@ -3017,7 +3033,7 @@ S_make_trie(pTHX_ RExC_state_t *pRExC_state, regnode *startbranch,
 
             if (OP(noper) == NOTHING) {
                 regnode *noper_next= regnext(noper);
-                if (noper_next < tail)
+                if (noper_next < tail && TRIE_TYPE(OP(noper_next)) == flags)
                     noper= noper_next;
             }
 
@@ -3242,7 +3258,7 @@ S_make_trie(pTHX_ RExC_state_t *pRExC_state, regnode *startbranch,
 
             if (OP(noper) == NOTHING) {
                 regnode *noper_next= regnext(noper);
-                if (noper_next < tail)
+                if (noper_next < tail && TRIE_TYPE(OP(noper_next)) == flags)
                     noper= noper_next;
             }
 
@@ -4882,21 +4898,6 @@ S_study_chunk(pTHX_ RExC_state_t *pRExC_state, regnode **scanp,
 
 
                         */
-#define TRIE_TYPE(X) ( ( NOTHING == (X) )                                   \
-                       ? NOTHING                                            \
-                       : ( EXACT == (X) || EXACT_REQ8 == (X) )             \
-                         ? EXACT                                            \
-                         : (     EXACTFU == (X)                             \
-                              || EXACTFU_REQ8 == (X)                       \
-                              || EXACTFUP == (X) )                          \
-                           ? EXACTFU                                        \
-                           : ( EXACTFAA == (X) )                            \
-                             ? EXACTFAA                                     \
-                             : ( EXACTL == (X) )                            \
-                               ? EXACTL                                     \
-                               : ( EXACTFLU8 == (X) )                       \
-                                 ? EXACTFLU8                                \
-                                 : 0 )
 
                         /* dont use tail as the end marker for this traverse */
                         for ( cur = startbranch ; cur != scan ; cur = regnext( cur ) ) {

@demerphq
Copy link
Collaborator

demerphq commented Jan 9, 2020 via email

@hvds
Copy link
Contributor

hvds commented Jan 9, 2020

I need to look into this, but normally if it is NOTHING {non-triable} then it should produce a jump trie that can handle whatever comes after the NOTHING. So this works by bypassing the faulty logic, not by fixing it. I will investigate more, I suspect there is some property clause somewhere or flag or something about SROPEN, a new regop, which is not properly conifgured, which makes me suspect it might break in other contexts so we should get to the bottom of it. Also, we want to be able to trie a branch starting with NOTHING. EG, /foo||bar/ should produce a trie. Yves

The patch does create a trie with a branch starting nothing; my apologies, I should have shown this. Here are the original test case and Karl's corresponding version with /b/b as compiled after that patch:

% ./miniperl -Dr -e 'qr/(?n)()(0)|()(*sr:)/' 2>&1 | grep '^ '
   1: TRIE-EXACT[0] (10)
      <0> (10)
      <> (7)
   7:   SROPEN (9)
   8:     NOTHING (9)
   9:   SRCLOSE (10)
  10: END (0)
% ./miniperl -Dr -e 'qr/(?n)()(0)|()\b\b/' 2>&1 | grep '^ '
   1: TRIE-EXACT[0] (9)
      <0> (9)
      <> (7)
   7:   BOUND (8)
   8:   BOUND (9)
   9: END (0)
% 

Hugo

@khwilliamson
Copy link
Contributor

I suspect there is some property clause somewhere or flag
or something about SROPEN, a new regop, which is not properly conifgured,

This seems unlikely to me, as I said in my analysis, that it is dropping the \b if that is used instead of sropen. It seems more likely that it is dropping the first zero-length matchable op

demerphq added a commit that referenced this issue Jan 9, 2020
We weren't handling NOTHING regops that were not followed
by a trieable type in the trie code.
@demerphq
Copy link
Collaborator

demerphq commented Jan 9, 2020 via email

steve-m-hay pushed a commit that referenced this issue Feb 12, 2020
We weren't handling NOTHING regops that were not followed
by a trieable type in the trie code.

(cherry picked from commit ca902fb)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants