Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

global-buffer-overflow in Perl_foldEQ_utf8_flags (utf8.c:5278) #15796

Closed
p5pRT opened this issue Jan 6, 2017 · 20 comments
Closed

global-buffer-overflow in Perl_foldEQ_utf8_flags (utf8.c:5278) #15796

p5pRT opened this issue Jan 6, 2017 · 20 comments

Comments

@p5pRT
Copy link
Collaborator

@p5pRT p5pRT commented Jan 6, 2017

Migrated from rt.perl.org#130522 (status was 'resolved')

Searchable as RT130522$

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented Jan 6, 2017

From @geeknik

Triggered with Perl v5.25.8-132-gc10193e. If it matters​:
PERL_HASH_SEED=0x420

od -tx1 test136
0000000 26 30 3d 30 3d 7e 7e 2f 28 3f 3d 00 ff 29 28 7c
0000020 30 29 24 2f 69
0000025

./perl test136

==18801==ERROR​: AddressSanitizer​: global-buffer-overflow on address
0x000000dfcd41 at pc 0x000000806330 bp 0x7ffdb0686100 sp 0x7ffdb06860f8
READ of size 1 at 0x000000dfcd41 thread T0
  #0 0x80632f in Perl_foldEQ_latin1 /root/perl/util.c​:1081​:6
  #1 0xb72d79 in S_find_byclass /root/perl/regexec.c​:1985​:13
  #2 0xb61331 in Perl_re_intuit_start /root/perl/regexec.c​:1309​:13
  #3 0xb513b4 in Perl_regexec_flags /root/perl/regexec.c​:3007​:6
  #4 0x8cbe3e in Perl_pp_match /root/perl/pp_hot.c​:2050​:10
  #5 0x7fc7bb in Perl_runops_debug /root/perl/dump.c​:2260​:23
  #6 0x5a0d71 in S_run_body /root/perl/perl.c​:2528​:2
  #7 0x5a0d71 in perl_run /root/perl/perl.c​:2451
  #8 0x4de84d in main /root/perl/perlmain.c​:123​:9
  #9 0x7f326b488b44 in __libc_start_main
/build/glibc-daoqzt/glibc-2.19/csu/libc-start.c​:287
  #10 0x4de4bc in _start (/root/perl/perl+0x4de4bc)

0x000000dfcd41 is located 0 bytes to the right of global variable '<string
literal>' defined in 'sv.c​:3200​:17' (0xdfcd40) of size 1
  '<string literal>' is ascii string ''
SUMMARY​: AddressSanitizer​: global-buffer-overflow /root/perl/util.c​:1081
Perl_foldEQ_latin1
Shadow bytes around the buggy address​:
  0x0000801b7950​: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0000801b7960​: 00 00 00 00 00 00 00 00 00 00 02 f9 f9 f9 f9 f9
  0x0000801b7970​: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9
  0x0000801b7980​: f9 f9 f9 f9 f9 f9 f9 f9 00 00 00 00 02 f9 f9 f9
  0x0000801b7990​: f9 f9 f9 f9 00 00 00 00 00 00 00 00 02 f9 f9 f9
=>0x0000801b79a0​: f9 f9 f9 f9 00 00 00 00[01]f9 f9 f9 f9 f9 f9 f9
  0x0000801b79b0​: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0000801b79c0​: 00 07 f9 f9 f9 f9 f9 f9 00 00 00 00 00 07 f9 f9
  0x0000801b79d0​: f9 f9 f9 f9 00 00 00 00 04 f9 f9 f9 f9 f9 f9 f9
  0x0000801b79e0​: 00 00 00 00 00 00 00 00 00 00 00 03 f9 f9 f9 f9
  0x0000801b79f0​: 04 f9 f9 f9 f9 f9 f9 f9 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes)​:
  Addressable​: 00
  Partially addressable​: 01 02 03 04 05 06 07
  Heap left redzone​: fa
  Heap right redzone​: fb
  Freed heap region​: fd
  Stack left redzone​: f1
  Stack mid redzone​: f2
  Stack right redzone​: f3
  Stack partial redzone​: f4
  Stack after return​: f5
  Stack use after scope​: f8
  Global redzone​: f9
  Global init order​: f6
  Poisoned by user​: f7
  Container overflow​: fc
  ASan internal​: fe
==18801==ABORTING

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented Jan 6, 2017

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented Jan 6, 2017

From @geeknik

Triggered with Perl v5.25.8-132-gc10193e. If it matters​:
PERL_HASH_SEED=0x420

./perl -e '/(?=00)0?$/il'

=================================================================
==17227==ERROR​: AddressSanitizer​: global-buffer-overflow on address
0x000000dfcd41 at pc 0x000000bfd348 bp 0x7ffe01708e70 sp 0x7ffe01708e68
READ of size 1 at 0x000000dfcd41 thread T0
  #0 0xbfd347 in Perl_foldEQ_utf8_flags /root/perl/utf8.c​:5278​:21
  #1 0xb6d183 in S_find_byclass /root/perl/regexec.c​:2037​:17
  #2 0xb61331 in Perl_re_intuit_start /root/perl/regexec.c​:1309​:13
  #3 0xb513b4 in Perl_regexec_flags /root/perl/regexec.c​:3007​:6
  #4 0x8cbe3e in Perl_pp_match /root/perl/pp_hot.c​:2050​:10
  #5 0x7fc7bb in Perl_runops_debug /root/perl/dump.c​:2260​:23
  #6 0x5a0d71 in S_run_body /root/perl/perl.c​:2528​:2
  #7 0x5a0d71 in perl_run /root/perl/perl.c​:2451
  #8 0x4de84d in main /root/perl/perlmain.c​:123​:9
  #9 0x7fcb82209b44 in __libc_start_main
/build/glibc-daoqzt/glibc-2.19/csu/libc-start.c​:287
  #10 0x4de4bc in _start (/root/perl/perl+0x4de4bc)

0x000000dfcd41 is located 0 bytes to the right of global variable '<string
literal>' defined in 'sv.c​:3200​:17' (0xdfcd40) of size 1
  '<string literal>' is ascii string ''
SUMMARY​: AddressSanitizer​: global-buffer-overflow /root/perl/utf8.c​:5278
Perl_foldEQ_utf8_flags
Shadow bytes around the buggy address​:
  0x0000801b7950​: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0000801b7960​: 00 00 00 00 00 00 00 00 00 00 02 f9 f9 f9 f9 f9
  0x0000801b7970​: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9
  0x0000801b7980​: f9 f9 f9 f9 f9 f9 f9 f9 00 00 00 00 02 f9 f9 f9
  0x0000801b7990​: f9 f9 f9 f9 00 00 00 00 00 00 00 00 02 f9 f9 f9
=>0x0000801b79a0​: f9 f9 f9 f9 00 00 00 00[01]f9 f9 f9 f9 f9 f9 f9
  0x0000801b79b0​: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0000801b79c0​: 00 07 f9 f9 f9 f9 f9 f9 00 00 00 00 00 07 f9 f9
  0x0000801b79d0​: f9 f9 f9 f9 00 00 00 00 04 f9 f9 f9 f9 f9 f9 f9
  0x0000801b79e0​: 00 00 00 00 00 00 00 00 00 00 00 03 f9 f9 f9 f9
  0x0000801b79f0​: 04 f9 f9 f9 f9 f9 f9 f9 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes)​:
  Addressable​: 00
  Partially addressable​: 01 02 03 04 05 06 07
  Heap left redzone​: fa
  Heap right redzone​: fb
  Freed heap region​: fd
  Stack left redzone​: f1
  Stack mid redzone​: f2
  Stack right redzone​: f3
  Stack partial redzone​: f4
  Stack after return​: f5
  Stack use after scope​: f8
  Global redzone​: f9
  Global init order​: f6
  Poisoned by user​: f7
  Container overflow​: fc
  ASan internal​: fe
==17227==ABORTING

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented Jan 6, 2017

From @arc

Brian Carpenter <perl5-security-report@​perl.org> wrote​:

If it matters​:
PERL_HASH_SEED=0x420

It doesn't seem to.

od -tx1 test136
0000000 26 30 3d 30 3d 7e 7e 2f 28 3f 3d 00 ff 29 28 7c
0000020 30 29 24 2f 69
0000025

Reduction​: ./miniperl -e '/(?=\0a)b?$/iu'

#130521 looks like the same bug, in that it reduces in the same way
except without the /u flag, and crashes in foldEQ_utf8() rather than
foldEQ_latin1(). That is, I think the bug is higher up the stack.

My hunch is that the culprit is re_intuit_start(), but I'm afraid I'd
better get on with some work now.

--
Aaron Crane ** http​://aaroncrane.co.uk/

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented Jan 6, 2017

The RT System itself - Status changed from 'new' to 'open'

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented Jan 6, 2017

From @hvds

On Fri, 06 Jan 2017 04​:22​:00 -0800, arc@​aaroncrane.co.uk wrote​:

Brian Carpenter <perl5-security-report@​perl.org> wrote​:

If it matters​:
PERL_HASH_SEED=0x420

It doesn't seem to.

od -tx1 test136
0000000 26 30 3d 30 3d 7e 7e 2f 28 3f 3d 00 ff 29 28 7c
0000020 30 29 24 2f 69
0000025

Reduction​: ./miniperl -e '/(?=\0a)b?$/iu'

#130521 looks like the same bug, in that it reduces in the same way
except without the /u flag, and crashes in foldEQ_utf8() rather than
foldEQ_latin1(). That is, I think the bug is higher up the stack.

My hunch is that the culprit is re_intuit_start(), but I'm afraid I'd
better get on with some work now.

I can't reproduce the problem with the original code, nor with your reduced form, nor with #130521. Could you give a perl -V?

When I trace through test136, the call from find_by_class() hits the EXACTFU case but then neither is_utf8_pat nor utf8_target, so it takes the do_exactf_non_utf8 branch rather than the do_exactf_utf8 from Brian's stack trace.

Cheers,

Hugo

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented Jan 7, 2017

From @arc

Hugo van der Sanden via RT <perl5-security-report@​perl.org> wrote​:

I can't reproduce the problem with the original code, nor with your reduced form, nor with #130521. Could you give a perl -V?

See below. It also happens under -Mre=debug too; the output looks like this​:

$ ./perl -Ilib -Mre=debug -e '/(?=\0a)b?$/iu'
Compiling REx "(?=\0a)b?$"
Final program​:
  1​: IFMATCH[0] (7)
  3​: EXACTFU <\0a> (5)
  5​: SUCCEED (0)
  6​: TAIL (7)
  7​: CURLY{0,1} (11)
  9​: EXACTFU <b> (0)
  11​: SEOL (12)
  12​: END (0)
floating ""$ at 0..1 (checking floating) stclass EXACTFU <\0a> minlen 0
Matching REx "(?=\0a)b?$" against ""
Intuit​: trying to determine minimum start position...
  doing 'check' fbm scan, [0..0] gave 0
  Found floating substr ""$ at offset 0 (rx_origin now 0)...
  (multiline anchor test skipped)
  looking for class​: start_shift​: 0 check_at​: 0 rx_origin​: 0 endpos​: 2

==70880==ERROR​: AddressSanitizer​: global-buffer-overflow on address
0x000100bdba01 at pc 0x0001007c6af0 bp 0x7fff5f675350 sp
0x7fff5f675348
READ of size 1 at 0x000100bdba01 thread T0
  #0 0x1007c6aef in Perl_foldEQ_latin1 util.c​:1081
  #1 0x1040558f4 in S_find_byclass re_exec.c​:1985
  #2 0x10404bfef in my_re_intuit_start re_exec.c​:1309
  #3 0x10405f9c0 in my_regexec re_exec.c​:3007
  #4 0x10084bb91 in Perl_pp_match pp_hot.c​:2050
  #5 0x1007c04b6 in Perl_runops_debug dump.c​:2260
  #6 0x100610bb5 in perl_run perl.c​:2528
  #7 0x100589e81 in main perlmain.c​:123
  #8 0x7fff81da35fc in start (libdyld.dylib+0x35fc)
  #9 0x4 (<unknown module>)

0x000100bdba01 is located 63 bytes to the left of global variable
'<string literal>' defined in 'sv.c​:3207​:2' (0x100bdba40) of size 70
  '<string literal>' is ascii string 'PL_valid_types_PVX[SvTYPE(sv) &
SVt_MASK] || SvTYPE(sv) == SVt_REGEXP'
0x000100bdba01 is located 0 bytes to the right of global variable
'<string literal>' defined in 'sv.c​:3200​:17' (0x100bdba00) of size 1
  '<string literal>' is ascii string ''
SUMMARY​: AddressSanitizer​: global-buffer-overflow util.c​:1081 in
Perl_foldEQ_latin1
Shadow bytes around the buggy address​:
  0x10002017b6f0​: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10002017b700​: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01
  0x10002017b710​: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9
  0x10002017b720​: f9 f9 f9 f9 f9 f9 f9 f9 05 f9 f9 f9 f9 f9 f9 f9
  0x10002017b730​: 02 f9 f9 f9 f9 f9 f9 f9 00 06 f9 f9 f9 f9 f9 f9
=>0x10002017b740​:[01]f9 f9 f9 f9 f9 f9 f9 00 00 00 00 00 00 00 00
  0x10002017b750​: 06 f9 f9 f9 f9 f9 f9 f9 00 07 f9 f9 f9 f9 f9 f9
  0x10002017b760​: 00 00 05 f9 f9 f9 f9 f9 04 f9 f9 f9 f9 f9 f9 f9
  0x10002017b770​: 04 f9 f9 f9 f9 f9 f9 f9 00 00 f9 f9 f9 f9 f9 f9
  0x10002017b780​: 00 00 f9 f9 f9 f9 f9 f9 00 00 04 f9 f9 f9 f9 f9
  0x10002017b790​: 00 00 00 00 f9 f9 f9 f9 00 00 05 f9 f9 f9 f9 f9
Shadow byte legend (one shadow byte represents 8 application bytes)​:
  Addressable​: 00
  Partially addressable​: 01 02 03 04 05 06 07
  Heap left redzone​: fa
  Heap right redzone​: fb
  Freed heap region​: fd
  Stack left redzone​: f1
  Stack mid redzone​: f2
  Stack right redzone​: f3
  Stack partial redzone​: f4
  Stack after return​: f5
  Stack use after scope​: f8
  Global redzone​: f9
  Global init order​: f6
  Poisoned by user​: f7
  Container overflow​: fc
  Array cookie​: ac
  Intra object redzone​: bb
  ASan internal​: fe
  Left alloca redzone​: ca
  Right alloca redzone​: cb
==70880==ABORTING
Abort trap​: 6

The "endpos​: 2" in the "looking for class" line is the first highly
suspicious part, given that the target string has length 0, but
clamping endpos to strend doesn't on its own remove the error. Then
tracing through as far as

  e = HOP3c(strend, -((SSize_t)ln), s);

  if (reginfo->intuit && e < s) {
  e = s; /* Due to minlen logic of intuit() */
  }

we end up with e == s, both pointing at the \0 byte in the
empty-string literal near the end of sv_2pv_flags. Then
REXEC_FBC_EXACTISH_SCAN() does "while (s <= e)". So if I also add this
after the "if (reginfo->intuit && e < s)" condition​:

  if (e >= strend) /* REXEC_FBC_EXACTISH_SCAN does while
(s <= e) */
  e = (char *)strend - 1;

then the out-of-bounds read goes away, and the test suite still passes.

I'm not sure this is the right fix, though; I'm wondering whether in
fact there's a logic error in the intuit minlen logic.

I believe the reason why the floating substr needs to begin with \0 to
trigger the bug is because the byte in the target buffer that we're
looking is also \0, and the exactish scan checks *s==c1 before calling
the folder, where c1 is the first byte of the floating substr. (I was
wondering for a while whether it in fact suggested something was
looking for end-of-buffer by checking for \0 rather than comparing
length, but on reflection I think not.)

Output of "./perl -Ilib -V"​:

Summary of my perl5 (revision 5 version 25 subversion 9) configuration​:
  Commit id​: d6c970c
  Platform​:
  osname=darwin
  osvers=13.4.0
  archname=darwin-2level
  uname='darwin daybreak 13.4.0 darwin kernel version 13.4.0​: mon
jan 11 18​:17​:34 pst 2016; root​:xnu-2422.115.15~1release_x86_64 x86_64
i386 macbookpro11,3 darwin '
  config_args='-des -Dusedevel -DEBUGGING -Dprefix=/tmp/xblead
-Dcc=clang-mp-3.7 -fsanitize=address'
  hint=recommended
  useposix=true
  d_sigaction=define
  useithreads=undef
  usemultiplicity=undef
  use64bitint=define
  use64bitall=define
  uselongdouble=undef
  usemymalloc=n
  bincompat5005=undef
  Compiler​:
  cc='clang-mp-3.7 -fsanitize=address'
  ccflags ='-fno-common -DPERL_DARWIN -no-cpp-precomp
-mmacosx-version-min=10.9 -DDEBUGGING -fno-strict-aliasing -pipe
-fstack-protector-strong -I/usr/local/include -I/opt/local/include
-DPERL_USE_SAFE_PUTENV'
  optimize='-O3 -g'
  cppflags='-no-cpp-precomp -fno-common -DPERL_DARWIN
-no-cpp-precomp -mmacosx-version-min=10.9 -DDEBUGGING
-fno-strict-aliasing -pipe -fstack-protector-strong
-I/usr/local/include -I/opt/local/include'
  ccversion=''
  gccversion='4.2.1 Compatible Clang 3.7.1 (tags/RELEASE_371/final)'
  gccosandvers=''
  intsize=4
  longsize=8
  ptrsize=8
  doublesize=8
  byteorder=12345678
  doublekind=3
  d_longlong=define
  longlongsize=8
  d_longdbl=define
  longdblsize=16
  longdblkind=3
  ivtype='long'
  ivsize=8
  nvtype='double'
  nvsize=8
  Off_t='off_t'
  lseeksize=8
  alignbytes=8
  prototype=define
  Linker and Libraries​:
  ld='clang-mp-3.7 -fsanitize=address'
  ldflags =' -mmacosx-version-min=10.9 -fstack-protector-strong
-L/usr/local/lib -L/opt/local/lib'
  libpth=/usr/local/lib
/opt/local/libexec/llvm-3.7/bin/../lib/clang/3.7.1/lib /usr/lib
/opt/local/lib
  libs=-lpthread -lgdbm -ldbm -ldl -lm -lutil -lc
  perllibs=-lpthread -ldl -lm -lutil -lc
  libc=
  so=dylib
  useshrplib=false
  libperl=libperl.a
  gnulibc_version=''
  Dynamic Linking​:
  dlsrc=dl_dlopen.xs
  dlext=bundle
  d_dlsymun=undef
  ccdlflags=' '
  cccdlflags=' '
  lddlflags=' -mmacosx-version-min=10.9 -bundle -undefined
dynamic_lookup -L/usr/local/lib -L/opt/local/lib
-fstack-protector-strong'

Characteristics of this binary (from libperl)​:
  Compile-time options​:
  DEBUGGING
  HAS_TIMES
  PERLIO_LAYERS
  PERL_COPY_ON_WRITE
  PERL_DONT_CREATE_GVSV
  PERL_MALLOC_WRAP
  PERL_OP_PARENT
  PERL_PRESERVE_IVUV
  PERL_USE_DEVEL
  PERL_USE_SAFE_PUTENV
  USE_64_BIT_ALL
  USE_64_BIT_INT
  USE_LARGE_FILES
  USE_LOCALE
  USE_LOCALE_COLLATE
  USE_LOCALE_CTYPE
  USE_LOCALE_NUMERIC
  USE_LOCALE_TIME
  USE_PERLIO
  USE_PERL_ATOF
  Built under darwin
  Compiled at Jan 7 2017 13​:28​:02
  %ENV​:
  PERLBREW="command perlbrew"
  PERLBREW_BASHRC_VERSION="0.37"
  PERLBREW_HOME="/Users/aaron/.perlbrew"
  PERLBREW_MANPATH="/Users/aaron/perl5/perlbrew/perls/perl-5.22.0/man"
  PERLBREW_PATH="/Users/aaron/perl5/perlbrew/bin​:/Users/aaron/perl5/perlbrew/perls/perl-5.22.0/bin"
  PERLBREW_PERL="perl-5.22.0"
  PERLBREW_ROOT="/Users/aaron/perl5/perlbrew"
  PERLBREW_VERSION="0.37"
  @​INC​:
  lib
  /tmp/xblead/lib/perl5/site_perl/5.25.9/darwin-2level
  /tmp/xblead/lib/perl5/site_perl/5.25.9
  /tmp/xblead/lib/perl5/5.25.9/darwin-2level
  /tmp/xblead/lib/perl5/5.25.9
  .

--
Aaron Crane ** http​://aaroncrane.co.uk/

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented Jan 8, 2017

From @hvds

On Sat, 07 Jan 2017 07​:31​:02 -0800, arc wrote​:

See below.

Thanks, I'd managed to convince myself the stacktrace was showing the foldEQ_utf8_flags() call from the other branch, and failed to spot the 3 levels of indirection reaching the same call in the branch we're actually in.

As you found, I think there are two separate bugs here, both at heart (I think) based on an assumption that the length of check substrs cannot exceed minlen - but, as here, we can get them from a lookahead.

Karl​: please check the second patch - I think when e < s we do not need to (and must not) check further; setting e = s causes us to try at least once, even if as in this case the string is empty. I suspect your comments in the removed chunks are intended to imply the erroneous assumption that the length of check substrs cannot exceed minlen, but I'm not sure why we additionally predicate that on reginfo->intuit.

I believe the reason why the floating substr needs to begin with \0 to
trigger the bug is because the byte in the target buffer that we're
looking is also \0, and the exactish scan checks *s==c1 before calling
the folder, where c1 is the first byte of the floating substr.

I think we just need a byte that folds to itself, so I've made the testcases using digits.

I was slightly worried that on a test run re/fold_grind timed out; I'm assuming that's because clang+sanitize=address is slow - disabling the watchdog allowed it to complete successfully in "386.33user 0.69system 6​:27.79elapsed 99%CPU". Still waiting for the rest of the test run, and will then verify that the first of the new tests actually fails without the fixes.

In any case, I don't see any significant security concern here - we'll read out of bounds, but I can't see that we'll write, nor that we'll give the wrong answer to the underlying pattern match - so I think this ticket can also be moved to the public queue.

I'd welcome comments on that, or on the 3 proposed patches below.

Hugo

commit 9de31938f2d75d7cef537e4981da943d1d116b54
Author​: Hugo van der Sanden <hv@​crypt.org>
Date​: Sun Jan 8 14​:54​:57 2017 +0000

  [perl #130522] do not allow endpos to exceed strend
 
  Check substrings can come from lookaheads, so their length can exceed
  minlen. Use a clamped variant of HOP3c to avoid a bad endpos in this case.

Inline Patch
diff --git a/regexec.c b/regexec.c
index 056a993..6a9da32 100644
--- a/regexec.c
+++ b/regexec.c
@@ -149,6 +149,7 @@ static const char* const non_utf8_target_but_utf8_required
 #define HOP3lim(pos,off,lim) (reginfo->is_utf8_target \
     ? reghop3((U8*)(pos), off, (U8*)(lim)) \
     : (U8*)((pos + off) > lim ? lim : (pos + off)))
+#define HOP3clim(pos,off,lim) ((char*)HOP3lim(pos,off,lim))
 
 #define HOP4(pos,off,llim, rlim) (reginfo->is_utf8_target \
     ? reghop4((U8*)(pos), off, (U8*)(llim), (U8*)(rlim)) \
@@ -1291,10 +1292,10 @@ Perl_re_intuit_start(pTHX_
          */
 
        if (prog->anchored_substr || prog->anchored_utf8 || ml_anch)
-            endpos= HOP3c(rx_origin, (prog->minlen ? cl_l : 0), strend);
+            endpos = HOP3clim(rx_origin, (prog->minlen ? cl_l : 0), strend);
         else if (prog->float_substr || prog->float_utf8) {
            rx_max_float = HOP3c(check_at, -start_shift, strbeg);
-           endpos= HOP3c(rx_max_float, cl_l, strend);
+           endpos = HOP3clim(rx_max_float, cl_l, strend);
         }
         else 
             endpos= strend;

commit 30407a4ff1a338b906ed3d5d883c00e976b6682d Author​: Hugo van der Sanden \ Date​: Sun Jan 8 14​:59​:36 2017 \+0000

  [perl #130522] don't try to find_byclass outside the string
 
  If the calculated latest start position to try is before our current
  start position, stop immediately​: else we can read out of bounds,
  and end up doing unnecessary work.

Inline Patch
diff --git a/regexec.c b/regexec.c
index 6a9da32..7d2a3ac 100644
--- a/regexec.c
+++ b/regexec.c
@@ -1974,10 +1974,8 @@ S_find_byclass(pTHX_ regexp * prog, const regnode *c, char *s,
          * trying that it will fail; so don't start a match past the
          * required minimum number from the far end */
         e = HOP3c(strend, -((SSize_t)ln), s);
-
-        if (reginfo->intuit && e < s) {
-            e = s;                     /* Due to minlen logic of intuit() */
-        }
+        if (e < s)
+            break;
 
         c1 = *pat_string;
         c2 = fold_array[c1];
@@ -2021,10 +2019,6 @@ S_find_byclass(pTHX_ regexp * prog, const regnode *c, char *s,
          */
         e = HOP3c(strend, -((SSize_t)lnc), s);
 
-        if (reginfo->intuit && e < s) {
-            e = s;                     /* Due to minlen logic of intuit() */
-        }
-
         /* XXX Note that we could recalculate e to stop the loop earlier,
          * as the worst case expansion above will rarely be met, and as we
          * go along we would usually find that e moves further to the left.

commit 8dcf2227576f22517e02daf49e6f3ab395bf1390 Author​: Hugo van der Sanden \

  [perl #130522] test cases for len(STCLASS) > minlen

Inline Patch
diff --git a/t/re/re_tests b/t/re/re_tests
index e8a7fa9..5491b49 100644
--- a/t/re/re_tests
+++ b/t/re/re_tests
@@ -1976,6 +1976,9 @@ AB\s+\x{100}      AB \x{100}X     y       -       -
 (.*?(a(a)|i(i))n)      riiaan  y       $2-$3-$4-$1     aa-a--riiaan            #  Jump trie capture buffer issue [perl #129897]
 (^(?:(\d)x)?\d$)       1       y       [$1-$2] [1-]            #  make sure that we reset capture buffers properly (from regtry)
 (X{2,}[-X]{1,4}){3,}X{2,}      XXX-XXX-XXX--   n       -       -       # [perl #130307]
+/(?=01)0?1?$/iu                n       -       -       # [perl #130522] should not read outside string
+/(?=01)0?1?$/iu        0       n       -       -       # [perl #130522]
+/(?=01)0?1?$/iu        01      y       -       -       # [perl #130522]
 
 # Keep these lines at the end of the file
 # vim: softtabstop=0 noexpandtab
@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented Jan 8, 2017

From @hvds

On Sun, 08 Jan 2017 08​:26​:20 -0800, hv wrote​:

Still waiting for the rest of the test run,
and will then verify that the first of the new tests actually fails
without the fixes.

Test run with address=sanitized passed ok; however the new tests don't trigger the problem on blead - the closest I've found that does (looking for one that actually would match at least one string) is​:
  perl -e '/(?=\0a)\0?a?$/'

However it only causes a problem if run against undef​: an empty string is already safe. So I'll look to wrap that in a pat.t test.

Hugo

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented Jan 9, 2017

From @arc

Hugo van der Sanden via RT <perl5-security-report@​perl.org> wrote​:

As you found, I think there are two separate bugs here, both at heart (I think) based on an assumption that the length of check substrs cannot exceed minlen - but, as here, we can get them from a lookahead.

Agreed. (Though I suspect we try to read a long way out of bounds for
a long lookahead.)

I believe the reason why the floating substr needs to begin with \0 to
trigger the bug is because the byte in the target buffer that we're
looking is also \0, and the exactish scan checks *s==c1 before calling
the folder, where c1 is the first byte of the floating substr.

I think we just need a byte that folds to itself, so I've made the testcases using digits.

As you note in your followup, making that change doesn't make asan
produce an error. I *think* that we still read beyond SvCUR when the
test uses digits; but the only way to get asan to complain is if
that's actually nothing allocated beyond that point. This is why
matching against undef triggers the asan error​: the buffer is
statically-allocated in that case, which means asan can detect every
out-of-bounds read against it.

I'm otherwise happy with your patch, though.

--
Aaron Crane ** http​://aaroncrane.co.uk/

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented Jan 9, 2017

From @arc

Hugo van der Sanden via RT <perl5-security-report@​perl.org> wrote​:

Test run with address=sanitized passed ok; however the new tests don't trigger the problem on blead - the closest I've found that does (looking for one that actually would match at least one string) is​:
perl -e '/(?=\0a)\0?a?$/'

However it only causes a problem if run against undef​: an empty string is already safe. So I'll look to wrap that in a pat.t test.

Just for the record​: asan remains silent for me on that test case
unless I add the /i flag. (I'm pretty sure that's just a copy-paste-o,
but I thought it'd be best to be sure.)

--
Aaron Crane ** http​://aaroncrane.co.uk/

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented Jan 9, 2017

From @hvds

On Mon, 09 Jan 2017 07​:01​:07 -0800, arc wrote​:

Hugo van der Sanden via RT <perl5-security-report@​perl.org> wrote​:

Test run with address=sanitized passed ok; however the new tests
don't trigger the problem on blead - the closest I've found that does
(looking for one that actually would match at least one string) is​:
perl -e '/(?=\0a)\0?a?$/'

However it only causes a problem if run against undef​: an empty
string is already safe. So I'll look to wrap that in a pat.t test.

Just for the record​: asan remains silent for me on that test case
unless I add the /i flag. (I'm pretty sure that's just a copy-paste-o,
but I thought it'd be best to be sure.)

Yes, copy/paste. I've just prepped the pat.t test, using qr{(?=\0z)\0?z?$}i.

If Karl can confirm he's happy with the second patch, I'll transqueue and push.

Hugo

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented Jan 10, 2017

From @khwilliamson

On Mon, 09 Jan 2017 08​:52​:25 -0800, hv wrote​:

On Mon, 09 Jan 2017 07​:01​:07 -0800, arc wrote​:

Hugo van der Sanden via RT <perl5-security-report@​perl.org> wrote​:

Test run with address=sanitized passed ok; however the new tests
don't trigger the problem on blead - the closest I've found that
does
(looking for one that actually would match at least one string) is​:
perl -e '/(?=\0a)\0?a?$/'

However it only causes a problem if run against undef​: an empty
string is already safe. So I'll look to wrap that in a pat.t test.

Just for the record​: asan remains silent for me on that test case
unless I add the /i flag. (I'm pretty sure that's just a copy-paste-
o,
but I thought it'd be best to be sure.)

Yes, copy/paste. I've just prepped the pat.t test, using
qr{(?=\0z)\0?z?$}i.

If Karl can confirm he's happy with the second patch, I'll transqueue
and push.

Hugo

It seems correct to break immediately if we go back further than the beginning of the string. This code, marked as mine, was essentially copied from elsewhere in the file. I was puzzled by the test for reginfo->intuit, and so did some excavating. That was added in
commit 3b0527f
Author​: Dave Mitchell <davem@​fdisolutions.com>
Date​: Thu Apr 13 18​:31​:54 2006 +0000

But even here, he was essentially just changing an existing variable name. I didn't look further back than that.
--
Karl Williamson

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented Jan 11, 2017

From @hvds

Now pushed​:

commit 6785390
  [perl #130522] do not allow endpos to exceed strend
 
  Check substrings can come from lookaheads, so their length can exceed
  minlen. Use a clamped variant of HOP3c to avoid a bad endpos in this case.

commit dda0191
  [perl #130522] don't try to find_by_class outside the string
 
  If the calculated latest start position to try is before our current
  start position, stop immediately​: else we can read out of bounds,
  and end up doing unnecessary work.

commit 871e132
  [perl #130522] test cases for len(STCLASS) > len(target)

Hugo

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented Jan 11, 2017

@hvds - Status changed from 'open' to 'pending release'

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented Jan 11, 2017

From @hvds

Merging this to [perl #130522], as discussed there this is the same problem.

Hugo

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented Jan 11, 2017

The RT System itself - Status changed from 'new' to 'open'

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented Jan 11, 2017

@hvds - Status changed from 'open' to 'pending release'

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented May 30, 2017

From @khwilliamson

Thank you for filing this report. You have helped make Perl better.

With the release today of Perl 5.26.0, this and 210 other issues have been
resolved.

Perl 5.26.0 may be downloaded via​:
https://metacpan.org/release/XSAWYERX/perl-5.26.0

If you find that the problem persists, feel free to reopen this ticket.

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

@p5pRT p5pRT commented May 30, 2017

@khwilliamson - Status changed from 'pending release' to 'resolved'

@p5pRT p5pRT closed this May 30, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
1 participant
You can’t perform that action at this time.