Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NULL pointer dereference in S_pending_ident() #17397

Open
fcambus opened this issue Dec 28, 2019 · 13 comments
Open

NULL pointer dereference in S_pending_ident() #17397

fcambus opened this issue Dec 28, 2019 · 13 comments

Comments

@fcambus
Copy link

fcambus commented Dec 28, 2019

Hi,

While fuzzing Perl 5.30.1 with Honggfuzz, I found a NULL pointer dereference in the S_pending_ident() function, in toke.c.

Attaching a reproducer (gzipped so GitHub accepts it): test01.pl.gz

Issue can be reproduced by running:

perl test01.pl
AddressSanitizer:DEADLYSIGNAL
=================================================================
==13609==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000008 (pc 0x0000005857d1 bp 0x7ffd300e8150 sp 0x7ffd300e6f80 T0)
==13609==The signal is caused by a READ memory access.
==13609==Hint: address points to the zero page.
    #0 0x5857d0 in S_pending_ident /home/fcambus/perl-5.30.1/toke.c:9111:17
    #1 0x5857d0 in Perl_yylex /home/fcambus/perl-5.30.1/toke.c:4903:13
    #2 0x5d21cc in Perl_yyparse /home/fcambus/perl-5.30.1/perly.c:340:34
    #3 0x54cfa0 in S_parse_body /home/fcambus/perl-5.30.1/perl.c:2531:9
    #4 0x54cfa0 in perl_parse /home/fcambus/perl-5.30.1/perl.c:1822:2
    #5 0x4df38c in main /home/fcambus/perl-5.30.1/perlmain.c:126:10
    #6 0x7f1b520c11e2 in __libc_start_main /build/glibc-4WA41p/glibc-2.30/csu/../csu/libc-start.c:308:16
    #7 0x437bfd in _start (/home/fcambus/perl-5.30.1/perl+0x437bfd)

AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV /home/fcambus/perl-5.30.1/toke.c:9111:17 in S_pending_ident
==13609==ABORTING
@jkeenan
Copy link
Contributor

jkeenan commented Dec 28, 2019

What happens if you rewrite the test program to include:

use strict;
use warnings;

... and then re-fuzz it?

Thank you very much.
Jim Keenan

@jkeenan
Copy link
Contributor

jkeenan commented Dec 28, 2019

FWIW, from at least perl-5.6.2 to perl-5.8.9, the test program fails to compile.

$ perlbrew use perl-5.6.2
$ perl ghi-17397-test01.pl 
syntax error at ghi-17397-test01.pl line 197, near "{]"
syntax error at ghi-17397-test01.pl line 197, near "{]"
syntax error at ghi-17397-test01.pl line 265, near "print"
syntax error at ghi-17397-test01.pl line 271, near "E1 "
syntax error at ghi-17397-test01.pl line 346, near "{ ^"
syntax error at ghi-17397-test01.pl line 352, near "{^"
syntax error at ghi-17397-test01.pl line 355, near "{ ^"
Execution of ghi-17397-test01.pl aborted due to compilation errors.

As of (at least) perl-5.10.1, it fails to compile and segfaults at the point where the program dies.

$ perlbrew use perl-5.10.1
$ perl ghi-17397-test01.pl 
Semicolon seems to be missing at ghi-17397-test01.pl line 270.
String found where operator expected at ghi-17397-test01.pl line 363, near "}" ne ""
	(Missing operator before " ne "?)
Bareword found where operator expected at ghi-17397-test01.pl line 363, near "" ne "bar"
	(Missing operator before bar?)
String found where operator expected at ghi-17397-test01.pl line 364, near "print ""
  (Might be a runaway multi-line "" string starting on line 363)
	(Missing semicolon on previous line?)
Bareword found where operator expected at ghi-17397-test01.pl line 364, near "print "ok"
	(Do you need to predeclare print?)
Backslash found where operator expected at ghi-17397-test01.pl line 364, near "$test\"
	(Missing operator before \?)
String found where operator expected at ghi-17397-test01.pl line 366, near "print ""
  (Might be a runaway multi-line "" string starting on line 364)
	(Missing semicolon on previous line?)
Scalar found where operator expected at ghi-17397-test01.pl line 366, near "" if "${^TEST}"
	(Missing operator before ${^TEST}?)
Bareword found where operator expected at ghi-17397-test01.pl line 366, near "" ne "splat"
	(Missing operator before splat?)
Bareword found where operator expected at ghi-17397-test01.pl line 367, near "print "ok"
  (Might be a runaway multi-line "" string starting on line 366)
	(Do you need to predeclare print?)
Backslash found where operator expected at ghi-17397-test01.pl line 367, near "$test\"
	(Missing operator before \?)
String found where operator expected at ghi-17397-test01.pl line 369, near "print ""
  (Might be a runaway multi-line "" string starting on line 367)
	(Missing semicolon on previous line?)
Scalar found where operator expected at ghi-17397-test01.pl line 369, near "" if "$"
	(Missing operator before $?)
Segmentation fault (core dumped)

As yet I see no evidence that the program ever ran to completion.

Thank you very much.
Jim Keenan

@jkeenan
Copy link
Contributor

jkeenan commented Dec 29, 2019

The test file appears to be a fork from an older version of t/base/lex.t. If, in the test file, you make this one correction:

$ diff -w test01.pl ghi-17397-test04.pl
363c363
<   print "not " if${ ^TEST [1] }" ne "bar";
---
>   print "not " if "${ ^TEST [1] }" ne "bar";

... the segfault goes away (although the file still does not compile).

Thank you very much.
Jim Keenan

@hvds
Copy link
Contributor

hvds commented Dec 29, 2019

It's probably not profitable to critique perl code generated by fuzzing. If we can reproduce it, the next step is to attempt to reduce it to a minimal test case and concentrate on why it triggers a coredump.

miniperl@blead happily reproduces it; it reduces at least to:

<<E1;
${sub{b{]]]{} @{[ <<E2 ]}
E2
E1

I hope someone with fresher knowledge of the lexer will look at this, so I don't have to.

Hugo

@khwilliamson
Copy link
Contributor

In 5.35.10, if I run either the fuzzed program or the reduction above, the dereference is gone. In the reduction, we get

syntax error at 17397.pl line 2, near "{]"
syntax error at 17397.pl line 1, near "<<E1"
panic: pad_swipe po=3, fill=1 at 17397.pl line 1.
panic: pad_swipe po=3, fill=1 at 17397.pl line 1.

In the unreduced version, we get lots of syntax errors.

But I believe this is now closable

@khwilliamson khwilliamson added the Closable? We might be able to close this ticket, but we need to check with the reporter label Apr 16, 2022
@hvds
Copy link
Contributor

hvds commented Apr 16, 2022

Let me see if I can bisect to find what fixed it: it would be useful to know at least if it was intentional.

@hvds
Copy link
Contributor

hvds commented Apr 17, 2022

In 5.35.10, if I run either the fuzzed program or the reduction above, the dereference is gone. In the reduction, we get

syntax error at 17397.pl line 2, near "{]"
syntax error at 17397.pl line 1, near "<<E1"
panic: pad_swipe po=3, fill=1 at 17397.pl line 1.
panic: pad_swipe po=3, fill=1 at 17397.pl line 1.

In the unreduced version, we get lots of syntax errors.

But I believe this is now closable

Huh, with blead@f5469426bb I get:

% ./Configure -des -Dcc=gcc -Dprefix=/opt/scratch -Doptimize='-g -O6' -Dusedevel -Uversiononly \
  && make miniperl
[...]
% ./miniperl
<<E1;
${sub{b{]]]{} @{[ <<E2 ]}
E2
E1
Segmentation fault (core dumped)
% 

So I don't think this is closable. @khwilliamson how did you build and test?

I'm pretty sure this is yet another case of foolishly attempting to continue after errors, which @iabyn may one day save us from.

@demerphq
Copy link
Collaborator

demerphq commented Apr 17, 2022 via email

@khwilliamson khwilliamson removed the Closable? We might be able to close this ticket, but we need to check with the reporter label Apr 17, 2022
@khwilliamson
Copy link
Contributor

It turns out that the crucial distinction between my Configure call and yours was -DDEBUGGING in mine

I fell prey to confirmation bias. I had made some changes in toke.c some releases ago that fixed a number of segfaults from malformed input, and I thought that that could very well have fixed this.

Bisecting should be done in order to close tickets like this. Is there some place to say that in the docs?

@jkeenan
Copy link
Contributor

jkeenan commented Apr 17, 2022

[snip]

Bisecting should be done in order to close tickets like this. Is there some place to say that in the docs?

I don't think that's necessary. Quite a few committers have gotten fluent in the use of Porting/bisect.pl and we have a good sense of the kind of bug tickets for which it is useful. The trick is to figure out the arguments for that program. As we've encountered new cases, we've added them to the EXAMPLES section of the documentation in Porting/bisect-runner.pl.

@jkeenan jkeenan closed this as completed Apr 17, 2022
@jkeenan jkeenan reopened this Apr 17, 2022
@jkeenan
Copy link
Contributor

jkeenan commented Apr 17, 2022

Closed by mistake; re-opened.

@demerphq
Copy link
Collaborator

demerphq commented Sep 8, 2022

This is fixed by eeaa145 (unmerged as yet).

demerphq added a commit that referenced this issue Sep 8, 2022
We try to keep parsing after many types of errors, up to a (current)
maximum of 10 errors. Continuing after a semantic error (like
undeclared variables) can be helpful, for instance showing a set of
common errors, but continuing after a syntax error isn't helpful
most of the time as the internal state of the parser can get confused
and is not reliably restored in between attempts. This can produce
sometimes completely bizarre errors which just obscure the true error,
and has resulted in security tickets being filed in the past.

This patch makes the parser stop after the first syntax error, while
preserving the current behavior for other errors. An error is considered
a syntax error if the error message from our internals is the literal
text "syntax error". This may not be a complete list of true syntax
errors, we can iterate on that in the future.

This fixes the segfault in Issue #17397 and likely fixes other "segfault
due to compiler continuation after syntax error" bugs that we have on
record which has been a recurring issue over the years.
@demerphq
Copy link
Collaborator

demerphq commented Sep 8, 2022

Currently in #20168

demerphq added a commit that referenced this issue Sep 8, 2022
We try to keep parsing after many types of errors, up to a (current)
maximum of 10 errors. Continuing after a semantic error (like
undeclared variables) can be helpful, for instance showing a set of
common errors, but continuing after a syntax error isn't helpful
most of the time as the internal state of the parser can get confused
and is not reliably restored in between attempts. This can produce
sometimes completely bizarre errors which just obscure the true error,
and has resulted in security tickets being filed in the past.

This patch makes the parser stop after the first syntax error, while
preserving the current behavior for other errors. An error is considered
a syntax error if the error message from our internals is the literal
text "syntax error". This may not be a complete list of true syntax
errors, we can iterate on that in the future.

This fixes the segfaults reported in Issue #17397, and #16944 and
likely fixes other "segfault due to compiler continuation after syntax
error" bugs that we have on record, which has been a recurring issue
over the years.
demerphq added a commit that referenced this issue Sep 9, 2022
We try to keep parsing after many types of errors, up to a (current)
maximum of 10 errors. Continuing after a semantic error (like
undeclared variables) can be helpful, for instance showing a set of
common errors, but continuing after a syntax error isn't helpful
most of the time as the internal state of the parser can get confused
and is not reliably restored in between attempts. This can produce
sometimes completely bizarre errors which just obscure the true error,
and has resulted in security tickets being filed in the past.

This patch makes the parser stop after the first syntax error, while
preserving the current behavior for other errors. An error is considered
a syntax error if the error message from our internals is the literal
text "syntax error". This may not be a complete list of true syntax
errors, we can iterate on that in the future.

This fixes the segfaults reported in Issue #17397, and #16944 and
likely fixes other "segfault due to compiler continuation after syntax
error" bugs that we have on record, which has been a recurring issue
over the years.
scottchiefbaker pushed a commit to scottchiefbaker/perl5 that referenced this issue Nov 3, 2022
We try to keep parsing after many types of errors, up to a (current)
maximum of 10 errors. Continuing after a semantic error (like
undeclared variables) can be helpful, for instance showing a set of
common errors, but continuing after a syntax error isn't helpful
most of the time as the internal state of the parser can get confused
and is not reliably restored in between attempts. This can produce
sometimes completely bizarre errors which just obscure the true error,
and has resulted in security tickets being filed in the past.

This patch makes the parser stop after the first syntax error, while
preserving the current behavior for other errors. An error is considered
a syntax error if the error message from our internals is the literal
text "syntax error". This may not be a complete list of true syntax
errors, we can iterate on that in the future.

This fixes the segfaults reported in Issue Perl#17397, and Perl#16944 and
likely fixes other "segfault due to compiler continuation after syntax
error" bugs that we have on record, which has been a recurring issue
over the years.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants