Skip to content

Commit

Permalink
Stop parsing on first syntax error.
Browse files Browse the repository at this point in the history
We try to keep parsing after many types of errors, up to a (current)
maximum of 10 errors. Continuing after a semantic error (like
undeclared variables) can be helpful, for instance showing a set of
common errors, but continuing after a syntax error isn't helpful
most of the time as the internal state of the parser can get confused
and is not reliably restored in between attempts. This can produce
sometimes completely bizarre errors which just obscure the true error,
and has resulted in security tickets being filed in the past.

This patch makes the parser stop after the first syntax error, while
preserving the current behavior for other errors. An error is considered
a syntax error if the error message from our internals is the literal
text "syntax error". This may not be a complete list of true syntax
errors, we can iterate on that in the future.

This fixes the segfaults reported in Issue #17397, and #16944 and
likely fixes other "segfault due to compiler continuation after syntax
error" bugs that we have on record, which has been a recurring issue
over the years.
  • Loading branch information
demerphq committed Sep 9, 2022
1 parent 14580ae commit eb54d46
Show file tree
Hide file tree
Showing 17 changed files with 111 additions and 53 deletions.
2 changes: 1 addition & 1 deletion embed.fnc
Original file line number Diff line number Diff line change
Expand Up @@ -2727,7 +2727,7 @@ p |void |write_to_stderr|NN SV* msv
: Used in op.c
p |int |yyerror |NN const char *const s
p |void |yyquit
pr |void |abort_execution|NN const char * const msg|NN const char * const name
pr |void |abort_execution|NULLOK SV *msg_sv|NN const char * const name
p |int |yyerror_pv |NN const char *const s|U32 flags
p |int |yyerror_pvn |NULLOK const char *const s|STRLEN len|U32 flags
: Used in perly.y, and by Data::Alias
Expand Down
2 changes: 1 addition & 1 deletion perl.c
Original file line number Diff line number Diff line change
Expand Up @@ -2596,7 +2596,7 @@ S_parse_body(pTHX_ char **env, XSINIT_t xsinit)

SETERRNO(0,SS_NORMAL);
if (yyparse(GRAMPROG) || PL_parser->error_count) {
abort_execution("", PL_origfilename);
abort_execution(NULL, PL_origfilename);
}
CopLINE_set(PL_curcop, 0);
SET_CURSTASH(PL_defstash);
Expand Down
6 changes: 6 additions & 0 deletions perl.h
Original file line number Diff line number Diff line change
Expand Up @@ -8936,6 +8936,12 @@ END_EXTERN_C
#define PERL_DIAG_WARN_SYNTAX(x) PERL_DIAG_STR_(x)
#define PERL_DIAG_DIE_SYNTAX(x) PERL_DIAG_STR_(x)

#define PERL_STOP_PARSING_AFTER_N_ERRORS 10

#define PERL_PARSE_IS_SYNTAX_ERROR_FLAG 128
#define PERL_PARSE_IS_SYNTAX_ERROR(f) ((f) & PERL_PARSE_IS_SYNTAX_ERROR_FLAG)
#define PERL_PARSE_ERROR_COUNT(f) ((f) & (PERL_PARSE_IS_SYNTAX_ERROR_FLAG-1))

/*
(KEEP THIS LAST IN perl.h!)
Expand Down
18 changes: 18 additions & 0 deletions pod/perldelta.pod
Original file line number Diff line number Diff line change
Expand Up @@ -246,6 +246,24 @@ L<Locale '%s' is unsupported, and may crash the interpreter.message|perldiag/"Lo

=item *

The compiler will now stop parsing on the first syntax error it
encounters. Historically the compiler would attempt to "skip past" the
error and continue parsing so that it could list multiple errors. For
things like undeclared variables under strict this makes sense. For
syntax errors however it has been found that continuing tends to result
in a storm of unrelated or bizarre errors that mostly just obscure the
true error. In extreme cases it can even lead to segfaults and other
malbehavior.

Therefore we have reformed the continuation logic so that the parse will
stop after the first seen syntax error. Semantic errors like undeclared
variables will not stop the parse, so you may still see multiple errors
when compiling code. However if there is a syntax error it will be the
last error message reported by perl and all of the errors that you see
will be something that actually needs to be fixed.

=item *

Error messages that output class or package names have been modified to
output double quoted strings with various characters escaped so as to
make the exact value clear to a reader. The exact rules on which
Expand Down
4 changes: 4 additions & 0 deletions pod/perldiag.pod
Original file line number Diff line number Diff line change
Expand Up @@ -2235,6 +2235,10 @@ Catamount. See L<perlport>.

(F) The final summary message when a Perl compilation fails.

=item Execution of %s aborted due to compilation errors.

(F) The final summary message when a Perl compilation fails.

=item exists argument is not a HASH or ARRAY element or a subroutine

(F) The argument to C<exists> must be a hash or array element or a
Expand Down
10 changes: 9 additions & 1 deletion pp_ctl.c
Original file line number Diff line number Diff line change
Expand Up @@ -1669,8 +1669,16 @@ Perl_qerror(pTHX_ SV *err)
sv_catsv(PL_errors, err);
else
Perl_warn(aTHX_ "%" SVf, SVfARG(err));
if (PL_parser)

if (PL_parser) {
STRLEN len;
char *err_pv = SvPV(err,len);
++PL_parser->error_count;
if (memBEGINs(err_pv,len,"syntax error"))
{
PL_parser->error_count |= PERL_PARSE_IS_SYNTAX_ERROR_FLAG;
}
}
}


Expand Down
4 changes: 2 additions & 2 deletions proto.h
Original file line number Diff line number Diff line change
Expand Up @@ -231,11 +231,11 @@ PERL_CALLCONV UV Perl__utf8n_to_uvchr_msgs_helper(const U8 *s, STRLEN curlen, ST
assert(s)
PERL_CALLCONV void Perl__warn_problematic_locale(void);
#define PERL_ARGS_ASSERT__WARN_PROBLEMATIC_LOCALE
PERL_CALLCONV_NO_RET void Perl_abort_execution(pTHX_ const char * const msg, const char * const name)
PERL_CALLCONV_NO_RET void Perl_abort_execution(pTHX_ SV *msg_sv, const char * const name)
__attribute__noreturn__
__attribute__visibility__("hidden");
#define PERL_ARGS_ASSERT_ABORT_EXECUTION \
assert(msg); assert(name)
assert(name)

PERL_CALLCONV LOGOP* Perl_alloc_LOGOP(pTHX_ I32 type, OP *first, OP *other)
__attribute__visibility__("hidden");
Expand Down
8 changes: 0 additions & 8 deletions t/lib/croak/toke
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,6 @@ EXPECT
Scalar found where operator expected at - line 1, near "0${"
(Missing operator before ${?)
syntax error at - line 1, near "0$"
Missing right curly or square bracket at - line 1, at end of line
Execution of - aborted due to compilation errors.
########
# NAME (Missing operator before $#{?) [perl #123737]
Expand All @@ -34,7 +33,6 @@ EXPECT
Array length found where operator expected at - line 1, near "0$#{"
(Missing operator before $#{?)
syntax error at - line 1, near "0$#"
Missing right curly or square bracket at - line 1, at end of line
Execution of - aborted due to compilation errors.
########
# NAME (Missing operator before @foo) [perl #123737]
Expand All @@ -52,7 +50,6 @@ EXPECT
Array found where operator expected at - line 1, near "0@{"
(Missing operator before @{?)
syntax error at - line 1, near "0@"
Missing right curly or square bracket at - line 1, at end of line
Execution of - aborted due to compilation errors.
########
# NAME Unterminated here-doc in string eval
Expand Down Expand Up @@ -483,8 +480,6 @@ Use of bare << to mean <<"" is forbidden at - line 1.
EXPECT
Bareword found where operator expected at - line 1, near "1e"
(Missing operator before e?)
Number found where operator expected at - line 1, near "--5"
(Missing operator before 5?)
syntax error at - line 1, near "1e"
Execution of - aborted due to compilation errors.
########
Expand Down Expand Up @@ -573,9 +568,6 @@ Execution of - aborted due to compilation errors.
EXPECT
Number found where operator expected at - line 1, near "0 0x"
(Missing operator before 0x?)
Array found where operator expected at - line 1, near "0x@
;"
(Missing operator before ;?)
No digits found for hexadecimal literal at - line 1, near "0 0x@"
syntax error at - line 1, near "0 0x"
Execution of - aborted due to compilation errors.
Expand Down
4 changes: 2 additions & 2 deletions t/lib/subs/subs
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ EXPECT
Number found where operator expected at - line 3, near "Fred 1"
(Do you need to predeclare Fred?)
syntax error at - line 3, near "Fred 1"
BEGIN not safe after errors--compilation aborted at - line 4.
Execution of - aborted due to compilation errors.
########

# AOK
Expand Down Expand Up @@ -104,4 +104,4 @@ EXPECT
Number found where operator expected at - line 5, near "ふれど 1"
(Do you need to predeclare ふれど?)
syntax error at - line 5, near "ふれど 1"
BEGIN not safe after errors--compilation aborted at - line 6.
Execution of - aborted due to compilation errors.
2 changes: 0 additions & 2 deletions t/lib/warnings/7fatal
Original file line number Diff line number Diff line change
Expand Up @@ -545,8 +545,6 @@ if (1 {
}
EXPECT
syntax error at - line 4, near "1 {"
"my" variable $x masks earlier declaration in same statement at - line 6.
syntax error at - line 7, near "}"
Execution of - aborted due to compilation errors.
########

Expand Down
11 changes: 2 additions & 9 deletions t/lib/warnings/toke
Original file line number Diff line number Diff line change
Expand Up @@ -261,13 +261,8 @@ Reversed *= operator at - line 5.
Reversed %= operator at - line 6.
Reversed &= operator at - line 7.
Reversed .= operator at - line 8.
Reversed ^= operator at - line 9.
Reversed |= operator at - line 10.
Reversed <= operator at - line 11.
syntax error at - line 8, near "=."
syntax error at - line 9, near "=^"
syntax error at - line 10, near "=|"
Unterminated <> operator at - line 11.
Execution of - aborted due to compilation errors.
########
# toke.c
no warnings 'syntax' ;
Expand All @@ -283,9 +278,7 @@ $a =< 2 ;
$a =/ 2 ;
EXPECT
syntax error at - line 8, near "=."
syntax error at - line 9, near "=^"
syntax error at - line 10, near "=|"
Unterminated <> operator at - line 11.
Execution of - aborted due to compilation errors.
########
# toke.c
use warnings 'syntax' ;
Expand Down
12 changes: 11 additions & 1 deletion t/op/heredoc.t
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ BEGIN {
}

use strict;
plan(tests => 137);
plan(tests => 138);

# heredoc without newline (#65838)
{
Expand Down Expand Up @@ -233,3 +233,13 @@ HEREDOC
);
}
}
fresh_perl_like(
q#<<E1;
${sub{b{]]]{} @{[ <<E2 ]}
E2
E1
#,
qr/^syntax error/,
{},
"GH Issue #17397 - Syntax error inside of here doc causes segfault"
);
4 changes: 2 additions & 2 deletions t/op/lex.t
Original file line number Diff line number Diff line change
Expand Up @@ -134,7 +134,7 @@ SKIP: {
Bareword found where operator expected at - line 1, near ""ab}"ax"
(Missing operator before ax?)
syntax error at - line 1, near ""ab}"ax"
Unrecognized character \\x8A; marked by <-- HERE after ab}"ax;&\0z<-- HERE near column 12 at - line 1.
Execution of - aborted due to compilation errors.
gibberish
{ stderr => 1 },
'gibberish containing &\0z - used to crash [perl #123753]'
Expand All @@ -144,7 +144,7 @@ gibberish
Bareword found where operator expected at - line 1, near ""ab}"ax"
(Missing operator before ax?)
syntax error at - line 1, near ""ab}"ax"
Unrecognized character \\x8A; marked by <-- HERE after }"ax;&{+z}<-- HERE near column 14 at - line 1.
Execution of - aborted due to compilation errors.
gibberish
{ stderr => 1 },
'gibberish containing &{+z} - used to crash [perl #123753]'
Expand Down
9 changes: 8 additions & 1 deletion t/op/sub.t
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ BEGIN {
set_up_inc('../lib');
}

plan(tests => 62);
plan(tests => 63);

sub empty_sub {}

Expand Down Expand Up @@ -426,3 +426,10 @@ eval '
CORE::state sub b; sub d { sub b {} sub d }
';
eval '()=%e; sub e { sub e; eval q|$x| } e;';

fresh_perl_like(
q#<s,,$0[sub{m]]]],}>0,shift#,
qr/^syntax error/,
{},
"GH Issue #16944 - Syntax error with sub and shift causes segfault"
);
8 changes: 4 additions & 4 deletions t/op/tie.t
Original file line number Diff line number Diff line change
Expand Up @@ -731,10 +731,10 @@ $foo{'exit'};
print "overshot main\n"; # shouldn't reach here
EXPECT
eval: s0=EVAL-BD-BS-E1-S1-E2-S2-S2-S2-S2-S2-S2-S2-S2-S2-S2-R
eval: s1=EVAL-BD-BS-E1-S1-E2-S2-S2-S2-S2-S2-S2-S2-S2-S2-S2-R
eval: s2=EVAL-BD-BS-E1-S1-E2-S2-S2-S2-S2-S2-S2-S2-S2-S2-S2-R
eval: s3=EVAL-BD-BS-E1-S1-E2-S2-S2-S2-S2-S2-S2-S2-S2-S2-S2-R
eval: s0=EVAL-BD-BS-E1-S1-E2-S2-R
eval: s1=EVAL-BD-BS-E1-S1-E2-S2-R
eval: s2=EVAL-BD-BS-E1-S1-E2-S2-R
eval: s3=EVAL-BD-BS-E1-S1-E2-S2-R
require: s0=REQUIRE-0-ERQ-ENDRQ-1-ERQ-ENDRQ-2-ERQ-ENDRQ-3-ERQ-ENDRQ-R
require: s1=REQUIRE-0-RQ
require: s2=REQUIRE-0-ERQ-ENDRQ-1-ERQ-ENDRQ-2-ERQ-ENDRQ-3-ERQ-ENDRQ-R
Expand Down
4 changes: 0 additions & 4 deletions t/run/fresh_perl.t
Original file line number Diff line number Diff line change
Expand Up @@ -817,8 +817,6 @@ meow {
};
EXPECT
syntax error at - line 12, near "used"
syntax error at - line 12, near "used}"
Unmatched right curly bracket at - line 14, at end of line
Execution of - aborted due to compilation errors.
######## [perl #112312] crash on syntax error - another test
# SKIP: !defined &DynaLoader::boot_DynaLoader # miniperl
Expand Down Expand Up @@ -849,6 +847,4 @@ sub testo {
EXPECT
syntax error at - line 15, near "used"
syntax error at - line 15, near "used}"
Unmatched right curly bracket at - line 17, at end of line
Execution of - aborted due to compilation errors.
56 changes: 41 additions & 15 deletions toke.c
Original file line number Diff line number Diff line change
Expand Up @@ -12521,15 +12521,24 @@ S_yywarn(pTHX_ const char *const s, U32 flags)
}

void
Perl_abort_execution(pTHX_ const char * const msg, const char * const name)
Perl_abort_execution(pTHX_ SV* msg_sv, const char * const name)
{
PERL_ARGS_ASSERT_ABORT_EXECUTION;

if (PL_minus_c)
Perl_croak(aTHX_ "%s%s had compilation errors.\n", msg, name);
else {
Perl_croak(aTHX_
"%sExecution of %s aborted due to compilation errors.\n", msg, name);
if (msg_sv) {
if (PL_minus_c)
Perl_croak(aTHX_ "%" SVf "%s had compilation errors.\n", SVfARG(msg_sv), name);
else {
Perl_croak(aTHX_
"%" SVf "Execution of %s aborted due to compilation errors.\n", SVfARG(msg_sv), name);
}
} else {
if (PL_minus_c)
Perl_croak(aTHX_ "%s had compilation errors.\n", name);
else {
Perl_croak(aTHX_
"Execution of %s aborted due to compilation errors.\n", name);
}
}
NOT_REACHED; /* NOTREACHED */
}
Expand Down Expand Up @@ -12644,22 +12653,39 @@ Perl_yyerror_pvn(pTHX_ const char *const s, STRLEN len, U32 flags)
qerror(msg);
}
}
if (s == NULL || PL_error_count >= 10) {
const char * msg = "";
if ( s == NULL ||
PL_error_count >= PERL_STOP_PARSING_AFTER_N_ERRORS
) {
const char * const name = OutCopFILE(PL_curcop);
SV * errsv = NULL;
U8 raw_error_count = PERL_PARSE_ERROR_COUNT(PL_error_count);
bool syntax_error = PERL_PARSE_IS_SYNTAX_ERROR(PL_error_count);

if (PL_in_eval) {
SV * errsv = ERRSV;
if (SvCUR(errsv)) {
msg = Perl_form(aTHX_ "%" SVf, SVfARG(errsv));
}
errsv = ERRSV;
}

if (s == NULL) {
abort_execution(msg, name);
abort_execution(errsv, name);
}
else {
Perl_croak(aTHX_ "%s%s has too many errors.\n", msg, name);
else
if (raw_error_count >= PERL_STOP_PARSING_AFTER_N_ERRORS) {
if (errsv) {
Perl_croak(aTHX_ "%" SVf "%s has too many errors.\n",
SVfARG(errsv), name);
} else {
Perl_croak(aTHX_ "%s has too many errors.\n", name);
}
}
else
/* if (syntax_error) - implied */
{
assert(syntax_error);
if (errsv) {
Perl_croak_sv(aTHX_ errsv);
} else {
abort_execution(errsv, name);
}
}
}
PL_in_my = 0;
Expand Down

0 comments on commit eb54d46

Please sign in to comment.