-
Notifications
You must be signed in to change notification settings - Fork 550
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CVE-2018-18314] Heap-based buffer overflow in S_regatom #16041
Comments
From @jwilkConsider the following Perl program: m/(?[(?s:(?[[x]][xx]xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx])/; This is what happens when I run it: The regex_sets feature is experimental in regex; marked by <-- HERE in m/(?[ <-- HERE (?s:(?[[x]][xx]xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx])/ at test line 1. Valgrind says it's a heap-based buffer overflow: Invalid write of size 1 $ perl -V Characteristics of this binary (from libperl): -- |
From @demerphqI have a patch for this in yves/fix_131649 The extended charclass parser is a bit unusual in that it uses quite My patch cleans up some of this logic and adds some trace, but it Personally I would expect the first pass to use the same parsing rules Cheers, On 25 June 2017 at 04:40, Jakub Wilk <perl5-security-report@perl.org> wrote:
-- |
The RT System itself - Status changed from 'new' to 'open' |
From @khwilliamsonOn 06/26/2017 05:42 AM, demerphq wrote:
Maybe we should bite the bullet and fold the passes together, like the
|
From @demerphqOn 2 July 2017 at 06:03, Karl Williamson <public@khwilliamson.com> wrote:
I think we should formally define the grammar before we touch the extended_regexp: '(?[' eclass_expr '])' Personally I think it is odd we only allow extended regexps to | extended_regexp to the eclass_expr production. In other words /(?[ (?[ (?[ (?[ [x] ]) ]) ]) ])/ should be legal. We shouldn't have to write /(?s:(?[ (?s:(?[ (?s:(?[ (?s:(?[ [x] ])) ])) ])) ]))/ for instance, and if we allow /(?[ (?s:(?[ ... ])) ])/ we should allow /(?[ (?:(?[ ... ])) ])/ and /(?[ (?: (?[ ... ]) ) ])/ and etc... (note the whitespace) I think once we have a proper definition of the grammar allowed inside Yves -- |
From @iabynOn Sun, Jul 02, 2017 at 12:17:39PM +0200, demerphq wrote:
Has there been any progress on this issue? -- |
From @demerphqOn 29 November 2017 at 09:58, Dave Mitchell <davem@iabyn.com> wrote:
I am waiting on Karl to reply. As far as I know my patch does fix the issue. Yves -- |
From @khwilliamsonOn 11/29/2017 02:46 AM, demerphq wrote:
I think his patch solves the issue, but my writing a grammar comes after |
From @tonycozOn Wed, 29 Nov 2017 10:53:43 -0800, public@khwilliamson.com wrote:
If I understand the issue, compiling an attacker controlled regexp can overflow a heap allocated buffer, making it a security issue. I suspect, given it's a regexp compilation that an attacker has a lot of control over the data written beyond the end of the buffer.[1] Tony [1] which doesn't matter much, given attacks like the one you linked |
From @tonycozOn Wed, 29 Nov 2017 10:53:43 -0800, public@khwilliamson.com wrote:
Which patch? I don't see a yves/fix_131649 branch anywhere. Tony |
From @tonycozI plan to request a CVE ID for this issue in the next couple of days. If anyone has already requested an ID, please let me know. Thanks, |
From @tonycozOn Wed, 19 Sep 2018 17:37:48 -0700, tonyc wrote:
It turns out this was fixed in 19a498a, which is included in 5.28. This doesn't cherry-pick cleanly to 5.26, the attached is a merged cherry-pick of the fix. Tony |
From @tonycoz0001-5.26-fix-131649-extended-charclass-can-trigger-assert.patchFrom 926efad7572e351b649af3139c4ca53abe20f19b Mon Sep 17 00:00:00 2001
From: Yves Orton <demerphq@gmail.com>
Date: Mon, 26 Jun 2017 13:19:55 +0200
Subject: fix #131649 - extended charclass can trigger assert
The extended charclass parser makes some assumptions during the
first pass which are only true on well structured input, and it
does not properly catch various errors. later on the code assumes
that things the first pass will let through are valid, when in
fact they should trigger errors.
(cherry picked from commit 19a498a461d7c81ae3507c450953d1148efecf4f)
---
pod/perldiag.pod | 27 ++++++++++++++++++++++++++-
pod/perlrecharclass.pod | 4 ++--
regcomp.c | 28 ++++++++++++++++++----------
t/lib/warnings/regcomp | 6 +++---
t/re/reg_mesg.t | 29 ++++++++++++++++-------------
t/re/regex_sets.t | 6 +++---
6 files changed, 68 insertions(+), 32 deletions(-)
diff --git a/pod/perldiag.pod b/pod/perldiag.pod
index 106fe41121..c29925a2a4 100644
--- a/pod/perldiag.pod
+++ b/pod/perldiag.pod
@@ -5904,7 +5904,7 @@ yourself.
a perl4 interpreter, especially if the next 2 tokens are "use strict"
or "my $var" or "our $var".
-=item Syntax error in (?[...]) in regex m/%s/
+=item Syntax error in (?[...]) in regex; marked by <-- HERE in m/%s/
(F) Perl could not figure out what you meant inside this construct; this
notifies you that it is giving up trying.
@@ -6402,6 +6402,31 @@ to find out why that isn't happening.
(F) The unexec() routine failed for some reason. See your local FSF
representative, who probably put it there in the first place.
+=item Unexpected ']' with no following ')' in (?[... in regex; marked by <-- HERE in m/%s/
+
+(F) While parsing an extended character class a ']' character was encountered
+at a point in the definition where the only legal use of ']' is to close the
+character class definition as part of a '])', you may have forgotten the close
+paren, or otherwise confused the parser.
+
+=item Expecting close paren for nested extended charclass in regex; marked by <-- HERE in m/%s/
+
+(F) While parsing a nested extended character class like:
+
+ (?[ ... (?flags:(?[ ... ])) ... ])
+ ^
+
+we expected to see a close paren ')' (marked by ^) but did not.
+
+=item Expecting close paren for wrapper for nested extended charclass in regex; marked by <-- HERE in m/%s/
+
+(F) While parsing a nested extended character class like:
+
+ (?[ ... (?flags:(?[ ... ])) ... ])
+ ^
+
+we expected to see a close paren ')' (marked by ^) but did not.
+
=item Unexpected binary operator '%c' with no preceding operand in regex;
marked by S<<-- HERE> in m/%s/
diff --git a/pod/perlrecharclass.pod b/pod/perlrecharclass.pod
index 79480e4131..8c008507d1 100644
--- a/pod/perlrecharclass.pod
+++ b/pod/perlrecharclass.pod
@@ -1128,8 +1128,8 @@ hence both of the following work:
Any contained POSIX character classes, including things like C<\w> and C<\D>
respect the C<E<sol>a> (and C<E<sol>aa>) modifiers.
-C<< (?[ ]) >> is a regex-compile-time construct. Any attempt to use
-something which isn't knowable at the time the containing regular
+Note that C<< (?[ ]) >> is a regex-compile-time construct. Any attempt
+to use something which isn't knowable at the time the containing regular
expression is compiled is a fatal error. In practice, this means
just three limitations:
diff --git a/regcomp.c b/regcomp.c
index 2bee9d4460..5583b0a7bd 100644
--- a/regcomp.c
+++ b/regcomp.c
@@ -14823,8 +14823,9 @@ S_handle_regex_sets(pTHX_ RExC_state_t *pRExC_state, SV** return_invlist,
TRUE /* Force /x */ );
switch (*RExC_parse) {
- case '?':
- if (RExC_parse[1] == '[') depth++, RExC_parse++;
+ case '(':
+ if (RExC_parse[1] == '?' && RExC_parse[2] == '[')
+ depth++, RExC_parse+=2;
/* FALLTHROUGH */
default:
break;
@@ -14881,9 +14882,9 @@ S_handle_regex_sets(pTHX_ RExC_state_t *pRExC_state, SV** return_invlist,
}
case ']':
- if (depth--) break;
- RExC_parse++;
- if (*RExC_parse == ')') {
+ if (RExC_parse[1] == ')') {
+ RExC_parse++;
+ if (depth--) break;
node = reganode(pRExC_state, ANYOF, 0);
RExC_size += ANYOF_SKIP;
nextchar(pRExC_state);
@@ -14895,20 +14896,25 @@ S_handle_regex_sets(pTHX_ RExC_state_t *pRExC_state, SV** return_invlist,
return node;
}
- goto no_close;
+ /* We output the messages even if warnings are off, because we'll fail
+ * the very next thing, and these give a likely diagnosis for that */
+ if (posix_warnings && av_tindex_skip_len_mg(posix_warnings) >= 0) {
+ output_or_return_posix_warnings(pRExC_state, posix_warnings, NULL);
+ }
+ RExC_parse++;
+ vFAIL("Unexpected ']' with no following ')' in (?[...");
}
RExC_parse += UTF ? UTF8SKIP(RExC_parse) : 1;
}
- no_close:
/* We output the messages even if warnings are off, because we'll fail
* the very next thing, and these give a likely diagnosis for that */
if (posix_warnings && av_tindex_skip_len_mg(posix_warnings) >= 0) {
output_or_return_posix_warnings(pRExC_state, posix_warnings, NULL);
}
- FAIL("Syntax error in (?[...])");
+ vFAIL("Syntax error in (?[...])");
}
/* Pass 2 only after this. */
@@ -15088,12 +15094,14 @@ redo_curchar:
* inversion list, and RExC_parse points to the trailing
* ']'; the next character should be the ')' */
RExC_parse++;
- assert(UCHARAT(RExC_parse) == ')');
+ if (UCHARAT(RExC_parse) != ')')
+ vFAIL("Expecting close paren for nested extended charclass");
/* Then the ')' matching the original '(' handled by this
* case: statement */
RExC_parse++;
- assert(UCHARAT(RExC_parse) == ')');
+ if (UCHARAT(RExC_parse) != ')')
+ vFAIL("Expecting close paren for wrapper for nested extended charclass");
RExC_parse++;
RExC_flags = save_flags;
diff --git a/t/lib/warnings/regcomp b/t/lib/warnings/regcomp
index 2b084c59b0..51ad57ccbe 100644
--- a/t/lib/warnings/regcomp
+++ b/t/lib/warnings/regcomp
@@ -59,21 +59,21 @@ Unmatched [ in regex; marked by <-- HERE in m/abc[ <-- HERE ���[.00./ at - line
qr/(?[[[:word]]])/;
EXPECT
Assuming NOT a POSIX class since there is no terminating ':' in regex; marked by <-- HERE in m/(?[[[:word <-- HERE ]]])/ at - line 2.
-syntax error in (?[...]) in regex m/(?[[[:word]]])/ at - line 2.
+Unexpected ']' with no following ')' in (?[... in regex; marked by <-- HERE in m/(?[[[:word]] <-- HERE ])/ at - line 2.
########
# NAME qr/(?[ [[:digit: ])/
# OPTION fatal
qr/(?[[[:digit: ])/;
EXPECT
Assuming NOT a POSIX class since no blanks are allowed in one in regex; marked by <-- HERE in m/(?[[[:digit: ] <-- HERE )/ at - line 2.
-syntax error in (?[...]) in regex m/(?[[[:digit: ])/ at - line 2.
+syntax error in (?[...]) in regex; marked by <-- HERE in m/(?[[[:digit: ]) <-- HERE / at - line 2.
########
# NAME qr/(?[ [:digit: ])/
# OPTION fatal
qr/(?[[:digit: ])/
EXPECT
Assuming NOT a POSIX class since no blanks are allowed in one in regex; marked by <-- HERE in m/(?[[:digit: ] <-- HERE )/ at - line 2.
-syntax error in (?[...]) in regex m/(?[[:digit: ])/ at - line 2.
+syntax error in (?[...]) in regex; marked by <-- HERE in m/(?[[:digit: ]) <-- HERE / at - line 2.
########
# NAME [perl #126141]
# OPTION fatal
diff --git a/t/re/reg_mesg.t b/t/re/reg_mesg.t
index 39cfcf7df1..1b36d6df20 100644
--- a/t/re/reg_mesg.t
+++ b/t/re/reg_mesg.t
@@ -213,8 +213,9 @@ my @death =
'/\b{gc}/' => "'gc' is an unknown bound type {#} m/\\b{gc{#}}/",
'/\B{gc}/' => "'gc' is an unknown bound type {#} m/\\B{gc{#}}/",
- '/(?[[[::]]])/' => "Syntax error in (?[...]) in regex m/(?[[[::]]])/",
- '/(?[[[:w:]]])/' => "Syntax error in (?[...]) in regex m/(?[[[:w:]]])/",
+
+ '/(?[[[::]]])/' => "Unexpected ']' with no following ')' in (?[... {#} m/(?[[[::]]{#}])/",
+ '/(?[[[:w:]]])/' => "Unexpected ']' with no following ')' in (?[... {#} m/(?[[[:w:]]{#}])/",
'/(?[[:w:]])/' => "",
'/([.].*)[.]/' => "", # [perl #127582]
'/[.].*[.]/' => "", # [perl #127604]
@@ -237,11 +238,12 @@ my @death =
'/(?[ \p{foo} ])/' => 'Can\'t find Unicode property definition "foo" {#} m/(?[ \p{foo}{#} ])/',
'/(?[ \p{ foo = bar } ])/' => 'Can\'t find Unicode property definition "foo = bar" {#} m/(?[ \p{ foo = bar }{#} ])/',
'/(?[ \8 ])/' => 'Unrecognized escape \8 in character class {#} m/(?[ \8{#} ])/',
- '/(?[ \t ]/' => 'Syntax error in (?[...]) in regex m/(?[ \t ]/',
- '/(?[ [ \t ]/' => 'Syntax error in (?[...]) in regex m/(?[ [ \t ]/',
- '/(?[ \t ] ]/' => 'Syntax error in (?[...]) in regex m/(?[ \t ] ]/',
- '/(?[ [ ] ]/' => 'Syntax error in (?[...]) in regex m/(?[ [ ] ]/',
- '/(?[ \t + \e # This was supposed to be a comment ])/' => 'Syntax error in (?[...]) in regex m/(?[ \t + \e # This was supposed to be a comment ])/',
+ '/(?[ \t ]/' => "Unexpected ']' with no following ')' in (?[... {#} m/(?[ \\t ]{#}/",
+ '/(?[ [ \t ]/' => "Syntax error in (?[...]) {#} m/(?[ [ \\t ]{#}/",
+ '/(?[ \t ] ]/' => "Unexpected ']' with no following ')' in (?[... {#} m/(?[ \\t ]{#} ]/",
+ '/(?[ [ ] ]/' => "Syntax error in (?[...]) {#} m/(?[ [ ] ]{#}/",
+ '/(?[ \t + \e # This was supposed to be a comment ])/' =>
+ "Syntax error in (?[...]) {#} m/(?[ \\t + \\e # This was supposed to be a comment ]){#}/",
'/(?[ ])/' => 'Incomplete expression within \'(?[ ])\' {#} m/(?[ {#}])/',
'm/(?[[a-\d]])/' => 'False [] range "a-\d" {#} m/(?[[a-\d{#}]])/',
'm/(?[[\w-x]])/' => 'False [] range "\w-" {#} m/(?[[\w-{#}x]])/',
@@ -427,10 +429,10 @@ my @death_utf8 = mark_as_utf8(
'/���\p{}���/' => 'Empty \p{} {#} m/���\p{{#}}���/',
- '/���(?[[[:���]]])���/' => "Syntax error in (?[...]) in regex m/���(?[[[:���]]])���/",
- '/���(?[[[:���: ])���/' => "Syntax error in (?[...]) in regex m/���(?[[[:���: ])���/",
- '/���(?[[[::]]])���/' => "Syntax error in (?[...]) in regex m/���(?[[[::]]])���/",
- '/���(?[[[:���:]]])���/' => "Syntax error in (?[...]) in regex m/���(?[[[:���:]]])���/",
+ '/���(?[[[:���]]])���/' => "Unexpected ']' with no following ')' in (?[... {#} m/���(?[[[:���]]{#}])���/",
+ '/���(?[[[:���: ])���/' => "Syntax error in (?[...]) {#} m/���(?[[[:���: ])���{#}/",
+ '/���(?[[[::]]])���/' => "Unexpected ']' with no following ')' in (?[... {#} m/���(?[[[::]]{#}])���/",
+ '/���(?[[[:���:]]])���/' => "Unexpected ']' with no following ')' in (?[... {#} m/���(?[[[:���:]]{#}])���/",
'/���(?[[:���:]])���/' => "",
'/���(?[���])���/' => 'Unexpected character {#} m/���(?[���{#}])���/',
'/���(?[ + [���] ])/' => 'Unexpected binary operator \'+\' with no preceding operand {#} m/���(?[ +{#} [���] ])/',
@@ -443,8 +445,9 @@ my @death_utf8 = mark_as_utf8(
'/(?[ \x{���} ])���/' => 'Non-hex character {#} m/(?[ \x{���{#}} ])���/',
'/(?[ \p{���} ])/' => 'Can\'t find Unicode property definition "���" {#} m/(?[ \p{���}{#} ])/',
'/(?[ \p{ ��� = bar } ])/' => 'Can\'t find Unicode property definition "��� = bar" {#} m/(?[ \p{ ��� = bar }{#} ])/',
- '/���(?[ \t ]/' => 'Syntax error in (?[...]) in regex m/���(?[ \t ]/',
- '/(?[ \t + \e # ��� This was supposed to be a comment ])/' => 'Syntax error in (?[...]) in regex m/(?[ \t + \e # ��� This was supposed to be a comment ])/',
+ '/���(?[ \t ]/' => "Unexpected ']' with no following ')' in (?[... {#} m/���(?[ \\t ]{#}/",
+ '/(?[ \t + \e # ��� This was supposed to be a comment ])/' =>
+ "Syntax error in (?[...]) {#} m/(?[ \\t + \\e # ��� This was supposed to be a comment ]){#}/",
'm/(*���)���/' => q<Unknown verb pattern '���' {#} m/(*���){#}���/>,
'/\c���/' => "Character following \"\\c\" must be printable ASCII",
'/\b{���}/' => "'���' is an unknown bound type {#} m/\\b{���{#}}/",
diff --git a/t/re/regex_sets.t b/t/re/regex_sets.t
index 6a79f9d692..e9644bd4e6 100644
--- a/t/re/regex_sets.t
+++ b/t/re/regex_sets.t
@@ -158,13 +158,13 @@ for my $char ("��", "��", "��") {
eval { $_ = '/(?[(\c]) /'; qr/$_/ };
like($@, qr/^Syntax error/, '/(?[(\c]) / should not panic');
eval { $_ = '(?[\c#]' . "\n])"; qr/$_/ };
- like($@, qr/^Syntax error/, '/(?[(\c]) / should not panic');
+ like($@, qr/^Unexpected/, '/(?[(\c]) / should not panic');
eval { $_ = '(?[(\c])'; qr/$_/ };
like($@, qr/^Syntax error/, '/(?[(\c])/ should be a syntax error');
eval { $_ = '(?[(\c]) ]\b'; qr/$_/ };
- like($@, qr/^Syntax error/, '/(?[(\c]) ]\b/ should be a syntax error');
+ like($@, qr/^Unexpected/, '/(?[(\c]) ]\b/ should be a syntax error');
eval { $_ = '(?[\c[]](])'; qr/$_/ };
- like($@, qr/^Syntax error/, '/(?[\c[]](])/ should be a syntax error');
+ like($@, qr/^Unexpected/, '/(?[\c[]](])/ should be a syntax error');
like("\c#", qr/(?[\c#])/, '\c# should match itself');
like("\c[", qr/(?[\c[])/, '\c[ should match itself');
like("\c\ ", qr/(?[\c\])/, '\c\ should match itself');
--
2.11.0
|
From @tonycozOn Sun, 23 Sep 2018 23:42:36 -0700, tonyc wrote:
This is CVE-2018-18314. Tony |
From @tonycozOn Mon, Oct 01, 2018 at 05:19:54PM -0700, Tony Cook via RT wrote:
And this was corrupted apparently at my end, turning all of the Attached a gzipped patch. Tony |
From @steve-m-hayMoved to public queue with the release of 5.26.3. |
From [Unknown Contact. See original ticket]Moved to public queue with the release of 5.26.3. |
From @khwilliamsonThis has been fixed in blead and all maintenance releases, so resolving A new ticket has been created to track creating a formal grammar for regex sets, mentioned originally in this ticket. The new ticket is https://rt-archive.perl.org/perl5/Ticket/Display.html?id=133707 -- |
@khwilliamson - Status changed from 'open' to 'resolved' |
Migrated from rt.perl.org#131649 (status was 'resolved')
Searchable as RT131649$
The text was updated successfully, but these errors were encountered: