New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DoS risk: small regex gives: panicked at 'tried to unwrap expr from HirFrame, got: Concat' #465

Closed
PaulGrandperrin opened this Issue Apr 14, 2018 · 10 comments

Comments

Projects
None yet
3 participants
@PaulGrandperrin

PaulGrandperrin commented Apr 14, 2018

Hi @BurntSushi ,
Sorry to bother you :-) another one!

regex::Regex::new("(?i)(?i)*?");
thread 'main' panicked at 'tried to unwrap expr from HirFrame, got: Concat', /root/.cargo/registry/src/github.com-1ecc6299db9ec823/regex-syntax-0.5.3/src/hir/translate.rs:203:18

playground

@robertswiecki : honggfuzz is really good! libFuzzer has been unable to find those even though I'm sure many people ran it for many hours/days on this code.

@robertswiecki

This comment has been minimized.

robertswiecki commented Apr 14, 2018

Interesting! Seems like we (libfuzzer, afl, honggfuzz) might be using complementary strategies after all.

@PaulGrandperrin

This comment has been minimized.

PaulGrandperrin commented Apr 14, 2018

I guess so. I'll try to help on the rust front to make it easier to fuzz with all three.

@BurntSushi

This comment has been minimized.

Member

BurntSushi commented Apr 15, 2018

Thanks! This is another good one, and I thought I had a test case for this. I'll look into it, but won't be able to fix it until tomorrow.

Is it possible to batch these bugs? I wouldn't be surprised if you found more, and it'd make more sense to just throw as many as you can over the wall so I can fix them in one go.

libFuzzer has been unable to find those even though I'm sure many people ran it for many hours/days on this code.

To be clear, the regex crate got a rewrite of its regex-syntax crate a few months ago. So if you're the first to run a fuzzer against that rewrite (I haven't), then you'll be the first to pick at the low hanging fruit. :-)

@BurntSushi BurntSushi added the bug label Apr 15, 2018

@PaulGrandperrin

This comment has been minimized.

PaulGrandperrin commented Apr 15, 2018

Is it possible to batch these bugs?

Yes, of course, it's just that I didn't know there was a recent rewrite so I didn't expect to find many bugs if any.

I ran the fuzzer for a couple of hours and it just found another one:

regex::Regex::new("(?i)?");
// thread 'main' panicked at 'called `Option::unwrap()` on a `None` value', libcore/option.rs:335:21

I think that's it for the most obvious ones. But I'll improve a bit the process and continue to fuzz for a small while.

Do you have any recommendation of APIs that might need more scrutiny than others?

@BurntSushi

This comment has been minimized.

Member

BurntSushi commented Apr 15, 2018

Do you have any recommendation of APIs that might need more scrutiny than others?

Traditionally, fuzzers have found bugs by comparing the outputs of the different execution engines, but it requires dealing with internal APIs. There are a few open issues detailing bugs in that space.

With respect to the parser rewrite though, no, I think Regex::new is exactly the thing you want to test. The other piece is printing the error message returned by Regex::new when given an invalid pattern, but it looks like your setup already accounts for this. If you test those two things, then that should cover most or all of the rewrite.

@BurntSushi

This comment has been minimized.

Member

BurntSushi commented Apr 15, 2018

I will let this issue sit for a few days before diving into bug fixing mode. :-)

@PaulGrandperrin

This comment has been minimized.

PaulGrandperrin commented Apr 15, 2018

Ok, I'll leave it to fuzz for a few days then.
Could you confirm that this code is enough to cover the code rewrite?

pub fn fuzz_regex(data: &[u8]) {
    if let Ok(data) = std::str::from_utf8(data) {
        let _ = regex::Regex::new(data);
    }
}

It seems to me that it's not mandatory to display the error (if any) because I didn't see any real logic in the Debug and Display implementation of regex::Error. I guess the new code printing the message is executed while constructing the regex::Error so if it panics, we'll see it.

I've also configured the fuzzer to generate input with a maximum size of 20bytes, does it looks like a good size to you?

@BurntSushi

This comment has been minimized.

Member

BurntSushi commented Apr 15, 2018

Yup, that code looks good to me. And yes, the formatter is indeed run when the regex::Error is built.

20 bytes is probably good. At least, I can't think of any specific reason to go bigger.

You might also consider adding (?x) to the beginning of patterns, if that's easy to do. Alternatively, enable ignore_whitespace, which should accomplish the same thing. I bet that will uncover more bugs. :-)

@PaulGrandperrin

This comment has been minimized.

PaulGrandperrin commented Apr 15, 2018

Great, I'll do that.

@PaulGrandperrin

This comment has been minimized.

PaulGrandperrin commented Apr 16, 2018

Ok, I think you can go ahead and go into bug fixing mode whenever you want :-)
After almost 1 billion iterations, honggfuzz stopped being able to find new coverage.
I then tried libFuzzer, but it was unusable because it cannot continue fuzzing once it finds one crash.
And AFL was just too slow and wasn't able to deduplicate similar crashes so its output would have been unusable.
So it didn't find anything new and I'm stopping fuzzing.

killercup added a commit to killercup/targets that referenced this issue Apr 27, 2018

Add regex syntax target
Run it with `cargo run target regex_syntax`

Finds the following inputs that yield panics, some of which are also
mentioned in [1]:

- `(?i)?i�`
- `(?m)?`
- `CBh~62�Y((?i))??i�
- `(?m)?90`
- `(?i)?`

[1]: rust-lang/regex#465

killercup added a commit to killercup/targets that referenced this issue Apr 27, 2018

Add regex syntax target
Run it with `cargo run target regex_syntax`

Finds the following inputs that yield panics, some of which are also
mentioned in [1]:

- `b"(?m)?"`�
- `b"(?i)?i\x0e"`
- `b"CBh~62\x17Y((?i))??i\x0e"`�
- `b"(?m)?90"`
- `b"(?i)?"`

[1]: rust-lang/regex#465

killercup added a commit to killercup/targets that referenced this issue Apr 27, 2018

Add regex syntax target
Run it with `cargo run target regex_syntax`

Finds the following inputs that yield panics, some of which are also
mentioned in [1]:

- `b"(?m)?"`
- `b"(?i)?i\x0e"`
- `b"CBh~62\x17Y((?i))??i\x0e"`
- `b"(?m)?90"`
- `b"(?i)?"`

[1]: rust-lang/regex#465

bors bot added a commit to rust-fuzz/targets that referenced this issue Apr 27, 2018

Merge #112
112: Add regex syntax target r=PaulGrandperrin a=killercup

Add regex syntax target

Run it with `cargo run target regex_syntax`

Finds the following inputs that yield panics, some of which are also
mentioned in [1]:

- `b"(?m)?"`
- `b"(?i)?i\x0e"`
- `b"CBh~62\x17Y((?i))??i\x0e"`
- `b"(?m)?90"`
- `b"(?i)?"`

[1]: rust-lang/regex#465

Co-authored-by: Pascal Hertleif <killercup@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment