Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify fuzzer, but also fuzz all the (non-limits) options in RegexBuilder #821

Closed
wants to merge 1 commit into from

Conversation

5225225
Copy link
Contributor

@5225225 5225225 commented Nov 28, 2021

I'm not sure if the OSS-fuzz integration can handle data not being a &[u8], but cargo-fuzz can. (It looks like they just use cargo fuzz, so it should work.)

This does change the input format, which might confuse the fuzzer (but I'd expect it to pretty quickly how to pass strings). I could make this as a new fuzz target if you want.

One advantage taking &str over &[u8] has is that if the end of a &str is invalid UTF-8, then it will still be used, but it (the implementation of Arbitrary for &str) just cuts off the invalid part, which is better than not running at all.

I ran this for a bit locally and didn't find any issues. I can back out the commit to add the options and just have the one that removes the complex parsing code if you want.

@BurntSushi
Copy link
Member

@5225225 Thanks for this PR! This is a really nice simplification.

One thing that this seems to make harder though is converting failing fuzz inputs into regression tests. Today, I can do something like this:

#[test] 
fn regression_foo_bar() {
    let raw = include_bytes!("path/to/crash-blahblah");
    let _ = run(raw);
}

fn run(raw: &[u8]) -> Option<()> {
    // a copy of the fuzz target code
    Some(())
}

But with this simplification, it seems like it covers up the conversion of the fuzzer input of &[u8], which I think makes writing run harder. Is there a way to explicitly convert &[u8] into the more structure types such that I could copy that into the test code above?

(I'm also open to solving the problem of fuzz regression tests in a different way.)

@5225225
Copy link
Contributor Author

5225225 commented May 4, 2023

I did something that I'm not sure is terrible or very clever. But it's definitely easy. (Below is for an artificial error, i just put in an assert somewhere to panic on nontrivial input).

Failing input:

	fuzz/artifacts/fuzz_regex_match/crash-dfcf19f0e4fba283e45253fd6eabd3c77b557cdf

Output of `std::fmt::Debug`:

	
	let r = regex::RegexBuilder::new(",@*{000120}}}}}}}")
	    .case_insensitive(false)
	    .multi_line(false)
	    .dot_matches_new_line(false)
	    .swap_greed(false)
	    .ignore_whitespace(false)
	    .unicode(false)
	    .octal(false)
	    .build();
	
	if let Ok(re) = r {
	    re.is_match("");
	}
	        

Reproduce with:

	cargo fuzz run fuzz_regex_match fuzz/artifacts/fuzz_regex_match/crash-dfcf19f0e4fba283e45253fd6eabd3c77b557cdf

Minimize test case with:

	cargo fuzz tmin fuzz_regex_match fuzz/artifacts/fuzz_regex_match/crash-dfcf19f0e4fba283e45253fd6eabd3c77b557cdf

Basically, the debug print of whatever you're fuzzing is printed on failure, so if you just make that debug print actually just valid rust code, you can copy paste directly from the output to a #[test].

This relies on debug print on &str round tripping, but if it doesn't then you would just see either compilation errors, or the test wouldn't fail when it should.

@BurntSushi
Copy link
Member

Interesting! I think I like that. And I do really like how much simpler it is with the structured input.

BurntSushi pushed a commit that referenced this pull request May 4, 2023
This makes a couple of the fuzzer targets a bit nicer by just asking for
structured data instead of trying to manifest it ourselves out of a
&[u8].

Closes #821
BurntSushi pushed a commit that referenced this pull request May 5, 2023
This makes a couple of the fuzzer targets a bit nicer by just asking for
structured data instead of trying to manifest it ourselves out of a
&[u8].

Closes #821
BurntSushi pushed a commit that referenced this pull request May 14, 2023
This makes a couple of the fuzzer targets a bit nicer by just asking for
structured data instead of trying to manifest it ourselves out of a
&[u8].

Closes #821
BurntSushi pushed a commit that referenced this pull request May 18, 2023
This makes a couple of the fuzzer targets a bit nicer by just asking for
structured data instead of trying to manifest it ourselves out of a
&[u8].

Closes #821
BurntSushi pushed a commit that referenced this pull request May 18, 2023
This makes a couple of the fuzzer targets a bit nicer by just asking for
structured data instead of trying to manifest it ourselves out of a
&[u8].

Closes #821
BurntSushi pushed a commit that referenced this pull request May 21, 2023
This makes a couple of the fuzzer targets a bit nicer by just asking for
structured data instead of trying to manifest it ourselves out of a
&[u8].

Closes #821
BurntSushi pushed a commit that referenced this pull request May 22, 2023
This makes a couple of the fuzzer targets a bit nicer by just asking for
structured data instead of trying to manifest it ourselves out of a
&[u8].

Closes #821
BurntSushi pushed a commit that referenced this pull request May 22, 2023
This makes a couple of the fuzzer targets a bit nicer by just asking for
structured data instead of trying to manifest it ourselves out of a
&[u8].

Closes #821
BurntSushi pushed a commit that referenced this pull request May 24, 2023
This makes a couple of the fuzzer targets a bit nicer by just asking for
structured data instead of trying to manifest it ourselves out of a
&[u8].

Closes #821
BurntSushi pushed a commit that referenced this pull request May 24, 2023
This makes a couple of the fuzzer targets a bit nicer by just asking for
structured data instead of trying to manifest it ourselves out of a
&[u8].

Closes #821
BurntSushi pushed a commit that referenced this pull request May 25, 2023
This makes a couple of the fuzzer targets a bit nicer by just asking for
structured data instead of trying to manifest it ourselves out of a
&[u8].

Closes #821
BurntSushi pushed a commit that referenced this pull request Jun 13, 2023
This makes a couple of the fuzzer targets a bit nicer by just asking for
structured data instead of trying to manifest it ourselves out of a
&[u8].

Closes #821
@BurntSushi BurntSushi closed this in 01a8cf4 Jul 5, 2023
BurntSushi pushed a commit that referenced this pull request Jul 5, 2023
This makes a couple of the fuzzer targets a bit nicer by just asking for
structured data instead of trying to manifest it ourselves out of a
&[u8].

Closes #821
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants