Check min size needed with `Arbitrary::size_hint` and early exit if the data isn't long enough #59

fitzgen · 2020-02-25T19:10:26Z

@bnjbvr was reporting that starting fuzzing from scratch with a fuzz target that takes an Arbtirary impl was spending a lot of time on three bytes long inputs, where the Arbitrary implementation required more input bytes than given. The fuzzer wasn't giving more input bytes fast enough to get to the meaty bits of the fuzz target.

In theory, we could use Arbitrary::size_hint to create a bunch of random seeds for the corpus of the appropriate size and/or control the maximum length of inputs that libfuzzer will generate.

I'm not sure exactly what this would look like, but it probably requires another env var dance between cargo fuzz and libfuzzer-sys like how the debug printing works.

As far as how this is exposed to users, maybe we should add cargo fuzz seed <target> as a new subcommand?

Or maybe we can just document that users can do something like head -c 100 /dev/random > fuzz/corpus/my-target/my-random-seed to add a random 100 bytes seed to their fuzz target's corpus to get things off the ground...

The text was updated successfully, but these errors were encountered:

Manishearth · 2020-02-26T20:23:50Z

it feels like libfuzzer should have an option for this? if not perhaps we should ask for one?

fitzgen · 2020-02-26T23:04:10Z

What are you imagining that libfuzzer would do? It doesn't know whether it provided enough bytes for Arbitrary or not. I'm not sure what you mean here.

rohanpadhye · 2020-02-27T00:44:29Z

Hello! I'm new to cargo-fuzz but I've encountered similar issues when implementing the Zest algorithm in JQF (a structured-input fuzzer for Java). The solution for this problem was to extend the input on-demand when the Quickcheck generators (analogous to Arbitrary) request more bytes, and then cap this extension to some max limit. The relevant source code is here: https://github.com/rohanpadhye/jqf/blob/aeff24a79e409f6ab2c1b8a5c987dbafb091461d/fuzz/src/main/java/edu/berkeley/cs/jqf/fuzz/ei/ZestGuidance.java#L1068-L1099

I don't know much about libfuzzer internals, but it would be good if a test driver could actually modify and extend the byte array that libfuzzer provides during test execution, so that cargo-fuzz (or libfuzzer-sys) could just populate any extra bytes with freshly generated random values on demand. That would be such a useful feature for libfuzzer to have in many different scenarios other than powering Arbitrary...

Manishearth · 2020-02-27T00:44:44Z

@fitzgen ./fuzz --min-len=foo, and a way to pass that down. This would require executing the fuzz target with another env var to get the size hint.

Manishearth · 2020-02-27T00:47:46Z

@rohanpadhye Worth proposing to libFuzzer. I suspect that given the general fuzzer focus on fuzzer-driven mutation this may not actually work that well, but worth a shot either way.

fitzgen · 2020-02-27T17:48:50Z

Thanks for the insights @rohanpadhye!

The solution for this problem was to extend the input on-demand when the Quickcheck generators (analogous to Arbitrary) request more bytes, and then cap this extension to some max limit. The relevant source code is here: https://github.com/rohanpadhye/jqf/blob/aeff24a79e409f6ab2c1b8a5c987dbafb091461d/fuzz/src/main/java/edu/berkeley/cs/jqf/fuzz/ei/ZestGuidance.java#L1068-L1099

@Manishearth, this reminds me of how during the redesign of Unstructured, we debated whether to error early when running out of bytes or to artificially extend it with zero bytes. We thought that exiting early would avoid confusing the fuzzer, and keep it in control of exploring the input space. Maybe this was the wrong choice?

We could fairly easily start returning zeros after we've exhausted the input (and have a max limit on how many zeros we return just in case something gets in a loop).

But if we were to actually pursue this, first I would want to take a real world fuzz target (or multiple targets) and test how much code coverage we get after X period of time with and without the zero extending, when starting from an empty corpus. Unfortunately, I don't have the cycles to run the experiment...

Anyways, I'll file an issue for a -min_len flag for libfuzzer.

fitzgen · 2020-02-27T18:14:22Z

Oh another idea is we could do something like this:

if data.len() < Arbitrary::<FuzzTargetType>::size_hint().0 {
    return;
}

...

in the libfuzzer_sys::fuzz_target! macro. That would at least cut down on the reported code covered inside Arbitrary impls and also be a bit faster.

fitzgen · 2020-02-27T18:18:10Z

In fact, I think that would be the best way to handle this.

Fixes rust-fuzz#59

fitzgen transferred this issue from rust-fuzz/cargo-fuzz Feb 27, 2020

fitzgen transferred this issue from rust-fuzz/arbitrary Feb 27, 2020

fitzgen changed the title ~~Somehow leverage Arbitrary::size_hint to create initial seeds for corpus or control -max_len and -len_control?~~ Check min size needed with Arbitrary::size_hint and early exit if the data isn't long enough Feb 27, 2020

fitzgen added a commit to fitzgen/libfuzzer that referenced this issue Feb 27, 2020

Early exit when given too few bytes for the Arbitrary impl

90c1140

Fixes rust-fuzz#59

fitzgen mentioned this issue Feb 27, 2020

Early exit when given too few bytes for the Arbitrary impl #60

Merged

fitzgen closed this as completed in #60 Feb 27, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Check min size needed with `Arbitrary::size_hint` and early exit if the data isn't long enough #59

Check min size needed with `Arbitrary::size_hint` and early exit if the data isn't long enough #59

fitzgen commented Feb 25, 2020

Manishearth commented Feb 26, 2020

fitzgen commented Feb 26, 2020

rohanpadhye commented Feb 27, 2020

Manishearth commented Feb 27, 2020

Manishearth commented Feb 27, 2020

fitzgen commented Feb 27, 2020

fitzgen commented Feb 27, 2020

fitzgen commented Feb 27, 2020

Check min size needed with Arbitrary::size_hint and early exit if the data isn't long enough #59

Check min size needed with Arbitrary::size_hint and early exit if the data isn't long enough #59

Comments

fitzgen commented Feb 25, 2020

Manishearth commented Feb 26, 2020

fitzgen commented Feb 26, 2020

rohanpadhye commented Feb 27, 2020

Manishearth commented Feb 27, 2020

Manishearth commented Feb 27, 2020

fitzgen commented Feb 27, 2020

fitzgen commented Feb 27, 2020

fitzgen commented Feb 27, 2020

Check min size needed with `Arbitrary::size_hint` and early exit if the data isn't long enough #59

Check min size needed with `Arbitrary::size_hint` and early exit if the data isn't long enough #59