meta: forge fuzzer improvements #387

mds1 · 2022-01-06T03:36:28Z

The current state of fuzzing is naive (we think, more on this below)—it uses proptest strategies to generate random, uniformly distributed inputs based on Solidity types. This is an ok start, but isn't great. There's lots of improvements to the fuzzer that can be made to increase the chances of finding bugs.

This issue is spec out (1) what fuzzing features and functionality we want to have, and then (2) how to approach that, e.g. are custom proptest strategies sufficient, should we switch to a different crate, etc.

I'm certainly no fuzzing expert, but as a jumping off point here's my assessment of functionality we'd want to have, as well as a list of other rust crates for fuzzing to consider

Fuzzing Features

These are just ones that come to mind based on my limited knowledge of fuzzing and my use of foundry so far

Bias random values towards extremes

Based on the proptest docs and this open proptest issue, it appears proptest does not bias towards extreme values and edge cases such as 0, max, min, etc.

"Synchronize" the fuzzed inputs

Some bugs require multiple inputs to be in sync, e.g. all zeros or empty arrays. If each input is generated independently and uniformly, this is unlikely to happen. I think this is the current behavior of proptest, but I may be wrong because the fuzzer did find a failure case with a counter example of [], [], [], 0x, 0x for inputs of types (address,uint256[],uint256[],uint256[],bytes,bytes). I believe this specific example came up on each fuzz run, suggesting proptest may be smarter than myself and the docs give it credit for

Some degree of control over array and bytes lengths

Currently the max size of inputs for arrays and bytes are bounded, but some bugs may only surface with larger inputs. Always allowing very large inputs might slow down tests, so it'd be useful to either allow them be unbound (or have larger bounds) via a flag (e.g. for use in CI), or to bias the fuzzer to have large inputs only be used occasionally

Constant mining

Details and motivation can be found here. The summary is that echidna uses constant mining to extract return values from other methods and uses those as fuzzer inputs. As shown in the link, this helps find bugs that would not be found otherwise

`assume` functionality

Allow inputs that don't meet certain conditions, such as ignoring the zero address or ensuring that x + y < z, to be discarded and have a new set of inputs generated. Discarded fuzz runs are not counted toward the number of executed runs, so new inputs would be generated and the test re-ran

Bounding inputs to a range

It's debatable whether this should be part of the fuzzer, or if users should just manage it manually by bounding the fuzzer's inputs, but a way to control the range of inputs is often useful to avoid reverts due to overflows, etc.

Symbolically execute to seed fuzzer inputs

See #387 (comment) below:

Consider using SMT solvers like z3 directly to generate random valid inputs, instead of hoping some generic fuzzer will support smart contract specific needs.

h/t @yaronvel

Rust Fuzzing Crates

There's probably others, but here's a few I've come across so far

proptest is what we currently use
quickcheck — QuickCheck for Haskell is what dapptools uses
afl.rs allows fuzzing with AFL
cargo-fuzz allows fuzzing with libFuzzer
arbitrary is intended to be combined with a fuzzer like libFuzzer and cargo-fuzz or AFL, and to help you turn the raw, untyped byte buffers that they produce into well-typed, valid, structured values (h/t @mattsse)

The text was updated successfully, but these errors were encountered:

mattsse · 2022-01-06T11:53:18Z

there's also
https://github.com/rust-fuzz/arbitrary

gakonst · 2022-01-06T13:57:03Z

We could also explore making our own Strategy implementations for proptest.

yaronvel · 2022-01-06T14:18:10Z

Consider using SMT solvers like z3 directly to generate random valid inputs, instead of hoping some generic fuzzer will support smart contract specific needs.
Can elaborate more on this if you want, in my previous life I once built a randomized testing tool with z3 from scratch.

wuestholz · 2022-01-07T14:26:24Z

@mds1 @gakonst Great to hear that there are other efforts to get more users hooked on fuzzing! :) You might want to take a look at our Harvey fuzzer; see https://mariachris.github.io/Pubs/FSE-2020-Harvey.pdf for ideas on what features make a big difference in practice. Harvey is used both in Diligence's MythX service and our dedicated Diligence fuzzing solution.

mds1 · 2022-01-13T02:12:17Z

Consider using SMT solvers like z3 directly to generate random valid inputs, instead of hoping some generic fuzzer will support smart contract specific needs.
Can elaborate more on this if you want, in my previous life I once built a randomized testing tool with z3 from scratch.

Hey @yaronvel—I'm not too familiar with how you'd implement this in practice, and foundry doesn't yet have any symbolic execution capabilities, so this probably won't be the first path we go down. However I've always thought seeding the fuzzer with solver outputs would be valuable, so would love if you could elaborate more!

mds1 · 2022-01-13T02:20:14Z

Great to hear that there are other efforts to get more users hooked on fuzzing! :) You might want to take a look at our Harvey fuzzer; see https://mariachris.github.io/Pubs/FSE-2020-Harvey.pdf for ideas on what features make a big difference in practice. Harvey is used both in Diligence's MythX service and our dedicated Diligence fuzzing solution.

Thanks @wuestholz this looks great, planning to read this soon. Only question until then is: looks like the paper is from November 2020, so is there anything you've learned since then that isn't covered in the paper, or anything you'd change if writing a similar paper today? It's only about a year old, but a year is a long time in the smart contract world! 😁

wuestholz · 2022-01-13T09:13:15Z

Great to hear that there are other efforts to get more users hooked on fuzzing! :) You might want to take a look at our Harvey fuzzer; see https://mariachris.github.io/Pubs/FSE-2020-Harvey.pdf for ideas on what features make a big difference in practice. Harvey is used both in Diligence's MythX service and our dedicated Diligence fuzzing solution.

Thanks @wuestholz this looks great, planning to read this soon. Only question until then is: looks like the paper is from November 2020, so is there anything you've learned since then that isn't covered in the paper, or anything you'd change if writing a similar paper today? It's only about a year old, but a year is a long time in the smart contract world! 😁

😁 We're continuously improving Harvey based on feedback from auditors and clients. We're planning to write up one of the novel techniques we added soon (spoiler alert: up to 30% coverage increase for some of our benchmarks). I think all the lessons in the paper still apply today, but we've definitely been optimizing Harvey more for property checking (for instance, in combination with our Scribble specification language). We've also found several cool use-cases for the input prediction technique we presented in our FSE paper.

gakonst · 2022-01-13T10:10:27Z

How much of the specification/property writing do you think we can overload into Solidity syntax, vs having a separate language?

wuestholz · 2022-01-13T12:19:02Z

@gakonst I'm hoping a lot! 😁 We're trying to keep Scribble as simple as possible. Ideally, a Solidity developer can understand most specifications "out-of-the-box"; for instance, we allow regular Solidity expressions whenever possible, and only extend the syntax slightly when needed (for instance, old-expressions in postconditions).

From my perspective, the Solidity team can "steal" as much of Scribble as possible. 😉 We partly started Scribble to explore the language design space and are more than happy to evolve the language with feedback from the community.

transmissions11 · 2022-02-06T21:37:04Z

here's a concrete example of the fuzzer missing multiple bugs that other fuzzers (dapptools) would detect easily:

function testEquality(uint256 x, uint256 y) public {
        uint256 xy;

        unchecked {
            xy = x * y;
        }

        if ((x != 0 && xy / x != y)) return;

        assertEq(((xy - 1) / 1e18) + 1, (xy - 1) / (1e18 + 1));
}

mds1 · 2022-02-10T01:39:16Z

Primary fuzzing technique we should focus on right now is building the best dictionary, and selecting inputs from both the dictionary and from a randomly generated value, with a user-defined weight specifying the frequency of each.

Biasing towards extremes can be handled by adding values to the dictionary
Constant mining can be handled by dumping the EVM state into the dictionary
Can also add values by parsing solc output or using slither
Instead of coverage guidance, add a flag to output a coverage report, so user knows what wasn't hit
A cheatcode to add values to dictionary can help improve coverage, based on coverage report

To implement this we need a weighted union of two strategies:

Strategy 1: Randomly generated values, which we currently have
Strategy 2: The dictionary, using proptest's select, which creates a strategy that uniformly selects one value from values:
Union: Use new_weighted function to create a strategy that selects from the above two strategies with the specified weights
Echidna's default dictionary frequency is 40%, so we should use that value also. This means random values are used 1.5 times more often than dictionary values, and based on how new_weighted requires wights to be specified, this is equivalent to a weight of 1 for dictionary and 1.5 for random values

mds1 · 2023-02-26T22:09:20Z

Closing this since the fuzzer has improved since this was created, so this is now stale. Will create a new issue to track future fuzz improvements

This was referenced Jan 6, 2022

feat: vm.assume cheat code for discarding bad fuzz inputs #286

Closed

feat(forge): fuzzer should bias towards edge cases #330

Closed

transmissions11 mentioned this issue Jan 6, 2022

feature request: Constant mining in fuzzer #51

Closed

gakonst added the A-testing Area: testing label Jan 6, 2022

gakonst added the C-forge Command: forge label Jan 8, 2022

mds1 mentioned this issue Jan 19, 2022

dapptools-style coverage #99

Closed

alexeuler mentioned this issue Feb 12, 2022

Added assume cheat code #725

Merged

brockelmore mentioned this issue Feb 12, 2022

feat(forge): fuzz dictionary #731

Closed

alexeuler mentioned this issue Feb 13, 2022

Better number generation strategy #735

Merged

mds1 mentioned this issue Aug 3, 2022

feat: support time-based and continuous fuzzing and invariant testing #990

Open

mds1 closed this as completed Feb 26, 2023

wildmolasses mentioned this issue Jan 29, 2024

Document dictionary_weight foundry-rs/book#893

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

meta: forge fuzzer improvements #387

meta: forge fuzzer improvements #387

mds1 commented Jan 6, 2022 •

edited

Loading

mattsse commented Jan 6, 2022

gakonst commented Jan 6, 2022

yaronvel commented Jan 6, 2022

wuestholz commented Jan 7, 2022

mds1 commented Jan 13, 2022

mds1 commented Jan 13, 2022

wuestholz commented Jan 13, 2022

gakonst commented Jan 13, 2022

wuestholz commented Jan 13, 2022

transmissions11 commented Feb 6, 2022

mds1 commented Feb 10, 2022

mds1 commented Feb 26, 2023

meta: forge fuzzer improvements #387

meta: forge fuzzer improvements #387

Comments

mds1 commented Jan 6, 2022 • edited Loading

Fuzzing Features

Bias random values towards extremes

"Synchronize" the fuzzed inputs

Some degree of control over array and bytes lengths

Constant mining

assume functionality

Bounding inputs to a range

Symbolically execute to seed fuzzer inputs

Rust Fuzzing Crates

mattsse commented Jan 6, 2022

gakonst commented Jan 6, 2022

yaronvel commented Jan 6, 2022

wuestholz commented Jan 7, 2022

mds1 commented Jan 13, 2022

mds1 commented Jan 13, 2022

wuestholz commented Jan 13, 2022

gakonst commented Jan 13, 2022

wuestholz commented Jan 13, 2022

transmissions11 commented Feb 6, 2022

mds1 commented Feb 10, 2022

mds1 commented Feb 26, 2023

mds1 commented Jan 6, 2022 •

edited

Loading

`assume` functionality