Introduce Transaction Subsystem Fuzzing and Common Fuzzer Interface #2182

robertDurst · 2019-07-03T22:02:51Z

Some Background

Fuzzing is a test strategy that performs automated binary executions with randomly generated/mutated inputs, monitoring for program crashes. It is useful for increasing test coverage by discovery of obscure, unthought-of, and interesting edge cases.

The basic process works as so:

**Input**: an initial corpus of well formed data
1. Inputs fed into a program
2. Fuzzer derives more inputs (based on some gathered metrics like unique execution path)
3. Repeat
**Output**: inputs that caused crashes

There are many fuzz testing tools out there such as libFuzzer, AFL, and Peach Fuzzer. For the time being, I have implemented a test harness to work with AFL.

Description

This builds on #2155, introducing a transaction subsystem fuzzer.

A significant portion of the work here was integrating the introduced transaction fuzzer with AFL's persistent mode. Persistent mode allows for multiple executions of the binary in a single long-lived process, significantly improving executions per second since the expensive state setup can be limited to every 1,000th, or even 1,000,000th execution. Areas where determinism improved include:

foregoing signature verification (we aren't breaking ED25519 here)
clearing caches
seeding randomness

This PR also defines a common interface for fuzzing various subsystems, allowing one to utilize a CLI flag to switch between Overlay and Transaction fuzzing.

Finally, this PR includes an option to specify process_id, allowing for easy parallelization of fuzzing (as supported by AFL).

Further progress of #1376

Future Work

Generating transactions with more operations

To demonstrate functionality, I restricted the fuzzer to a subset of operations: ACCOUNT_MERGE and PAYMENT of native assets. Running against protocol 4, the fuzzer discovered the double merge bug described here. This set will be expanded as I extend the fuzz test harness methodically, focusing first on discovery of the rest of the historical bugs in chronological order.

Improving Fuzzer Efficiency

I also spent some time thinking about how to meaningfully and unobstructively limit the fuzz space. For example, I use a small set of pre-generated accounts for operations. Since I do this, and cause a disconnect between what the fuzzer fuzzes and what is actually fed to Stellar-core, cpu time spent incrementing and flipping bits in the AccountID property of XDR's is wasted time. Thus, I feed the fuzzer an initial corpus of operations with all bytes of the AccountID zero'd out except for the 32nd byte to nudge it in the right direction.

Note for Reviewers

Dependent on and rebased against #2155.

Checklist

Rebased on top of master (no merge commits)
Ran clang-format v5.0.0 (via make format or the Visual Studio extension)
Compiles
Ran all tests

src/ledger/LedgerTxn.cpp

src/main/CommandLine.cpp

src/test/fuzz.cpp

src/test/fuzz/FuzzerImpl.cpp

src/test/fuzz/deserialize.cpp

src/test/fuzz/fuzz.cpp

src/xdr/Fuzz-test.x

robertDurst · 2019-07-09T04:37:24Z

More or less addressed everything above. Some of the hacky-est parts here will be benefited by the work being done in #2179. Specifically, I created a method called resetAndApplyForFuzzer that calls resetResults and applyOperations on a TransactionFrame, in order to bypass fee processing. I also manually construct a transaction envelope, replicating some of the code in TestAccount since I only have public keys and only care about the public keys of my pregenerated accounts.

Will move this to Ready for Review once the PR mentioned above and #2155 are complete.

src/ledger/LedgerTxn.cpp

src/test/fuzz.cpp

robertDurst · 2019-07-11T16:25:39Z

Rebased against master. With #2179 merged, going to make some changes here based on that PR and undraft later today.

robertDurst · 2019-07-12T00:07:12Z

This PR is now ready for review! It is rebased, commits logically organized, and has taken into consideration the comments above.

Areas of uncertainty:

Currently for initial setup, I pregenerate accounts. As it is currently implemented, I generate 16 public keys (reasons explained in comments) and add these to the ledger via creation of LedgerEntry objects. As I played around a bit with order book initial state setup, I do not think this elegantly scales to more complex state setup. Originally I had a FuzzAccount (subset of TestAccount) object and utilized TxTests which made this stuff a bit easier, however this comes with some overhead such as many methods having a REQUIRE macro that I do not want/can't use. Do note, this is only an issue for initial setup -- now that I pregenerate accounts within a predicable range, I can directly use fuzzed XDR's as input.
Not really commentary on the current PR, but related. I have discussed with Nicolas what makes a good initial corpus for fuzzing. Ideally this would be running genFuzz many, many (1000's/10000's) of times and then using AFL's corpus minimization tool afl-cmin. Thus, there is no reason to run this fuzz initial corpus generator every single time I want to fuzz; it may be worth having a repo or home for large, minimized initial corpuses.

robertDurst · 2019-07-13T03:52:21Z

Appears I broke something. Not fuzzing correctly anymore. Will work on fixing tomorrow.

robertDurst · 2019-07-15T16:28:53Z

My apologies for not finding the above bug before opening for review. Tested and everything works again as expected. Should be good for review again.

src/test/fuzz.cpp

src/test/FuzzerImpl.cpp

robertDurst · 2019-07-18T04:04:15Z

Ok, should be good for another review.

src/test/Fuzzer.h

src/test/fuzz.h

src/test/fuzz.cpp

src/test/FuzzerImpl.cpp

src/test/fuzz.cpp

src/test/FuzzerImpl.cpp

src/test/FuzzerImpl.h

robertDurst · 2019-07-25T04:44:09Z

Ok @jonjove and @marta-lokhova, I believe I fixed up the code to adhere to your feedback. Two places I am still not 100% sure is:

for db clearPreparedCache, I had this in two places. I removed the first one since I believe it makes most sense at the end, or at least as much sense as the beginning -- I also could not quite figure out why it had an affect on stability when placed before XDRStreamInput like it did; placed at the end it not only clears the statements between loops, but also makes stability happy.
for the --mode cli flag, I have duplicated code for parsing the fuzzerMode input. However I did this because I kept running into lambda capture by reference dangling reference errors when I tried to extract these to a function (this way of passing non-const references to multiple lambdas does not seem good, but its seems to be the way to do this based on the other command parsers).

I also cleaned up my sporadic commits. The first commit that addressed both y'all's feedback is (13be3d1f3fd2ff34f82fbe4ec252db3415bfa2d7) or the first one dated July 24.

src/main/CommandLine.cpp

src/test/FuzzerImpl.cpp

src/test/fuzz.cpp

src/test/FuzzerImpl.cpp

MonsieurNicolas · 2019-08-10T01:18:12Z

src/test/FuzzerImpl.cpp

+
+            // we need an extra db cache clear here since the db prepared
+            // statement cache may persist between tryRead loops
+            mApp->getDatabase().clearPreparedStatementCache();


resetTxInternalState should call clearPreparedStatementCache

MonsieurNicolas · 2019-08-10T01:20:18Z

src/test/FuzzerImpl.cpp

+
+            // we need an extra db cache clear here since the db prepared
+            // statement cache may persist between tryRead loops
+            mApp->getDatabase().clearPreparedStatementCache();


I suspect you need to call clearPreparedStatementCache after ltx goes out of scope just to be sure things are reset the way you want

src/test/FuzzerImpl.cpp

MonsieurNicolas · 2019-08-22T23:25:49Z

r+ a9468013d6a4c955dba8580bba96789803698946

robertDurst · 2019-08-22T23:47:38Z

@MonsieurNicolas accidentally turned off format for a commit, fixed now

MonsieurNicolas · 2019-08-22T23:52:12Z

r+ 60b38b41535e1d31c092ecbe633a92bab03f9659

robertDurst · 2019-08-23T00:32:08Z

My apologies -- the fuzzer utilized a method behind FUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION. Fixed.

…ate method for fuzzer

…ode args to cli

…mpls

MonsieurNicolas · 2019-08-23T16:46:37Z

r+ c2cd2fc

Introduce Transaction Subsystem Fuzzing and Common Fuzzer Interface Reviewed-by: MonsieurNicolas

MonsieurNicolas requested changes Jul 4, 2019

View reviewed changes

vogel reviewed Jul 9, 2019

View reviewed changes

src/ledger/LedgerTxn.cpp Outdated Show resolved Hide resolved

vogel reviewed Jul 9, 2019

View reviewed changes

src/test/fuzz.cpp Outdated Show resolved Hide resolved

robertDurst changed the title ~~[Dependent on #2155] Introduce Transaction Subsystem Fuzzing and Common Fuzzer Interface~~ Introduce Transaction Subsystem Fuzzing and Common Fuzzer Interface Jul 10, 2019

robertDurst marked this pull request as ready for review July 11, 2019 23:50

MonsieurNicolas requested changes Jul 17, 2019

View reviewed changes

eLmistiso approved these changes Jul 19, 2019

View reviewed changes

jonjove reviewed Jul 23, 2019

View reviewed changes

marta-lokhova reviewed Jul 24, 2019

View reviewed changes

robertDurst mentioned this pull request Jul 30, 2019

OrderBookIsNotCrossed invariant #2210

Closed

5 tasks

MonsieurNicolas requested changes Aug 10, 2019

View reviewed changes

MonsieurNicolas reviewed Aug 13, 2019

View reviewed changes

src/test/FuzzerImpl.cpp Show resolved Hide resolved

robertDurst mentioned this pull request Aug 13, 2019

update fuzz for tx fuzzer stellar-deprecated/docker-stellar-core#64

Merged

robertDurst mentioned this pull request Aug 22, 2019

Fuzzing docs and makefile update #2246

Merged

robertDurst added 3 commits August 23, 2019 09:39

[fuzz general]: allow fuzzer to ignore signature checks

655718b

[fuzz general]: add resetForFuzzer method to LedgerTxnRoot

e788842

[fuzz general]: create base fuzzer class

1cfb09c

robertDurst added 7 commits August 23, 2019 09:39

[fuzz overlay]: implement overlay fuzzer

e6e4204

[fuzz tx]: create FuzzTransactionFrame and define a resetTxInternalSt…

6257ece

…ate method for fuzzer

[fuzz tx]: implement tx fuzzer

15350c0

[fuzz tx]: autocheck to restrict public key range

6b26071

[fuzz general]: add createFuzzer method and add process_id & fuzzer m…

02bbf90

…ode args to cli

[fuzz general]: simplify fuzz.cpp and utilize overlay and tx fuzzer i…

05f314e

…mpls

[fuzz general]: update makefile for new fuzzer cli args

c2cd2fc

latobarita added a commit that referenced this pull request Aug 23, 2019

Merge pull request #2182 from robertDurst/transaction-fuzzing

e6e680c

Introduce Transaction Subsystem Fuzzing and Common Fuzzer Interface Reviewed-by: MonsieurNicolas

latobarita merged commit c2cd2fc into stellar:master Aug 23, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce Transaction Subsystem Fuzzing and Common Fuzzer Interface #2182

Introduce Transaction Subsystem Fuzzing and Common Fuzzer Interface #2182

robertDurst commented Jul 3, 2019

robertDurst commented Jul 9, 2019

robertDurst commented Jul 11, 2019

robertDurst commented Jul 12, 2019 •

edited

Loading

robertDurst commented Jul 13, 2019

robertDurst commented Jul 15, 2019

robertDurst commented Jul 18, 2019

robertDurst commented Jul 25, 2019 •

edited

Loading

MonsieurNicolas Aug 10, 2019

MonsieurNicolas Aug 10, 2019

MonsieurNicolas commented Aug 22, 2019

robertDurst commented Aug 22, 2019

MonsieurNicolas commented Aug 22, 2019

robertDurst commented Aug 23, 2019 •

edited

Loading

MonsieurNicolas commented Aug 23, 2019

Introduce Transaction Subsystem Fuzzing and Common Fuzzer Interface #2182

Introduce Transaction Subsystem Fuzzing and Common Fuzzer Interface #2182

Conversation

robertDurst commented Jul 3, 2019

Some Background

Description

Future Work

Generating transactions with more operations

Improving Fuzzer Efficiency

Note for Reviewers

Checklist

robertDurst commented Jul 9, 2019

robertDurst commented Jul 11, 2019

robertDurst commented Jul 12, 2019 • edited Loading

robertDurst commented Jul 13, 2019

robertDurst commented Jul 15, 2019

robertDurst commented Jul 18, 2019

robertDurst commented Jul 25, 2019 • edited Loading

MonsieurNicolas Aug 10, 2019

Choose a reason for hiding this comment

MonsieurNicolas Aug 10, 2019

Choose a reason for hiding this comment

MonsieurNicolas commented Aug 22, 2019

robertDurst commented Aug 22, 2019

MonsieurNicolas commented Aug 22, 2019

robertDurst commented Aug 23, 2019 • edited Loading

MonsieurNicolas commented Aug 23, 2019

robertDurst commented Jul 12, 2019 •

edited

Loading

robertDurst commented Jul 25, 2019 •

edited

Loading

robertDurst commented Aug 23, 2019 •

edited

Loading