Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce Transaction Subsystem Fuzzing and Common Fuzzer Interface #2182

Merged
merged 10 commits into from
Aug 23, 2019
Merged

Introduce Transaction Subsystem Fuzzing and Common Fuzzer Interface #2182

merged 10 commits into from
Aug 23, 2019

Conversation

robertDurst
Copy link
Contributor

Some Background

Fuzzing is a test strategy that performs automated binary executions with randomly generated/mutated inputs, monitoring for program crashes. It is useful for increasing test coverage by discovery of obscure, unthought-of, and interesting edge cases.

The basic process works as so:

**Input**: an initial corpus of well formed data
1. Inputs fed into a program
2. Fuzzer derives more inputs (based on some gathered metrics like unique execution path)
3. Repeat
**Output**: inputs that caused crashes

There are many fuzz testing tools out there such as libFuzzer, AFL, and Peach Fuzzer. For the time being, I have implemented a test harness to work with AFL.

Description

This builds on #2155, introducing a transaction subsystem fuzzer.

A significant portion of the work here was integrating the introduced transaction fuzzer with AFL's persistent mode. Persistent mode allows for multiple executions of the binary in a single long-lived process, significantly improving executions per second since the expensive state setup can be limited to every 1,000th, or even 1,000,000th execution. Areas where determinism improved include:

  • foregoing signature verification (we aren't breaking ED25519 here)
  • clearing caches
  • seeding randomness

This PR also defines a common interface for fuzzing various subsystems, allowing one to utilize a CLI flag to switch between Overlay and Transaction fuzzing.

Finally, this PR includes an option to specify process_id, allowing for easy parallelization of fuzzing (as supported by AFL).

Further progress of #1376

Future Work

Generating transactions with more operations

To demonstrate functionality, I restricted the fuzzer to a subset of operations: ACCOUNT_MERGE and PAYMENT of native assets. Running against protocol 4, the fuzzer discovered the double merge bug described here. This set will be expanded as I extend the fuzz test harness methodically, focusing first on discovery of the rest of the historical bugs in chronological order.

Improving Fuzzer Efficiency

I also spent some time thinking about how to meaningfully and unobstructively limit the fuzz space. For example, I use a small set of pre-generated accounts for operations. Since I do this, and cause a disconnect between what the fuzzer fuzzes and what is actually fed to Stellar-core, cpu time spent incrementing and flipping bits in the AccountID property of XDR's is wasted time. Thus, I feed the fuzzer an initial corpus of operations with all bytes of the AccountID zero'd out except for the 32nd byte to nudge it in the right direction.


Note for Reviewers

Dependent on and rebased against #2155.

Checklist

  • Rebased on top of master (no merge commits)
  • Ran clang-format v5.0.0 (via make format or the Visual Studio extension)
  • Compiles
  • Ran all tests

src/ledger/LedgerTxn.cpp Outdated Show resolved Hide resolved
src/ledger/LedgerTxn.cpp Outdated Show resolved Hide resolved
src/main/CommandLine.cpp Outdated Show resolved Hide resolved
src/main/CommandLine.cpp Outdated Show resolved Hide resolved
src/test/fuzz.cpp Show resolved Hide resolved
src/test/fuzz/FuzzerImpl.cpp Outdated Show resolved Hide resolved
src/test/fuzz/deserialize.cpp Outdated Show resolved Hide resolved
src/test/fuzz/fuzz.cpp Outdated Show resolved Hide resolved
src/test/fuzz/fuzz.cpp Outdated Show resolved Hide resolved
src/xdr/Fuzz-test.x Outdated Show resolved Hide resolved
@robertDurst
Copy link
Contributor Author

More or less addressed everything above. Some of the hacky-est parts here will be benefited by the work being done in #2179. Specifically, I created a method called resetAndApplyForFuzzer that calls resetResults and applyOperations on a TransactionFrame, in order to bypass fee processing. I also manually construct a transaction envelope, replicating some of the code in TestAccount since I only have public keys and only care about the public keys of my pregenerated accounts.

Will move this to Ready for Review once the PR mentioned above and #2155 are complete.

src/ledger/LedgerTxn.cpp Outdated Show resolved Hide resolved
src/test/fuzz.cpp Outdated Show resolved Hide resolved
@robertDurst robertDurst changed the title [Dependent on #2155] Introduce Transaction Subsystem Fuzzing and Common Fuzzer Interface Introduce Transaction Subsystem Fuzzing and Common Fuzzer Interface Jul 10, 2019
@robertDurst
Copy link
Contributor Author

Rebased against master. With #2179 merged, going to make some changes here based on that PR and undraft later today.

@robertDurst robertDurst marked this pull request as ready for review July 11, 2019 23:50
@robertDurst
Copy link
Contributor Author

robertDurst commented Jul 12, 2019

This PR is now ready for review! It is rebased, commits logically organized, and has taken into consideration the comments above.

Areas of uncertainty:

  • Currently for initial setup, I pregenerate accounts. As it is currently implemented, I generate 16 public keys (reasons explained in comments) and add these to the ledger via creation of LedgerEntry objects. As I played around a bit with order book initial state setup, I do not think this elegantly scales to more complex state setup. Originally I had a FuzzAccount (subset of TestAccount) object and utilized TxTests which made this stuff a bit easier, however this comes with some overhead such as many methods having a REQUIRE macro that I do not want/can't use. Do note, this is only an issue for initial setup -- now that I pregenerate accounts within a predicable range, I can directly use fuzzed XDR's as input.
  • Not really commentary on the current PR, but related. I have discussed with Nicolas what makes a good initial corpus for fuzzing. Ideally this would be running genFuzz many, many (1000's/10000's) of times and then using AFL's corpus minimization tool afl-cmin. Thus, there is no reason to run this fuzz initial corpus generator every single time I want to fuzz; it may be worth having a repo or home for large, minimized initial corpuses.

@robertDurst
Copy link
Contributor Author

Appears I broke something. Not fuzzing correctly anymore. Will work on fixing tomorrow.

@robertDurst
Copy link
Contributor Author

My apologies for not finding the above bug before opening for review. Tested and everything works again as expected. Should be good for review again.

src/test/fuzz.cpp Outdated Show resolved Hide resolved
src/test/fuzz.cpp Outdated Show resolved Hide resolved
src/test/fuzz.cpp Outdated Show resolved Hide resolved
src/test/fuzz.cpp Outdated Show resolved Hide resolved
src/test/fuzz.cpp Outdated Show resolved Hide resolved
src/test/fuzz.cpp Outdated Show resolved Hide resolved
src/test/fuzz.cpp Outdated Show resolved Hide resolved
src/test/FuzzerImpl.cpp Outdated Show resolved Hide resolved
src/test/FuzzerImpl.cpp Outdated Show resolved Hide resolved
src/test/FuzzerImpl.cpp Outdated Show resolved Hide resolved
@robertDurst
Copy link
Contributor Author

Ok, should be good for another review.

src/test/Fuzzer.h Outdated Show resolved Hide resolved
src/test/Fuzzer.h Outdated Show resolved Hide resolved
src/test/fuzz.h Outdated Show resolved Hide resolved
src/test/fuzz.h Outdated Show resolved Hide resolved
src/test/fuzz.cpp Outdated Show resolved Hide resolved
src/test/FuzzerImpl.cpp Outdated Show resolved Hide resolved
src/test/FuzzerImpl.cpp Outdated Show resolved Hide resolved
src/test/FuzzerImpl.cpp Outdated Show resolved Hide resolved
src/test/FuzzerImpl.cpp Outdated Show resolved Hide resolved
src/test/FuzzerImpl.cpp Outdated Show resolved Hide resolved
src/test/FuzzerImpl.cpp Outdated Show resolved Hide resolved
src/test/FuzzerImpl.cpp Outdated Show resolved Hide resolved
src/test/FuzzerImpl.cpp Outdated Show resolved Hide resolved
src/test/FuzzerImpl.cpp Outdated Show resolved Hide resolved
src/test/FuzzerImpl.cpp Outdated Show resolved Hide resolved
src/test/fuzz.cpp Show resolved Hide resolved
src/test/FuzzerImpl.cpp Outdated Show resolved Hide resolved
src/test/FuzzerImpl.h Outdated Show resolved Hide resolved
src/test/FuzzerImpl.h Outdated Show resolved Hide resolved
@robertDurst
Copy link
Contributor Author

robertDurst commented Jul 25, 2019

Ok @jonjove and @marta-lokhova, I believe I fixed up the code to adhere to your feedback. Two places I am still not 100% sure is:

  • for db clearPreparedCache, I had this in two places. I removed the first one since I believe it makes most sense at the end, or at least as much sense as the beginning -- I also could not quite figure out why it had an affect on stability when placed before XDRStreamInput like it did; placed at the end it not only clears the statements between loops, but also makes stability happy.
  • for the --mode cli flag, I have duplicated code for parsing the fuzzerMode input. However I did this because I kept running into lambda capture by reference dangling reference errors when I tried to extract these to a function (this way of passing non-const references to multiple lambdas does not seem good, but its seems to be the way to do this based on the other command parsers).

I also cleaned up my sporadic commits. The first commit that addressed both y'all's feedback is (13be3d1f3fd2ff34f82fbe4ec252db3415bfa2d7) or the first one dated July 24.

@robertDurst robertDurst mentioned this pull request Jul 30, 2019
5 tasks
src/main/CommandLine.cpp Outdated Show resolved Hide resolved
src/main/CommandLine.cpp Outdated Show resolved Hide resolved
src/test/FuzzerImpl.cpp Outdated Show resolved Hide resolved
src/test/fuzz.cpp Outdated Show resolved Hide resolved
src/test/fuzz.cpp Outdated Show resolved Hide resolved
src/test/FuzzerImpl.cpp Outdated Show resolved Hide resolved
src/test/FuzzerImpl.cpp Show resolved Hide resolved
src/test/FuzzerImpl.cpp Outdated Show resolved Hide resolved

// we need an extra db cache clear here since the db prepared
// statement cache may persist between tryRead loops
mApp->getDatabase().clearPreparedStatementCache();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resetTxInternalState should call clearPreparedStatementCache


// we need an extra db cache clear here since the db prepared
// statement cache may persist between tryRead loops
mApp->getDatabase().clearPreparedStatementCache();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect you need to call clearPreparedStatementCache after ltx goes out of scope just to be sure things are reset the way you want

@MonsieurNicolas
Copy link
Contributor

r+ a9468013d6a4c955dba8580bba96789803698946

@robertDurst
Copy link
Contributor Author

@MonsieurNicolas accidentally turned off format for a commit, fixed now

@MonsieurNicolas
Copy link
Contributor

r+ 60b38b41535e1d31c092ecbe633a92bab03f9659

@robertDurst
Copy link
Contributor Author

robertDurst commented Aug 23, 2019

My apologies -- the fuzzer utilized a method behind FUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION. Fixed.

@MonsieurNicolas
Copy link
Contributor

r+ c2cd2fc

latobarita added a commit that referenced this pull request Aug 23, 2019
Introduce Transaction Subsystem Fuzzing and Common Fuzzer Interface

Reviewed-by: MonsieurNicolas
@latobarita latobarita merged commit c2cd2fc into stellar:master Aug 23, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants