Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add timeoffsets inputs folder #178

Merged
merged 1 commit into from May 13, 2024

Conversation

stickies-v
Copy link
Contributor

@stickies-v stickies-v commented Mar 18, 2024

Draft until bitcoin/bitcoin#29623 is merged.

This fuzz target is largely the same as the timedata target, but
both are kept to allow continued testing of both for now. In the
future, the timedata input folder may be removed.

@maflcko
Copy link
Contributor

maflcko commented Mar 19, 2024

For git, renaming and creating a copy is the same cost (close to zero). So I think it would be better to create a copy, to allow the old fuzz target and fuzz inputs to continue to exist for a while, as there is only one branch in this repo.

Moreover, better than creating a copy of a single folder would be to pick fuzz inputs from all other folders with the coverage algorithm. This way the cost is still close to zero, but you'll likely get more coverage.

Finally, the best would be to do the previous step, and then on top, let the fuzz engine run for a bit on an empty folder, as well as a the populated one, and then merge both results into a fresh folder and submit that.

In any case, probably doesn't matter for this fuzz target, so lgtm.

@stickies-v stickies-v changed the title [WIP] Rename timedata folder to timeoffsets [WIP] Add timeoffsets inputs folder Mar 19, 2024
@stickies-v
Copy link
Contributor Author

Rebased after #177 was merged.

Thanks for your comprehensive explanation, maflcko.

So I think it would be better to create a copy

Updated my approach to create a copy instead.

Moreover, better than creating a copy of a single folder would be to pick fuzz inputs from all other folders with the coverage algorithm. This way the cost is still close to zero, but you'll likely get more coverage.

I'd be happy to try this but unfortunately am a bit out of my depth here and could not find relevant instructions on the documentation in this repo, or online. If you have any pointers for me to look at that would be helpful, but also I'm happy to keep it at the current approach if that's not worth the hassle.

@maflcko
Copy link
Contributor

maflcko commented Mar 19, 2024

lgtm

@murchandamus
Copy link
Contributor

Moreover, better than creating a copy of a single folder would be to pick fuzz inputs from all other folders with the coverage algorithm. This way the cost is still close to zero, but you'll likely get more coverage.

I'd be happy to try this but unfortunately am a bit out of my depth here and could not find relevant instructions on the documentation in this repo, or online. If you have any pointers for me to look at that would be helpful, but also I'm happy to keep it at the current approach if that's not worth the hassle.

It’s possible to provide multiple source directories to the fuzzer. The first directory you provide is both a source and also the destination of newly added seeds. When you create a new target, you can scrounge up the seeds from all targets to see if any of them improve the coverage of your new target, e.g. like this:

FUZZ=timeoffsets src/test/fuzz/fuzz -set_cover_merge=1 -shuffle=0 -prefer_small=1 -use_value_profile=0 ../qa-assets/fuzz_seed_corpus/timeoffsets ../qa-assets/fuzz_seed_corpus/*

@murchandamus
Copy link
Contributor

On trying this, it might be better to go with merge=1 instead of set_cover_merge=1 when sourcing that many seeds for an initial corpus. set_cover_merge keeps running out of memory for me towards the end and is progressing with single digit executions per second, whereas a regular merge seems to be progressing much quicker. The set could then be reduced further via a set_cover_merge afterwards, but that’s probably not necessary for an initial corpus.

@murchandamus
Copy link
Contributor

I got MERGE-OUTER: 65 new files with 1945 new features added; 62 new coverage edges with set_cover_merge

@stickies-v
Copy link
Contributor Author

Thanks for the guidance @murchandamus, that was very helpful. I've updated my approach to start with seeds from all targets, then run the fuzzer on an empty dir, and then finally merge both together.

From the existing seeds, I got:
MERGE-OUTER: 109 new files with 5720 new features added; 726 new coverage edges

After merging with a fresh run, I got:
MERGE-OUTER: 37 new files with 277 new features added; 12 new coverage edges

I've summarized my actions (and outputs) below.

Starting with an empty timeoffsets dir:

$ FUZZ=timeoffsets src/test/fuzz/fuzz -set_cover_merge=1 -shuffle=0 -prefer_small=1 -use_value_profile=0 ../../bitcoin-core-qa-assets/fuzz_seed_corpus/timeoffsets ../../bitcoin-core-qa-assets/fuzz_seed_corpus/*
...
MERGE-OUTER: successful in 1 attempt(s)
MERGE-OUTER: the control file has 387517107 bytes
MERGE-OUTER: consumed 193Mb (830Mb rss) to parse the control file
MERGE-OUTER: 109 new files with 5720 new features added; 726 new coverage edges

Then, on a fresh timeoffsets-emptydir:

$ FUZZ=timeoffsets src/test/fuzz/fuzz -shuffle=0 -prefer_small=1 ../../bitcoin-core-qa-assets/fuzz_seed_corpus/timeoffsets-empty
...
#2324306	REDUCE cov: 928 ft: 5864 corp: 669/292Kb lim: 4096 exec/s: 543 rss: 4714Mb L: 201/4092 MS: 3 ChangeASCIIInt-CMP-EraseBytes- DE: "7s3\003\001\000\000\000"-
#2333777	REDUCE cov: 928 ft: 5864 corp: 669/291Kb lim: 4096 exec/s: 542 rss: 4714Mb L: 1768/4092 MS: 1 EraseBytes-
#2342448	REDUCE cov: 928 ft: 5864 corp: 669/291Kb lim: 4096 exec/s: 541 rss: 4714Mb L: 284/4092 MS: 1 EraseBytes-

And finally merging the 2 together:

$ FUZZ=timeoffsets src/test/fuzz/fuzz -set_cover_merge=1 -shuffle=0 -prefer_small=1 -use_value_profile=0 ../../bitcoin-core-qa-assets/fuzz_seed_corpus/timeoffsets ../../bitcoin-core-qa-assets/fuzz_seed_corpus/timeoffsets-empty
...
MERGE-OUTER: successful in 1 attempt(s)
MERGE-OUTER: the control file has 4547882 bytes
MERGE-OUTER: consumed 2Mb (82Mb rss) to parse the control file
MERGE-OUTER: 37 new files with 277 new features added; 12 new coverage edges

@stickies-v
Copy link
Contributor Author

One more force-push, thanks for bearing with me here.

Turns out the timeoffsets harness had a memory leak. That's now fixed, so I've recreated the inputs folder using the same process.

@murchandamus
Copy link
Contributor

Sounds like a good initial set to me.

@stickies-v
Copy link
Contributor Author

stickies-v commented Mar 22, 2024

Would you like me to schedule a cronjob to fuzz that for 12 hours with 12 processors early next week?

Thanks for offering, that'd be very helpful but I think it's probably best to wait until the underlying pull (bitcoin/bitcoin#29623) is merged?

I'll take this PR out of draft once that's done.

@murchandamus
Copy link
Contributor

Yeah, let’s first get the code in, then throw CPU at increasing the fuzz coverage. :)

@murchandamus
Copy link
Contributor

@stickies-v: I saw #29623 get merged, I got a new computer that I’m itching to fuzz something with ;)

@stickies-v stickies-v marked this pull request as ready for review May 9, 2024 14:23
@stickies-v
Copy link
Contributor Author

I got a new computer that I’m itching to fuzz something with ;)

I wouldn't want to withhold you from that pleasure - thank you very much 🥳

@murchandamus
Copy link
Contributor

Tested this locally, has a pretty good starting coverage. LGTM.

@stickies-v stickies-v changed the title [WIP] Add timeoffsets inputs folder Add timeoffsets inputs folder May 9, 2024
@dergoegge dergoegge merged commit f7e12aa into bitcoin-core:main May 13, 2024
4 checks passed
@stickies-v stickies-v deleted the rename-timedata-timeoffsets branch May 13, 2024 10:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants