-
Couldn't load subscription status.
- Fork 421
Upload CI generated fuzz corpus coverage to codecov #4153
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
👋 Thanks for assigning @TheBlueMatt as a reviewer! |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #4153 +/- ##
==========================================
+ Coverage 88.82% 89.17% +0.34%
==========================================
Files 180 180
Lines 137278 137271 -7
Branches 137278 137271 -7
==========================================
+ Hits 121944 122406 +462
+ Misses 12522 12257 -265
+ Partials 2812 2608 -204
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
dc493c2 to
fdf6799
Compare
contrib/generate_fuzz_coverage.sh
Outdated
| for target_dir in hfuzz_workspace/*; do | ||
| [ -d "$target_dir" ] || continue | ||
| src_name="$(basename "$target_dir")" | ||
| for dest in "$src_name" "${src_name%_target}"; do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think you need to copy into $src_name.
contrib/generate_fuzz_coverage.sh
Outdated
| mkdir -p "test_cases/$dest" | ||
| # Copy corpus files into the test_cases directory | ||
| find "$target_dir" -maxdepth 2 -type f \ | ||
| \( -path "$target_dir/CORPUS/*" -o -path "$target_dir/INPUT/*" -o -path "$target_dir/NEW/*" -o -path "$target_dir/input/*" \) \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because we're just looking in hfuzz_workspace, I believe we only need to look in input, not CORPUS, INPUT, or NEW.
| cargo clean | ||
| - name: Run fuzzers | ||
| run: cd fuzz && ./ci-fuzz.sh && cd .. | ||
| - name: Upload honggfuzz corpus |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than only uploading, is there a way to make this directory persistent so that we can keep it between fuzz jobs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if we really need to persist the directory here. My understanding is that the fuzz job runs on the latest code changes on every PR, so the generated corpus is tailored to the code changes on that PR. If we persist the corpus from a previous run and use that on a new run, won't that produce incorrect/misleading coverage data?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the point of the fuzz job is only to generate coverage data, but rather test the code :). Having a bit more coverage data from fuzzing than we "deserve" is okay, at least now that we split the coverage data out so that codecov shows fuzzing separately, and having persistent fuzzing corpus means our fuzzing is much more likely to catch issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, how long do you think we can have this directory persisted? The upload-artifact action have a retention-days input that can be used to persist the artifact for a while. The default is 90 days but can be adjusted (https://github.com/actions/upload-artifact?tab=readme-ov-file#retention-period).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the simple "upload-artifact" task just stores data for this CI run. What I was thinking is some kind of persistent directory that's shared across jobs so that each CI fuzz task picks up the latest directory, does some fuzzing, finds new test cases, then uploads a new copy with more tests in it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I was thinking is some kind of persistent directory that's shared across jobs so that each CI fuzz task picks up the latest directory, does some fuzzing, finds new test cases, then uploads a new copy with more tests in it.
Makes sense. I pushed eea2e4b to handle this using Github's cache action (https://github.com/actions/cache?tab=readme-ov-file).
contrib/generate_fuzz_coverage.sh
Outdated
| # Copy corpus files into the test_cases directory | ||
| find "$target_dir" -maxdepth 2 -type f \ | ||
| \( -path "$target_dir/CORPUS/*" -o -path "$target_dir/INPUT/*" -o -path "$target_dir/NEW/*" -o -path "$target_dir/input/*" \) \ | ||
| -print0 | xargs -0 -I{} cp -n {} "test_cases/$dest/" 2>/dev/null || true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| -print0 | xargs -0 -I{} cp -n {} "test_cases/$dest/" 2>/dev/null || true | |
| -print0 | xargs -0 -I{} cp -n {} "test_cases/$dest/" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Thank you.
contrib/generate_fuzz_coverage.sh
Outdated
| done | ||
| # Check if any files were actually imported | ||
| if [ -n "$(find test_cases -type f -print -quit 2>/dev/null)" ]; then | ||
| imported=1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure its worth the extra effort just to print differently.
|
👋 The first review has been submitted! Do you think this PR is ready for a second reviewer? If so, click here to assign a second reviewer. |
|
Thank you for the review. I've addressed all feedbacks and pushed a fixup here 1e4a7c5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Responded at #4153 (comment)
19c1495 to
eea2e4b
Compare
|
I rebased on EDIT: This seems to be blocking the build (and fuzzing as well). |
We have some complexity in `ci-fuzz.sh` to limit each fuzzer to a rough runtime, but `honggfuzz` has a `--run-time` argument that we can simply use instead, which we do here.
This now slows us down as we run our fuzz job on a machine with more than one or two cores.
|
Yea, sorry, CI is kinda a mess for three reasons all at once. Can you rebase on #4179? That should get at least the |
honggfuzz builds appear to fail on 1.75 - 1.79, so we just pick an MSRV of the first toolchain that actually works. Sadly, github for some reason drops a trailing in the env entry, so we copy the rust version a few places.
Because each CI job runs on a fresh runner and can't share data between jobs. We rely on Github Actions upload-artifact and download-artifact to share the CI generated fuzz corpus, then replay them in the `contrib/generate_fuzz_coverage.sh` script to generate the coverage report.
eea2e4b to
de9e1fd
Compare
Yes, done! Thank you. |
|
FWIW, remaining CI failures should be resolved shortly by #4180 |
| uses: actions/cache@v4 | ||
| with: | ||
| path: fuzz/hfuzz_workspace | ||
| key: fuzz-corpus-${{ github.ref }}-${{ github.sha }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this going to be per-pr? We don't want it to be per-pr we want it to be global.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this going to be per-pr? We don't want it to be per-pr we want it to be global.
Addressed this by adding a two-step logic to the workflow: a read-only per-pr step that seeds the fuzzer for a more effective run on PRs and a main branch step that does same but also writes to the global cache.
de9e1fd to
fcba095
Compare
fcba095 to
6cd3f8f
Compare
Following the work (#3718 and #3925) that introduced uploading coverage from no-corpus fuzzing runs into codecov in CI. This PR focuses on uploading the CI-generated fuzz corpus coverage into codecov in CI.
Closes #3926