Automated fuzz target filtering #185

DonggeLiu · 2024-03-27T03:50:32Z

No description provided.

three types of cases are handled: - driver that has no coverage increase - false positives by crashing at INITED or first few rounds - false positives by crashing at `LLVMFuzzerTestOneInput`

DonggeLiu · 2024-03-27T23:18:46Z

JOB: https://console.cloud.google.com/kubernetes/job/us-central1-c/llm-experiment/default/ofg-pr-185-dg
REPORT: https://llm-exp.oss-fuzz.com/Result-reports/ofg-pr/2024-03-28-185-dg-comparison/index.html
BUCKET: https://console.cloud.google.com/storage/browser/oss-fuzz-gcb-experiment-run-logs/Result-reports/ofg-pr/2024-03-28-185-dg-comparison
BUCKET GS: gs://oss-fuzz-gcb-experiment-run-logs/Result-reports/ofg-pr/2024-03-28-185-dg-comparison

DonggeLiu · 2024-03-27T23:19:10Z

@happy-qop Started the experiment here.
The report above shows your filter already helped filter out many invalid crashes. Thanks!
(Please do let me know if you cannot access it.)

Several things I wanted to do before merging this to maximize the impact of your filters:

Surface filter categories in the report. The report only shows crash status as a boolean value, it would be more reader-friendly to show why we filtered each crash (e.g., FP_CRASH_NEAR_INIT).
Run a full experiment. Once the above is done, we are ready for a full experiment with benchmarks in the all/ benchmark set.

I can work on both and let you know once the experiment starts.
The following four days are holidays in Australia, so I might be late in replies / commits, hope you won't mind :)

happy-qop · 2024-03-28T05:18:02Z

Glad to hear that it helps!

No, I cannot access these crash reports~

Surface filter categories in the report.

I would like to do this but not sure what is the preferred implementation way to align with your following workflow such as how your experiment scripts work with the new classification data. Besides, this classification information also highly relates with the discussed fix prompt strategies.
If you can hint on this, I can adapt my existing code such as further classification & fix prompts to here in the next few days (perhaps another PR).

The following four days are holidays in Australia, so I might be late in replies / commits, hope you won't mind :)

Have fun for your holiday!

DonggeLiu · 2024-03-28T05:41:39Z

No, I cannot access these crash reports~

What's your preferred email address to access them?
We can test adding you after returning from holiday (the coming Tuesday).

I would like to do this but not sure what is the preferred implementation way to align with your following workflow such as how your experiment scripts work with the new classification data.

I am thinking to start with a very preliminary changes:

Replace the current crash with your is_driver_fuzz_err.
Show the FP reason in the report HTML.

Given you cannot see the report now, it might be quicker for me to do it.

Besides, this classification information also highly relates with the discussed fix prompt strategies. If you can hint on this, I can adapt my existing code such as further classification & fix prompts in the next few days.

Are there any other more sophisticated changes you'd like to propose?
I'd love to hear more : )

happy-qop · 2024-03-28T05:54:49Z

What's your preferred email address to access them?

The gmail used for the meeting should be fine.

I am thinking to start with a very preliminary changes:

Replace the current crash with your is_driver_fuzz_err.

Show the FP reason in the report HTML.

Cool! Please go ahead for this and I can learn from your changes for further implementation.

Are there any other more sophisticated changes you'd like to propose?

The error filtered here is of driver runtime error, identifying them and proposing corresponding fix prompts is part of the workflow of implementing error type specific fix prompts (FIX_FUZZ_XXX at here).

DonggeLiu · 2024-03-28T06:23:55Z

Push the first version here: #191

I may adjust it a bit and will update you once it is ready : )

The error filtered here is of driver runtime error, identifying them and proposing corresponding fix prompts is part of the workflow of implementing error type specific fix prompts (FIX_FUZZ_XXX at here).

I see, that seems to be a big change indeed.
Let's do that in a separate PR later, then!

Show crash type in HTML reports and JSON summary.

oliverchang · 2024-04-03T06:04:43Z

experiment/evaluator.py

+          break
+
+    else:
+      # Another error driver case: no cov increase.


note: I believe we have seen cases where the fuzz target is legitimate even when there is no cov increase. This is often when a new target fuzzes a very buggy function that has never been fuzzed before, with a very shallow bug that is instantly triggered.

@DonggeLiu to confirm if this is an issue.

We may want to think about some kind of confidence score here instead of a binary yes/no for filtering crashes/fuzz targets.

Sure, I will check with old known bugs.

@oliverchang @DonggeLiu Is it possible to also share the example cases with me, I'm interested in these cases and wondering if I also can contribute on improving this.

Certainly! Thanks soooo much for volunteering!
These are fuzz targets and the crash logs:

OOB access in parse_string DaveGamble/cJSON#800

OOB access in plist_from_memory libimobiledevice/libplist#244

Stack-buffer-overflow in AffixMgr::compound_check_morph hunspell/hunspell#996

https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=67497

They are documented at our README.md : )

BTW, would it be possible to capture LeakSanitizer: detected memory leaks?
They are usually not security-related and caused by the fuzz target, it would be useful to filter them (or give them a low confidence score). E.g.,
https://llm-exp.oss-fuzz.com/Result-reports/ofg-pr/2024-04-03-198-dg-comparison/sample/output-hiredis-rediscommand/01

Of course! I had that idea when implementing the initial version error driver filtering, but leave it as future work since I'm not sure what is your expected way for handling LEAK cases (link). I'll handle that together.

That's fantastic, much appreciated!

Create dummy

1de802a

DonggeLiu added the Experiment-only A PR only to run experiments, do not merge it to main. label Mar 27, 2024

detect and filter incorrect fuzz drivers (#187)

5028210

three types of cases are handled: - driver that has no coverage increase - false positives by crashing at INITED or first few rounds - false positives by crashing at `LLVMFuzzerTestOneInput`

Delete dummy

580b717

Surface crash category (#191)

c8da576

Show crash type in HTML reports and JSON summary.

DonggeLiu changed the title ~~[DO NOT MERGE] Experiment with automated fuzz target filtering~~ Automated fuzz target filtering Mar 28, 2024

DonggeLiu merged commit b4928d2 into main Mar 28, 2024
3 checks passed

DonggeLiu deleted the DonggeLiu-patch-2 branch March 28, 2024 12:26

oliverchang reviewed Apr 3, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Automated fuzz target filtering #185

Automated fuzz target filtering #185

DonggeLiu commented Mar 27, 2024

DonggeLiu commented Mar 27, 2024 •

edited

Loading

DonggeLiu commented Mar 27, 2024 •

edited

Loading

happy-qop commented Mar 28, 2024 •

edited

Loading

DonggeLiu commented Mar 28, 2024

happy-qop commented Mar 28, 2024 •

edited

Loading

DonggeLiu commented Mar 28, 2024

oliverchang Apr 3, 2024

DonggeLiu Apr 3, 2024 •

edited

Loading

happy-qop Apr 4, 2024

DonggeLiu Apr 4, 2024

DonggeLiu Apr 4, 2024

happy-qop Apr 4, 2024 •

edited

Loading

DonggeLiu Apr 4, 2024

Automated fuzz target filtering #185

Automated fuzz target filtering #185

Conversation

DonggeLiu commented Mar 27, 2024

DonggeLiu commented Mar 27, 2024 • edited Loading

DonggeLiu commented Mar 27, 2024 • edited Loading

happy-qop commented Mar 28, 2024 • edited Loading

DonggeLiu commented Mar 28, 2024

happy-qop commented Mar 28, 2024 • edited Loading

DonggeLiu commented Mar 28, 2024

oliverchang Apr 3, 2024

Choose a reason for hiding this comment

DonggeLiu Apr 3, 2024 • edited Loading

Choose a reason for hiding this comment

happy-qop Apr 4, 2024

Choose a reason for hiding this comment

DonggeLiu Apr 4, 2024

Choose a reason for hiding this comment

DonggeLiu Apr 4, 2024

Choose a reason for hiding this comment

happy-qop Apr 4, 2024 • edited Loading

Choose a reason for hiding this comment

DonggeLiu Apr 4, 2024

Choose a reason for hiding this comment

DonggeLiu commented Mar 27, 2024 •

edited

Loading

DonggeLiu commented Mar 27, 2024 •

edited

Loading

happy-qop commented Mar 28, 2024 •

edited

Loading

happy-qop commented Mar 28, 2024 •

edited

Loading

DonggeLiu Apr 3, 2024 •

edited

Loading

happy-qop Apr 4, 2024 •

edited

Loading