Open
Description
Hello, as part of some research we analyzed fuzzer performance degradation by looking at the reasons why fuzzing coverage reduces for C/C++ projects in OSS-Fuzz. We found that there are several types of issues that are easier to detect by comparing to past reports.
I would be happy to implement these metrics if you are interested.
- Detecting coverage drops would be a generic way to detect degradation, this is already discussed here: idea: treat a major coverage drop an issue! google/oss-fuzz#11398. Here a threshold would need to be decided, maybe percentage or absolute number of lines.
- A common reason for large coverage drops is the vendoring of third-party library code, though, sometimes also project specific code. If you agree that library code should not be included in the coverage measurement, large changes should cause an alert and be ignored. See grpc-httpjson-transcoding as an example, which is by itself a few hundred lines of code with close to 100% coverage but vendored 100k lines of library code.
- Compare the fuzz targets over time. It sometimes happens that a project starts to have a partial build failure that only stops one (or few) fuzz target from building, while not necessarily causing a build failure issue to be created for the project. For example this happened with curl: idea: treat a major coverage drop an issue! google/oss-fuzz#11398 (comment)
- The number of corpus entries is normally quite stable. But due to the way coverage is collected can fluctuate and drop to a fraction of the real size: Reported coverage results do not match corpus google/oss-fuzz#12986 and Understanding inconsistent coverage reports google/oss-fuzz#11935. So this could be detected by looking at past corpus sizes. Though, if I understand correctly the seed corpus is combined across fuzz targets? Alternatively, a expected number of corpus entries for covered code branches/lines could be decided. For example covering 10k lines with five corpus entries does not seem like effective fuzzing.
This is also related to diffing runs: #734
I can also provide more examples if you want, just wanted to keep it short.
Metadata
Metadata
Assignees
Labels
No labels