Absolute coverage numbers aren't painting the whole picture: Different fuzzers may reach the same total coverage, but cover different parts of the program. Or, fuzzer (A) may reach more coverage than fuzzer (B), but (B) may reach coverage that (A) doesn't reach — so it's still valuable.
There are different ways of measuring differential coverage:
- Leonelli et al. introduced
relcovin their paper on TwinFuzz1. It allows comparing the relative performance of one fuzzer against another and provides values$0\leq x\leq 1$ . - Alternatively,
relscorewas proposed in the SBFT’25 Competition Report2 by Crump et al. It provides a total ordering of a list of fuzzers.
relcov was defined as the following1:
This can be used in various ways, such as:
The reliability of a fuzzer relcov values between each trial
The performance of a fuzzer relcov of the coverage of the input corpus
The relative performance of a fuzzer relcov values between each trial
relscore was defined as the following2:
The score of a fuzzer is then
This can be simplified to the following:
Disclaimer: This is subject to change.
Assume <campaign_dir> is a directory of subdirectories for each fuzzer, where each fuzzer subdirectory contains coverage files for each trial. Currently, the output format of afl-showmap is supported (lines with edge_id:count, where we only care if count > 0).
PRs for other data formats welcome :)
All commands take the same campaign directory layout: one subdirectory per fuzzer, each with afl-showmap coverage files. Use -o csv for machine-readable output.
differential-coverage --helpusage: differential-coverage [-h] [-x FUZZER] [--output FORMAT] command ...
Compute differential coverage relscore and relcov-based measures from afl-showmap coverage. All commands take one campaign directory: one subdir per fuzzer, each with coverage files.
options:
-h, --help show this help message and exit
-x, --exclude-fuzzer FUZZER
Exclude a fuzzer (subdirectory name) from the analysis. Can be specified multiple times.
--output, -o FORMAT Output format: csv for CSV (default: print to stdout).
subcommands:
command
relscore Compute relscore (SBFT'25) from a campaign directory containing one subdirectory per fuzzer with afl-showmap files.
relcov-reliability Compute relcov-based reliability for each fuzzer.
relcov-performance-fuzzer
Compute relcov-based performance of each fuzzer relative to reference fuzzers. By default prints a table (all fuzzers × all fuzzers as reference). Use --single to get scores for one reference only.
relcov-performance-corpus
Compute relcov-based performance of each fuzzer relative to input corpora. By default prints a table (all fuzzers × all single-trial corpus fuzzers). Use --single to get scores for one corpus only.
Load a campaign from disk and run the same computations programmatically. The API lives in differential_coverage.api:
from pathlib import Path
from differential_coverage.api import (
read_campaign,
run_relscore,
run_relcov_reliability,
run_relcov_performance_fuzzer,
run_relcov_performance_fuzzer_all,
run_relcov_performance_corpus,
run_relcov_performance_corpus_all,
)
path = Path("path/to/campaign_dir")
campaign = read_campaign(path) # dict[fuzzer_name, dict[trial_id, dict[edge_id, count]]]
# Relscore (SBFT'25): fuzzer -> score
scores = run_relscore(campaign)
# Relcov reliability: fuzzer -> value in [0, 1]
reliability = run_relcov_reliability(campaign)
# Relcov performance vs a reference fuzzer (reference excluded from result)
perf = run_relcov_performance_fuzzer(campaign, against="fuzzer_c")
# Relcov performance vs every fuzzer as reference: (ref_fuzzers, table[row][col])
ref_fuzzers, table = run_relcov_performance_fuzzer_all(campaign)
# Relcov performance over input corpus (corpus fuzzer must have exactly one trial)
perf_corpus = run_relcov_performance_corpus(campaign, input_corpus_fuzzer="seeds")
# Relcov performance over every single-trial corpus: (corpus_fuzzers, table[row][col])
corpus_fuzzers, corpus_table = run_relcov_performance_corpus_all(campaign)All run_* functions take a campaign (in-memory map: fuzzer name → trial id → edge id → count). Use read_campaign(path) to build it from a directory. The single-reference functions return dict[str, float] (fuzzer name → value). The *_all functions return (list_of_refs, table) where table[row_fuzzer][ref_fuzzer] is the score (1.0 on the diagonal). run_relcov_performance_fuzzer and run_relcov_performance_corpus raise ValueError if the reference or input-corpus fuzzer is missing or (for corpus) has more than one trial.
pip install .Install dev dependencies and set up the pre-commit hook (runs a couple of checks before committing):
pip install -e ".[dev]"
pre-commit installFootnotes
-
TWINFUZZ: Differential Testing of Video Hardware Acceleration Stacks, https://www.ndss-symposium.org/ndss-paper/twinfuzz-differential-testing-of-video-hardware-acceleration-stacks/ ↩ ↩2
-
SBFT’25 Competition Report — Fuzzing Track, https://ieeexplore.ieee.org/document/11086561 ↩ ↩2