Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support specifying which fuzzers to include in the report #79

Open
lszekeres opened this issue Mar 9, 2020 · 3 comments
Open

Support specifying which fuzzers to include in the report #79

lszekeres opened this issue Mar 9, 2020 · 3 comments
Labels
enhancement New feature or request

Comments

@lszekeres
Copy link
Member

One often wants to see a report that compares only two or just a few fuzzers. E.g., compare only

  • afl with libfuzzer, or
  • two different versions of the same fuzzer, or
  • only fuzzers with dynamic binary instrumentation, etc.

This can be supported by adding a --fuzzers flag to the report generator, where users can list the fuzzers they would like the report to compare. We should allow specifying fuzzers with different versions, by either tagging them with their version number or perhaps with an experiment name. E.g., generate_report --fuzzers afl:v2.5 afl:v2.4, or generate_report --fuzzers afl:experiment-2020-01-15 afl:experiment-2020-02-15.

This is also useful to have because specifying a smaller/different subset of fuzzers can affect the result of the top level statistical analysis (Friedman test, critical difference plot). Further, when we only compare two fuzzer, we can even do more precise statistical tests with more specific visualizations. On the benchmark level, we don't need a pairwise comparison matrix of statistical significance, since we only have a single pair. On the experiment level, we don't need to use Friedman/Nemenyi test (which compares more than two fuzzers, and is rather conservative), but we can use the Wilcoxon signed-rank test, which is specifically designed for comparing two things (i.e., matched samples for the different benchmarks).

@jonathanmetzman
Copy link
Contributor

This issue is fixed.

@lszekeres
Copy link
Member Author

This issue is fixed.

Not everything is fixed yet. There's also:

  1. Support version/experiment tagging, e.g.,

--fuzzers afl:v2.5 afl:v2.4.

and

  1. Different report when we're comparing only two things:

when we only compare two fuzzer, we can even do more precise statistical tests with more specific visualizations. On the benchmark level, we don't need a pairwise comparison matrix of statistical significance, since we only have a single pair. On the experiment level, we don't need to use Friedman/Nemenyi test (which compares more than two fuzzers, and is rather conservative), but we can use the Wilcoxon signed-rank test, which is specifically designed for comparing two things (i.e., matched samples for the different benchmarks).

Should we reopen this issue, or would you prefer creating separate issues for these?

@jonathanmetzman
Copy link
Contributor

My bad, reopening is fine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants