Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

better heuristics for detecting un-instrumented code than raw edge count #2469

Closed
inferno-chromium opened this issue May 28, 2019 · 8 comments

Comments

@inferno-chromium
Copy link
Collaborator

See WizardMac/ReadStat#181 (comment)
also, we have a bunch of hacks accumulating in https://github.com/google/oss-fuzz/blob/master/infra/base-images/base-runner/test_all#L121

@evanmiller
Copy link
Contributor

Can you use sancov to count the number of covered functions?

https://clang.llvm.org/docs/SanitizerCoverage.html#default-implementation

@jonathanmetzman
Copy link
Contributor

Can you use sancov to count the number of covered functions?

Wouldn't this suffer from a similar problem? That if we pick an arbitrary number, some legitimate cases will have less than that number? Maybe we can be smarter about the number by looking at binary size?

@evanmiller
Copy link
Contributor

I was thinking in the case of counting functions, the threshold count would be 1 (or 2). But I don't understand coverage internals very well, and am grasping at straws here.

@Dor1s
Copy link
Contributor

Dor1s commented May 29, 2019

The real problem here with small fuzz targets is resource allocation. To give you an example:

#include <string>
#include <vector>

// Do some computations with 'str', return the result.
// This function contains a bug. Can you spot it?
size_t DoStuff(const std::string &str) {
  std::vector<int> Vec({0, 1, 2, 3, 4});
  size_t Idx = 0;
  if (str.size() > 5)
    Idx++;
  if (str.find("foo") != std::string::npos)
    Idx++;
  if (str.find("bar") != std::string::npos)
    Idx++;
  if (str.find("ouch") != std::string::npos)
    Idx++;
  if (str.find("omg") != std::string::npos)
    Idx++;
  return Vec[Idx];
}

// Simple fuzz target for DoStuff().
// See http://libfuzzer.info for details.
extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
  std::string str(reinterpret_cast<const char *>(data), size);
  DoStuff(str);  // Disregard the output.
  return 0;
}

This fuzz target has ~73 coverage edges (no matter whether we grep or obtain that number by any other means, as @jonathanmetzman pointed out). The current threshold is 100.

I don't think we should allocate even one CPU-day for such a trivial fuzz target. If we could allocate fractional CPUs (e.g. run such a small fuzz target for just 1 hour a day), we might let them go through, but anyway that would be fuzzing done wrong.

Regarding hacky white listing in https://github.com/google/oss-fuzz/blob/master/infra/base-images/base-runner/test_all#L121, totally agree that we should get rid of it. I'm fine with removing / disabling those fuzz targets at all. At current scale of OSS-Fuzz, I think those minor and almost useless targets shouldn't distract us.

@evanmiller
Copy link
Contributor

I agree that the marginal value of running a small fuzzing target for the 24th hour is very small. However, in terms of catching regressions, the marginal value of running the target on a new commit for even the first 60 seconds is quite high. These small targets may be sub-grammars buried deep in a complicated file format, and otherwise awkward to fuzz. Perhaps the scheduler should be more intelligent about prioritizing targets based on the marginal likelihood of finding a bug.

@Dor1s
Copy link
Contributor

Dor1s commented May 29, 2019

Absolutely. That's one of the key elements of our Ideal Integration guidance. The regression testing should happen upstream and/or be embedded into developer's workflow. From fuzzing you just need a good corpus. For a trivial fuzz target, a good corpus can be generated very quickly even without an infrastructure like ClusterFuzz.

@evanmiller
Copy link
Contributor

That makes sense.

It would help me to have more clarity on what belongs in ClusterFuzz vs what belongs upstream. I was happy to include only the large, "full format" fuzzers until @Google-Autofuzz opened WizardMac/ReadStat#181 requesting the smaller fuzzers be included. Now that I have spent time on those smaller fuzzers, I am being told they are a potential waste of resources.

As a project owner, it would help me if Google-Autofuzz and OSS-Fuzz came up with a coherent policy on acceptable fuzzing targets before making these sorts of requests of project owners. I don't really care which fuzzers are ultimately included, but I do find the current lack of clarity (and the unexplained urgency for including more fuzzing targets) a bit frustrating.

@Dor1s
Copy link
Contributor

Dor1s commented May 29, 2019

@evanmiller, I'm really sorry for that confusion. We're still adjusting some elements of collaboration with @Google-Autofuzz, so this situation we've put you in is definitely unfortunate.

Also, note that the point regarding a potential waste of resources is just my opinion, we still need to discuss it more with the team, and maybe we'll end up accepting small fuzz targets and lowering the requirements. I'd say stay tuned :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants