-
Notifications
You must be signed in to change notification settings - Fork 17.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cmd/go: uses of os/exec can trigger fuzz failures when stopping via ^C #50228
Comments
I tried to write a reproducer that fails ~100% of the time, but so far I've failed. |
/cc @golang/fuzzing |
When you hit ^C you're sending SIGINT to the foreground process group, including all the If instead of ^C you use
I don't think this eliminates the race. The fuzz failure can happen before To extend the idea, when the fuzzer is exiting due to a signal, it can suppress any failures that occurred during the last, say, 50ms. That should reduce the race window to ~0 but it seems really hacky. (In general using |
I'm not sure I agree. It can be hard to estimate how long one should fuzz for. Sometimes I fuzz for five minutes, and the corpus stops expanding almost entirely. Sometimes I fuzz for twenty minutes, and the corpus keeps expanding. Being able to stop the fuzzing process interactively is helpful.
Good observation, I hadn't thought of that.
From the user's perspective, this solution seems fine. If I'm manually hitting ^C, I really don't care that much about a few hundred milliseconds being lost. That's usually what happens with "cleanup" until the process actually exits, anyway. It may not be a foolproof mechanism, but if it reduces the odds of false positive failures to near-zero, that seems like a big step forward. |
Another possibility might just be to start the process in a new process group. |
Which process? The |
This is a bit complicated. I thought we explicitly recommended against using fuzzing in this way. We don't, but we probably should. With the way the fuzzing engine is designed, fuzzing things outside of the runtime is likely to have other strange behaviors other than this anyway. Throwing away things that happen in some interval before exiting could work, but it would also end up masking some real failures. I also cannot think of an elegant implementation of this that doesn't simply rely on buffering findings, which would end up adding a not insignificant amount of complexity. Given that this is a pretty narrow issue, and that we don't have all that much time left, I think for 1.18 we should just document that this is not a recommended usage of the fuzzer, and is likely to lead to to such issues, and consider if there is a larger structural change we could make for this for 1.19. |
What exactly do "this way" and "fuzzing things outside of the runtime" mean? (IOW, what are you proposing to recommend against for Go 1.18?) Do you have examples of the "other strange behaviors"? |
As long as the suppression window is a lot smaller than the fuzzing duration, this doesn't seem significant. For example, suppose you suppress failures from the last 50ms of the fuzzing run. In this scenario, running the fuzzer for 5 seconds would be equivalent to running the fuzzer for 4.95s without suppression. Also, you wouldn't suppress results when exiting due to |
Given finding crashers is rare, rather than continually buffering, would it be reasonable to instead effectively pause when a crasher is found but before reporting it? In other words, the new behavior could kick in only when it would have been about to report a crasher? Or maybe that’s what you were already saying… |
And sorry for the double post, but just one more quick comment to agree with @mvdan that it would be nice to handle this gracefully eventually. It can be helpful to be able to execute an external program for comparative fuzzing, even if you don’t get coverage with the external binary. For example, from the go-fuzz-corpus: https://github.com/dvyukov/go-fuzz-corpus/blob/master/gotypes/main.go#L225 which found a bunch of real bugs comparing go/types, cmd/compile, and gccgo, where only go/types had coverage instrumentation in that example (I think). Alll that said, it seems reasonable to do whatever is expedient for 1.18, which might just be a note in the documentation or similar. |
In this case, I hit
^C
a.k.a. SIGINT after about one second. I can reproduce this kind of error about 20% of the time; I was able to reproduce with this minimal fuzz func on the fourth try, for instance.This is a minimal reproducer, using /bin/sleep for the sake of simplicity. My real fuzz func calls a reasonable program that exits within milliseconds. But if I stop the fuzzer just at the right moment with
^C
, I might get a failure that's not really a relevant failure. The code is here: https://github.com/mvdan/sh/blob/4a4c1600341f6413719f5e8a50105b8d13a96d6d/syntax/fuzz_test.go#L82-L94I don't think my code is to blame here. I could possibly expand my
err != nil
check to ignoresignal: interrupt
errors, but that doesn't seem right either - any use ofos/exec
would need to also be adapted to handle such a case.I think the solution should come from cmd/go's fuzzing - if the user hits
^C
, any fuzz failures that happen afterwards should be ignored entirely.cc @katiehockman
The text was updated successfully, but these errors were encountered: