Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

testing: fuzzed function can corrupt internal state of fuzzer #48606

Open
stevenjohnstone opened this issue Sep 24, 2021 · 11 comments
Open

testing: fuzzed function can corrupt internal state of fuzzer #48606

stevenjohnstone opened this issue Sep 24, 2021 · 11 comments

Comments

@stevenjohnstone
Copy link

@stevenjohnstone stevenjohnstone commented Sep 24, 2021

What version of Go are you using (go version)?

$ go version
go version devel go1.18-d4139083204 Fri Sep 24 07:22:13 2021 +0000 linux/amd64

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/home/stevie/.cache/go-build"
GOENV="/home/stevie/.config/go/env"
GOEXE=""
GOEXPERIMENT=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOINSECURE=""
GOMODCACHE="/home/stevie/go/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/home/stevie/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/home/stevie/sdk/gotip"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/home/stevie/sdk/gotip/pkg/tool/linux_amd64"
GOVCS=""
GOVERSION="devel go1.18-d4139083204 Fri Sep 24 07:22:13 2021 +0000"
GCCGO="gccgo"
GOAMD64="v1"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD="/home/stevie/code/corrupt/go.mod"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build1167654317=/tmp/go-build -gno-record-gcc-switches"

What did you do?

Ran the following:

package corrupt

import "testing"

func FuzzCorrupt(f *testing.F) {
	f.Fuzz(func(t *testing.T, input []byte) {
		defer func() {
			for i := range input {
				input[i] = 0
			}
		}()

		if len(input) >= 2 && input[0] == 0 && input[1] == 1 {
			t.Fatalf("input %v", input)
		}

	})
}
$ gotip test -fuzz=FuzzCorrupt
warning: starting with empty corpus
fuzz: elapsed: 0s, execs: 0 (0/sec), interesting: 0
fuzz: minimizing 50-byte crash input...
FAIL
fuzz: elapsed: 0s, execs: 745 (16223/sec), interesting: 2
--- FAIL: FuzzCorrupt (0.05s)
        --- FAIL: FuzzCorrupt (0.00s)
            fuzz_test.go:14: input [0 1 0 84 84 84 0 174]
        --- FAIL: FuzzCorrupt (0.00s)
            fuzz_test.go:14: input [0 1 0 84 84 84 0 174]
    
    Crash written to testdata/fuzz/FuzzCorrupt/3be4f06197d4ae9580e884318bf125a616a7cd7eac35346c1e5c943218d8d00f
    To re-run:
    go test corrupt -run=FuzzCorrupt/3be4f06197d4ae9580e884318bf125a616a7cd7eac35346c1e5c943218d8d00f
FAIL
exit status 1
FAIL	corrupt	0.052s
$ cat testdata/fuzz/FuzzCorrupt/3be4f06197d4ae9580e884318bf125a616a7cd7eac35346c1e5c943218d8d00f 
go test fuzz v1
[]byte("\x00\x00\x00\x00\x00\x00\x00\x00")

The "crasher" in the testdata directory does not cause a crash.

Running with minimization disabled also reveals an issue:

$ gotip test -fuzz=FuzzCorrupt -fuzzminimizetime=0
warning: starting with empty corpus
fuzz: elapsed: 0s, execs: 0 (0/sec), interesting: 0
FAIL
FAIL
fuzz: elapsed: 0s, execs: 45 (1354/sec), interesting: 4
--- FAIL: FuzzCorrupt (0.03s)
        --- FAIL: FuzzCorrupt (0.00s)
            fuzz_test.go:14: input [0 1 0 0 0 0 0 0 0 0 0 0 208 230 47 47 47 47 47 47 47 208 208 208 128 0 0]
    
    Crash written to testdata/fuzz/FuzzCorrupt/c0e98f376324bb6d09edf7d1d9d054ca8bd1af158ba4ce46fafd5f96daa76e9f
    To re-run:
    go test corrupt -run=FuzzCorrupt/c0e98f376324bb6d09edf7d1d9d054ca8bd1af158ba4ce46fafd5f96daa76e9f
FAIL
exit status 1
FAIL	corrupt	0.040s
$ gotip test corrupt -run=FuzzCorrupt/c0e98f376324bb6d09edf7d1d9d054ca8bd1af158ba4ce46fafd5f96daa76e9f
ok  	corrupt	0.003s
$ cat testdata/fuzz/FuzzCorrupt/c0e98f376324bb6d09edf7d1d9d054ca8bd1af158ba4ce46fafd5f96daa76e9f 
go test fuzz v1
[]byte("\v\x01\x00\x00\x00G\xea\x10\x00\x01\v\xba\xd0\xe6///////\xd0\xd0Ѐ\x00\xe5")

What did you expect to see?

Crashers stored in testdata should trigger crashes.

What did you see instead?

Corpus entry was not a crasher.

@stevenjohnstone
Copy link
Author

@stevenjohnstone stevenjohnstone commented Sep 24, 2021

@jayconrod
Copy link
Contributor

@jayconrod jayconrod commented Sep 24, 2021

@jayconrod
Copy link
Contributor

@jayconrod jayconrod commented Sep 24, 2021

Allocating a new []byte on every iteration will slow down fuzzing by creating GC pressure. We could keep two slices though and make changes in both: one kept by the fuzzing engine, one passed to the function being fuzzed.

@katiehockman katiehockman self-assigned this Sep 29, 2021
@gopherbot
Copy link

@gopherbot gopherbot commented Sep 30, 2021

Change https://golang.org/cl/353429 mentions this issue: internal/fuzz: prevent the fuzz fn from corrupting data

@katiehockman
Copy link
Member

@katiehockman katiehockman commented Oct 5, 2021

@jayconrod brought up a good point during review of https://golang.org/cl/353429 that makes a case for why we shouldn't be allowing the fuzzed functions to alter the underlying data that is provided by the fuzzing engine. Namely, he brought up that this will be next to impossible when structs are supported in the future. He noted that we would need to allocate a new struct every time the fuzz function is called, which isn't tenable for performance (fuzzing has to be fast to be effective).

One potential option that I brought up is that we support this for []byte only, but not for structs, but special-casing like this makes the code more difficult to maintain and makes things like documentation less straightforward.

This may be something that can be supported in the future after more thought has gone into it, but for now, it sounds like the best course of action is to make it a requirement that the fuzz target is not allowed to alter the underlying data during fuzzing. It's easier to loosen the restriction in future releases then to design ourselves into a corner today.

If folks have good examples of cases where functions like this should be fuzzable, please comment here.

@katiehockman
Copy link
Member

@katiehockman katiehockman commented Oct 5, 2021

As far as how to make this a requirement, I would say that documenting it in (*testing.F).Fuzz is a reasonable step for now. We could certainly implement something that panics/crashes if the underlying data does change at any point, but I don't think that's strictly necessary for 1.18, though that's something we could add in the future.

@gopherbot
Copy link

@gopherbot gopherbot commented Oct 5, 2021

Change https://golang.org/cl/353977 mentions this issue: testing: document f.Fuzz requirement to not change underlying data

@stevenjohnstone
Copy link
Author

@stevenjohnstone stevenjohnstone commented Oct 5, 2021

@katiehockman I wonder how we can enforce that only functions which don't modify their arguments can be fuzzed? How do I know I haven't done that by mistake? #47587 is an example in the fuzzing code. This bug itself could be considered to be another. I also have doubts that I'd be able to vouch for all the dependencies of a function under test. I've written a few bugs where I meant to copy data but didn't. It seems to be common enough to end up in coding guidelines: https://github.com/uber-go/guide/blob/master/style.md#copy-slices-and-maps-at-boundaries.

This is all fine though if there are tools to discover that arguments have been changed. I'd be really happy with auxiliary functions in the testing package which would give lightweight checksums of arguments which are friendly to the performance requirements of the fuzzer. In vague pseudo-code

c := testing.Checksum(arg)
func(arg)
if testing.Checksum(arg) != c {
    panic("argument modified")
}

I'd be okay with writing my own functions for that but I'm really keen to advocate fuzzing and making it as easy as possible would help there.

@katiehockman
Copy link
Member

@katiehockman katiehockman commented Oct 5, 2021

@stevenjohnstone I think the way you described would be a workable approach for this. @jayconrod had a similar suggestion: "We could potentially enforce this by hashing the data before and after each call and stopping if the hashes don't match. Hashing should be at least marginally faster than allocating and garbage collecting."

I agree that we should do something like this if we have the time to do so. At the very least, we can make this a strong nice-to-have. It's a bit unclear how much effort would be involved in implementing this, so I didn't want to make it a release-blocker for 1.18 yet, especially given the fact that there are still several other bug fixes and quality-of-life improvements we should make before the cut. I'll take a closer look though and see if it's reasonable to implement for 1.18.

Thanks for the feedback 👍

gopherbot pushed a commit that referenced this issue Oct 5, 2021
Updates #48606

Change-Id: I6d555fdefccd842fb65ec8d630b4808bcb54a825
Reviewed-on: https://go-review.googlesource.com/c/go/+/353977
Trust: Katie Hockman <katie@golang.org>
Run-TryBot: Katie Hockman <katie@golang.org>
Reviewed-by: Jay Conrod <jayconrod@google.com>
@katiehockman
Copy link
Member

@katiehockman katiehockman commented Oct 5, 2021

I actually think this should be pretty straightforward. I'm working on this now, so hopefully can send a patch shortly.

@katiehockman
Copy link
Member

@katiehockman katiehockman commented Oct 5, 2021

Looking into this more, we should fix #48804 before we fix this, since we need that fixed for the error messages to be handled and reported correctly. But once that's done, I think this should be pretty straightforward.

My biggest concern with the hash approach is that we can't hash the []interface{} that's passed to the fuzz function without first marshaling it into a []byte. That means that we would need to marshal the data twice every time we call the fuzz function (once before we call it, and once after), which I suspect would slow things down. Ideally we can do some performance checks before we merge something like that to make sure it doesn't cause a notable regression.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
4 participants