Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[dev.fuzz] Crash data written to corpus is incorrect #47587

Closed
stevenjohnstone opened this issue Aug 7, 2021 · 11 comments
Closed

[dev.fuzz] Crash data written to corpus is incorrect #47587

stevenjohnstone opened this issue Aug 7, 2021 · 11 comments
Labels
FrozenDueToAge fuzz Issues related to native fuzzing support NeedsFix The path to resolution is known, but the work has not been done. release-blocker
Milestone

Comments

@stevenjohnstone
Copy link
Contributor

What version of Go are you using (go version)?

$ go version
go version devel go1.17-2a0825d01f Tue Jul 20 00:06:06 2021 +0000 linux/amd64

Does this issue reproduce with the latest release?

n/a

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/home/stevie/.cache/go-build"
GOENV="/home/stevie/.config/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOINSECURE=""
GOMODCACHE="/home/stevie/go/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/home/stevie/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/home/stevie/sdk/gotip"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/home/stevie/sdk/gotip/pkg/tool/linux_amd64"
GOVCS=""
GOVERSION="devel go1.17-2a0825d01f Tue Jul 20 00:06:06 2021 +0000"
GCCGO="gccgo"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD="/home/stevie/unicode/go.mod"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build426468616=/tmp/go-build -gno-record-gcc-switches"

What did you do?

// +build gofuzzbeta

package unicode

import (
	"fmt"
	"strings"
	"testing"
)

func FuzzBeta(f *testing.F) {
	f.Fuzz(func(t *testing.T, data []byte) {
		s := string(data)
		valid := strings.ToValidUTF8(s, "")
		if len(valid) == 0 {
			return
		}

		lower := strings.ToLower(valid)
		upper := strings.ToUpper(lower)

		roundtrip := strings.ToLower(upper)

		if roundtrip != lower {
			panic(fmt.Sprintf("%x: %v (%x): %v (%x) != %v (%x), upper = %v (%x)", data, valid, []byte(valid), roundtrip, []byte(roundtrip), lower, []byte(lower), upper, []byte(upper)))
		}
	})
}

Run the fuzzer and it finds (correctly) unicode examples which are modified in a strings.ToUpper strings.ToLower roundtrip

$ gotip test -fuzz=Fuzz
found a crash, minimizing...
fuzzing, elapsed: 0.1s, execs: 995 (7461/sec), workers: 8, interesting: 16
--- FAIL: FuzzBeta (0.13s)
        panic: c2b5: µ (c2b5): μ (cebc) != µ (c2b5), upper = Μ (ce9c)
        goroutine 264 [running]:
        runtime/debug.Stack()
        	/home/stevie/sdk/gotip/src/runtime/debug/stack.go:24 +0x90
        testing.tRunner.func1.2({0x5a8b60, 0xc006707ab0})
        	/home/stevie/sdk/gotip/src/testing/testing.go:1281 +0x267
        testing.tRunner.func1()
        	/home/stevie/sdk/gotip/src/testing/testing.go:1288 +0x218
        panic({0x5a8b60, 0xc006707ab0})
        	/home/stevie/sdk/gotip/src/runtime/panic.go:1038 +0x215
        github.com/stevenjohnstone/unicode.FuzzBeta.func1(0x0, {0xc00013e4c0, 0x2, 0x40})
        	/home/stevie/unicode/fuzz_test.go:25 +0x3e8
        reflect.Value.call({0x5aa780, 0x5e36e0, 0x13}, {0x5d5c43, 0x4}, {0xc0067144b0, 0x2, 0x2})
        	/home/stevie/sdk/gotip/src/reflect/value.go:543 +0xf0d
        reflect.Value.Call({0x5aa780, 0x5e36e0, 0xc006703930}, {0xc0067144b0, 0x2, 0x2})
        	/home/stevie/sdk/gotip/src/reflect/value.go:339 +0x13e
        testing.(*F).Fuzz.func1.1(0xc0002080c0)
        	/home/stevie/sdk/gotip/src/testing/fuzz.go:377 +0x1c6
        testing.tRunner(0xc00673b6c0, 0xc006726a80)
        	/home/stevie/sdk/gotip/src/testing/testing.go:1335 +0x102
        created by testing.(*F).Fuzz.func1
        	/home/stevie/sdk/gotip/src/testing/fuzz.go:366 +0x505
        
        --- FAIL: FuzzBeta (0.00s)
    
    Crash written to testdata/corpus/FuzzBeta/e2422ad459420d437cced43ea44570336d2ccae8d700406a987c1b59cab89e23
    To re-run:
    go test github.com/stevenjohnstone/unicode -run=FuzzBeta/e2422ad459420d437cced43ea44570336d2ccae8d700406a987c1b59cab89e23
FAIL
exit status 1
FAIL	github.com/stevenjohnstone/unicode	0.139s

When I run the suggested command to re-run the crasher, the test passes:

$ gotip test github.com/stevenjohnstone/unicode -run=FuzzBeta/e2422ad459420d437cced43ea44570336d2ccae8d700406a987c1b59cab89e23
ok  	github.com/stevenjohnstone/unicode	0.003s

Looking at the contents of the corpus file I see

$ cat testdata/corpus/FuzzBeta/e2422ad459420d437cced43ea44570336d2ccae8d700406a987c1b59cab89e23
go test fuzz v1
[]byte("\xb5\xb5")

The panic message shows that the actual input was "\xc2\xb5". It appears that incorrect data is written to the corpus?

What did you expect to see?

I expect the re-run command to result in a test failure. I expected the corpus file to contain the input which causes a panic.

What did you see instead?

Incorrect data in the corpus

@seankhliao seankhliao added fuzz Issues related to native fuzzing support NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Aug 7, 2021
@stevenjohnstone
Copy link
Contributor Author

When I disable crasher minimization with this patch to the toolchain

diff --git a/src/internal/fuzz/worker.go b/src/internal/fuzz/worker.go
index e3029bcd66..fca250077d 100644
--- a/src/internal/fuzz/worker.go
+++ b/src/internal/fuzz/worker.go
@@ -206,14 +206,8 @@ func (w *worker) coordinate(ctx context.Context) error {
 
                case crasher := <-w.coordinator.minimizeC:
                        // Received input to minimize from coordinator.
-                       minRes, err := w.minimize(ctx, crasher)
-                       if err != nil {
-                               // Failed to minimize. Send back the original crash.
-                               fmt.Fprintln(w.coordinator.opts.Log, err)
-                               minRes = crasher
-                               minRes.minimized = true
-                       }
-                       w.coordinator.resultC <- minRes
+                       crasher.minimized = true
+                       w.coordinator.resultC <- crasher
                }
        }
 }

the problem goes away.

Interestingly, it's always the penultimate byte of the corpus entry which is incorrect and is always equal to the last byte.

@stevenjohnstone
Copy link
Contributor Author

The bug seems to be caused by assigning a byte slice to the collection of minimized inputs which can be modified after assignment by minimizeBytes. This should be a copy of the data rather than a reference. This patch fixes the problem for me:

diff --git a/src/internal/fuzz/worker.go b/src/internal/fuzz/worker.go
index e3029bcd66..bab0f07036 100644
--- a/src/internal/fuzz/worker.go
+++ b/src/internal/fuzz/worker.go
@@ -787,7 +787,7 @@ func (ws *workerServer) minimizeInput(ctx context.Context, vals []interface{}, c
                case []byte:
                        switch prev.(type) {
                        case []byte:
-                               vals[valI] = c
+                               vals[valI] = append([]byte{}, c...)
                        case string:
                                vals[valI] = string(c)
                        default:

stevenjohnstone added a commit to stevenjohnstone/go that referenced this issue Aug 7, 2021
In workerServer.minimizeInput, the callback "tryMinimize" can receive
byte slices which will be modified after the callback returns. Similarly
to the case where the "candidate" is a string, make a copy of the byte
slice.

Fixes golang#47587
@gopherbot
Copy link

Change https://golang.org/cl/340630 mentions this issue: [dev.fuzz] internal/fuzz: avoid corruption of minimized crashers

@katiehockman katiehockman added this to the Go1.18 milestone Sep 7, 2021
@katiehockman katiehockman self-assigned this Sep 7, 2021
@katiehockman katiehockman added NeedsFix The path to resolution is known, but the work has not been done. and removed NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Sep 7, 2021
@katiehockman
Copy link
Contributor

/cc @golang/fuzzing

@katiehockman
Copy link
Contributor

I can reproduce this, and confirm that your fix stops the issue from happening, but I'm not actually seeing exactly why this bug is occurring in the first place. In theory, I can see how assigning vals[valI] the way that it does would allow it to be modified later on, but I also don't see anywhere in the code where it is actually modifying it.

I'd like to figure that out before we merge the fix.

@katiehockman
Copy link
Contributor

Actually I think I know what is going on. I think the fix is still a reasonable one, but here is my theory for how the memory is actually changing later on:

My suspicion is that https://github.com/golang/go/blob/dev.fuzz/src/internal/fuzz/minimize.go#L38 is the root of how this is happening. We use this same tmp slice every time, pointing to the same slots in memory. We set candidate to tmp every iteration of the loop. Let's say that one iteration finds a smaller input that works: vals[valI] will be set to candidate which is actually just pointing to tmp. Even though we create a new candidate in the next iteration of the loop, we are still setting candidate to the same tmp that we used before, thus the same memory that vals[valI] is currently pointing to. This next iteration, we change candidate, which changes tmp, which changes vals[valI], and there's our issue.

stevenjohnstone added a commit to stevenjohnstone/go that referenced this issue Sep 7, 2021
In workerServer.minimizeInput, the callback "tryMinimize" can receive
byte slices which will be modified after the callback returns. Similarly
to the case where the "candidate" is a string, make a copy of the byte
slice.

Fixes golang#47587
@stevenjohnstone
Copy link
Contributor Author

@katiehockman I think it's possible to corrupt candidates in the minimization process in the function under test even with the proposed fix. There are no guarantees that the function will not modify the contents of a byte slice. For example,

// +build gofuzzbeta

package unicode

import (
	"fmt"
	"strings"
	"testing"
)

func FuzzBeta(f *testing.F) {
	f.Fuzz(func(t *testing.T, data []byte) {
		s := string(data)
		// trash the input data
		for i := range data {
			data[i] = 'z'
		}
		valid := strings.ToValidUTF8(s, "")
		if len(valid) == 0 {
			return
		}

		lower := strings.ToLower(valid)
		upper := strings.ToUpper(lower)

		roundtrip := strings.ToLower(upper)

		if roundtrip != lower {
			panic(fmt.Sprintf("%x: %v (%x): %v (%x) != %v (%x), upper = %v (%x)", data, valid, []byte(valid), roundtrip, []byte(roundtrip), lower, []byte(lower), upper, []byte(upper)))
		}
	})
}

will crash but the input recorded in the corpus does not cause a crash due the write

		// trash the input data
		for i := range data {
			data[i] = 'z'
		}

I have a fix for this. Should I add it to #47591 / golang.org/cl/340630?

@katiehockman
Copy link
Contributor

@stevenjohnstone That's a good point. I tested it locally and it definitely had some strange behavior.

It might be worth arguing though that this isn't a case that we actually have to make work, at least not right now. It seems very reasonable that we can inform developers that changing the bytes that are sent into the f.Fuzz function can cause the fuzzing engine to misbehave. They shouldn't really be doing that anyway, because it can mess with the efficiency of the fuzzing engine even if it didn't cause other problems. I would hold off on that for now.

@gopherbot
Copy link

Change https://golang.org/cl/348610 mentions this issue: [dev.fuzz] internal/fuzz: avoid incorrect bytes modification during minimization

@stevenjohnstone
Copy link
Contributor Author

@katiehockman I'm inclined to agree that it's likely to be a niche problem. Unfortunately, we aren't always in control of the code we fuzz. Even having the source, it may not be straightforward to tell if a function modifies its arguments. My use case for fuzzing is often to do due diligence on third-party dependencies and I've seen bugs where the author forgets that a byte slice references shared data and not a copy like a string.

The first indication you'd get of this problem would be the fuzzer finding a crasher and then losing it after minimization fails. Perhaps a mitigation to this and any future minimization bugs is to keep the original crash and a minimized version?

PS: probably a good thing to add to a fuzzing guide that the test could do

f.Fuzz(t *testing.T, b []byte) {
   tmp := append([]byte, b...)
   functionUnderTest(tmp)
   if !bytes.Equal(tmp, b) {
        t.Fatalf("input modified %v != %v", tmp, b)
   }
})

gopherbot pushed a commit that referenced this issue Sep 9, 2021
…inimization

During minimization, the "canonical inputs" (vals) are updated
as viable minimized values are found. Previously, these bytes
could be changed later during minimization. This patch updates
the minimization code to revert the bytes back when a candidate
doesn't pass the minimization checks.

Another approach was in CL 340630 which would make a new allocation
each time a candidate was attempted. This will get very expensive
very quickly, as minimization can run several thousand times for every
new crash and every newly discovered interesting input.

Credit to Steven Johnstone (steven.james.johnstone@gmail.com) for the
"single_bytes" test which was added to minimize_test.go.

Fixes #47587

Change-Id: Ibd12f73458ed812bab7d3f1d4118854a54fc4d0a
Reviewed-on: https://go-review.googlesource.com/c/go/+/348610
Trust: Katie Hockman <katie@golang.org>
Trust: Jay Conrod <jayconrod@google.com>
Run-TryBot: Katie Hockman <katie@golang.org>
Reviewed-by: Jay Conrod <jayconrod@google.com>
@katiehockman
Copy link
Contributor

This should be fixed now. LMK if you run into this again.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge fuzz Issues related to native fuzzing support NeedsFix The path to resolution is known, but the work has not been done. release-blocker
Projects
Status: No status
Development

No branches or pull requests

4 participants