Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

testing: racey access to t.failed #24438

Closed
nhooyr opened this issue Mar 18, 2018 · 11 comments
Closed

testing: racey access to t.failed #24438

nhooyr opened this issue Mar 18, 2018 · 11 comments
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@nhooyr
Copy link
Contributor

nhooyr commented Mar 18, 2018

Please answer these questions before submitting your issue. Thanks!

What version of Go are you using (go version)?

go version go1.10 darwin/amd64

Does this issue reproduce with the latest release?

Yes.

What operating system and processor architecture are you using (go env)?

GOARCH="amd64"
GOBIN=""
GOCACHE="/Users/nhooyr/Library/Caches/go-build"
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GOOS="darwin"
GOPATH="/Users/nhooyr/Programming/gopath"
GORACE=""
GOROOT="/usr/local/Cellar/go/1.10/libexec"
GOTMPDIR=""
GOTOOLDIR="/usr/local/Cellar/go/1.10/libexec/pkg/tool/darwin_amd64"
GCCGO="gccgo"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/7b/218sfv615xxf_w9ttnpjt0_r0000gn/T/go-build699377598=/tmp/go-build -gno-record-gcc-switches -fno-common"

What did you do?

If possible, provide a recipe for reproducing the error.
A complete runnable program is good.
A link on play.golang.org is best.

package foo

import "testing"

func TestBar(t *testing.T) {
	go func() {
		t.Error("meow")
	}()
}

Run with the race detector.

What did you expect to see?

No race condition.

What did you see instead?

Race condition.

==================
WARNING: DATA RACE
Write at 0x00c4200ca131 by goroutine 7:
  testing.(*common).Fail()
      /usr/local/Cellar/go/1.10/libexec/src/testing/testing.go:511 +0xb2
  testing.(*common).Error()
      /usr/local/Cellar/go/1.10/libexec/src/testing/testing.go:579 +0x7c
  command-line-arguments.TestBar.func1()
      /Users/nhooyr/Library/Preferences/GoLand2017.3/scratches/scratch_59_test.go:7 +0x88

Previous read at 0x00c4200ca131 by goroutine 6:
  testing.tRunner()
      /usr/local/Cellar/go/1.10/libexec/src/testing/testing.go:779 +0x17f

Goroutine 7 (running) created at:
  command-line-arguments.TestBar()
      /Users/nhooyr/Library/Preferences/GoLand2017.3/scratches/scratch_59_test.go:6 +0x4c
  testing.tRunner()
      /usr/local/Cellar/go/1.10/libexec/src/testing/testing.go:777 +0x16d

Goroutine 6 (running) created at:
  testing.(*T).Run()
      /usr/local/Cellar/go/1.10/libexec/src/testing/testing.go:824 +0x564
  testing.runTests.func1()
      /usr/local/Cellar/go/1.10/libexec/src/testing/testing.go:1063 +0xa4
  testing.tRunner()
      /usr/local/Cellar/go/1.10/libexec/src/testing/testing.go:777 +0x16d
  testing.runTests()
      /usr/local/Cellar/go/1.10/libexec/src/testing/testing.go:1061 +0x4e1
  testing.(*M).Run()
      /usr/local/Cellar/go/1.10/libexec/src/testing/testing.go:978 +0x2cd
  main.main()
      _testmain.go:42 +0x22a
==================
--- FAIL: TestBar (0.00s)
	scratch_59_test.go:7: meow
FAIL
FAIL	command-line-arguments	0.012s

Reason is that the variable read at

if t.failed {
needs to be done while t.mu is rlocked.

@nhooyr
Copy link
Contributor Author

nhooyr commented Mar 18, 2018

Sent https://go-review.googlesource.com/c/go/+/101283 as the fix.

@gopherbot
Copy link
Contributor

Change https://golang.org/cl/101283 mentions this issue: testing: fix racey access to t.failed

@nhooyr
Copy link
Contributor Author

nhooyr commented Mar 18, 2018

Actually, not sure if this is an issue as you cannot fail the test after it exits.

In the above output for the code, I would have expected to see a race detected during execution of test message. Any idea why its not there?

@ALTree ALTree added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Mar 19, 2018
@ALTree ALTree added this to the Go1.11 milestone Mar 19, 2018
@mark-rushakoff
Copy link
Contributor

I've run into the same race. This appears to be a Go 1.10 regression from Go 1.9.

Note: the docs are clear that t.Error is safe to call concurrently:

The other reporting methods, such as the variations of Log and Error, may be called simultaneously from multiple goroutines.

Here is my test case, which keeps the main test goroutine alive while the background goroutine calls t.Error:

package main

import (
	"testing"
	"time"
)

func TestGoroutineError(t *testing.T) {
	go func() {
		t.Error("error from goroutine")
	}()

	time.Sleep(time.Second)
}

I get this data race from running go test -race ./a_test.go, for go1.10.1 and go tip (go version devel +7b7affa56d Mon Apr 16 16:20:11 2018 +0000 darwin/amd64):

gotip test -race /tmp/a_test.go
==================
WARNING: DATA RACE
Read at 0x00c0000b8131 by goroutine 6:
  testing.tRunner()
      /Users/mr/gogo/src/github.com/golang/go/src/testing/testing.go:793 +0x16c

Previous write at 0x00c0000b8131 by goroutine 7:
  testing.(*common).Fail()
      /Users/mr/gogo/src/github.com/golang/go/src/testing/testing.go:514 +0xb9
  testing.(*common).Errorf()
      /Users/mr/gogo/src/github.com/golang/go/src/testing/testing.go:588 +0x94
  command-line-arguments.TestGoroutineError.func1()
      /tmp/a_test.go:10 +0x60

Goroutine 6 (running) created at:
  testing.(*T).Run()
      /Users/mr/gogo/src/github.com/golang/go/src/testing/testing.go:838 +0x568
  testing.runTests.func1()
      /Users/mr/gogo/src/github.com/golang/go/src/testing/testing.go:1081 +0xc9
  testing.tRunner()
      /Users/mr/gogo/src/github.com/golang/go/src/testing/testing.go:791 +0x155
  testing.runTests()
      /Users/mr/gogo/src/github.com/golang/go/src/testing/testing.go:1079 +0x4f8
  testing.(*M).Run()
      /Users/mr/gogo/src/github.com/golang/go/src/testing/testing.go:996 +0x2ee
  main.main()
      _testmain.go:42 +0x221

Goroutine 7 (finished) created at:
  command-line-arguments.TestGoroutineError()
      /tmp/a_test.go:9 +0x4c
  testing.tRunner()
      /Users/mr/gogo/src/github.com/golang/go/src/testing/testing.go:791 +0x155
==================
--- FAIL: TestGoroutineError (1.00s)
	a_test.go:10: error from goroutine
	testing.go:735: race detected during execution of test
FAIL
FAIL	command-line-arguments	1.016s

I do not get a data race with Go 1.9.5:

$ $GOPATH/bin/go1.9.5 test -race /tmp/a_test.go
--- FAIL: TestGoroutineError (1.00s)
	a_test.go:10: error from goroutine
FAIL
FAIL	command-line-arguments	1.023s

$ $GOPATH/bin/go1.9.5 version
go version go1.9.5 darwin/amd64

I used the Go docker images to separately confirm that there is a data race with go1.10.1 linux/amd64 but no data race with go1.9 linux/amd64.

@mark-rushakoff
Copy link
Contributor

Applying https://go-review.googlesource.com/c/go/+/101283 to my copy of the Go source does fix the regression. I don't know why the 32-bit trybots would have failed from that patch.

@mark-rushakoff
Copy link
Contributor

Ping @ianlancetaylor - am I correct in my above comment that this is a 1.9 -> 1.10 regression?

@ianlancetaylor
Copy link
Member

As far as I can tell this can only happen when there is a race condition between the call to t.Error and the main test goroutine returning from the test. I agree that in Go 1.9 this did not cause a warning from the race detector. But it still seems like a race. A time.Sleep, as in your example above, does not stop the race detector from reporting a race. If you use a sync.WaitGroup, on the other hand, you should not see a race report.

So while I think this is likely worth fixing, since the race report isn't useful, it's not obvious to me that this is a regression as such. The test really is racy, and in principle the test could get the "Fail in goroutine after has completed" panic.

If you have an example that is not racy, please let us know.

@mark-rushakoff
Copy link
Contributor

Thanks for the explanation, I'm now satisfied that this is not a 1.9 -> 1.10 regression.

If you use a sync.WaitGroup, on the other hand, you should not see a race report.

That's correct, this code does not trigger the race detector:

package main

import (
	"sync"
	"testing"
)

func TestGoroutineError(t *testing.T) {
	var wg sync.WaitGroup
	wg.Add(1)
	go func() {
		defer wg.Done()
		t.Error("error from goroutine")
	}()

	wg.Wait()
}

@nhooyr
Copy link
Contributor Author

nhooyr commented May 2, 2018

@ianlancetaylor why do you think the race report is not useful? It lets them know the test is racey.

@ianlancetaylor
Copy link
Member

Yes, the test is racy, but not in a very interesting way.

@nhooyr
Copy link
Contributor Author

nhooyr commented May 2, 2018

They might get the failure in goroutine after test exit panic like you said. Having the race detector also working to report the race makes them aware something is wrong too in case they do not get the panic. Its more protection.

@golang golang locked and limited conversation to collaborators May 2, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

5 participants