Data race if c.Fail is called in a goroutine #43

Closed
davecheney opened this Issue Jun 10, 2015 · 14 comments

Comments

Projects
None yet
3 participants
Contributor

davecheney commented Jun 10, 2015

Consider this test case

package suite

import (
    "sync"
    "testing"

    gc "gopkg.in/check.v1"
)

type TestSuite struct{}

var _ = gc.Suite(&TestSuite{})

func TestTesting(t *testing.T) {
    gc.TestingT(t)
}

func (t *TestSuite) TestRace(c *gc.C) {
    var wg sync.WaitGroup
    start := make(chan bool)

    const n = 2

    wg.Add(n)
    for i := 0; i < n; i++ {
        go func() {
            defer wg.Done()

            <-start

            c.Error("an error occured")
        }()
    }
    close(start)
    wg.Wait()
}

When running the test with -race, the following data race results

==================
WARNING: DATA RACE
Write by goroutine 9:
  gopkg.in/check%2ev1.(*C).Error()
      /home/dfc/src/gopkg.in/check.v1/helpers.go:117 +0x246
  suite.(*TestSuite).TestRace.func1()
      /home/dfc/src/suite/suite_test.go:31 +0x152

Previous write by goroutine 10:
  gopkg.in/check%2ev1.(*C).Error()
      /home/dfc/src/gopkg.in/check.v1/helpers.go:117 +0x246
  suite.(*TestSuite).TestRace.func1()
      /home/dfc/src/suite/suite_test.go:31 +0x152

Goroutine 9 (running) created at:
  suite.(*TestSuite).TestRace()
      /home/dfc/src/suite/suite_test.go:32 +0xd0
  runtime.call32()
      /home/dfc/go/src/runtime/asm_amd64.s:437 +0x44
  reflect.Value.Call()
      /home/dfc/go/src/reflect/value.go:300 +0xd0
  gopkg.in/check%2ev1.(*suiteRunner).forkTest.func1()
      /home/dfc/src/gopkg.in/check.v1/check.go:763 +0x5e3
  gopkg.in/check%2ev1.(*suiteRunner).forkCall.func1()
      /home/dfc/src/gopkg.in/check.v1/check.go:657 +0x83

Goroutine 10 (finished) created at:
  suite.(*TestSuite).TestRace()
      /home/dfc/src/suite/suite_test.go:32 +0xd0
  runtime.call32()
      /home/dfc/go/src/runtime/asm_amd64.s:437 +0x44
  reflect.Value.Call()
      /home/dfc/go/src/reflect/value.go:300 +0xd0
  gopkg.in/check%2ev1.(*suiteRunner).forkTest.func1()
      /home/dfc/src/gopkg.in/check.v1/check.go:763 +0x5e3
  gopkg.in/check%2ev1.(*suiteRunner).forkCall.func1()
      /home/dfc/src/gopkg.in/check.v1/check.go:657 +0x83
==================

----------------------------------------------------------------------
FAIL: suite_test.go:18: TestSuite.TestRace

OOPS: 0 passed, 1 FAILED
suite_test.go:31:
    c.Error("an error occured")
... Error: an error occured

suite_test.go:31:
    c.Error("an error occured")
... Error: an error occured

--- FAIL: TestTesting (0.00s)
FAIL
exit status 1
FAIL    suite   0.018s
Contributor

niemeyer commented Jun 10, 2015

Don't ever call FailNow in gocheck or even in the testing package if you are inside a goroutine. These will call Go's runtime.Goexit, which will not do the right thing if you are inside a goroutine.

@niemeyer niemeyer closed this Jun 10, 2015

Contributor

niemeyer commented Jun 10, 2015

Sorry, I jumped to conclusions too soon. The race exists also without FailNow.

@niemeyer niemeyer reopened this Jun 10, 2015

Contributor

davecheney commented Jun 10, 2015

Please read the example I supplied, nobody is calling FailNow directly, but
the call occurs internally to go check.

On Thu, 11 Jun 2015 09:07 Gustavo Niemeyer notifications@github.com wrote:

Don't ever call FailNow in gocheck or even in the testing package if you
are inside a goroutine. These will call Go's runtime.Goexit, which will not
do the right thing if you are inside a goroutine.


Reply to this email directly or view it on GitHub
#43 (comment).

Contributor

niemeyer commented Jun 10, 2015

I win. ;-)

Contributor

davecheney commented Jun 10, 2015

Lol. Yes, sorry about the crap issue title, it totally buried the lead.

On Thu, 11 Jun 2015 09:10 Gustavo Niemeyer notifications@github.com wrote:

I win. ;-)


Reply to this email directly or view it on GitHub
#43 (comment).

@niemeyer niemeyer changed the title from Data race if c.Fail, or c.FailNow are called in a goroutine to Data race if c.Fail is called in a goroutine Jun 10, 2015

davecheney added a commit to juju/check that referenced this issue Jun 17, 2015

check: fix data race on C.status
Fixes #43
Updates #35

This PR removes a data race on C.status if c.Error or c.Assert fails in a goroutine spawned inside a test suite.

Renaming the variable to `C._status` is regretable, but made the change to the code accessing the variable via helpers smaller.

Tests were _not_ run as they do not pass on recent versions of Go. This fix was verified by asserting that LP 1463643 was resolved.

The gocheck docs should document exactly what concurrent behaviour
is allowed.

If it was up to me, I would explicitly allow FailNow within a goroutine,
but document that it will just end the goroutine in question without actually stopping
the whole test. That way concurrent pieces of the test can be written in the
usual "Assert" style. As far as I'm aware, that actually is the way it works now.

Contributor

niemeyer commented Jun 17, 2015

It feels a bit pointless to "allow" something to happen that has completely unusual and not useful behavior. "Yeah, you can call panic, but it won't work! Hah!"

If I've started a goroutine I don't expect it to be able to influence other
goroutines directly, so for me the the fact that Assert only aborts the
current goroutine is not unusual at all. It feels more unusual to me
that Assert cannot be used in goroutines where other methods on *C
can, despite the documentation not mentioning anything either way.

And of course it's useful for just the same reason that Assert is useful
in other circumstances - you don't need a succession of if statements
that test whether Check has failed.

Strictly speaking, any code that calls methods on a *C instance
concurrently is wrong, because the docs don't mention that it's OK
to call methods concurrently and the Go convention is that this means that
it's not OK. But we do have a bunch of code that does this,
and it's useful, so I'd very much like to see the situation clarified.
I personally think that it would be reasonable to define useful semantics
for all the methods when called concurrently rather than prohibiting
some.

Contributor

davecheney commented Jun 18, 2015

Please read the sample code Rog, I'm only calling c.Error.

I'm sorry I buried the lede by talking about c.FailNow

On 18 Jun 2015, at 17:52, Roger Peppe notifications@github.com wrote:

If I've started a goroutine I don't expect it to be able to influence other
goroutines directly, so for me the the fact that Assert only aborts the
current goroutine is not unusual at all. It feels more unusual to me
that Assert cannot be used in goroutines where other methods on *C
can, despite the documentation not mentioning anything either way.

And of course it's useful for just the same reason that Assert is useful
in other circumstances - you don't need a succession of if statements
that test whether Check has failed.

Strictly speaking, any code that calls methods on a *C instance
concurrently is wrong, because the docs don't mention that it's OK
to call methods concurrently and the Go convention is that this means that
it's not OK. But we do have a bunch of code that does this,
and it's useful, so I'd very much like to see the situation clarified.
I personally think that it would be reasonable to define useful semantics
for all the methods when called concurrently rather than prohibiting
some.


Reply to this email directly or view it on GitHub.

I'm sorry, this has indeed drifted because I was responding to Gustavo's
reply. But technically it's not OK even to call c.Error concurrently
because the documentation does not explicitly say that it's alright
to call methods on *C concurrently (like it's not OK to call methods
concurrently on bufio.Writer). I think that issue is intimately tied
up with this one, because the code was not written assuming concurrent
use, hence the race. If we can agree on the desired semantics, we'll
know what kind of fix to apply.

Contributor

davecheney commented Jun 18, 2015

The testing package handles this case fine. I argue that it is expected that c.Errorf would have the same implicit guarantee.

% cat t_test.go
package t

import (
        "sync"
        "testing"
)

func TestRace(t *testing.T) {
        var wg sync.WaitGroup
        start := make(chan bool)

        const n = 2

        wg.Add(n)
        for i := 0; i < n; i++ {
                go func() {
                        defer wg.Done()

                        <-start

                        t.Error("an error occured")
                }()
        }
        close(start)
        wg.Wait()
}
lucky(~/src/panic) % go test -race
--- FAIL: TestRace (0.00s)
        t_test.go:21: an error occured
        t_test.go:21: an error occured
FAIL
exit status 1
FAIL    panic   0.011s

The testing package should be better documented in this respect too.
Concurrent invocation is a privilege not a right in Go.
Mind you, at least the testing package does document something
with respect to goroutines (and that matches Gustavo's take on it),
for example:

FailNow must be called from the goroutine running the test or benchmark function,
not from other goroutines created during the test. Calling FailNow does not stop
those other goroutines.

I'd argue that's it is less of a burden in the testing package because all tests are explicit
anyway, so the assert style is not so usual.

Contributor

davecheney commented Jun 18, 2015

c.Error does not invoke c.FailNow.

I was trying to suggest that because the stdlib testing documentation explicitly states
FailNow cannot be called in a separate goroutine, that by implication
it's OK for Error to be called that way. That's not the case in gocheck,
and technically I think we're in the wrong by doing that regardless of
whether it invokes FailNow or not.

davecheney added a commit to davecheney/juju that referenced this issue Jun 22, 2015

util/ssh: fix data races in test
Updates LP 1467362

There is one race remaining, that is blocked on go-check/check#43

jujubot added a commit to juju/juju that referenced this issue Jun 22, 2015

Merge pull request #2617 from davecheney/fixedbugs/1467362
util/ssh: fix data races in test

Updates LP 1467362

There is one race remaining, that is blocked on go-check/check#43

(Review request: http://reviews.vapour.ws/r/1991/)

@niemeyer niemeyer closed this in #44 Jun 26, 2015

jujubot added a commit to juju/juju that referenced this issue Jun 28, 2015

Merge pull request #2665 from davecheney/fixedbugs/1467362
update to gopkg/check.v1 rev b3d343032 to pick up fix for go-check/check#43

update to gopkg/check.v1 rev b3d343032 to pick up fix for go-check/check#43

(Review request: http://reviews.vapour.ws/r/2041/)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment