Skip to content

runtime/race: misses race in gRPC server because of timing #32682

@pohly

Description

@pohly

What version of Go are you using (go version)?

$ go version
go version go1.12.6 linux/amd64

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GOARCH="amd64"
GOBIN=""
GOCACHE="/home/pohly/.cache/go-build"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/nvme/gopath"
GOPROXY=""
GORACE=""
GOROOT="/nvme/gopath/go"
GOTMPDIR=""
GOTOOLDIR="/nvme/gopath/go/pkg/tool/linux_amd64"
GCCGO="gccgo"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD="/work/grpc-locking/go.mod"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build250390168=/tmp/go-build -gno-record-gcc-switches"

What did you do?

I want to enable the Go race detector in the testing of a gRPC server. gRPC calls methods concurrently in different goroutines, so those methods have to protect access to shared data via locks. I got the race detector working (it reported an artificially introduced race) but some other race (found via code review) wasn't reported although the faulty code gets invoked by different goroutines (verified via debug output).

I reduced it to a simpler example that works with just a single go test command:
https://github.com/pohly/grpc-locking

To reproduce (more details in the repo's README.md):

  • check out that repo
  • go test -v -race -run=TestGRPC . -args -delay=10ms

The delay is important. It configures the duration of a sleep between starting two goroutines which call the faulty code through gRPC. This simulates the random timing that occurs when different clients call the server from remote.

It's also relevant that gRPC is involved. The race detector always works when calling the methods directly.

What did you expect to see?

I expect to get the race reported regardless of the delay.

What did you see instead?

The race is only reported when the delay is small, like 1ms.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions