Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: goroutines that stop after calling runtime.RaceDisable break race detector #60934

Closed
jellevandenhooff opened this issue Jun 22, 2023 · 6 comments
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsFix The path to resolution is known, but the work has not been done. RaceDetector
Milestone

Comments

@jellevandenhooff
Copy link
Contributor

jellevandenhooff commented Jun 22, 2023

What version of Go are you using (go version)?

$ go version
go version go1.20.5 darwin/arm64

Does this issue reproduce with the latest release?

Yes, reproduces on tip.

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE=""
GOARCH="arm64"
GOBIN=""
GOCACHE="/Users/jelle/Library/Caches/go-build"
GOENV="/Users/jelle/Library/Application Support/go/env"
GOEXE=""
GOEXPERIMENT=""
GOFLAGS=""
GOHOSTARCH="arm64"
GOHOSTOS="darwin"
GOINSECURE=""
GOMODCACHE="/Users/jelle/go/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="darwin"
GOPATH="/Users/jelle/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/opt/homebrew/Cellar/go/1.20.5/libexec"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/opt/homebrew/Cellar/go/1.20.5/libexec/pkg/tool/darwin_arm64"
GOVCS=""
GOVERSION="go1.20.5"
GCCGO="gccgo"
AR="ar"
CC="cc"
CXX="c++"
CGO_ENABLED="1"
GOMOD="/dev/null"
GOWORK=""
CGO_CFLAGS="-O2 -g"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-O2 -g"
CGO_FFLAGS="-O2 -g"
CGO_LDFLAGS="-O2 -g"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -arch arm64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/rr/r3bbgb153sq467ngl1m0q3z00000gn/T/go-build2978985759=/tmp/go-build -gno-record-gcc-switches -fno-common"

What did you do?

I am using the race detector in a rather tricky setting: I call runtime.RaceDisable and runtime.RaceEnable to have the race detector ignore certain happens-before edges. If a goroutine stops after calling runtime.RaceDisable, but before calling runtime.RaceEnable because it panics, the race detector is broken afterwards.

Calling runtime.RaceDisable causes problems because it tracks state in a per-g field g.raceignore that does not get reset when the g is reused by a later new goroutine.

This program reproduces the issue:

package main

import (
	"fmt"
	"runtime"
	"time"
)

func main() {
	// create several gs that have mismatched RaceDisable/RaceEnable
	for i := 0; i < 100; i++ {
		go func() {
			// make sure gs run in parallel by sleeping
			time.Sleep(10 * time.Millisecond)

			runtime.RaceDisable()
		}()
	}

	// wait for all gs above to finish
	time.Sleep(20 * time.Millisecond)

	// shared variable protected by reading/writing ch
	var x int
	ch := make(chan struct{}, 0)

	go func() {
		x = 1
		ch <- struct{}{}
	}()
	go func() {
		<-ch
		_ = x
	}()

	time.Sleep(20 * time.Millisecond)
	fmt.Println("done")
}

What did you expect to see?

$ go run -race .
done

What did you see instead?

$ go run -race .
==================
WARNING: DATA RACE
Read at 0x00c0000ba008 by goroutine 110:
  main.main.func3()
      /Users/jelle/hack/go-bugs/raceenable/main.go:33 +0x40

Previous write at 0x00c0000ba008 by goroutine 109:
  main.main.func2()
      /Users/jelle/hack/go-bugs/raceenable/main.go:28 +0x34

Goroutine 110 (running) created at:
  main.main()
      /Users/jelle/hack/go-bugs/raceenable/main.go:31 +0x1c4

Goroutine 109 (finished) created at:
  main.main()
      /Users/jelle/hack/go-bugs/raceenable/main.go:27 +0x124
==================
done
Found 1 data race(s)
exit status 66
@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Jun 22, 2023
@gopherbot
Copy link
Contributor

Change https://go.dev/cl/505055 mentions this issue: runtime: set raceignore to zero when starting a new goroutine

@dmitshur
Copy link
Contributor

CC @golang/runtime.

@dmitshur dmitshur added RaceDetector NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Jun 22, 2023
@dmitshur dmitshur added this to the Backlog milestone Jun 22, 2023
@mknyszek
Copy link
Contributor

Thanks for the investigation and for sending a fix! That does look like a real problem to me.

I think this should probably be backported too, because although I imagine runtime.RaceDisable to be rare, this does cause some pretty serious false positives without any workaround.

@gopherbot please open backport issues for Go 1.20 and Go 1.19.

@gopherbot
Copy link
Contributor

Backport issue(s) opened: #60948 (for 1.19), #60949 (for 1.20).

Remember to create the cherry-pick CL(s) as soon as the patch is submitted to master, according to https://go.dev/wiki/MinorReleases.

@dmitshur dmitshur added NeedsFix The path to resolution is known, but the work has not been done. and removed NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Jun 22, 2023
@dmitshur dmitshur modified the milestones: Backlog, Go1.21 Jun 22, 2023
@gopherbot
Copy link
Contributor

Change https://go.dev/cl/505675 mentions this issue: [release-branch.go1.19] runtime: set raceignore to zero when starting a new goroutine

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/505676 mentions this issue: [release-branch.go1.20] runtime: set raceignore to zero when starting a new goroutine

gopherbot pushed a commit that referenced this issue Jun 29, 2023
… a new goroutine

When reusing a g struct the runtime did not reset
g.raceignore. Initialize raceignore to zero when initially
setting racectx.

A goroutine can end with a non-zero raceignore if it exits
after calling runtime.RaceDisable without a matching
runtime.RaceEnable. If that goroutine's g is later reused
the race detector is in a weird state: the underlying
g.racectx is active, yet g.raceignore is non-zero, and
raceacquire/racerelease which check g.raceignore become
no-ops. This causes the race detector to report races when
there are none.

For #60934
Fixes #60948

Change-Id: Ib8e412f11badbaf69a480f03740da70891f4093f
Reviewed-on: https://go-review.googlesource.com/c/go/+/505055
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
(cherry picked from commit 48dbb62)
Reviewed-on: https://go-review.googlesource.com/c/go/+/505675
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Bypass: Carlos Amedee <carlos@golang.org>
gopherbot pushed a commit that referenced this issue Jun 29, 2023
… a new goroutine

When reusing a g struct the runtime did not reset
g.raceignore. Initialize raceignore to zero when initially
setting racectx.

A goroutine can end with a non-zero raceignore if it exits
after calling runtime.RaceDisable without a matching
runtime.RaceEnable. If that goroutine's g is later reused
the race detector is in a weird state: the underlying
g.racectx is active, yet g.raceignore is non-zero, and
raceacquire/racerelease which check g.raceignore become
no-ops. This causes the race detector to report races when
there are none.

For #60934
Fixes #60949

Change-Id: Ib8e412f11badbaf69a480f03740da70891f4093f
Reviewed-on: https://go-review.googlesource.com/c/go/+/505055
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
(cherry picked from commit 48dbb62)
Reviewed-on: https://go-review.googlesource.com/c/go/+/505676
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
bradfitz pushed a commit to tailscale/go that referenced this issue Jul 15, 2023
When reusing a g struct the runtime did not reset
g.raceignore. Initialize raceignore to zero when initially
setting racectx.

A goroutine can end with a non-zero raceignore if it exits
after calling runtime.RaceDisable without a matching
runtime.RaceEnable. If that goroutine's g is later reused
the race detector is in a weird state: the underlying
g.racectx is active, yet g.raceignore is non-zero, and
raceacquire/racerelease which check g.raceignore become
no-ops. This causes the race detector to report races when
there are none.

Fixes golang#60934

Change-Id: Ib8e412f11badbaf69a480f03740da70891f4093f
Reviewed-on: https://go-review.googlesource.com/c/go/+/505055
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsFix The path to resolution is known, but the work has not been done. RaceDetector
Projects
None yet
Development

No branches or pull requests

4 participants