Skip to content

runtime/race: llvm-unstable changes break Go tests with -race: panic: release of handle with refcount 0 #70283

@stapelberg

Description

@stapelberg

Go version

go version devel go1.24-583d750fa1 Mon Nov 11 00:08:45 2024 +0000 linux/amd64

Output of go env in your module/workspace:

GO111MODULE=''
GOARCH='amd64'
GOAUTH='netrc'
GOBIN=''
GOCACHE='/home/stapelberg/.cache/go-build'
GOENV='/home/stapelberg/.config/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFLAGS=''
GOHOSTARCH='amd64'
GOHOSTOS='linux'
GOINSECURE=''
GOMODCACHE='/home/stapelberg/go/pkg/mod'
GONOPROXY=''
GONOSUMDB=''
GOOS='linux'
GOPATH='/home/stapelberg/go'
GOPRIVATE=''
GOPROXY='https://proxy.golang.org,direct'
GOROOT='/home/stapelberg/upstream-go'
GOSUMDB='sum.golang.org'
GOTMPDIR=''
GOTOOLCHAIN='auto'
GOTOOLDIR='/home/stapelberg/upstream-go/pkg/tool/linux_amd64'
GOVCS=''
GOVERSION='devel go1.24-583d750fa1 Mon Nov 11 00:08:45 2024 +0000'
GODEBUG=''
GOTELEMETRY='local'
GOTELEMETRYDIR='/home/stapelberg/.config/go/telemetry'
GCCGO='gccgo'
GOAMD64='v1'
AR='ar'
CC='gcc'
CXX='g++'
CGO_ENABLED='1'
GOMOD='/tmp/rep/go.mod'
GOWORK=''
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
PKG_CONFIG='pkg-config'
GOGCCFLAGS='-fPIC -m64 -pthread -Wl,--no-gc-sections -fmessage-length=0 -ffile-prefix-map=/tmp/go-build4094750678=/tmp/go-build -gno-record-gcc-switches'

What did you do?

I was working on a Google-internal software update and ran into an issue: tests started failing when running with -race enabled with Go toolchains built after Google-internal CL/693898707 (an LLVM import).

Further investigation revealed that tests start failing once the race syso contains LLVM upstream change llvm/llvm-project@4d374479bea4 (“[nfc][tsan] Replace some macros with templates (#114931)”). Building the race syso at commit 4d374479bea4^ (one prior) results in a working test.

I reproduced this using:

~/llvm-project main % g rh 4d374479bea4
~/llvm-project main % (cd compiler-rt/lib/tsan/go && GOAMD64=v3 ./buildgo.sh)
~/go/src master % cp ~/llvm-project/compiler-rt/lib/tsan/go/race_linux_amd64.syso ./runtime/race/internal/amd64v3/race_linux.syso
~/go/src master % ./make.bash
~/go/src master % export PATH=$HOME/go/bin:$PATH
% mkdir /tmp/rep && cd /tmp/rep
/tmp/rep % go mod init rep
/tmp/rep % cat > rep_test.go <<'EOT'
package rep_test

import (
	"os"
	"os/exec"
	"testing"
)

func TestRepro(t *testing.T) {
	if os.Getenv("REEXEC") != "1" {
		cmd := &exec.Cmd{
			Path:   os.Args[0],
			Args:   os.Args,
			Stdin:  os.Stdin,
			Stdout: os.Stdout,
			Stderr: os.Stderr,
			Env:    append(os.Environ(), "REEXEC=1"),
		}
		if err := cmd.Start(); err != nil {
			panic(err)
		}
		cmd.Wait()
	}
}
EOT
/tmp/rep % GOAMD64=v3 go test -race -v

What did you see happen?

The test failed with the following error:

% GOAMD64=v3 go test -race -v
=== RUN   TestRepro
=== RUN   TestRepro
--- PASS: TestRepro (0.00s)
PASS
--- FAIL: TestRepro (1.01s)
panic: release of handle with refcount 0 [recovered]
	panic: release of handle with refcount 0

goroutine 19 [running]:
testing.tRunner.func1.2({0x5f0c40, 0x65e970})
	/home/stapelberg/go/src/testing/testing.go:1706 +0x3eb
testing.tRunner.func1()
	/home/stapelberg/go/src/testing/testing.go:1709 +0x696
panic({0x5f0c40?, 0x65e970?})
	/home/stapelberg/go/src/runtime/panic.go:787 +0x132
os.(*Process).handleTransientRelease(0xc000152040)
	/home/stapelberg/go/src/os/exec.go:168 +0x105
os.(*Process).pidfdWait(0xc000152040)
	/home/stapelberg/go/src/os/pidfd_linux.go:121 +0x612
os.(*Process).wait(0xc000152040)
	/home/stapelberg/go/src/os/exec_unix.go:27 +0x4e
os.(*Process).Wait(...)
	/home/stapelberg/go/src/os/exec.go:358
os/exec.(*Cmd).Wait(0xc000172000)
	/home/stapelberg/go/src/os/exec/exec.go:922 +0xb0
rep_test.TestRepro(0xc000102a80?)
	/tmp/rep/repro_test.go:22 +0x3bf
testing.tRunner(0xc000102a80, 0x62f070)
	/home/stapelberg/go/src/testing/testing.go:1764 +0x226
created by testing.(*T).Run in goroutine 1
	/home/stapelberg/go/src/testing/testing.go:1823 +0x8f3

What did you expect to see?

I expected the test to keep succeeding.

I am not sure if the LLVM upstream change is to blame or if Go is doing something wrong (or a combination of both)…?

Metadata

Metadata

Assignees

No one assigned

    Labels

    FrozenDueToAgeNeedsInvestigationSomeone must examine and confirm this is a valid issue and not a duplicate of an existing one.compiler/runtimeIssues related to the Go compiler and/or runtime.

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions