-
Notifications
You must be signed in to change notification settings - Fork 18.8k
Description
What version of Go are you using (go version)?
$ go version go version go1.21rc2 darwin/arm64
Does this issue reproduce with the latest release?
Yes, it reproduces since this change.
What operating system and processor architecture are you using (go env)?
go env Output
$ go env GO111MODULE='' GOARCH='arm64' GOBIN='' GOCACHE='/Users/joaks/Library/Caches/go-build' GOENV='/Users/joaks/Library/Application Support/go/env' GOEXE='' GOEXPERIMENT='' GOFLAGS='' GOHOSTARCH='arm64' GOHOSTOS='darwin' GOINSECURE='' GOMODCACHE='/Users/joaks/go/pkg/mod' GONOPROXY='' GONOSUMDB='' GOOS='darwin' GOPATH='/Users/joaks/go' GOPRIVATE='' GOPROXY='https://proxy.golang.org,direct' GOROOT='/Users/joaks/go/src/github.com/golang/go' GOSUMDB='sum.golang.org' GOTMPDIR='' GOTOOLCHAIN='auto' GOTOOLDIR='/Users/joaks/go/src/github.com/golang/go/pkg/tool/darwin_arm64' GOVCS='' GOVERSION='go1.21rc2' GCCGO='gccgo' AR='ar' CC='clang' CXX='clang++' CGO_ENABLED='1' GOMOD='/dev/null' GOWORK='' CGO_CFLAGS='-O2 -g' CGO_CPPFLAGS='' CGO_CXXFLAGS='-O2 -g' CGO_FFLAGS='-O2 -g' CGO_LDFLAGS='-O2 -g' PKG_CONFIG='pkg-config' GOGCCFLAGS='-fPIC -arch arm64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -ffile-prefix-map=/var/folders/xj/2wbc4_xn293gkz7_6rzxsz5w0000gn/T/go-build3971151895=/tmp/go-build -gno-record-gcc-switches -fno-common'
What did you do?
We have a code generator that we use at Uber that spawns up many concurrent child processes that communicate via stdin & stdout. While doing internal testing with Go1.21rc2, we noticed the code generator hanging. A very minimal runnable repro can be found in this repository: https://github.com/JacobOaks/Go1.21rc2-syscall.forkExec-hanging-repro.
Essentially, we are spinning up a bunch of external processes with stdin & stdout pipes concurrently. Something like (see link above for full repro):
func spawn(binaryPath string, n int) []*client {
clients := make([]*client, n)
for i := 0; i < n; i++ {
clients[i] = newClient(binaryPath)
}
var wg sync.WaitGroup
for i := 0; i < n; i++ {
wg.Add(1)
client := clients[i]
go func() {
if err := client.start(); err != nil {
panic("TODO")
}
wg.Done()
}()
}
wg.Wait()
return clients
}
type client struct {
cmd *exec.Cmd
stdout io.ReadCloser
stdin io.WriteCloser
}
func newClient(binary string) *client {
return &client{
cmd: exec.Command(binary),
}
}
func (c *client) start() error {
var err error
c.stdout, err = c.cmd.StdoutPipe()
if err != nil {
return fmt.Errorf("create stdout pipe: %w", err)
}
c.stdin, err = c.cmd.StdinPipe()
if err != nil {
return fmt.Errorf("create stdin pipe: %w", err)
}
if err = c.cmd.Start(); err != nil {
return fmt.Errorf("run cmd: %w", err)
}
return nil
}Attaching delve to the hanging process, we notice the issue occurs in cmd.Start, where syscall.forkExec seems to hang:
(dlv) grs
Goroutine 1 - User: /Users/joaks/go/src/github.com/golang/go/src/runtime/sema.go:62 sync.runtime_Semacquire (0x1026bf57c) [semacquire]
Goroutine 2 - User: /Users/joaks/go/src/github.com/golang/go/src/runtime/proc.go:399 runtime.gopark (0x102695198) [force gc (idle)]
Goroutine 3 - User: /Users/joaks/go/src/github.com/golang/go/src/runtime/proc.go:399 runtime.gopark (0x102695198) [GC sweep wait]
Goroutine 4 - User: /Users/joaks/go/src/github.com/golang/go/src/runtime/proc.go:399 runtime.gopark (0x102695198) [GC scavenge wait]
Goroutine 5 - User: /Users/joaks/go/src/github.com/golang/go/src/runtime/proc.go:399 runtime.gopark (0x102695198) [finalizer wait]
Goroutine 12 - User: /Users/joaks/go/src/github.com/golang/go/src/runtime/sys_darwin.go:24 syscall.syscall (0x1026bfaf8) (thread 18311682) [timer goroutine (idle)]
[6 goroutines]
(dlv) gr 12
Switched from 0 to 12 (thread 18311682)
(dlv) stack
0 0x000000018f884acc in ???
at ?:-1
1 0x00000001026c0b58 in runtime.systemstack_switch
at /Users/joaks/go/src/github.com/golang/go/src/runtime/asm_arm64.s:200
2 0x00000001026b19dc in runtime.libcCall
at /Users/joaks/go/src/github.com/golang/go/src/runtime/sys_libc.go:49
3 0x00000001026bfaf8 in syscall.syscall
at /Users/joaks/go/src/github.com/golang/go/src/runtime/sys_darwin.go:24
4 0x00000001026daa5c in syscall.readlen
at /Users/joaks/go/src/github.com/golang/go/src/syscall/syscall_darwin.go:242
5 0x00000001026d9c30 in syscall.forkExec
at /Users/joaks/go/src/github.com/golang/go/src/syscall/exec_unix.go:217
6 0x00000001026e9628 in syscall.StartProcess
at /Users/joaks/go/src/github.com/golang/go/src/syscall/exec_unix.go:334
7 0x00000001026e9628 in os.startProcess
at /Users/joaks/go/src/github.com/golang/go/src/os/exec_posix.go:54
8 0x00000001026e9340 in os.StartProcess
at /Users/joaks/go/src/github.com/golang/go/src/os/exec.go:111
9 0x00000001026fc534 in os/exec.(*Cmd).Start
at /Users/joaks/go/src/github.com/golang/go/src/os/exec/exec.go:693
10 0x00000001026ff368 in main.(*client).start
at ./server/main.go:105
11 0x00000001026fefa8 in main.spawn.func1
at ./server/main.go:46
12 0x00000001026c3024 in runtime.goexit
at /Users/joaks/go/src/github.com/golang/go/src/runtime/asm_arm64.s:1197
This behavior is flaky and in our investigation, only appears on Go1.21rc2 on darwin-arm64.
git bisect indicated this change to be the culprit.
What did you expect to see?
I would expect the program in the linked repro to not hang, as in Go 1.20.
What did you see instead?
It occasionally hangs, see above.