Skip to content

os/exec: parallel Command.Start() can be very slow #46388

@sb10

Description

@sb10

What version of Go are you using (go version)?

$ go version
go version go1.16.3 linux/amd64

Does this issue reproduce with the latest release?

Yes (1.16.4).

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/nfs/users/nfs_s/sb10/.cache/go-build"
GOENV="/nfs/users/nfs_s/sb10/.config/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOINSECURE=""
GOMODCACHE="/nfs/users/nfs_s/sb10/go/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/nfs/users/nfs_s/sb10/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/software/vertres/installs/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/software/vertres/installs/go/pkg/tool/linux_amd64"
GOVCS=""
GOVERSION="go1.16.3"
GCCGO="gccgo"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="0"
GOMOD="/dev/null"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build2872611298=/tmp/go-build -gno-record-gcc-switches"

What did you do?

$ go run main.go
Serial:
Start 0 took 252.173µs
Start 1 took 221.002µs
Start 2 took 231.627µs
Start 3 took 171.249µs
Start 4 took 223.431µs
Start 5 took 211.772µs
Start 6 took 169.167µs
Start 7 took 203.491µs
Start 8 took 198.665µs
Start 9 took 181.992µs

Parallel (8 cpus, GOMAXPROCS 8):
Start 0 took 282.599µs
Start 1 took 419.762µs
Start 2 took 423.152µs
Start 3 took 834.894µs
Start 9 took 1.081168ms
Start 6 took 2.307089ms
Start 4 took 3.174552ms
Start 8 took 3.222679ms
Start 5 took 7.438996ms
Start 7 took 14.173073ms

Where main.go is https://play.golang.org/p/VN5OqF23nDQ

What did you expect to see?

exec.Command.Start() times be consistent between serial and parallel runs.

What did you see instead?

Parallel exec.Command.Start() times can be around 70 times greater than serial ones.

Setting runtime.GOMAXPROCS() to greater than or equal to the number of concurrent Start()s doesn't help. Setting it to 1 does reduce the effect, but I still get a few starts in the multiple millisecond range.

A runtime trace seems to suggest this is caused by os.StartProcess taking a global mutex lock, I think as explained here:

// Lock synchronizing creation of new file descriptors with fork.
//
// We want the child in a fork/exec sequence to inherit only the
// file descriptors we intend. To do that, we mark all file
// descriptors close-on-exec and then, in the child, explicitly
// unmark the ones we want the exec'ed program to keep.
// Unix doesn't make this easy: there is, in general, no way to
// allocate a new file descriptor close-on-exec. Instead you
// have to allocate the descriptor and then mark it close-on-exec.
// If a fork happens between those two events, the child's exec
// will inherit an unwanted file descriptor.
//
// This lock solves that race: the create new fd/mark close-on-exec
// operation is done holding ForkLock for reading, and the fork itself
// is done holding ForkLock for writing. At least, that's the idea.
// There are some complications.
//
// Some system calls that create new file descriptors can block
// for arbitrarily long times: open on a hung NFS server or named
// pipe, accept on a socket, and so on. We can't reasonably grab
// the lock across those operations.
//
// It is worse to inherit some file descriptors than others.
// If a non-malicious child accidentally inherits an open ordinary file,
// that's not a big deal. On the other hand, if a long-lived child
// accidentally inherits the write end of a pipe, then the reader
// of that pipe will not see EOF until that child exits, potentially
// causing the parent program to hang. This is a common problem
// in threaded C programs that use popen.
//
// Luckily, the file descriptors that are most important not to
// inherit are not the ones that can take an arbitrarily long time
// to create: pipe returns instantly, and the net package uses
// non-blocking I/O to accept on a listening socket.
// The rules for which file descriptor-creating operations use the
// ForkLock are as follows:
//
// 1) Pipe. Does not block. Use the ForkLock.
// 2) Socket. Does not block. Use the ForkLock.
// 3) Accept. If using non-blocking mode, use the ForkLock.
// Otherwise, live with the race.
// 4) Open. Can block. Use O_CLOEXEC if available (Linux).
// Otherwise, live with the race.
// 5) Dup. Does not block. Use the ForkLock.
// On Linux, could use fcntl F_DUPFD_CLOEXEC
// instead of the ForkLock, but only for dup(fd, -1).

But I wondered if there's any other way around this for this particular use case?

Metadata

Metadata

Assignees

No one assigned

    Labels

    FrozenDueToAgeNeedsInvestigationSomeone must examine and confirm this is a valid issue and not a duplicate of an existing one.Performance

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions