New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

syscall: StartProcess blocked at acquire lock #26836

Open
wushukai opened this Issue Aug 7, 2018 · 5 comments

Comments

Projects
None yet
4 participants
@wushukai

wushukai commented Aug 7, 2018

What version of Go are you using (go version)?

go version go1.8.3 linux/amd64

Does this issue reproduce with the latest release?

We have not tested yet.

What operating system and processor architecture are you using (go env)?

GOARCH="amd64"
GOBIN=""
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/workspace"
GORACE=""
GOROOT="/usr/local/go"
GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
GCCGO="gccgo"
CC="gcc"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build598025583=/tmp/go-build -gno-record-gcc-switches"
CXX="g++"
CGO_ENABLED="1"
PKG_CONFIG="pkg-config"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"

What did you do?

Our service will start subprocess periodically.

And we found some goroutines hanged for competing ForkLock.Lock(), but none succeed. Stack of one blocked goroutine is listed below

goroutine 1526643 [semacquire, 8274 minutes]:
sync.runtime_SemacquireMutex(0x1a2aba4)
/usr/local/go/src/runtime/sema.go:62 +0x34
sync.(*Mutex).Lock(0x1a2aba0)
/usr/local/go/src/sync/mutex.go:87 +0x9d
sync.(*RWMutex).Lock(0x1a2aba0)
/usr/local/go/src/sync/rwmutex.go:86 +0x2d
syscall.forkExec(0xce1c2b, 0x7, 0xc426b93410, 0x3, 0x3, 0xc42019f9e8, 0x0, 0x0, 0x0)
/usr/local/go/src/syscall/exec_unix.go:185 +0x1fd
syscall.StartProcess(0xce1c2b, 0x7, 0xc426b93410, 0x3, 0x3, 0xc42019f9e8, 0x2, 0x4, 0xc4244941e0, 0xc42019f9b8)
/usr/local/go/src/syscall/exec_unix.go:240 +0x64
os.startProcess(0xce1c2b, 0x7, 0xc426b93410, 0x3, 0x3, 0xc42019fb90, 0xc426b93440, 0x3, 0x3)
/usr/local/go/src/os/exec_posix.go:45 +0x1a3
os.StartProcess(0xce1c2b, 0x7, 0xc426b93410, 0x3, 0x3, 0xc42019fb90, 0x0, 0x0, 0x28)
/usr/local/go/src/os/exec.go:94 +0x64
os/exec.(*Cmd).Start(0xc426e70580, 0xc42019fc01, 0xc424fca850)
/usr/local/go/src/os/exec/exec.go:359 +0x3d2
os/exec.(*Cmd).Run(0xc426e70580, 0xc424fca850, 0xc426e70580)
/usr/local/go/src/os/exec/exec.go:277 +0x2b

And besides the blocked groutines, there are not other stack containing function "forkExec".

The problem occurs sometimes among our production services, but I have not found any way to ensure reproducing this.

Please help provides some clue for debugging this problem...

@tklauser tklauser changed the title from syscall.StartProcess blocked at acquire lock to syscall: StartProcess blocked at acquire lock Aug 7, 2018

@tklauser

This comment has been minimized.

Member

tklauser commented Aug 7, 2018

@ianlancetaylor ianlancetaylor added this to the Go1.12 milestone Aug 7, 2018

@ianlancetaylor

This comment has been minimized.

Contributor

ianlancetaylor commented Aug 7, 2018

What you are describing sounds like a clear bug, but I do not recall any similar reports. The fork lock is only held briefly and I do not know of any code path in which it could be left locked. Is there any way that we can reproduce the problem ourselves? Is it possible that the kernel is sometimes killing a single thread of your program?

@wushukai

This comment has been minimized.

wushukai commented Aug 8, 2018

Our server use a customized kernel based on linux 4.1. And I had checked the kernel messages, and found nothing seems related to this problem.

I wrote a simple test program which randomly starts child processes, and have not reproduced yet.

I have tried upgrade the golang version to 1.10.3, and deploys the new version to parts of our production servers, I will check if it can reproduce in this new version

@crvv

This comment has been minimized.

Contributor

crvv commented Aug 8, 2018

The ForkLock is a public variable. It can be held by any code.
Maybe this isn't a bug of stdlib but some other libraries.

@wushukai

This comment has been minimized.

wushukai commented Aug 9, 2018

@crvv
I checked all source code under GOPATH, no other libs use this lock..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment