New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

testing: test with child process sometimes hangs on 1.10; -timeout not respected #24050

Open
psanford opened this Issue Feb 22, 2018 · 7 comments

Comments

Projects
None yet
6 participants
@psanford

psanford commented Feb 22, 2018

What version of Go are you using (go version)?

go version go1.10 linux/amd64

What operating system and processor architecture are you using (go env)?

$ go env
GOARCH="amd64"
GOBIN=""
GOCACHE="/home/psanford/.cache/go-build"
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/home/psanford/projects/go"
GORACE=""
GOROOT="/usr/local/go"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
GCCGO="gccgo"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build953969108=/tmp/go-build -gno-record-gcc-switches"

What did you do?

After upgrading to 1.10 we had one test that started to hang intermittently. The test in question starts a child process which it kills by canceling a context object at the end of the test method. It does not do an explicit cmd.Wait().

Here is a minimal test case that demonstrates the problem:

https://play.golang.org/p/8rq41A5Khsm

I can get this to hang consistently by running it in a bash while loop:

$ while true; do go test -timeout 5s -v -count 1 .; sleep 0.1; done
=== RUN   TestOSExecNoWait
start
done
--- PASS: TestOSExecNoWait (0.01s)
PASS
ok      _/tmp   0.012s
=== RUN   TestOSExecNoWait
start
done
--- PASS: TestOSExecNoWait (0.01s)
PASS
ok      _/tmp   0.012s
=== RUN   TestOSExecNoWait
start
done
--- PASS: TestOSExecNoWait (0.01s)
PASS
ok      _/tmp   0.012s
=== RUN   TestOSExecNoWait
start
done
--- PASS: TestOSExecNoWait (0.01s)
PASS

<hangs here indefinitely>

If I explicitly call cmd.Wait() the test does not hang. If I don't attach the child process' Stdout and Stderr to os.Std{out,err} the test does not hang.

On 1.9.4 the test does not hang.

Its also interesting that even though I specified -timeout 5s the test runner hangs forever.

@crvv

This comment has been minimized.

Contributor

crvv commented Feb 23, 2018

the test runner hangs forever.

It only hangs 60 seconds on my machine.

go test hangs at

done <- cmd.Wait()

And cmd.Wait() is waiting at

_, err := io.Copy(w, pr)

This io.Copy() is reading a pipe reader. It won't return until the pipe writer is closed.
The pipe writer will be closed when the sleep 60 exits. So it hangs 60 seconds.

func TestOSExecNoWait(t *testing.T) {
...
	cancel()
}

After the cancel() returns, the main function will also return.
The child process may or may not be killed, so it hangs intermittently.

@psanford

This comment has been minimized.

psanford commented Feb 23, 2018

It only hangs 60 seconds on my machine.

Yes in the example provided it only hangs for 60 seconds because the child process exits.

In the test where I first found this issue the child process never exits so it hangs forever.

@crvv

This comment has been minimized.

Contributor

crvv commented Feb 24, 2018

I don't think this issue is a regression introduced in Go 1.10.
I can use Go 1.9 to reproduce it but with a different condition.

This difference was introduced in bd95f88.
bd95f88#diff-acaf53a9cd478507ebbcf85037940b4dL1080

If the go command needs a bytes.Buffer to save the output, os/exec will open a pipe.
And go will hang until the pipe is closed.

The timeout doesn't work because it isn't handled by go.
There is a testKillTimeout in go, but it also hangs at cmd.Wait().

@gopherbot

This comment has been minimized.

gopherbot commented Feb 28, 2018

Change https://golang.org/cl/97497 mentions this issue: cmd/go/internal/test: don't wait for pending I/O if child process has gone

@ianlancetaylor

This comment has been minimized.

Contributor

ianlancetaylor commented Feb 28, 2018

See also #23019.

@FiloSottile

This comment has been minimized.

Member

FiloSottile commented Apr 24, 2018

@gopherbot please open backport tracking issues. This might be a 1.10 regression, or also a 1.9 issue.

@gopherbot

This comment has been minimized.

gopherbot commented Apr 24, 2018

Backport issue(s) opened: #25042 (for 1.10), #25043 (for 1.9).

Remember to create the cherry-pick CL(s) as soon as the patch is submitted to master, according to https://golang.org/wiki/MinorReleases.

curoverse-bot pushed a commit to curoverse/arvados that referenced this issue Apr 26, 2018

Avoid dangling child procs in test suite.
In go 1.10.1, these seem to make "go test" hang sometimes.

golang/go#24050

No issue #

Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tclegg@veritasgenetics.com>

@ianlancetaylor ianlancetaylor modified the milestones: Go1.11, Go1.12 Jun 29, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment