Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

os/signal: timeout in TestAllThreadsSyscallSignals on linux-ppc64-buildlet #44193

Open
bcmills opened this issue Feb 9, 2021 · 7 comments
Open

os/signal: timeout in TestAllThreadsSyscallSignals on linux-ppc64-buildlet #44193

bcmills opened this issue Feb 9, 2021 · 7 comments
Assignees
Milestone

Comments

@bcmills
Copy link
Member

@bcmills bcmills commented Feb 9, 2021

It's not clear to me whether this is a deadlock, a livelock, or just a slow test, but the similarity to #43149 is concerning (CC @AndrewGMorgan @ianlancetaylor).

panic: test timed out after 3m0s

goroutine 5 [running]:
testing.(*M).startAlarm.func1()
	/workdir/go/src/testing/testing.go:1701 +0xcc
created by time.goFunc
	/workdir/go/src/time/sleep.go:180 +0x44

goroutine 1 [chan receive]:
testing.(*T).Run(0xc00009c300, 0x17407a, 0x1c, 0x179b58, 0xf)
	/workdir/go/src/testing/testing.go:1240 +0x280
testing.runTests.func1(0xc00009c180)
	/workdir/go/src/testing/testing.go:1512 +0x78
testing.tRunner(0xc00009c180, 0xc0000a7d58)
	/workdir/go/src/testing/testing.go:1194 +0xd8
testing.runTests(0xc0000b4018, 0x259a60, 0x12, 0x12, 0xc000d45ed867509a, 0x29e8e2495d, 0x25d820, 0x1705ec)
	/workdir/go/src/testing/testing.go:1510 +0x2b4
testing.(*M).Run(0xc0000c6000, 0x0)
	/workdir/go/src/testing/testing.go:1418 +0x1a0
main.main()
	_testmain.go:81 +0x130

goroutine 18 [runnable]:
syscall.runtime_doAllThreadsSyscall(0xc00008a4d0)
	/workdir/go/src/runtime/proc.go:1669 +0x384
syscall.AllThreadsSyscall(0xab, 0x8, 0x0, 0x0, 0xdcc10, 0x229ed0, 0x7a32c)
	/workdir/go/src/syscall/syscall_linux.go:1064 +0xa8
os/signal.TestAllThreadsSyscallSignals(0xc00009c300)
	/workdir/go/src/os/signal/signal_linux_test.go:23 +0x40
testing.tRunner(0xc00009c300, 0x179b58)
	/workdir/go/src/testing/testing.go:1194 +0xd8
created by testing.(*T).Run
	/workdir/go/src/testing/testing.go:1239 +0x264
FAIL	os/signal	180.104s

2021-02-09T18:40:13-e9c9683/linux-ppc64-buildlet

@bcmills bcmills added this to the Go1.16 milestone Feb 9, 2021
@bcmills
Copy link
Member Author

@bcmills bcmills commented Feb 9, 2021

Tentatively milestoned to Go 1.16 because the function under test is new (https://tip.golang.org/doc/go1.16#syscall).

@ceseo
Copy link
Contributor

@ceseo ceseo commented Feb 10, 2021

@laboger FYI

@AndrewGMorgan
Copy link
Contributor

@AndrewGMorgan AndrewGMorgan commented Feb 10, 2021

Please assign this to me. I have no idea what might have happened here, yet, but I'll try to figure it out.

Looking at the build.golang.org status page, is this a one-off failure?

@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Feb 10, 2021

This does seem to have occurred only once.

@AndrewGMorgan
Copy link
Contributor

@AndrewGMorgan AndrewGMorgan commented Feb 10, 2021

Quickly looking at the code https://tip.golang.org/src/os/signal/signal_linux_test.go it looks like the test was evaluating whether or not it should be skipped at the time of the 3 minute timeout.

Capturing the following just in case it otherwise gets lost at some point.

linux-ppc64-buildlet at e9c96835971044aa4ace37c7787de231bbde05d9

:: Running /workdir/go/src/make.bash with args ["/workdir/go/src/make.bash"] and env ["PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin" "HOSTNAME=ppc64_01" "GO_BUILDER_ENV=host-linux-ppc64-osu" "DEBIAN_FRONTEND=noninteractive" "GOROOT_BOOTSTRAP=/workdir/go1.4" "GO_BUILD_KEY_DELETE_AFTER_READ=true" "GO_BUILD_KEY_PATH=/buildkey/gobuildkey" "HOME=/root" "USER=root" "GO_STAGE0_NET_DELAY=500ms" "GO_STAGE0_DL_DELAY=800ms" "WORKDIR=/workdir" "GO_BUILDER_NAME=linux-ppc64-buildlet" "GO_BUILDER_FLAKY_NET=1" "GOROOT_BOOTSTRAP=/usr/local/go-bootstrap" "GOBIN=" "TMPDIR=/workdir/tmp" "GOCACHE=/workdir/gocache" "GOROOT_BOOTSTRAP=/usr/local/go-bootstrap"] in dir /workdir/go/src

Building Go cmd/dist using /usr/local/go-bootstrap. (go1.8 linux/ppc64)
Building Go toolchain1 using /usr/local/go-bootstrap.
Building Go bootstrap cmd/go (go_bootstrap) using Go toolchain1.
Building Go toolchain2 using go_bootstrap and Go toolchain1.
Building Go toolchain3 using go_bootstrap and Go toolchain2.
Building packages and commands for linux/ppc64.
---
Installed Go for linux/ppc64 in /workdir/go
Installed commands in /workdir/go/bin
[... see failure at top of bug for end of this log...]
@dmitshur dmitshur modified the milestones: Go1.16, Go1.17 Feb 11, 2021
@AndrewGMorgan
Copy link
Contributor

@AndrewGMorgan AndrewGMorgan commented Feb 12, 2021

#42178 (comment) concerning the ppc64 build supports the detail that this code is CGO_ENABLED=0. My recollection of working on resolving that issue was that this architecture was noticeably slower overall than the systems I typically work with. This lends some support for a timeout being more likely on this architecture. However, the test should be nowhere near 3 minutes of runtime.

That being said, the code in this crash trace does not appear to have timed out while running the loop inside the test, but before that loop even starts and in the code after all the syscall parts on all the threads have run and just as code is trying to unstop the world (and reenable GC).

I've not yet reproduced the failure, and the build servers appear not to have failed again. But I'm still investigating.

@bcmills
Copy link
Member Author

@bcmills bcmills commented Feb 12, 2021

@AndrewGMorgan, another theory to consider: perhaps the actual slowdown was in some other os/signal test, and there simply wasn't enough time remaining, and TestAllThreadsSyscallSignals just happened to be the one of the remaining tests that was running when the timeout passed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
5 participants