Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: runtime:cpu124 test crash or stall when GO_GCFLAGS=-N -l #15853

Closed
quentinmit opened this issue May 26, 2016 · 4 comments

Comments

Projects
None yet
3 participants
@quentinmit
Copy link
Contributor

commented May 26, 2016

Starting with @aclements CL 23391 ("runtime: pass gcWork to scanstack") yesterday, the runtime tests are consistently timing out on the linux-amd64-noopt builder. See e.g. https://build.golang.org/log/c70503b10af6b554372e2e11f257f7a0d8524678

It looks like these tests are running in anywhere from 10s to 100s on the other builders.

Austin, can you take a look at the traceback and see if this is an important regression for Go 1.7?

@aclements

This comment has been minimized.

Copy link
Member

commented May 27, 2016

Repro:

GO_GCFLAGS="-N -l" ./make.bash
GOMAXPROCS=2 go test runtime -cpu=1,2,4 -short

It looks like it's usually not a timeout, but rather a segfault. (Which is good; that's probably easier. :)

@aclements aclements changed the title runtime: tests consistently timing out on linux-amd64-noopt runtime: runtime:cpu124 test crash or stall when GO_GCFLAGS=-N -l May 27, 2016

@aclements aclements modified the milestones: Go1.7Beta, Go1.7 May 27, 2016

@aclements

This comment has been minimized.

Copy link
Member

commented May 27, 2016

One definite problem is that markrootFreeGStacks calls shrinkstack on a preemptible, growable user stack, which means we may corrupt the internal stack allocation structures. The fix for this is trivial, but I'm working on confirming that this is in fact the root cause of this failure.

@aclements

This comment has been minimized.

Copy link
Member

commented May 27, 2016

I've confirmed that there's a stack growth happening in the middle of stackcacherelease when it calls lock, after it's already picked up the mcache. The stack growth also accesses the mcache, and intertwining the two operations corrupts the mcache. 3be48b4 triggered it because it grew the stack of markroot from 0x68 bytes to 0x70 bytes on the noopt builder, and markroot is on the path to the stackcacherelease when the growth happens (the added argument that grew the markroot stack frame isn't, but that's the noopt builder for you. :)

@gopherbot

This comment has been minimized.

Copy link

commented May 27, 2016

CL https://golang.org/cl/23511 mentions this issue.

@gopherbot gopherbot closed this in 6a86dbe May 27, 2016

@golang golang locked and limited conversation to collaborators May 27, 2017

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.