Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: fatal error: checkdead: runnable g [1.14 backport] #40398

Closed
gopherbot opened this issue Jul 24, 2020 · 3 comments
Closed

runtime: fatal error: checkdead: runnable g [1.14 backport] #40398

gopherbot opened this issue Jul 24, 2020 · 3 comments

Comments

@gopherbot
Copy link

@gopherbot gopherbot commented Jul 24, 2020

@prattmic requested issue #40368 to be considered for backport to the next 1.14 minor release.

@gopherbot backport to 1.14 please. This crash could affect any program running with GOMAXPROCS=1.

@gopherbot
Copy link
Author

@gopherbot gopherbot commented Jul 28, 2020

Change https://golang.org/cl/245297 mentions this issue: [release-branch.go1.14] runtime: ensure startm new M is consistently visible to checkdead

@toothrot
Copy link
Contributor

@toothrot toothrot commented Aug 4, 2020

Approving. This is a serious problem with no workaround.

@toothrot toothrot modified the milestones: Go1.14.7, Go1.14.8 Aug 6, 2020
@gopherbot
Copy link
Author

@gopherbot gopherbot commented Aug 22, 2020

Closed by merging 17fd967 to release-branch.go1.14.

@gopherbot gopherbot closed this Aug 22, 2020
gopherbot pushed a commit that referenced this issue Aug 22, 2020
…visible to checkdead

If no M is available, startm first grabs an idle P, then drops
sched.lock and calls newm to start a new M to run than P.

Unfortunately, that leaves a window in which a G (e.g., returning from a
syscall) may find no idle P, add to the global runq, and then in stopm
discover that there are no running M's, a condition that should be
impossible with runnable G's.

To avoid this condition, we pre-allocate the new M ID in startm before
dropping sched.lock. This ensures that checkdead will see the M as
running, and since that new M must eventually run the scheduler, it will
handle any pending work as necessary.

Outside of startm, most other calls to newm/allocm don't have a P at
all. The only exception is startTheWorldWithSema, which always has an M
if there is 1 P (i.e., the currently running M), and if there is >1 P
the findrunnable spinning dance ensures the problem never occurs.

This has been tested with strategically placed sleeps in the runtime to
help induce the correct race ordering, but the timing on this is too
narrow for a test that can be checked in.

For #40368
Fixes #40398

Change-Id: If5e0293a430cc85154b7ed55bc6dadf9b340abe2
Reviewed-on: https://go-review.googlesource.com/c/go/+/245018
Run-TryBot: Michael Pratt <mpratt@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
(cherry picked from commit 85afa2e)
Reviewed-on: https://go-review.googlesource.com/c/go/+/245297
@dmitshur dmitshur modified the milestones: Go1.14.8, Go1.14.9 Sep 1, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
3 participants