Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: fatal error: checkdead: runnable g [1.13 backport] #40503

Open
dmitshur opened this issue Jul 30, 2020 · 7 comments
Open

runtime: fatal error: checkdead: runnable g [1.13 backport] #40503

dmitshur opened this issue Jul 30, 2020 · 7 comments

Comments

@dmitshur
Copy link
Member

@dmitshur dmitshur commented Jul 30, 2020

In a Go release meeting, we noticed issue #40368 likely affects Go 1.13 as well. This needs to be confirmed, and if so, this is the tracking issue for #40368 to be considered for backport to the next 1.13 minor release.

Including a copy of the rationale to backport from the 1.14 tracking issue #40398:

This crash could affect any program running with GOMAXPROCS=1.

/cc @prattmic @aclements @cagedmantis @toothrot

@gopherbot gopherbot added this to the Go1.13.15 milestone Jul 30, 2020
@prattmic
Copy link
Member

@prattmic prattmic commented Jul 30, 2020

I believe this should affect 1.13 as well. I'll see if I can reproduce and send a cherry-pick.

@dmitshur
Copy link
Member Author

@dmitshur dmitshur commented Jul 30, 2020

Thanks Michael!

@prattmic
Copy link
Member

@prattmic prattmic commented Jul 31, 2020

So I've been unable to reproduce this on 1.13, though I believe it is still affected.

In general, the racing startm call must come from sysmon, as most other calls come from M's considered "running" by checkdead and would thus not trigger the throw. In 1.14+, sysmon is partially responsible for handling timers, so the timer expiration startm is an easy place for this race to occur.

In 1.13, sysmon was not responsible for timers, thus that startm doesn't exist, but there are two others: one for netpoll, and one for forcegc. Ironically, the netpoll case already handles this exact problem from #6070, which I hadn't noticed before now. I may be able to clean those up in 1.16 that we have another approach to this.

The forcegc should be subject to this problem, but it only runs every 2 minutes, so it is extremely rare. I'll still send the backport patch for consideration for this case, though I'm not sure how important it is.

@prattmic
Copy link
Member

@prattmic prattmic commented Jul 31, 2020

(I've filed #40518 to clean this up for 1.16).

@gopherbot
Copy link

@gopherbot gopherbot commented Jul 31, 2020

Change https://golang.org/cl/246199 mentions this issue: [release-branch.go1.13] runtime: ensure startm new M is consistently visible to checkdead

@aclements
Copy link
Member

@aclements aclements commented Jul 31, 2020

I agree with your assessment that the possibility of this bug in 1.13 seems so astronomically small that I'm not sure it's worth fixing, but I'll leave the final decision to the release team.

@toothrot
Copy link
Contributor

@toothrot toothrot commented Aug 4, 2020

Approving. Even though it's astronomically small, we try to keep both versions at the same level of support for fixes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
5 participants
You can’t perform that action at this time.