Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: checkdead fires due to suspected race in the Go runtime when GOMAXPROCS=1 on AWS [1.19 backport] #60788

Closed
prattmic opened this issue Jun 14, 2023 · 3 comments
Labels
compiler/runtime Issues related to the Go compiler and/or runtime.
Milestone

Comments

@prattmic
Copy link
Member

@prattmic requested issue #59600 to be considered for backport to the next 1.19 minor release.

@gopherbot Please open a backport issue to 1.20.

Quoting @SuperQ :

Is it possible to get this bugfix backported to 1.20? It is affecting users of Prometheus monitoring as we default GOMAXPROCS=1 for the node_exporter to reduce problems with data races in the Linux kernel.

Quoting @prattmic:

@gopherbot Please open a backport to 1.19. This issue is not new in 1.20, and potentially affects users with GOMAXPROCS=1.

@prattmic prattmic added CherryPickCandidate Used during the release process for point releases compiler/runtime Issues related to the Go compiler and/or runtime. labels Jun 14, 2023
@prattmic prattmic added this to the Go1.19.11 milestone Jun 14, 2023
@prattmic prattmic added the CherryPickApproved Used during the release process for point releases label Jun 14, 2023
@gopherbot gopherbot removed the CherryPickCandidate Used during the release process for point releases label Jun 14, 2023
@gopherbot
Copy link
Contributor

Change https://go.dev/cl/504396 mentions this issue: [release-branch.go1.19] runtime: resolve checkdead panic by refining startm lock handling in caller context

@prattmic
Copy link
Member Author

On closer look, I think that this issue is new in 1.20. Specifically, the issue was due to a critical section in injectglist where we allocate a P, unlock sched.lock (allowing checkdead to run), and then later in startm lock sched.lock again to actually start the M to use the P.

This was introduced in https://go-review.googlesource.com/c/go/+/389014/8/src/runtime/proc.go#3186. Prior to that CL, the P was acquired in startm with sched.lock held the whole time.

Thus I believe we can retract this cherry-pick request.

@prattmic prattmic removed the CherryPickApproved Used during the release process for point releases label Jun 21, 2023
@prattmic prattmic closed this as not planned Won't fix, can't repro, duplicate, stale Jun 21, 2023
@SuperQ
Copy link

SuperQ commented Jun 21, 2023

Yes, we only noticed the issue with 1.20.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime.
Projects
None yet
Development

No branches or pull requests

3 participants