Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: non-empty mark queue after concurrent mark [1.23 backport] #70323

Closed
gopherbot opened this issue Nov 13, 2024 · 2 comments
Closed

runtime: non-empty mark queue after concurrent mark [1.23 backport] #70323

gopherbot opened this issue Nov 13, 2024 · 2 comments
Labels
CherryPickApproved Used during the release process for point releases compiler/runtime Issues related to the Go compiler and/or runtime.
Milestone

Comments

@gopherbot
Copy link
Contributor

@mknyszek requested issue #69803 to be considered for backport to the next 1.23 minor release.

@gopherbot Please open a backport issue for Go 1.23. This bug causes random crashes with no workaround for anything using the unique package, which includes applications using net/http.

@gopherbot gopherbot added the CherryPickCandidate Used during the release process for point releases label Nov 13, 2024
@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Nov 13, 2024
@gopherbot gopherbot added this to the Go1.23.4 milestone Nov 13, 2024
@gopherbot
Copy link
Contributor Author

Change https://go.dev/cl/627615 mentions this issue: [release-branch.go1.23] runtime: prevent weak->strong conversions during mark termination

@dmitshur dmitshur added CherryPickApproved Used during the release process for point releases and removed CherryPickCandidate Used during the release process for point releases labels Nov 20, 2024
gopherbot pushed a commit that referenced this issue Nov 20, 2024
…ing mark termination

Currently it's possible for weak->strong conversions to create more GC
work during mark termination. When a weak->strong conversion happens
during the mark phase, we need to mark the newly-strong pointer, since
it may now be the only pointer to that object. In other words, the
object could be white.

But queueing new white objects creates GC work, and if this happens
during mark termination, we could end up violating mark termination
invariants. In the parlance of the mark termination algorithm, the
weak->strong conversion is a non-monotonic source of GC work, unlike the
write barriers (which will eventually only see black objects).

This change fixes the problem by forcing weak->strong conversions to
block during mark termination. We can do this efficiently by setting a
global flag before the ragged barrier that is checked at each
weak->strong conversion. If the flag is set, then the conversions block.
The ragged barrier ensures that all Ps have observed the flag and that
any weak->strong conversions which completed before the ragged barrier
have their newly-minted strong pointers visible in GC work queues if
necessary. We later unset the flag and wake all the blocked goroutines
during the mark termination STW.

There are a few subtleties that we need to account for. For one, it's
possible that a goroutine which blocked in a weak->strong conversion
wakes up only to find it's mark termination time again, so we need to
recheck the global flag on wake. We should also stay non-preemptible
while performing the check, so that if the check *does* appear as true,
it cannot switch back to false while we're actively trying to block. If
it switches to false while we try to block, then we'll be stuck in the
queue until the following GC.

All-in-all, this CL is more complicated than I would have liked, but
it's the only idea so far that is clearly correct to me at a high level.

This change adds a test which is somewhat invasive as it manipulates
mark termination, but hopefully that infrastructure will be useful for
debugging, fixing, and regression testing mark termination whenever we
do fix it.

For #69803.
Fixes #70323.

Change-Id: Ie314e6fd357c9e2a07a9be21f217f75f7aba8c4a
Reviewed-on: https://go-review.googlesource.com/c/go/+/623615
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
(cherry picked from commit 80d306d)
Reviewed-on: https://go-review.googlesource.com/c/go/+/627615
TryBot-Bypass: Dmitri Shuralyov <dmitshur@google.com>
Auto-Submit: Dmitri Shuralyov <dmitshur@google.com>
@gopherbot
Copy link
Contributor Author

Closed by merging CL 627615 (commit d8adc6c) to release-branch.go1.23.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CherryPickApproved Used during the release process for point releases compiler/runtime Issues related to the Go compiler and/or runtime.
Projects
None yet
Development

No branches or pull requests

2 participants