Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: long latency of sweep assists [1.19 backport] #58535

Closed
gopherbot opened this issue Feb 14, 2023 · 3 comments
Closed

runtime: long latency of sweep assists [1.19 backport] #58535

gopherbot opened this issue Feb 14, 2023 · 3 comments
Labels
CherryPickApproved Used during the release process for point releases compiler/runtime Issues related to the Go compiler and/or runtime. FrozenDueToAge
Milestone

Comments

@gopherbot
Copy link
Contributor

@mknyszek requested issue #57523 to be considered for backport to the next 1.19 minor release.

@gopherbot Please open up backport issues for Go 1.19 and Go 1.20. This issue has the potential to cause severe latency issues if runtime/debug.SetMemoryLimit or runtime/debug.SetGCPercent is called often. The fix is also small and safe (an overflow check on an unsigned subtraction).

Just about to land the fix. I think I'm in favor of backporting.

@gopherbot gopherbot added the CherryPickCandidate Used during the release process for point releases label Feb 14, 2023
@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Feb 14, 2023
@gopherbot gopherbot added this to the Go1.19.7 milestone Feb 14, 2023
@gopherbot
Copy link
Contributor Author

Change https://go.dev/cl/468376 mentions this issue: [release-branch.go1.19] runtime: check for overflow in sweep assist

@mknyszek
Copy link
Contributor

I want to amend the backport reasoning slightly to point out that it's specifically a combination of runtime/debug.SetMemoryLimit or runtime/debug.SetGCPercent AND something that flushes mcaches. That's the end of every GC or something like a runtime.ReadMemStats. Therefore, this is somewhat more common than just someone calling one of the aforementioned APIs rapidly.

@prattmic prattmic added the CherryPickApproved Used during the release process for point releases label Feb 15, 2023
@gopherbot gopherbot removed the CherryPickCandidate Used during the release process for point releases label Feb 15, 2023
@gopherbot
Copy link
Contributor Author

Closed by merging 0fa9ba8 to release-branch.go1.19.

gopherbot pushed a commit that referenced this issue Feb 15, 2023
The sweep assist computation is intentionally racy for performance,
since the specifics of sweep assist aren't super sensitive to error.
However, if overflow occurs when computing the live heap delta, we can
end up with a massive sweep target that causes the sweep assist to sweep
until sweep termination, causing severe latency issues. In fact, because
heapLive doesn't always increase monotonically then anything that
flushes mcaches will cause _all_ allocating goroutines to inevitably get
stuck in sweeping.

Consider the following scenario:
1. SetGCPercent is called, updating sweepHeapLiveBasis to heapLive.
2. Very shortly after, ReadMemStats is called, flushing mcaches and
   decreasing heapLive below the value sweepHeapLiveBasis was set to.
3. Every allocating goroutine goes to refill its mcache, calls into
   deductSweepCredit for sweep assist, and gets stuck sweeping until
   the sweep phase ends.

Fix this by just checking for overflow in the delta live heap calculation
and if it would overflow, pick a small delta live heap. This probably
means that no sweeping will happen at all, but that's OK. This is a
transient state and the runtime will recover as soon as heapLive
increases again.

Note that deductSweepCredit doesn't check overflow on other operations
but that's OK: those operations are signed and extremely unlikely to
overflow. The subtraction targeted by this CL is only a problem because
it's unsigned. An alternative fix would be to make the operation signed,
but being explicit about the overflow situation seems worthwhile.

For #57523.
Fixes #58535.

Change-Id: Ib18f71f53468e913548aac6e5358830c72ef0215
Reviewed-on: https://go-review.googlesource.com/c/go/+/468376
Reviewed-by: Michael Pratt <mpratt@google.com>
Auto-Submit: Michael Pratt <mpratt@google.com>
Run-TryBot: Michael Pratt <mpratt@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
@golang golang locked and limited conversation to collaborators Feb 15, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
CherryPickApproved Used during the release process for point releases compiler/runtime Issues related to the Go compiler and/or runtime. FrozenDueToAge
Projects
None yet
Development

No branches or pull requests

3 participants