Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: CPU-overprovisioned processes might fail to trigger the GC CPU limiter #52890

Closed
mknyszek opened this issue May 13, 2022 · 8 comments
Closed
Labels
NeedsFix okay-after-beta1 release-blocker
Milestone

Comments

@mknyszek
Copy link
Contributor

@mknyszek mknyszek commented May 13, 2022

I've observed that in cases where the live heap exceeds the max heap set by the memory limit, and the application is only using a small fraction of GOMAXPROCS, it might happen that the GC CPU limiter never notices that GC CPU utilization is particularly high, even though GCs are happening continuously. This is due to a combination of the idle mark workers and the existence of GOMAXPROCS. The correct fix is to remove idle mark time from the limiter's count of mutator time.

@mknyszek mknyszek added this to the Go1.19 milestone May 13, 2022
@mknyszek mknyszek self-assigned this May 13, 2022
@mknyszek mknyszek added NeedsFix release-blocker labels May 13, 2022
@gopherbot
Copy link

@gopherbot gopherbot commented May 13, 2022

Change https://go.dev/cl/405898 mentions this issue: runtime: account for idle mark time in the GC CPU limiter

@mknyszek
Copy link
Contributor Author

@mknyszek mknyszek commented May 13, 2022

Just to be clear, subtracting out idle mark time is the right thing to do here because that's where all our idle time is going in thrashing situations. It's not the perfect solution, but it's an easy one that covers all cases that matter. The fix's commit message has more details and a clearer explanation. We may want to consider accounting for all idle time, but I don't think it's necessary (and actually might have some negative consequences).

@mknyszek mknyszek reopened this May 26, 2022
@mknyszek mknyszek added the okay-after-beta1 label May 26, 2022
@mknyszek
Copy link
Contributor Author

@mknyszek mknyszek commented May 26, 2022

Turns out that if the system is oversubscribed enough this can still happen, because non-GC idle time counts toward mutator time.

The fix I have is to remove all idle time from the GC CPU limiter. There's a slight danger in doing this in that this will mean the limiter may turn on in oversubscribed scenarios where there really is no thrashing. However, in those cases, the amount of GC CPU time spent just in dedicated workers (which we don't attenuate) so greatly outweighs the mutator's actual time, that it shouldn't be an issue that no assists happen. Scavenge assists are also disabled, but that's probably fine too; the background scavenger basically always has available CPU to execute and again, we don't actually attenuate the background scavenger. An unlucky allocation pattern may cause a transient pass over the memory limit, but it's unlikely and I think well within the spirit of a "soft memory limit."

Fix incoming.

@gopherbot
Copy link

@gopherbot gopherbot commented May 26, 2022

Change https://go.dev/cl/408816 mentions this issue: runtime: cancel mark and scavenge assists if the limiter is enabled

@gopherbot
Copy link

@gopherbot gopherbot commented May 26, 2022

Change https://go.dev/cl/408817 mentions this issue: runtime: track total idle time for GC CPU limiter

gopherbot pushed a commit that referenced this issue May 27, 2022
This change forces mark and scavenge assists to be cancelled early if
the limiter is enabled. This avoids goroutines getting stuck in really
long assists if the limiter happens to be disabled when they first come
into the assist. This can get especially bad for mark assists, which, in
dire situations, can end up "owing" the GC a really significant debt.

For #52890.

Change-Id: I4bfaa76b8de3e167d49d2ffd8bc2127b87ea566a
Reviewed-on: https://go-review.googlesource.com/c/go/+/408816
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
@gopherbot
Copy link

@gopherbot gopherbot commented Jun 2, 2022

Change https://go.dev/cl/410120 mentions this issue: runtime: only use CPU time from the current window in the GC CPU limiter

@gopherbot
Copy link

@gopherbot gopherbot commented Jun 2, 2022

Change https://go.dev/cl/410122 mentions this issue: runtime: track total idle time for GC CPU limiter

gopherbot pushed a commit that referenced this issue Jun 3, 2022
Currently the GC CPU limiter consumes CPU time from a few pools, but
because the events that flush to those pools may overlap, rather than be
strictly contained within, the update window for the GC CPU limiter, the
limiter's accounting is ultimately sloppy.

This sloppiness complicates accounting for idle time more completely,
and makes reasoning about the transient behavior of the GC CPU limiter
much more difficult.

To remedy this, this CL adds a field to the P struct that tracks the
start time of any in-flight event the limiter might care about, along
with information about the nature of that event. This timestamp is
managed atomically so that the GC CPU limiter can come in and perform a
read of the partial CPU time consumed by a given event. The limiter also
updates the timestamp so that only what's left over is flushed by the
event itself when it completes.

The end result of this change is that, since the GC CPU limiter is aware
of all past completed events, and all in-flight events, it can much more
accurately collect the CPU time of events since the last update. There's
still the possibility for skew, but any leftover time will be captured
in the following update, and the magnitude of this leftover time is
effectively bounded by the update period of the GC CPU limiter, which is
much easier to consider.

One caveat of managing this timestamp-type combo atomically is that they
need to be packed in 64 bits. So, this CL gives up the top 3 bits of the
timestamp and places the type information there. What this means is we
effectively have only a 61-bit resolution timestamp. This is fine when
the top 3 bits are the same between calls to nanotime, but becomes a
problem on boundaries when those 3 bits change. These cases may cause
hiccups in the GC CPU limiter by not accounting for some source of CPU
time correctly, but with 61 bits of resolution this should be extremely
rare. The rate of update is on the order of milliseconds, so at worst
the runtime will be off of any given measurement by only a few
CPU-milliseconds (and this is directly bounded by the rate of update).
We're probably more inaccurate from the fact that we don't measure real
CPU time but only approximate it.

For #52890.

Change-Id: I347f30ac9e2ba6061806c21dfe0193ef2ab3bbe9
Reviewed-on: https://go-review.googlesource.com/c/go/+/410120
Reviewed-by: Michael Pratt <mpratt@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
@gopherbot
Copy link

@gopherbot gopherbot commented Jun 8, 2022

Change https://go.dev/cl/411119 mentions this issue: runtime: use pidleget for faketime jump

gopherbot pushed a commit that referenced this issue Jun 8, 2022
In faketime mode, checkdead is responsible for jumping time forward to
the next timer expiration, and waking an M to handle the newly ready
timer.

Currently it pulls the exact P that owns the next timer off of the pidle
list. In theory this is efficient because that P is immediately eligible
to run the timer without stealing. Unfortunately it is also fraught with
peril because we are skipping all of the bookkeeping in pidleget:

* Skipped updates to timerpMask mean that our timers may not be eligible
  for stealing, as they should be.
* Skipped updates to idlepMask mean that our runq may not be eligible
  for stealing, as they should be.
* Skipped updates to sched.npidle may break tracking of spinning Ms,
  potentially resulting in lost work.
* Finally, as of CL 410122, skipped updates to p.limiterEvent may affect
  the GC limiter, or cause a fatal throw when another event occurs.

The last case has finally undercovered this issue since it quickly
results in a hard crash.

We could add all of these updates into checkdead, but it is much more
maintainable to keep this logic in one place and use pidleget here like
everywhere else in the runtime. This means we probably won't wake the
P owning the timer, meaning that the P will need to steal the timer,
which is less efficient, but faketime is not a performance-sensitive
build mode. Note that the M will automatically make itself a spinning M
to make it eligible to steal since it is the only one running.

Fixes #53294
For #52890

Change-Id: I4acc3d259b9b4d7dc02608581c8b4fd259f272e9
Reviewed-on: https://go-review.googlesource.com/c/go/+/411119
Run-TryBot: Michael Pratt <mpratt@google.com>
Auto-Submit: Michael Pratt <mpratt@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsFix okay-after-beta1 release-blocker
Projects
Status: Done
Development

No branches or pull requests

2 participants