Skip to content

Commit

Permalink
runtime: smooth cons/mark with a moving average and use actual trigger
Browse files Browse the repository at this point in the history
This change modifies the pacer in two ways:
* It replaces the PI controller used as a smoothing function with a
  simple two-cycle moving average.
* It makes the pacer use the actual GC trigger point for cons/mark (and
  other) calculations instead of the precomputed one.

The second part of this change was attempted in the Go 1.19 release
cycle, but could not be done because although it resulted in a
better-behaved pacer, it exploited the PI controller's sensitivity to
history in a way that was ultimately unfavorable for most applications.

This sensitivity is complex to reason about, and forces us into choices
that don't really make sense (like using the precomputed trigger over
the actual one -- that's really a bug fix).

The net effect of this change is intended to be:
* The pacer computes a more stable estimate of the actual cons/mark
  ratio, making it easier to understand.
* The pacer is much more accurate at hitting the heap goal, so the GC
  respects GOGC much more often and reliably.
* The pacer forces a more stable rate of GC assists in the
  steady-state overall.

See https://perf.golang.org/search?q=upload:20221106.10 for benchmark
results and #53892 for complete context. The benchmarks that regress
in memory use appear to be not worth worrying about. In all cases, it appears that the GC was triggering super early, resulting in a lack
of adherence to rule of GOGC.

The fogleman benchmarks just have a single final GC that triggers
early and also happens to be the peak heap size. The tile38
WithinCircle benchmark only has 4 GC cycles in the benchmarked region,
so it's very sensitive to pacing. In this case, the old smoothing
function is getting lucky by starting way too early, avoiding assists. Meanwhile the 2-cycle moving average is more accurate on the heap
goal, but the 1st and 2nd cycle after the initialization phase are operating on a cons/mark from the initialization phase which is much
less active, resulting in a cons/mark that's too low, causing the GC
to start too late, increasing assists, and therefore latency (but
only transiently, at this phase change). I really do think the PI
controller is just getting lucky here with a particular history,
because I've definitely observed it oscillating wildly in response to
a phase change.

This change also moves the PI controller out of mgcpacer.go, because
it's no longer used there. It now lives in mgcscavenge.go, where it's
actually used.

Fixes #53892.

Change-Id: I3f875a2e40f31f381920f91d8b090556b17a2b16
Reviewed-on: https://go-review.googlesource.com/c/go/+/417558
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
  • Loading branch information
mknyszek authored and pull[bot] committed Jul 20, 2023
1 parent 7213c36 commit 1073818
Show file tree
Hide file tree
Showing 4 changed files with 125 additions and 188 deletions.
154 changes: 11 additions & 143 deletions src/runtime/mgcpacer.go
Expand Up @@ -136,12 +136,10 @@ type gcControllerState struct {
// Updated at the end of each GC cycle, in endCycle.
consMark float64

// consMarkController holds the state for the mark-cons ratio
// estimation over time.
//
// Its purpose is to smooth out noisiness in the computation of
// consMark; see consMark for details.
consMarkController piController
// lastConsMark is the computed cons/mark value for the previous GC
// cycle. Note that this is *not* the last value of cons/mark, but the
// actual computed value. See endCycle for details.
lastConsMark float64

// gcPercentHeapGoal is the goal heapLive for when next GC ends derived
// from gcPercent.
Expand Down Expand Up @@ -372,28 +370,6 @@ type gcControllerState struct {
func (c *gcControllerState) init(gcPercent int32, memoryLimit int64) {
c.heapMinimum = defaultHeapMinimum
c.triggered = ^uint64(0)

c.consMarkController = piController{
// Tuned first via the Ziegler-Nichols process in simulation,
// then the integral time was manually tuned against real-world
// applications to deal with noisiness in the measured cons/mark
// ratio.
kp: 0.9,
ti: 4.0,

// Set a high reset time in GC cycles.
// This is inversely proportional to the rate at which we
// accumulate error from clipping. By making this very high
// we make the accumulation slow. In general, clipping is
// OK in our situation, hence the choice.
//
// Tune this if we get unintended effects from clipping for
// a long time.
tt: 1000,
min: -1000,
max: 1000,
}

c.setGCPercent(gcPercent)
c.setMemoryLimit(memoryLimit)
c.commit(true) // No sweep phase in the first GC cycle.
Expand All @@ -416,26 +392,7 @@ func (c *gcControllerState) startCycle(markStartTime int64, procs int, trigger g
c.fractionalMarkTime.Store(0)
c.idleMarkTime.Store(0)
c.markStartTime = markStartTime

// TODO(mknyszek): This is supposed to be the actual trigger point for the heap, but
// causes regressions in memory use. The cause is that the PI controller used to smooth
// the cons/mark ratio measurements tends to flail when using the less accurate precomputed
// trigger for the cons/mark calculation, and this results in the controller being more
// conservative about steady-states it tries to find in the future.
//
// This conservatism is transient, but these transient states tend to matter for short-lived
// programs, especially because the PI controller is overdamped, partially because it is
// configured with a relatively large time constant.
//
// Ultimately, I think this is just two mistakes piled on one another: the choice of a swingy
// smoothing function that recalls a fairly long history (due to its overdamped time constant)
// coupled with an inaccurate cons/mark calculation. It just so happens this works better
// today, and it makes it harder to change things in the future.
//
// This is described in #53738. Fix this for #53892 by changing back to the actual trigger
// point and simplifying the smoothing function.
heapTrigger, heapGoal := c.trigger()
c.triggered = heapTrigger
c.triggered = c.heapLive.Load()

// Compute the background mark utilization goal. In general,
// this may not come out exactly. We round the number of
Expand Down Expand Up @@ -498,6 +455,7 @@ func (c *gcControllerState) startCycle(markStartTime int64, procs int, trigger g
c.revise()

if debug.gcpacertrace > 0 {
heapGoal := c.heapGoal()
assistRatio := c.assistWorkPerByte.Load()
print("pacer: assist ratio=", assistRatio,
" (scan ", gcController.heapScan.Load()>>20, " MB in ",
Expand Down Expand Up @@ -700,31 +658,12 @@ func (c *gcControllerState) endCycle(now int64, procs int, userForced bool) {
currentConsMark := (float64(c.heapLive.Load()-c.triggered) * (utilization + idleUtilization)) /
(float64(scanWork) * (1 - utilization))

// Update cons/mark controller. The time period for this is 1 GC cycle.
//
// This use of a PI controller might seem strange. So, here's an explanation:
//
// currentConsMark represents the consMark we *should've* had to be perfectly
// on-target for this cycle. Given that we assume the next GC will be like this
// one in the steady-state, it stands to reason that we should just pick that
// as our next consMark. In practice, however, currentConsMark is too noisy:
// we're going to be wildly off-target in each GC cycle if we do that.
//
// What we do instead is make a long-term assumption: there is some steady-state
// consMark value, but it's obscured by noise. By constantly shooting for this
// noisy-but-perfect consMark value, the controller will bounce around a bit,
// but its average behavior, in aggregate, should be less noisy and closer to
// the true long-term consMark value, provided its tuned to be slightly overdamped.
var ok bool
// Update our cons/mark estimate. This is the raw value above, but averaged over 2 GC cycles
// because it tends to be jittery, even in the steady-state. The smoothing helps the GC to
// maintain much more stable cycle-by-cycle behavior.
oldConsMark := c.consMark
c.consMark, ok = c.consMarkController.next(c.consMark, currentConsMark, 1.0)
if !ok {
// The error spiraled out of control. This is incredibly unlikely seeing
// as this controller is essentially just a smoothing function, but it might
// mean that something went very wrong with how currentConsMark was calculated.
// Just reset consMark and keep going.
c.consMark = 0
}
c.consMark = (currentConsMark + c.lastConsMark) / 2
c.lastConsMark = currentConsMark

if debug.gcpacertrace > 0 {
printlock()
Expand All @@ -733,9 +672,6 @@ func (c *gcControllerState) endCycle(now int64, procs int, userForced bool) {
print(c.heapScanWork.Load(), "+", c.stackScanWork.Load(), "+", c.globalsScanWork.Load(), " B work (", c.lastHeapScan+c.lastStackScan.Load()+c.globalsScan.Load(), " B exp.) ")
live := c.heapLive.Load()
print("in ", c.triggered, " B -> ", live, " B (∆goal ", int64(live)-int64(c.lastHeapGoal), ", cons/mark ", oldConsMark, ")")
if !ok {
print("[controller reset]")
}
println()
printunlock()
}
Expand Down Expand Up @@ -1379,74 +1315,6 @@ func readGOMEMLIMIT() int64 {
return n
}

type piController struct {
kp float64 // Proportional constant.
ti float64 // Integral time constant.
tt float64 // Reset time.

min, max float64 // Output boundaries.

// PI controller state.

errIntegral float64 // Integral of the error from t=0 to now.

// Error flags.
errOverflow bool // Set if errIntegral ever overflowed.
inputOverflow bool // Set if an operation with the input overflowed.
}

// next provides a new sample to the controller.
//
// input is the sample, setpoint is the desired point, and period is how much
// time (in whatever unit makes the most sense) has passed since the last sample.
//
// Returns a new value for the variable it's controlling, and whether the operation
// completed successfully. One reason this might fail is if error has been growing
// in an unbounded manner, to the point of overflow.
//
// In the specific case of an error overflow occurs, the errOverflow field will be
// set and the rest of the controller's internal state will be fully reset.
func (c *piController) next(input, setpoint, period float64) (float64, bool) {
// Compute the raw output value.
prop := c.kp * (setpoint - input)
rawOutput := prop + c.errIntegral

// Clamp rawOutput into output.
output := rawOutput
if isInf(output) || isNaN(output) {
// The input had a large enough magnitude that either it was already
// overflowed, or some operation with it overflowed.
// Set a flag and reset. That's the safest thing to do.
c.reset()
c.inputOverflow = true
return c.min, false
}
if output < c.min {
output = c.min
} else if output > c.max {
output = c.max
}

// Update the controller's state.
if c.ti != 0 && c.tt != 0 {
c.errIntegral += (c.kp*period/c.ti)*(setpoint-input) + (period/c.tt)*(output-rawOutput)
if isInf(c.errIntegral) || isNaN(c.errIntegral) {
// So much error has accumulated that we managed to overflow.
// The assumptions around the controller have likely broken down.
// Set a flag and reset. That's the safest thing to do.
c.reset()
c.errOverflow = true
return c.min, false
}
}
return output, true
}

// reset resets the controller state, except for controller error flags.
func (c *piController) reset() {
c.errIntegral = 0
}

// addIdleMarkWorker attempts to add a new idle mark worker.
//
// If this returns true, the caller must become an idle mark worker unless
Expand Down
45 changes: 0 additions & 45 deletions src/runtime/mgcpacer_test.go
Expand Up @@ -1019,51 +1019,6 @@ func (f float64Stream) limit(min, max float64) float64Stream {
}
}

func FuzzPIController(f *testing.F) {
isNormal := func(x float64) bool {
return !math.IsInf(x, 0) && !math.IsNaN(x)
}
isPositive := func(x float64) bool {
return isNormal(x) && x > 0
}
// Seed with constants from controllers in the runtime.
// It's not critical that we keep these in sync, they're just
// reasonable seed inputs.
f.Add(0.3375, 3.2e6, 1e9, 0.001, 1000.0, 0.01)
f.Add(0.9, 4.0, 1000.0, -1000.0, 1000.0, 0.84)
f.Fuzz(func(t *testing.T, kp, ti, tt, min, max, setPoint float64) {
// Ignore uninteresting invalid parameters. These parameters
// are constant, so in practice surprising values will be documented
// or will be other otherwise immediately visible.
//
// We just want to make sure that given a non-Inf, non-NaN input,
// we always get a non-Inf, non-NaN output.
if !isPositive(kp) || !isPositive(ti) || !isPositive(tt) {
return
}
if !isNormal(min) || !isNormal(max) || min > max {
return
}
// Use a random source, but make it deterministic.
rs := rand.New(rand.NewSource(800))
randFloat64 := func() float64 {
return math.Float64frombits(rs.Uint64())
}
p := NewPIController(kp, ti, tt, min, max)
state := float64(0)
for i := 0; i < 100; i++ {
input := randFloat64()
// Ignore the "ok" parameter. We're just trying to break it.
// state is intentionally completely uncorrelated with the input.
var ok bool
state, ok = p.Next(input, setPoint, 1.0)
if !isNormal(state) {
t.Fatalf("got NaN or Inf result from controller: %f %v", state, ok)
}
}
})
}

func TestIdleMarkWorkerCount(t *testing.T) {
const workers = 10
c := NewGCController(100, math.MaxInt64)
Expand Down
68 changes: 68 additions & 0 deletions src/runtime/mgcscavenge.go
Expand Up @@ -1114,3 +1114,71 @@ func (s *scavengeIndex) mark(base, limit uintptr) {
func (s *scavengeIndex) clear(ci chunkIdx) {
s.chunks[ci/8].And(^uint8(1 << (ci % 8)))
}

type piController struct {
kp float64 // Proportional constant.
ti float64 // Integral time constant.
tt float64 // Reset time.

min, max float64 // Output boundaries.

// PI controller state.

errIntegral float64 // Integral of the error from t=0 to now.

// Error flags.
errOverflow bool // Set if errIntegral ever overflowed.
inputOverflow bool // Set if an operation with the input overflowed.
}

// next provides a new sample to the controller.
//
// input is the sample, setpoint is the desired point, and period is how much
// time (in whatever unit makes the most sense) has passed since the last sample.
//
// Returns a new value for the variable it's controlling, and whether the operation
// completed successfully. One reason this might fail is if error has been growing
// in an unbounded manner, to the point of overflow.
//
// In the specific case of an error overflow occurs, the errOverflow field will be
// set and the rest of the controller's internal state will be fully reset.
func (c *piController) next(input, setpoint, period float64) (float64, bool) {
// Compute the raw output value.
prop := c.kp * (setpoint - input)
rawOutput := prop + c.errIntegral

// Clamp rawOutput into output.
output := rawOutput
if isInf(output) || isNaN(output) {
// The input had a large enough magnitude that either it was already
// overflowed, or some operation with it overflowed.
// Set a flag and reset. That's the safest thing to do.
c.reset()
c.inputOverflow = true
return c.min, false
}
if output < c.min {
output = c.min
} else if output > c.max {
output = c.max
}

// Update the controller's state.
if c.ti != 0 && c.tt != 0 {
c.errIntegral += (c.kp*period/c.ti)*(setpoint-input) + (period/c.tt)*(output-rawOutput)
if isInf(c.errIntegral) || isNaN(c.errIntegral) {
// So much error has accumulated that we managed to overflow.
// The assumptions around the controller have likely broken down.
// Set a flag and reset. That's the safest thing to do.
c.reset()
c.errOverflow = true
return c.min, false
}
}
return output, true
}

// reset resets the controller state, except for controller error flags.
func (c *piController) reset() {
c.errIntegral = 0
}
46 changes: 46 additions & 0 deletions src/runtime/mgcscavenge_test.go
Expand Up @@ -7,6 +7,7 @@ package runtime_test
import (
"fmt"
"internal/goos"
"math"
"math/rand"
. "runtime"
"runtime/internal/atomic"
Expand Down Expand Up @@ -707,3 +708,48 @@ func TestScavengeIndex(t *testing.T) {
find(0, 0)
})
}

func FuzzPIController(f *testing.F) {
isNormal := func(x float64) bool {
return !math.IsInf(x, 0) && !math.IsNaN(x)
}
isPositive := func(x float64) bool {
return isNormal(x) && x > 0
}
// Seed with constants from controllers in the runtime.
// It's not critical that we keep these in sync, they're just
// reasonable seed inputs.
f.Add(0.3375, 3.2e6, 1e9, 0.001, 1000.0, 0.01)
f.Add(0.9, 4.0, 1000.0, -1000.0, 1000.0, 0.84)
f.Fuzz(func(t *testing.T, kp, ti, tt, min, max, setPoint float64) {
// Ignore uninteresting invalid parameters. These parameters
// are constant, so in practice surprising values will be documented
// or will be other otherwise immediately visible.
//
// We just want to make sure that given a non-Inf, non-NaN input,
// we always get a non-Inf, non-NaN output.
if !isPositive(kp) || !isPositive(ti) || !isPositive(tt) {
return
}
if !isNormal(min) || !isNormal(max) || min > max {
return
}
// Use a random source, but make it deterministic.
rs := rand.New(rand.NewSource(800))
randFloat64 := func() float64 {
return math.Float64frombits(rs.Uint64())
}
p := NewPIController(kp, ti, tt, min, max)
state := float64(0)
for i := 0; i < 100; i++ {
input := randFloat64()
// Ignore the "ok" parameter. We're just trying to break it.
// state is intentionally completely uncorrelated with the input.
var ok bool
state, ok = p.Next(input, setPoint, 1.0)
if !isNormal(state) {
t.Fatalf("got NaN or Inf result from controller: %f %v", state, ok)
}
}
})
}

0 comments on commit 1073818

Please sign in to comment.