-
Notifications
You must be signed in to change notification settings - Fork 18.6k
Description
@RLH and I have been discussing this problem for the past week. This issue is to track our thoughts.
The current GC pacer is conservative about scheduling mutator assists and this leads to a cycle where we over-assist (use too much CPU for GC) and under-shoot the heap goal.
Specifically, GC currently assumes it may have to scan the entire live heap before the live heap size reaches the heap goal. It sets the assist ratio accordingly. If the reachable heap is growing, this conservative assumption is necessary to prevent the mutator from outpacing the garbage collector and spiraling the heap size up. However, in steady state (where the reachable heap size is roughly the same as it was the last GC cycle), this means we'll finish garbage collection when we're less than 100/(GOGC+100) of the way from the trigger to the goal (1/2 way for GOGC=100; 1/4 way for GOGC=300; etc).
You might think the trigger controller would adapt by setting the GC trigger lower in response to the over-utilization, but in fact, this drives the trigger controller into a state of persistent CPU over-utilization and heap under-shoot. Currently, the trigger controller will leave the trigger where it is if we're anywhere on the white edge between CPU utilization and heap over-shoot shown below:
The under-shoot caused by the over-assist forces the trigger controller to the left of the ideal point on this curve. Since this persists, we eventually settle well to the left of the ideal, repeatedly setting the trigger too close to the goal, which leads to the next cycle setting a high assist ratio and again finishing early.
This is particularly visible in applications with a high allocation rate and/or high GOGC.

