Today, it's possible (though unlikely) that goroutines may allocate beyond the hard heap goal.
Consider the following scenario:
A goroutine (G1) spins, making many large allocations in a loop.
The heap is small, or there isn't actually much mark work, so the mark phase is paced to start fairly late (let's say at 95 MiB for a 100 MiB heap goal).
G1 is forced to assist the GC, and in doing so accumulates a lot of credit relative to the heap size (let's say 30 MiB for a 100 MiB heap goal because it over-assists -- not unheard of).
Through background work and assists, nearly all the work in the GC is done, right on schedule.
Background GC goroutines try to terminate the mark phase, but something prevents them from doing so (e.g. #40459, or more work is found) many times (perhaps we get unlucky).
G1, having accumulated a bunch of credit, continues allocating in parallel during this period, unimpeded.
Very little time passes, but G1 successfully uses 16 MiB of its credit to generate 16 MiB of allocations, leading to a heap of 111 MiB before sweep.
We've now exceeded the hard heap goal, which is currently 1.1 * the heap goal (so 110 MiB in this example).
The fundamental problem here is that there is nothing to push back on G1 while the GC is nearly done, or after we've exceeded the soft goal. The assist credit system is the sole mechanism through which a goroutine may be blocked and forced to assist. If it still has credit, it won't even bother. This credit mechanism is not necessarily wrong, since it's what allows us to keep the assist cost down by amortizing it, but once we're in this regime where the most important thing is to finish the GC or we're in danger of exceeding the hard goal, it stands to reason that the goroutine should not be allowed to allocate. Doing so directly may have significant negative consequences, so a real solution here needs more thought.
With larger heaps this is even less likely to happen (because allocating fast enough is hard to do), but still possible.
The text was updated successfully, but these errors were encountered:
Here's a quick fix idea: what if a goroutine could never have more credit than the amount of heap runway left (basically the difference between the heap goal and the heap size at the point the assist finishes)? Then by construction a goroutine could never allocate past the heap goal without going into assist first and finishing off the GC cycle. The downside is you could have many goroutines try to end GC at the same time (go into gcMarkDone, I mean, and start trying to acquire markDoneSema), stalling the whole program a little bit as everything waits for GC to actually end. This situation should be exceedingly rare, though, and only potentially common when GOMAXPROCS=1 in which case there will only ever be 1 goroutine in gcMarkDone.