Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.Sign up
runtime: long pauses STW (sweep termination) on massive block allocation #31222
What version of Go are you using (
To be clear, the reproducer is making a 3.2 GiB allocation which is going to take a while for the OS to fulfill.
However, I dug into this a bit and I'm seeing ~1s pauses, sometimes with only ~50ms pauses.
The 50ms pauses represent a lower bound for how fast the OS can give us back pages. The ~1s pauses actually come from the fact that we're zeroing the entire 3.2 GiB allocation, even though it just came from the OS.
We have an optimization which avoids re-zeroing fresh pages, but unfortunately this optimization is conservative. For example, if a needzero span at the end of the heap is contiguous with the new 3.2 GiB allocation, then those two are going to coalesce, and the 3.2 GiB free space will be marked needzero, when really only the first N KiB of it actually needs to be zeroed.
One way we could fix this is by making needzero a number, and if we can guarantee that only the first N KiB need zeroing, then we only zero that. If it's some weird swiss cheese of non-zeroed memory then we just assume the whole thing needs zeroing, rather than trying to keep track of which parts need zeroing.
I'm not sure if this fix is worth it though, since future allocations would still have to go back and zero this memory anyway, causing the 1s delay to return.
In case others are wondering if they are running into this: