runtime: scavenger not freeing all possible memory on darwin #47656
Labels
compiler/runtime
Issues related to the Go compiler and/or runtime.
NeedsInvestigation
Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone
This is effectively a follow-up to #29844. I am attempting to reduce our memory usage on iOS, where we are severely memory-constrained.
On darwin,
sysUnused
callsmadvise(v, n, _MADV_FREE_REUSABLE)
. This marks the pages as reclaimable by the OS. However, unexpectedly, it does not mark all the pages as reclaimable. I do not understand why, but here's a way to reproduce it.The following program makes and then frees a single large byte slice. It pauses three times: before the allocation, after the allocation, and after the allocation has been freed.
Running this on macOS, I use
footprint
to measure the app's footprint andvmmap
to get the memory usage details, at each pause point. Concretely, I rungo build -o jjj x.go && GODEBUG=allocfreetrace=1 ./jjj
to run it andfootprint jjj && vmmap -pages -interleaved -submap jjj
to measure it.Before the alloc, for the Go heap,
footprint
reports:After the alloc:
After the free:
Note that 4192KB-1344KB=2848KB, which is considerably higher than the 1168KB we began with.
(The exact numbers vary slightly from run to run.)
We can get a glimpse into the details of the accounting using
vmmap
(with flags listed above). For the Go heap, before the alloc:After the alloc:
After the free:
This lines up with what tracealloc said:
and then
The large byte slice spans the 14000000000-14000400000 and the 14000400000-14000800000 regions. However, the free appears only to have marked the pages in the 14000400000-14000800000 region as reclaimable. (84 pages = 1344KB, which is exactly what
footprint
reported as reclaimable.) The pages in the 14000000000-14000400000 region are still marked as dirty.As an experiment, I changed
sysUnused
to also callmprotect(v, n, _PROT_NONE)
thenmprotect(v, n, _PROT_READ|_PROT_WRITE)
. See tailscale@38ab03e.Running again with this change, the unreclaimable space reported by
footprint
disappears. At the three pause points:We're not back down to 1168KB (I wish I knew why), but it's considerably better than 2848KB.
vmmap
shows more or less the same pattern as before:(Note that if you add the dirty and swapped pages together in the
mprotect
run, they match themadvise
run dirty pages count exactly.)I don't know how to interpret all of this. But it looks a bit like
madvise(v, n, _MADV_FREE_REUSABLE)
isn't sufficient to fully return memory to the OS, perhaps because of something having to do with allocation regions.I'm out of ideas for what/how to investigate from here, but I'm happy to follow up on suggestions.
cc @mknyszek @bradfitz @randall77
The text was updated successfully, but these errors were encountered: