-
Notifications
You must be signed in to change notification settings - Fork 18.3k
Description
This is effectively a follow-up to #29844. I am attempting to reduce our memory usage on iOS, where we are severely memory-constrained.
On darwin, sysUnused
calls madvise(v, n, _MADV_FREE_REUSABLE)
. This marks the pages as reclaimable by the OS. However, unexpectedly, it does not mark all the pages as reclaimable. I do not understand why, but here's a way to reproduce it.
The following program makes and then frees a single large byte slice. It pauses three times: before the allocation, after the allocation, and after the allocation has been freed.
package main
import (
"runtime/debug"
"time"
)
var b []byte
func main() {
// Call time.Sleep and debug.FreeOSMemory once up front,
// so that all basic runtime structures get set up
// and all relevant pages get dirtied.
time.Sleep(time.Millisecond)
debug.FreeOSMemory()
println("start")
time.Sleep(5 * time.Second)
b = make([]byte, 4_000_000)
for i := range b {
b[i] = 1
}
println("allocated")
time.Sleep(5 * time.Second)
b = nil
debug.FreeOSMemory()
time.Sleep(3 * time.Second) // wait for the scavenger's effects to be visible
println("freed")
time.Sleep(3 * time.Hour)
}
Running this on macOS, I use footprint
to measure the app's footprint and vmmap
to get the memory usage details, at each pause point. Concretely, I run go build -o jjj x.go && GODEBUG=allocfreetrace=1 ./jjj
to run it and footprint jjj && vmmap -pages -interleaved -submap jjj
to measure it.
Before the alloc, for the Go heap, footprint
reports:
Dirty Clean Reclaimable Regions Category
--- --- --- --- ---
1168 KB 0 B 0 B 38 untagged ("VM_ALLOCATE")
After the alloc:
Dirty Clean Reclaimable Regions Category
--- --- --- --- ---
5344 KB 0 B 0 B 39 untagged ("VM_ALLOCATE")
After the free:
Dirty Clean Reclaimable Regions Category
--- --- --- --- ---
4192 KB 0 B 1344 KB 40 untagged ("VM_ALLOCATE")
Note that 4192KB-1344KB=2848KB, which is considerably higher than the 1168KB we began with.
(The exact numbers vary slightly from run to run.)
We can get a glimpse into the details of the accounting using vmmap
(with flags listed above). For the Go heap, before the alloc:
REGION TYPE START - END [ VSIZE RSDNT DIRTY SWAP] PRT/MAX SHRMOD PURGE REGION DETAIL
VM_ALLOCATE 14000000000-14000400000 [ 256 40 40 0 ] rw-/rwx SM=ZER
VM_ALLOCATE 14000400000-14004000000 [ 3840 0 0 0 ] ---/rwx SM=NUL
After the alloc:
REGION TYPE START - END [ VSIZE RSDNT DIRTY SWAP] PRT/MAX SHRMOD PURGE REGION DETAIL
VM_ALLOCATE 14000000000-14000400000 [ 256 200 200 0 ] rw-/rwx SM=ZER
VM_ALLOCATE 14000400000-14000800000 [ 256 85 85 0 ] rw-/rwx SM=PRV
VM_ALLOCATE 14000800000-14004000000 [ 3584 0 0 0 ] ---/rwx SM=NUL
After the free:
REGION TYPE START - END [ VSIZE RSDNT DIRTY SWAP] PRT/MAX SHRMOD PURGE REGION DETAIL
VM_ALLOCATE 14000000000-14000400000 [ 256 202 202 0 ] rw-/rwx SM=ZER
VM_ALLOCATE 14000400000-14000800000 [ 256 85 1 0 ] rw-/rwx SM=PRV
VM_ALLOCATE 14000800000-14004000000 [ 3584 0 0 0 ] ---/rwx SM=NUL
This lines up with what tracealloc said:
tracealloc(0x14000180000, 0x3d2000, uint8)
and then
tracefree(0x14000180000, 0x3d2000)
The large byte slice spans the 14000000000-14000400000 and the 14000400000-14000800000 regions. However, the free appears only to have marked the pages in the 14000400000-14000800000 region as reclaimable. (84 pages = 1344KB, which is exactly what footprint
reported as reclaimable.) The pages in the 14000000000-14000400000 region are still marked as dirty.
As an experiment, I changed sysUnused
to also call mprotect(v, n, _PROT_NONE)
then mprotect(v, n, _PROT_READ|_PROT_WRITE)
. See tailscale@38ab03e.
Running again with this change, the unreclaimable space reported by footprint
disappears. At the three pause points:
Dirty Clean Reclaimable Regions Category
--- --- --- --- ---
1168 KB 0 B 0 B 37 untagged ("VM_ALLOCATE")
Dirty Clean Reclaimable Regions Category
--- --- --- --- ---
5328 KB 0 B 0 B 38 untagged ("VM_ALLOCATE")
Dirty Clean Reclaimable Regions Category
--- --- --- --- ---
1584 KB 0 B 0 B 39 untagged ("VM_ALLOCATE")
We're not back down to 1168KB (I wish I knew why), but it's considerably better than 2848KB. vmmap
shows more or less the same pattern as before:
REGION TYPE START - END [ VSIZE RSDNT DIRTY SWAP] PRT/MAX SHRMOD PURGE REGION DETAIL
VM_ALLOCATE 14000000000-14000400000 [ 256 40 40 0 ] rw-/rwx SM=ZER
VM_ALLOCATE 14000400000-14004000000 [ 3840 0 0 0 ] ---/rwx SM=NUL
REGION TYPE START - END [ VSIZE RSDNT DIRTY SWAP] PRT/MAX SHRMOD PURGE REGION DETAIL
VM_ALLOCATE 14000000000-14000400000 [ 256 188 188 12 ] rw-/rwx SM=ZER
VM_ALLOCATE 14000400000-14000800000 [ 256 85 85 0 ] rw-/rwx SM=PRV
VM_ALLOCATE 14000800000-14004000000 [ 3584 0 0 0 ] ---/rwx SM=NUL
REGION TYPE START - END [ VSIZE RSDNT DIRTY SWAP] PRT/MAX SHRMOD PURGE REGION DETAIL
VM_ALLOCATE 14000000000-14000400000 [ 256 197 197 5 ] rw-/rwx SM=ZER
VM_ALLOCATE 14000400000-14000800000 [ 256 85 1 0 ] rw-/rwx SM=PRV
VM_ALLOCATE 14000800000-14004000000 [ 3584 0 0 0 ] ---/rwx SM=NUL
(Note that if you add the dirty and swapped pages together in the mprotect
run, they match the madvise
run dirty pages count exactly.)
I don't know how to interpret all of this. But it looks a bit like madvise(v, n, _MADV_FREE_REUSABLE)
isn't sufficient to fully return memory to the OS, perhaps because of something having to do with allocation regions.
I'm out of ideas for what/how to investigate from here, but I'm happy to follow up on suggestions.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status