Skip to content

runtime: scavenger not freeing all possible memory on darwin #47656

@josharian

Description

@josharian

This is effectively a follow-up to #29844. I am attempting to reduce our memory usage on iOS, where we are severely memory-constrained.

On darwin, sysUnused calls madvise(v, n, _MADV_FREE_REUSABLE). This marks the pages as reclaimable by the OS. However, unexpectedly, it does not mark all the pages as reclaimable. I do not understand why, but here's a way to reproduce it.

The following program makes and then frees a single large byte slice. It pauses three times: before the allocation, after the allocation, and after the allocation has been freed.

package main

import (
	"runtime/debug"
	"time"
)

var b []byte

func main() {
	// Call time.Sleep and debug.FreeOSMemory once up front,
	// so that all basic runtime structures get set up
	// and all relevant pages get dirtied.
	time.Sleep(time.Millisecond)
	debug.FreeOSMemory()
	println("start")
	time.Sleep(5 * time.Second)

	b = make([]byte, 4_000_000)
	for i := range b {
		b[i] = 1
	}
	println("allocated")
	time.Sleep(5 * time.Second)

	b = nil
	debug.FreeOSMemory()
	time.Sleep(3 * time.Second) // wait for the scavenger's effects to be visible
	println("freed")
	time.Sleep(3 * time.Hour)
}

Running this on macOS, I use footprint to measure the app's footprint and vmmap to get the memory usage details, at each pause point. Concretely, I run go build -o jjj x.go && GODEBUG=allocfreetrace=1 ./jjj to run it and footprint jjj && vmmap -pages -interleaved -submap jjj to measure it.

Before the alloc, for the Go heap, footprint reports:

  Dirty      Clean  Reclaimable    Regions    Category
    ---        ---          ---        ---    ---
1168 KB        0 B          0 B         38    untagged ("VM_ALLOCATE")

After the alloc:

  Dirty      Clean  Reclaimable    Regions    Category
    ---        ---          ---        ---    ---
5344 KB        0 B          0 B         39    untagged ("VM_ALLOCATE")

After the free:

  Dirty      Clean  Reclaimable    Regions    Category
    ---        ---          ---        ---    ---
4192 KB        0 B      1344 KB         40    untagged ("VM_ALLOCATE")

Note that 4192KB-1344KB=2848KB, which is considerably higher than the 1168KB we began with.

(The exact numbers vary slightly from run to run.)

We can get a glimpse into the details of the accounting using vmmap (with flags listed above). For the Go heap, before the alloc:

REGION TYPE                    START - END         [   VSIZE    RSDNT    DIRTY     SWAP] PRT/MAX SHRMOD PURGE    REGION DETAIL
VM_ALLOCATE               14000000000-14000400000  [  256       40       40        0   ] rw-/rwx SM=ZER  
VM_ALLOCATE               14000400000-14004000000  [ 3840        0        0        0   ] ---/rwx SM=NUL  

After the alloc:

REGION TYPE                    START - END         [   VSIZE    RSDNT    DIRTY     SWAP] PRT/MAX SHRMOD PURGE    REGION DETAIL
VM_ALLOCATE               14000000000-14000400000  [  256      200      200        0   ] rw-/rwx SM=ZER  
VM_ALLOCATE               14000400000-14000800000  [  256       85       85        0   ] rw-/rwx SM=PRV  
VM_ALLOCATE               14000800000-14004000000  [ 3584        0        0        0   ] ---/rwx SM=NUL  

After the free:

REGION TYPE                    START - END         [   VSIZE    RSDNT    DIRTY     SWAP] PRT/MAX SHRMOD PURGE    REGION DETAIL
VM_ALLOCATE               14000000000-14000400000  [  256      202      202        0   ] rw-/rwx SM=ZER  
VM_ALLOCATE               14000400000-14000800000  [  256       85        1        0   ] rw-/rwx SM=PRV  
VM_ALLOCATE               14000800000-14004000000  [ 3584        0        0        0   ] ---/rwx SM=NUL  

This lines up with what tracealloc said:

tracealloc(0x14000180000, 0x3d2000, uint8)

and then

tracefree(0x14000180000, 0x3d2000)

The large byte slice spans the 14000000000-14000400000 and the 14000400000-14000800000 regions. However, the free appears only to have marked the pages in the 14000400000-14000800000 region as reclaimable. (84 pages = 1344KB, which is exactly what footprint reported as reclaimable.) The pages in the 14000000000-14000400000 region are still marked as dirty.

As an experiment, I changed sysUnused to also call mprotect(v, n, _PROT_NONE) then mprotect(v, n, _PROT_READ|_PROT_WRITE). See tailscale@38ab03e.

Running again with this change, the unreclaimable space reported by footprint disappears. At the three pause points:

  Dirty      Clean  Reclaimable    Regions    Category
    ---        ---          ---        ---    ---
1168 KB        0 B          0 B         37    untagged ("VM_ALLOCATE")
  Dirty      Clean  Reclaimable    Regions    Category
    ---        ---          ---        ---    ---
5328 KB        0 B          0 B         38    untagged ("VM_ALLOCATE")
  Dirty      Clean  Reclaimable    Regions    Category
    ---        ---          ---        ---    ---
1584 KB        0 B          0 B         39    untagged ("VM_ALLOCATE")

We're not back down to 1168KB (I wish I knew why), but it's considerably better than 2848KB. vmmap shows more or less the same pattern as before:

REGION TYPE                    START - END         [   VSIZE    RSDNT    DIRTY     SWAP] PRT/MAX SHRMOD PURGE    REGION DETAIL
VM_ALLOCATE               14000000000-14000400000  [  256       40       40        0   ] rw-/rwx SM=ZER  
VM_ALLOCATE               14000400000-14004000000  [ 3840        0        0        0   ] ---/rwx SM=NUL  
REGION TYPE                    START - END         [   VSIZE    RSDNT    DIRTY     SWAP] PRT/MAX SHRMOD PURGE    REGION DETAIL
VM_ALLOCATE               14000000000-14000400000  [  256      188      188       12   ] rw-/rwx SM=ZER  
VM_ALLOCATE               14000400000-14000800000  [  256       85       85        0   ] rw-/rwx SM=PRV  
VM_ALLOCATE               14000800000-14004000000  [ 3584        0        0        0   ] ---/rwx SM=NUL  
REGION TYPE                    START - END         [   VSIZE    RSDNT    DIRTY     SWAP] PRT/MAX SHRMOD PURGE    REGION DETAIL
VM_ALLOCATE               14000000000-14000400000  [  256      197      197        5   ] rw-/rwx SM=ZER  
VM_ALLOCATE               14000400000-14000800000  [  256       85        1        0   ] rw-/rwx SM=PRV  
VM_ALLOCATE               14000800000-14004000000  [ 3584        0        0        0   ] ---/rwx SM=NUL  

(Note that if you add the dirty and swapped pages together in the mprotect run, they match the madvise run dirty pages count exactly.)

I don't know how to interpret all of this. But it looks a bit like madvise(v, n, _MADV_FREE_REUSABLE) isn't sufficient to fully return memory to the OS, perhaps because of something having to do with allocation regions.

I'm out of ideas for what/how to investigate from here, but I'm happy to follow up on suggestions.

cc @mknyszek @bradfitz @randall77

Metadata

Metadata

Assignees

Labels

NeedsInvestigationSomeone must examine and confirm this is a valid issue and not a duplicate of an existing one.OS-Darwincompiler/runtimeIssues related to the Go compiler and/or runtime.

Type

No type

Projects

Status

Todo

Relationships

None yet

Development

No branches or pull requests

Issue actions