New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: make the scavenger more prompt #16930

Open
aclements opened this Issue Aug 30, 2016 · 8 comments

Comments

Projects
None yet
6 participants
@aclements
Member

aclements commented Aug 30, 2016

Currently, there's a lag of at least five minutes (and potentially much longer) between the Go heap size shrinking and the process' RSS shrinking. This is done to amortize the cost of releasing and re-acquiring system memory, but has several downsides:

  1. It's bad for memory-constrained environments, whether that constraint is 10MB or 100GB, since it makes it basically impossible to respond to resource changes by shrinking the heap. This is particularly noticeable in server environments that allow temporarily exceeding a soft limit, since it turns this soft limit into a hard limit.
  2. It prevents using memory for other useful purposes. There's no such thing as "unused memory". Memory not used by an application will be used by the OS for the buffer cache or the zero page cache, and free memory gives the OS the flexibility to perform optimization like transparent huge pages or NUMA migration. In the mobile sphere, this is particularly noticeable to the user, since it means less space for application and resource caching, which directly affects user experience.
  3. Finally, it's confusing. It's clear from questions on golang-nuts and elsewhere that people expect their RSS to track their heap size and, in particular, their heap profiles.

I believe we should make the scavenger release memory promptly and aggressively.

I believe we can do this without significant overhead by improving the design of the scavenger. Currently the scavenger is careful to recycle spans that have been unused for more than five minutes. This is largely wasted effort because virtual memory is fungible: there's a slight TLB locality boost to retaining very recently used memory (on the order of a few megabytes), but beyond this it doesn't matter what unused pages we return to the OS. The inflexibility of the scavenger has several downsides. Primarily, it requires many system calls to release sparse unused regions of memory, and these system calls have a very high per-call cost because the OS needs to do remote TLB invalidation. This cost also grows with the number of CPUs. It can also needlessly delay freeing memory because coalescing two neighboring free spans takes the most recent "used" time of the two.

I propose separating the concerns of how many pages to release from which pages to release.

Which pages to release should be based on minimizing the number of sysUnused calls (madvise on Linux). Roughly, releasing n pages should attempt to release an n page span and if no spans n pages or larger are on the free list, it should release as few smaller spans as possible. There are many algorithms that satisfy this outline, and I suspect the details don't matter. It would be reasonable to adopt the exact algorithm used by tcmalloc, since that has been field-tested. We should preferentially release spans from the end of the free lists, since those have been least recently used and have the least preference for being reused.

How many pages to release is a harder question, but by separating these concerns we have the flexibility to choose a good answer. For example, the current policy can be expressed as retaining max(HeapInUse over the past 5 minutes). The "right" answer depends on the costs of releasing and re-acquiring memory, the future size of the heap, and the relative CPU versus memory costs we're willing to incur. Because of the "sawtooth" behavior of GC, we do know something about the future size of the heap, at least: the heap will grow to the heap goal in the near future, and often that heap goal is in a rough steady state. Hence, as a simple heuristic, I propose we use the heap goal times a "steady state variance" factor, with some hysteresis to retain cost amortization in case garbage collections are happening rapidly:

retain = C * max(current heap goal, max({heap goals of GCs over the past T seconds}))
C = 1 + ε = 1.1
T = 10 seconds

/cc @matloob @RLH

@quentinmit quentinmit added this to the Go1.8 milestone Sep 6, 2016

@quentinmit quentinmit added the NeedsFix label Oct 11, 2016

@quentinmit

This comment has been minimized.

Contributor

quentinmit commented Nov 7, 2016

ping @aclements
Is there any chance we can get this into 1.8? If this is a simple change, it would make a huge difference. If it's not a simple change, it's probably too late for 1.8.

@aclements

This comment has been minimized.

Member

aclements commented Nov 7, 2016

Not a simple change, I'm afraid. Moving to 1.9.

@aclements aclements modified the milestones: Go1.9Early, Go1.8 Nov 7, 2016

@bradfitz bradfitz modified the milestones: Go1.10Early, Go1.9Early May 3, 2017

@bradfitz bradfitz modified the milestones: Go1.10Early, Go1.10 Jun 14, 2017

@aclements

This comment has been minimized.

Member

aclements commented Jun 23, 2017

Closely related: #14045 (comment)

@aclements

This comment has been minimized.

Member

aclements commented Aug 31, 2017

@gopherbot

This comment has been minimized.

gopherbot commented Oct 12, 2018

Change https://golang.org/cl/141937 mentions this issue: runtime: eagerly scavenge based on global budget

@gopherbot

This comment has been minimized.

gopherbot commented Oct 17, 2018

Change https://golang.org/cl/142959 mentions this issue: runtime: track the last 16 heap goals

@gopherbot

This comment has been minimized.

gopherbot commented Oct 17, 2018

Change https://golang.org/cl/142960 mentions this issue: runtime: add background scavenger

@gopherbot

This comment has been minimized.

gopherbot commented Oct 18, 2018

Change https://golang.org/cl/143157 mentions this issue: runtime: remove periodic scavenging

@bradfitz bradfitz removed the early-in-cycle label Nov 3, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment