Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.Sign up
proposal: runtime: add a mechanism for specifying a minimum target heap size #23044
I propose that we add a GC knob (either an environment variable or a function in runtime/debug) which allows users to set the minimum target heap size for the garbage collector.
For now I'll call this setting GOGCMIN pending a better name.
Right now, the one GC tunable available to users is GOGC. From the runtime documentation:
The idea is that when the target heap size is calculated based on live data and GOGC, if that target is less than GOGCMIN, then GOGCMIN is used instead.
It has often been noted that programs which make a lot of allocations while maintaining a small live heap end up doing excessive garbage collections.
At my company, we've noticed this a number of times. It's typically a problem with data processing applications which might read and write messages from queues at a large rate, yet keep very little data live over the long term.
We've had to address CPU usage due to such excessive garbage collections for at least three separate applications. Here are two real situations we observed:
In all cases, we would prefer that the application use a lot more memory in order to do fewer collections.
The knob that's available for controlling this situation is GOGC, described above. When we've come across these issues in the past, we've set high GOGC values and that largely fixes the problem, at least in the short term.
Unfortunately, this is a fragile fix. We don't actually care about the GOGC ratio; we want to target a particular heap size. So if we have a 40 MB heap, we might back into a GOGC value like 1000 or even 10000 in order to target a 400 MB or 4 GB heap size, respectively. With large GOGC values, the application is extremely sensitive to small increases in the live heap size: if our 40 MB heap increases to 100 MB (not a large jump), then our 4 GB target becomes 10 GB.
In fact, we recently had crashes with one application where we set GOGC=1200 many months ago when its typical heap size was a few hundred MB. The live data size increased to several GB and then the service started OOMing.
One way to address the shortcomings of GOGC is to dynamically adjust its value. This is possible by using runtime/debug.SetGCPercent.
We tried a solution that involved a long-running goroutine in every application watching memory use (via runtime.ReadMemStats) and adjusting SetGCPercent.
There are at least two problems with this approach:
We eventually settled on an awful workaround: we have a long-running goroutine manage a set of dummy allocations (ballast). When the heap is small, the ballast is large; if the live heap reaches the target size, the ballast shrinks to zero.
We can't pick the ballast as accurately as we would like because, again, we need the live heap size from the previous GC cycle. But by using the total non-ballast heap size as a conservative proxy, the solution works well enough. In particular, by keeping GOGC at a normal level (usually 100), we aren't subject to the heap size spike issue.
Obviously this isn't a great solution for the long term since it wastes memory that could otherwise be used for something else (like disk cache).
A related idea is to have a mechanism for limiting the max heap size (see #16843 and other linked discussions). However, I believe that a min size is a much simpler problem to solve since it doesn't require application coordination (backpressure).
Also related to the old issue #9067.
I'm happy to give this the full proposal document treatment if that's useful. It seems like a simple idea that doesn't necessarily need it.
Based on my limited understanding of the garbage collector, this would be easy to implement and wouldn't add much complexity to the GC.
The root cause is that the GC does not know how much RAM is available so it is understandable conservative. If the GC knows how much RAM it has to play with then the problem could be transparently handled by the GC. As the related discussion notes, #16843 proposes a mechanism for specifying a maximum heap size. Once the GC knows the maximum heap available it can use GC frequency to help determine when to start the next GC cycle. One idea is to increase heap size until the GC runs at a reasonable frequency or the maximum heap size is reached. While this proposal is much simpler it does not address many of the issues #16843 addresses.
@RLH I agree that some solutions to #16843 might address this, but frankly the discussion around that problem has not looked promising from the outside. It seems like the theoretical and technical barriers to a good max-heap API are large.
Meanwhile, the GC frequency problems described in this issue are -- to us, at least -- more pressing than the max-heap issue, admit a much simpler solution, and have only bad workarounds today.
We are working our way through the issues related to #16843. We have a prototype implementation at https://go-review.googlesource.com/c/go/+/46751 the community is welcome to comment on. Such public prototypes are intended to address the implementation barriers while discussion on the theoretical barriers are part of the process. Adding knobs to the GC is very heavy weight process.
That said your problems are real. Perhaps a discussion concerning what the GC should use internally as the default minimum heap size, currently 4MB, is warranted. For example would a default heap size derived from typically cloud virtual machine instances be a reasonable start? In December 2017 a standard 4 CPU instances (think GOMAXPROCS) comes with 16GB. A "high CPU" 4 CPU instance come with 3.6 GB. #16854 could be used to limit heap size to something below the default so small heap functionality wouldn't be lost.
Thanks. It seems like CL 46751 doesn't address my issue, though. (At least I don't see how from the documentation; I haven't understood all of the code.)
Furthermore, I don't necessarily have any kind of pushback mechanism to react to the notifications and I would not like to replace the current fast-OOM behavior on excessively large heaps (good) with any kind of thrashing/death spiral (bad).
If the minimum were raised to, say, [40 MB × GOMAXPROCS] then that would be helpful to us.
I expect the people running Go on raspberry pis and such would have something to say about that though.
https://en.wikipedia.org/wiki/Raspberry_Pi indicates that Raspberry Pis have had 256MB / core since Gen 1. The latest one has 4 cores and 1GB RAM. I'm not that familiar with them but 40MB / core (GOMAXPROCS) doesn't seem unreasonable. The hope is that #16843 and the GC / contest implementations will provide the tools needed to avoid a death spiral. If an application doesn't provide any pushback then it will likely OOM. Nothing would prevent the GC from aborting if it is able to detect a death spiral.…
On Thu, Dec 21, 2017 at 2:30 PM, Caleb Spare ***@***.***> wrote: We are working our way through the issues related to #16843 <#16843>. We have a prototype implementation at https://go-review.googlesource.com/c/go/+/46751 the community is welcome to comment on. Such public prototypes are intended to address the implementation barriers while discussion on the theoretical barriers are part of the process. Adding knobs to the GC is very heavy weight process. Thanks. It seems like CL 46751 doesn't address my issue, though. (At least I don't see how from the documentation; I haven't understood all of the code.) Furthermore, I don't necessarily have any kind of pushback mechanism to react to the notifications and I would not like to replace the current fast-OOM behavior on excessively large heaps (good) with any kind of thrashing/death spiral (bad). That said your problems are real. Perhaps a discussion concerning what the GC should use internally as the default minimum heap size, currently 4MB, is warranted. For example would a default heap size derived from typically cloud virtual machine instances be a reasonable start? In December 2017 a standard 4 CPU instances (think GOMAXPROCS) comes with 16GB. A "high CPU" 4 CPU instance come with 3.6 GB. #16854 <#16854> could be used to limit heap size to something below the default so small heap functionality wouldn't be lost. If the minimum were raised to, say, [40 MB × GOMAXPROCS] then that would be helpful to us. I expect the people running Go on raspberry pis and such would have something to say about that though. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#23044 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AA7Wn2bssHbRMQpE782vKuI72V9mQxEfks5tCrHmgaJpZM4Q6bba> .
I have a similar problem but a different use case. The program is processing lots of data from disk and can run for a very long time, the memory usage of the Go program itself is not much of an issue but the server that it is running on does more. When the server is running other tasks, for example backups, it can run out of memory. This is not a problem for other applications but the Go application crashes when it wants a little bit more memory but is denied.
My preferred solution would be a setting on how to handle this situation.
@elvarb I believe you can already achieve behavior you described on Linux utilizing cgroups and their oom-related knobs: https://www.kernel.org/doc/Documentation/cgroup-v1/memory.txt
I ran into this again today with another service that was doing 15 collections/sec. This one is a fairly simple webserver that receives 10-15k requests per second and turns them into protobuf messages that are sent along the network to other systems for processing.
referenced this issue
Feb 24, 2018
I think there are two basically orthogonal things going on here.
GC amortization failure
Assuming I understand the problem, I think what's actually going on here is a failure of the runtime to amortize GC costs. Currently, the GC goal (the payment) is in terms of heap size, but the actual GC cost is proportional not just to the heap size, but also to the size of the globals, the sizes of the stacks, and possibly some small fixed cost overhead. At large heaps, the relative contributions of the other factors tends toward zero, but at small heaps they can be a significant portion of the cost. (For the record, we've been here before: #19839 :)
@cespare, for the real situations where you've observed this, I'd love to know how much data those programs have in globals (add the sizes of the .data and .bss segments) and in stacks (roughly
Here's a trivial example program to demonstrate my thinking: https://play.golang.org/p/k69Zo0C7M1F. Here's the measured GC wall clock time on my laptop with GOMAXPROCS=1 (not necessary, but keeps things predictable) with two different sizes of globals as the heap grows:
Perhaps more to the point, we can look at the proportional GC cost:
The 4MB minimum is meant to truncate away the really, really bad proportional cost at the left, but even with just 10MB of globals that's clearly not enough.
Given this, I propose that we tweak the definition of GOGC to be proportional to the total cost of GC, which includes at least heap, globals, and stacks. For applications with larger heaps, the difference probably won't be noticeable. But I believe this may solve the "small heap problem" without the need for extra knobs or potentially-dangerous rate limiting.
For example, consider a program that has 100MB of live heap and 100MB of globals. In the current scheme, the footprint grows to 300MB total, but GC has to scan 200MB for 100MB of growth, making GC twice as expensive as an equivalent program with 0MB of globals. In the proposed scheme, the footprint would grow to 400MB total, but GC would scan 200MB for 200MB of growth, so the GC cost is identical to the program with 0MB of globals.
Fixing GC amortization may or may not fix your problem (I'll have a better sense if you can measure the globals and stacks). However, based on some of the things you said, I think SetMaxHeap might be a reasonable solution, or I might be picking quotes too carefully. :)
This sounds like exactly what SetMaxHeap does. If you know how big you want the heap, you can set GOGC to ~infinity (in TeX tradition, say infinity=10000) and put the heap entirely under the control of SetMaxHeap.
I think the SetMaxHeap channel would let you do this sort of thing much more effectively. For example, if you're okay with the heap growing, you can use the channel to observe that the heap is under pressure and rather than reducing your application's heap usage (the normal use of the channel) you could raise the heap limit. This is cheap to do (doesn't trigger a GC). And it's okay if there's some lag because it's still a soft limit: in the worst case, the GC will expend some extra cycles trying to keep you under an unnecessary limit, but it's not going to OOM your process.
@aclements thanks very much for your detailed consideration.
I pulled some metrics for one of my example programs. Let me know if you need more. (These numbers are after removing my "ballast" workaround.)
Everything you're saying about SetMaxHeap sounds interesting. Should I be trying out CL 46751 and giving feedback for these use cases?
From the above, I'm assuming that the .data and .bss sizes are insignificant and that we can point the finger at the stack:heap proportion.
I looked at 3 other programs where we've noticed a high GC rate (not all of these were using enough CPU to warrant addressing) and I can confirm that they all have similar .data/.bss sizes.
referenced this issue
Jul 2, 2018
This has been noted elsewhere (don't have the issues handy) but I also want to point out that this "small heap" problem is pretty common when running benchmarks (i.e., with
(How to write and interpret benchmarks in the face of GC is a more general -- and difficult -- problem, of course, but benchmarks are the other place where I end up fiddling with GOGC and it would be much easier to explain GOMINHEAP or SetMaxHeap than
This was referenced
Sep 6, 2018
This is unrelated to fixing the actual issue but just as a potential way to mitigate the downsides of the workaround: have you considered marking the space used by the ballast as
Hey @aclements, I tried out SetMaxHeap today and I think it will work for my purpose.
For example, I have a small test application which allocates several GB/sec but keeps almost no live data; without any mitigation it will run more than 1000 GCs/sec. I can set a 1 GB ballast and then it will use between 1-2 GB of memory and only GC 4x/sec. But instead, I can do
and it will use <1 GB of memory while still doing the same 4 GCs/sec.
In my case, I think that not reacting to the notification channel makes sense since I don't actually care about the heap limit.
Does this all sound like what you had in mind as far as using SetMaxHeap to address the problems I've outlined here?
This issue with managing GOGC definitely shows up in real prod environments. It isn't super common but when it shows up, workaround of dynamically adjusting GOGC definitely tends to be pretty clunky.
@aclements to add to discussion of GOGC_MIN (or GOGC_GB) vs SetMaxHeap, there are few benefits I can think of for GOGC_GB from our internal experiences: