New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: mechanism for monitoring heap size #16843

Open
bradfitz opened this Issue Aug 22, 2016 · 56 comments

Comments

Projects
None yet
@bradfitz
Member

bradfitz commented Aug 22, 2016

Tracking bug for some way for applications to monitor memory usage, apply backpressure, stay within limits, etc.

Related previous issues: #5049, #14162

@rgooch

This comment has been minimized.

rgooch commented Aug 23, 2016

Can you please expand on what you have in mind?

@bradfitz

This comment has been minimized.

Member

bradfitz commented Aug 23, 2016

I have nothing specific in mind. This bug was filed as part of a triage meeting with a bunch of us. One bug (#5049) was ancient with no activity and one bug (#14162) proposed a solution instead of discussing the problem.

This bug is recognition that there is a problem, and we've heard problem statements and potential solutions (and sometimes both) from a number of people.

The reality is that there are always memory limits, and it'd be nice for the Go runtime to help applications stay within them, through perhaps some combination of limiting itself, and/or helping the application apply backpressure when resources are getting tight. That might involve new runtime API surface to help applications know when things are getting tight.

/cc @nictuku also.

@bradfitz

This comment has been minimized.

Member

bradfitz commented Aug 23, 2016

Btw, there was lots of good conversation at #14162 and it wasn't our intention to kill it or devalue it. It just didn't fit the proposal process, and we also didn't want to decline it, nor close it as a dup of #5049.

Changing the language is out of scope, so all discussions of things like catching memory allocation failures, language additions like "trymake" or "tryappend", etc, are all not going to happen.

But we can add runtime APIs to help out. That's what this bug is tracking.

/cc @matloob @aclements

@rgooch

This comment has been minimized.

rgooch commented Aug 23, 2016

Agreed. "try*" isn't practical. It would require changing too make call-sites and even then would not catch all allocations. Adding runtime.SetSoftMemoryLimit() still seems like the best approach.

@nictuku

This comment has been minimized.

Contributor

nictuku commented Aug 23, 2016

It would be nice to have the ability to set a limit to the memory usage.

After a limit is set, perhaps the runtime could provide a clear indication that we're under memory pressure and that the application should avoid creating new allocations. Example new runtime APIs that would help:

  • func InMemoryPushback() bool; or
  • func RegisterPushbackFunc(func(inPushback bool))

That would provide a clear signal to the application. How exactly that's decided should be an internal implementation decision and not part of the API. An example implementation, to illustrate: if we limit ourselves to the heap size specified by the user, we could trigger GC whenever the used heap is close to the limit. Then we could enter pushback whenever the GC performance (latency or CPU overhead) is outside certain bounds. Apply smoothing as needed.

The approach suggested by this API has limitations.

For example, it's still possible for an application that is behaving well to do one monstrous allocation after it has checked for the pushback state. This would be common for HTTP and RPC servers that do admittance control at the beginning of the request processing. If the monstrous allocation would bring the memory heap above the limit, Go should probably panic. Since we don't want to change the language to add memory allocation error checks, I think this is fine. And we have no other option :).

Another problem is that deciding what is the right time to pushback can be hard. Whatever the runtime implements, some folks may find it too aggressive (pushing back too much, leading to poor resource utilization) or too conservative (pushing back too late, leading to high latency due to excessive GC). I guess the go team could provide a knob similar to GOGC to control the pushbackiness of the runtime, if folks are really paranoid about it.

@RLH

This comment has been minimized.

Contributor

RLH commented Aug 23, 2016

The runtime could set up a channel and send a message whenever it completes
a GC. The application could have a heap monitor goroutine (HMG) watching
that channel. Whenever the HMG gets a message it inspects the state of the
heap. To determine the size of the heap the HMG would look at the live heap
size and GOGC. If need be it could adjust GOGC so that the total heap does
not exceed whatever limit the application finds appropriate. If things are
going badly for the application the HMG can start applying back pressure to
whatever part of the application is causing the increase in heap size. The
HMG would be part of the application so a wide variety of application
specific strategies could be implemented.

Trying to pick up the pieces after a failure does not seem doable. Likewise
deciding what is "close to a failure" is very application specific and a
global metric that potentially involves external OS issues such as
co-tenancy as well as other issue well beyond the scope of the Go runtime.
Decisions and actions need to be made well ahead if one expects them to
reliable prevent an OOM.

I believe this is where we were headed in #14162
#14162 and this is a recap of some of
that discussion.

I would be interested in what useful policy could not be implemented using
the HMG mechanism and current runtime mechanisms.

On Tue, Aug 23, 2016 at 1:43 AM, Yves Junqueira notifications@github.com
wrote:

For Google's internal needs, it would be nice to have the ability to set a
limit to the memory usage.

After a limit is set, perhaps the runtime could provide a clear indication
that we're under memory pressure and that the application should avoid
creating new allocations. Example new runtime APIs that would help:

  • func InMemoryPushback() bool; or
  • func RegisterPushbackFunc(func(inPushback bool))

That would provide a clear signal to the application. How exactly
that's decided should be an internal implementation decision and not part
of the API. An example implementation, to illustrate: if we limit ourselves
to the heap size specified by the user, we could trigger GC whenever the
used heap is close to the limit. Then we could enter pushback whenever the
GC performance (latency or CPU overhead) is outside certain bounds. Apply
smoothing as needed.

The approach suggested by this API has limitations.

For example, it's still possible for an application that is behaving well
to do one monstrous allocation after it has checked for the pushback state.
This would be common for HTTP and RPC servers that do admittance control at
the beginning of the request processing. If the monstrous allocation would
bring the memory heap above the limit, Go should probably panic. Since we
don't want to change the language to add memory allocation error checks, I
think this is fine. And we have no other option :).

Another problem is that deciding what is the right time to pushback can be
hard. Whatever the runtime implements, some folks may find it too
aggressive (pushing back too much, leading to poor resource utilization) or
too conservative (pushing back too late, leading to high latency due to
excessive GC). I guess the go team could provide a knob similar to GOGC to
control the pushbackiness of the runtime, if folks are really paranoid
about it.


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#16843 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AA7Wn-x0kWzbQY0w2nI8daJSWBbIHPWHks5qiohsgaJpZM4Jqa25
.

@rgooch

This comment has been minimized.

rgooch commented Aug 23, 2016

I previously gave the reasoning why using a channel or a callback to receive memory exceeded events won't work: #14162
That same reasoning applies to a channel whenever a GC run is completed.

To robustly handling exceeding a memory limit the check for the limit has to be part of the allocator, not done after a GC run. This is because you can't afford to wait. If you wait for the next GC run, it may be too late. Consider a single large slice allocation that would put you over the soft limit and would exceed the hard memory limit. You'll get an OOM panic. The same applies to a callback function.

You need to immediately stop the code which is doing the heavy allocating. To do that you need a check in the allocator and you need to send a panic(). It's up to the application to set the soft memory limit at which these optional, catchable panics are sent.

Please, before rehashing old suggestions or coming up with new variants, read through #14162 where I gave the reasoning why a panic and a check in the allocator is needed. Otherwise we keep covering the same old ground.

@quentinmit

This comment has been minimized.

Contributor

quentinmit commented Aug 23, 2016

@rgooch If you are allocating giant arrays, you probably know exactly where in your code that is happening, and you can add code there to first check if there is enough memory available. You can even do that using the GC information we're discussing passing down a channel.

I do think there is a race here, but in the opposite case - if code is sitting in a tight loop making many small allocations, your channel read/callback might not run in time to actually trigger a new GC soon enough without OOMing.

@rgooch

This comment has been minimized.

rgooch commented Aug 23, 2016

I discussed all this in #14162: you can be reading GOB-encoded data from a network connection. No way to know ahead of time how big it's going to be. Or it can be some other library you don't control where a lot of data are allocated, whether a single huge slice or a lot of small allocations. The point is, you don't know how much will be allocated before you enter the library code and you've got no way to reach in there and stop things if you hit some pre-defined limit. And, as you say, if you're in a loop watching allocations, even if you could stop things, you may not get there in time. Spinning in a loop watching the memory level is grossly expensive. This needs to be tied to the allocator.

@RLH

This comment has been minimized.

Contributor

RLH commented Aug 23, 2016

This does not propose a callback or channel for delivering a memory
exceeded message or a memory almost exceeded message. At that point it is
already too late. This proposes a mechanism for providing the application
timely information that it can use to avoid the OOM. The application knows
how best to predict memory usage and, if need be, throttle its memory usage.

One suggestion was
func runtime.ReserveOOMBuffer(size uint64)

The application's heap monitor goroutine, HMG, could initially allocate a
large object of the required size and retain a single reference to it. If
the HMG using information provided by the runtime determines that the
current GOGC and live heap size will not support the application's
predicted allocations then it can release that single reference confident
that the next GC will recover those spans and make them available. It the
HMG wants the GC to happen sooner than currently scheduled then it can
lower GOGC using SetGCPercent.

If ReserveOOMBuffer is the API that some Go application needs then this
provides it. The intent of this proposal is to provide the application with
the information it needs to create the abstractions that best fits its need
while minimizing Go's runtime API surface.

On Tue, Aug 23, 2016 at 11:13 AM, rgooch notifications@github.com wrote:

I previously gave the reasoning why using a channel or a callback to
receive memory exceeded events won't work: #14162
#14162
That same reasoning applies to a channel whenever a GC run is completed.

To robustly handling exceeding a memory limit the check for the limit has
to be part of the allocator, not done after a GC run. This is because you
can't afford to wait. If you wait for the next GC run, it may be too late.
Consider a single large slice allocation that would put you over the soft
limit and would exceed the hard memory limit. You'll get an OOM panic.
The same applies to a callback function.

You need to immediately stop the code which is doing the heavy allocating.
To do that you need a check in the allocator and you need to send a
panic(). It's up to the application to set the soft memory limit at which
these optional, catchable panics are sent.

Please, before rehashing old suggestions or coming up with new variants,
read through #14162 #14162 where I
gave the reasoning why a panic and a check in the allocator is needed.
Otherwise we keep covering the same old ground.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#16843 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AA7Wn4rwLDnFazl8ko7MEgqGqjlHlYJKks5qiw4rgaJpZM4Jqa25
.

@dr2chase

This comment has been minimized.

Contributor

dr2chase commented Aug 25, 2016

As I read this, #14162 describes a workload where (analogy follows) sometimes the python attempts to swallow a rhino, and if the attempt is not halted ASAP it is guaranteed to end badly. Is it in fact the case that the rhino will never be successfully swallowed? (I can imagine DOS attacks on servers where this might be the case.)

I think that the periodic notification scheme is intended to deal with a python diet of a large number of smaller prey; if an application has the two constraints of m=memory < M and l=latency < L, and if m is affine in workload W (reasonable assumption) and l is also affine in workload W (semi-reasonable), then simply comparing observed m with limit M and observed l with limit L tells you how much more work can be admitted (W' = W * min(M/m, L/l)), with the usual handwaving around unlucky variations in the input and lag in the measurement. It's possible to adjust GOGC up or down if M/m and L/l are substantially different, so as to maximize the workload within constraints -- this however also requires knowledge of the actual GC overhead imposed on the actual application (supposed to be 25% during GC, but high allocation rates change this). One characteristic of this approach is that a newly started application might not snap online immediately at full load, but would increase its intake as it figured out what load it could handle.

But this is no help for intermittent rhino-swallowing.

@quentinmit quentinmit added this to the Go1.8Maybe milestone Sep 6, 2016

@jessfraz

This comment has been minimized.

Contributor

jessfraz commented Oct 5, 2016

@bradfitz would you be open to me taking some of the ideas from #14162 and applying the Go proposal process so it is considered? As long of course the proposed solution doesn't break the API or change the language.

@bradfitz

This comment has been minimized.

Member

bradfitz commented Oct 5, 2016

As long as the proposal isn't to "make it possible to catch failed memory allocations", which I'm pretty sure everybody agrees isn't going to happen.

But any proposal should address or at least consider the whole range of related issues in this space. (back pressure, runtime & applications being aware of limits & usage levels)

@jessfraz

This comment has been minimized.

Contributor

jessfraz commented Oct 5, 2016

I was thinking a couple additions to the runtime package to expose information that might be useful for applications like you said in #16843 (comment)

@rsc rsc modified the milestones: Go1.9, Go1.8Maybe Oct 21, 2016

@juliandroid

This comment has been minimized.

juliandroid commented Feb 20, 2017

Is there any decision about how this would be properly implemented?

In Perl there is documented a notorious $^M global variable that user code could initialize to some lengthy string, that in case of Out of memory error could be used as an emergency memory pool after die()ing. However I couldn't find a working example and it seems that feature was never implemented.

However it seems logical approach. Since you are most probably in multitenancy environment, sharing memory with other go/non-go programs, so the only buffer that you can rely on is the emergency one allocated by yourself. Using that memory by go runtime in case of low memory and immediately notifying the subscribed process that you are running out of memory seems like a good measure to prevent pure go programs panic.

@nictuku

This comment has been minimized.

Contributor

nictuku commented Feb 20, 2017

My proposal is here: https://docs.google.com/document/d/1zn4f3-XWmoHNj702mCCNvHqaS7p9rzqQGa74uOwOBKM/edit

I hope to have an implementation open sourced soon. I don't know if it could be included in the standard libraries.

I would like to make it as robust as possible, so if you'd like to test it, please drop me an email (see my github profile) and I'll contact you later. Thanks!

@rgooch

This comment has been minimized.

rgooch commented Mar 8, 2017

This proposal looks interesting. I made a couple of comments in the document:

  1. Support the pattern of pre-allocating at startup (up to a percentage of the VM/container memory) and never give that memory back to the OS

  2. Have a hard memory limit and push back+GC harder as you get closer to the limit.

@CAFxX

This comment has been minimized.

Contributor

CAFxX commented Mar 8, 2017

Added feedback to optionally trigger orderly application shutdown when GC pacing fails to keep memory below the set maximum.

@tve

This comment has been minimized.

tve commented May 11, 2017

I'm dealing with an app that runs out of memory (on a 16GB box) and that eventually lead me here. Some of the notes I took along the way are below, apologies if these fall into a "yeah, we know" category.

  • On 64-bit linux, I hit the out-of-memory panic in sysMap in mem_linux.go:216, but when I look up the call stack I see it passing through grow in mheap.go:774 and the code leads me to believe that if sysMap had returned an error instead of just panicking then grow could have tried a smaller allocation.
fatal error: runtime: out of memory

runtime stack:
runtime.throw(0x8a2de5, 0x16)
        /usr/local/go/src/runtime/panic.go:596 +0x95
runtime.sysMap(0xc437a10000, 0x5800000, 0xc420394800, 0xaebef8)
        /usr/local/go/src/runtime/mem_linux.go:216 +0x1d0
runtime.(*mheap).sysAlloc(0xad31a0, 0x5800000, 0x421b81)
        /usr/local/go/src/runtime/malloc.go:428 +0x374
runtime.(*mheap).grow(0xad31a0, 0x2c00, 0x0)
        /usr/local/go/src/runtime/mheap.go:774 +0x62
runtime.(*mheap).allocSpanLocked(0xad31a0, 0x2c00, 0xaceb30)
        /usr/local/go/src/runtime/mheap.go:678 +0x44f
  • I'm running in a container env where the container has a max memory set and I'm trying to understand what fraction of that can realistically be "in_use". It appears that I have to count for anywhere from 25% to 50% overhead. E.g., if the cgroup has memory=16GB then the actual in-use heap data structures may be in the 8GB..12GB range before I hit the out-of-memory panic. On the one hand, with GC that's perhaps in the reasonable ballpark, on the other hand this does represent $$.
  • The amount of "unused heap overhead" seems to be tunable using the GOGC env variable, I didn't see a way to modify this at run-time. For example. while the process is far from its limit using 100% reduces GC overhead, but when it reaches perhaps 60% of its limit I may want to change it to 20% to trade memory vs cpu. In my app I see it going from 1% to 6% of cpu overhead.
  • I'm very interested in being able to capture control when the process runs out of memory or is about to. I understand that in the absolute this is a difficult problem, but I'm looking at it from a troubleshooting perspective. I would first use it to output a memory profile or similar information so I can understand how much memory is allocated where, plus some info about GC (e.g. allocated but unused space). It would be OK for this to trigger before absolutely-out-of-memory occurs, e.g. the first time the runtime gets back-pressure from the OS (see first bullet point).
  • I do believe that many services can adjust their memory consumption by, broadly speaking, adjusting the concurrency. For example, an HTTP server can adjust the number of requests that are concurrently processed. I believe the runtime.MemStats info is sufficient for this purpose, but it could be enhanced by having some callback mechanism when a threshold is exceeded. E.g., a web server could block processing of new requests when 80% of available memory is used and only resume when it drops below 75%.

Overall I concur with the sentiment that most apps that run out of memory will run out of memory regardless of how fancy a mechanism is added to the current situation. For this reason if I had a vote I would vote for adding some additional simple hooks so one can do some tuning and foremost troubleshoot when an app does run out of memory.

@aclements

This comment has been minimized.

Member

aclements commented May 11, 2017

On 64-bit linux, I hit the out-of-memory panic in sysMap in mem_linux.go:216, but when I look up the call stack I see it passing through grow in mheap.go:774 and the code leads me to believe that if sysMap had returned an error instead of just panicking then grow could have tried a smaller allocation.

I'm not sure what you're suggesting, exactly. grow can reduce its request by at most 64 KB, which probably isn't going to help when a multi-gigabyte heap is running out of room.

I'm running in a container env where the container has a max memory set and I'm trying to understand what fraction of that can realistically be "in_use". It appears that I have to count for anywhere from 25% to 50% overhead.

Assuming you mean runtime.MemStats.HeapInUse (and friends), note that this can vary depending on where you are in a GC cycle. Perhaps more interesting is MemStats.NextGC, which tells you what heap size this GC cycle is trying to keep you below. This changes only once per GC cycle.

The amount of "unused heap overhead" seems to be tunable using the GOGC env variable, I didn't see a way to modify this at run-time.

runtime/debug.SetGCPercent lets you change this. Right now this triggers a full STW GC, but in Go 1.9 this operation will let you change GOGC on the fly without triggering a GC (unless you set it low enough that you have to immediately start a GC, of course :)

@tve

This comment has been minimized.

tve commented May 12, 2017

I'm not sure what you're suggesting, exactly. grow can reduce its request by at most 64 KB, which probably isn't going to help when a multi-gigabyte heap is running out of room.
Ah, I couldn't tell that, you're right then.

@tve

This comment has been minimized.

tve commented May 12, 2017

My proposal is here: https://docs.google.com/document/d/1zn4f3-XWmoHNj702mCCNvHqaS7p9rzqQGa74uOwOBKM/edit

Nice long proposal write-up :-). I'm trying to understand the tl;dr; ...

The proposal seems to come down do "periodically measure live data size and set GCPercent such that GC is triggered before the desired total heap size is reached". As mentioned in the proposal, this can be done/approximated today in the app itself using runtime.MemStats and debug.SetGCPercent.

As far as I can tell the following changes to the runtime would be desirable to improve this:

  • ensure that the calls necessary are efficient (some (all?) optimization are in Go1.9 already)
  • provide a hook so GCPercent can be adjusted after each GC instead of relying on a periodic timer?

As a user I'm still left wondering a bit what a reasonable goal in all of this is. I'm imagining something like "for the vast majority of Go apps the tuning of GCPercent allows 80% of memory to be used for live data with moderate GC overhead and 90% with high to very high GC overhead". Maybe someone in the Go community has informed intuition about specific numbers.

The answer to requests to have some callback or rescue option when memory allocation fails would be that instead GC overhead exceeding N% or GCPercent falling below below M% should be used to trigger said rescue action.

@tve

This comment has been minimized.

tve commented May 12, 2017

I did an experiment to use GCPercent to constrain heap size and while the principle works as expected, it does look sufficient to me. I'm working on an app that digests some giant CSVs where memory consumption is an issue. I'm running with GCPercent=25 to try and contain the memory overhead. I'm running with gctrace=1 and the highest heap size number I see is 797MB:

gc 389 @209.888s 6%: 0.013+888+0.10 ms clock, 0.055+164/183/1068+0.40 ms cpu, 796->797->613 MB, 797 MB goal, 4 P

A little later after some memory has been freed I grab MemStats and get the following HeapXxx stats which show 1.2GB of heap (all gctrace outputs since the above were lower):

Heap stats: sys=1205MB inuse=488MB alloc=438, idle=717, released=0

Data grabbed from top at about that time seems to agree with the heap stats (code/stack size are not significant):

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
17746 tve       20   0 1272812 983920   6952 S 226.2 24.9   7:44.84 csv-digest

I was trying to keep the memory used by my process to 613MB*1.25=767MB using GCPercent but clearly that's not really working.
The point here is that tuning GCPercent is not sufficient if there is some hard limit one wants to stay under.
(I understand that my 25% goal may very well be unrealistic but I don't think this invalidates the point.)

@beorn7

This comment has been minimized.

beorn7 commented Nov 21, 2017

Just for the record, as Prometheus was mentioned early in the whole story as a signature use case: If you look at the current Prometheus code (v2.x), you'll find that that's not the case anymore, as most of the memory used is in mmap'd files. The RSS of Prometheus 2.x is tiny compared to Prometheus 1.x.
However, I do believe that using mmap in Go programs (and the subsequent management of raw data blocks of memory) will and should be limited to very specific scenarios and is certainly not a viable work-around in general. So please do keep up the good work here!

If you are interested how the problem was “solved” in the later 1.x Prometheus versions (calling ReadMemStats once per second), here are the relevant code references (including the sometimes desperate comments of the poor coder):

As you can see, this grew into something fairly involved, which is, however, still not able to make absolutely sure we won't let the heap grow too much. On the other hand, the RSS is anyway not closely correlated to the heap size (and, for some reason I don't know, increased slightly for the same heap sizes with Go1.9). In practice, this worked quite nicely. On our fairly large number of Prometheus servers at SoundCloud (~70 servers), we never had an OOM-kill again, until we compiled Prometheus with Go1.9 but kept the settings the same (and the ratio between RSS and heap size went up).

@rgooch

This comment has been minimized.

rgooch commented Nov 21, 2017

That requires that all my transitive dependencies support contexts. That does not seem likely to happen, or will take a loooooong time.

@aclements

This comment has been minimized.

Member

aclements commented Nov 21, 2017

Also curious about the status of the SetMaxHeap experiment by @aclements in https://go-review.googlesource.com/c/go/+/46751. I see it got +2'd by @RLH but not sure that means anything for next steps. Is some API like that planned to go in eventually or at least still under consideration? If so, is there anywhere else we should look for discussion about it?

The intent is to get some experience with that API and make sure it actually solves problems (and doesn't create new ones :). We're planning to roll it out as an experimental API within Google and I was also hoping to get some adventurous open source users to try it out (I should email golang-dev), but neither of these has happened yet.

So, looking at https://golang.org/cl/46751 there is still the problem of how to induce a panic for goroutines which have opted-in. ... @aclements: will your solution be including that feature as well?

Sorry, but no, that isn't part of my solution. As @RLH said, context cancellation is the "right" answer to this, though I understand that context isn't everywhere. I'm afraid I still don't really know what a panic-based solution would look like. What actually triggers the panics? Which goroutines actually get hit by the panic? A large part of the point of CL 46751 is that the back-pressure is gradual, graceful, and application-level, so the application can respond as a whole before things go too terribly wrong. And it's application level because the heap is application level. We don't have the ability to say "Goroutine X is using Y MB of memory" because it's not well-formed in general (how do you count things reachable from multiple goroutines?). "Y MB could be freed if goroutine X exited" is well-formed and could theoretically be useful for this, but I'm pretty sure it's very expensive to figure out the answer to that.

(and the ratio between RSS and heap size went up).

@beorn7, out of curiosity, what sort of ratio are you seeing in practice? In general it's hard to bound this, so I expect you'll always have to do some testing to establish this and it will change a bit between releases.

@rgooch

This comment has been minimized.

rgooch commented Nov 21, 2017

To use an enlightened quote: "the perfect is the enemy of the good". Whether contexts are the "right" solution is unclear, but it is clear that it will be a long time before they can help solve the problem generally. In the meantime, people have to deal with OOM panics.

Here is an approach that may work while preserving the solution you've implemented: add an API that allows a goroutine to receive an externally induced panic. That would then allow me to catch the event from your memory pressure channel and start sending events to goroutines to initiate panics. Ideally, if the screams from the garbage collector get louder, I'd start inducing panics to more and more goroutines, to bring the situation under control. This follows the basic "opt-in" philosophy that I've been advocating. I know which goroutines are vulnerable to triggering an OOM, so those are the ones I opt-in to being killed.

Suggested API:
func MakePanicChannel() chan <- error

When an error is sent on the channel, the calling goroutine will panic, with the provided error.

@beorn7

This comment has been minimized.

beorn7 commented Nov 21, 2017

@beorn7, out of curiosity, what sort of ratio are you seeing in practice? In general it's hard to bound this, so I expect you'll always have to do some testing to establish this and it will change a bit between releases.

Yes, totally aware of that. I didn't want to imply a need to clamp RSS (which would be close to impossible) but merely underline that clamping the heap size doesn't have to be perfect rocket science to have the effects desired in many scenarios.

To answer your question: Our rule of thumb for a reasonably safe heap size setting was 67% of available physical memory with Go1.8 compiled Prometheus 1.7, and 60% for Go1.9 compiled Prometheus 1.8. (Note the beautiful version number dance…)

@rsc rsc modified the milestones: Go1.10, Go1.11 Nov 22, 2017

@RLH

This comment has been minimized.

Contributor

RLH commented Nov 22, 2017

@rgooch

This comment has been minimized.

rgooch commented Nov 22, 2017

Firstly, this is an opt-in mechanism. The goroutine must call MakePanicChannel and it must register that channel with whomever it wishes to give panic powers. Secondly, people should be using defer to manage their locks, which mitigates a lot of the problems with panic-as-abort. For the class of problems I've discussed up-thread, this approach will work well.

@RLH

This comment has been minimized.

Contributor

RLH commented Nov 22, 2017

@rgooch

This comment has been minimized.

rgooch commented Nov 27, 2017

How else would you recover from a too-large memory allocation (and unwind the calling stack) which is buried deep?

@RLH

This comment has been minimized.

Contributor

RLH commented Nov 28, 2017

@robaho

This comment has been minimized.

robaho commented Sep 7, 2018

In the meantime, something like Java's 'dump heap on OOM' would be very helpful, as long as there is a heap analyzer tool - but I assume it could just dump it in the memprof format and that should suffice.

@aclements

This comment has been minimized.

Member

aclements commented Sep 7, 2018

@robaho, if you're on Linux (or probably most BSD kernels) and enable core dumps, you should get a core dump. You can then analyze that core's heap using viewcore (which definitely still has rough edges, but can answer a lot of questions).

@aclements

This comment has been minimized.

Member

aclements commented Sep 7, 2018

The intent is to get some experience with that API and make sure it actually solves problems (and doesn't create new ones :). We're planning to roll it out as an experimental API within Google and I was also hoping to get some adventurous open source users to try it out (I should email golang-dev), but neither of these has happened yet.

Just a quick update on this. We've had several projects trying out SetMaxHeap. It helps, but this experimentation has uncovered a few rough interactions with other parts of the runtime. The biggest problem is that large object fragmentation can cause the heap's RSS to grow significantly larger than the allocated size of the heap (#14045). As a result, for systems that suffer from large object fragmentation, there needs to be a large (and somewhat unpredictable) buffer between the reserved memory and the max heap. The other known problem is that stacks and globals don't currently count toward the GC trigger, but do count toward the RSS, and since stacks change dynamically it's hard to account for this overhead when setting the heap limit (#19839 and #23044). Both of these issues also cause problems for other reasons (wasting memory and failing to amortize GC costs). I've prioritized fixing both for Go 1.12.

@andreimatei

This comment has been minimized.

andreimatei commented Sep 7, 2018

@aclements thank you for your interest in this area. For what it's worth, all this is a pretty big deal for CockroachDB, who's trying to do memory accounting and would like to trust the runtime to stay within given limits.

@vitalyisaev2

This comment has been minimized.

vitalyisaev2 commented Sep 10, 2018

I have an urgent need in a tool that helps to understand what's actually going in a Go process address space, why RSS keeps growing despite of memory limitations and so on. Also I need to use cgo, which makes problem even more complicated. Currently I have to use a set of tools like pprof, valgring --tool=massif, viewcore (introduced few weeks ago), and some self-developed tools. But it looks like I can see only different aspects of the problem, not an entire problem.

For example, I see that process has 5GB RSS. Go runtime says that it takes 2GB (though only 5% of them is used, and other 95% are idle). cgo library says that it uses < 500MB in it's caches and other internal data structures. And no one knows what consumed the remaining 2.5GB. I can only speculate if it's Go runtime cost (for instance, because I have profiler enabled as well), or it's a memory leak due to cgo (malloc without free).

@aclements

This comment has been minimized.

Member

aclements commented Sep 10, 2018

@vitalyisaev2, please open a new issue or send an email to golang-nuts@googlegroups.com. In it, please elaborate on what you mean by "Go runtime says that it takes 2GB". The runtime exports many different statistics, and it's important to know which one you're talking about. I would start by looking closely at all of the runtime.MemStats statistics, since that should tell you if it's on the Go side or the C side. If it's on the Go side, viewcore is probably the right tool to find the problem. Please also describe the time-scale of the problem, since some things (like returning memory to the OS) happen on the scale of minutes.

@rsc

This comment has been minimized.

Contributor

rsc commented Nov 14, 2018

@aclements, what do you think the decision is here that NeedsDecision refers to?
Is there a different bug for your memory pressure work?
Should this issue be closed?

@andybons andybons modified the milestones: Go1.12, Go1.13 Dec 5, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment