New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: profile goroutine creation #16894

Open
rhysh opened this Issue Aug 26, 2016 · 9 comments

Comments

Projects
None yet
8 participants
@rhysh
Contributor

rhysh commented Aug 26, 2016

Most unbounded Go memory leaks I've encountered have been due to goroutine leaks. When diagnosing these leaks, it would be helpful to know what function call stack led to the creation of the goroutines so we can link the goroutines back to the non-stdlib code that created them—why the goroutine exists, in addition to its start PC. This is the same vein in which the current heap profiles include the stack that allocated the memory rather than just the PC and type of the allocation.

Related proposal for file descriptor profiling: #16379


A popular way to leak goroutines is by making http requests and not closing the response body. That problem could be targeted more specifically with a good static analysis tool, similar to vet's new -lostcancel check. Both approaches would be helpful.

@josharian

This comment has been minimized.

Contributor

josharian commented Aug 26, 2016

There's already a goroutine profile; search for "goroutine" in https://golang.org/pkg/runtime/pprof/. I think that that covers this.

@rhysh

This comment has been minimized.

Contributor

rhysh commented Aug 26, 2016

That goroutine profile shows the current stack of the goroutines and a single address of the location of the go keyword. I'm interested in seeing the whole stack of the function that used the go keyword.

If this existed today, it would likely be called "goroutinecreate".

@quentinmit quentinmit added this to the Go1.8Maybe milestone Sep 6, 2016

@quentinmit quentinmit added the NeedsFix label Oct 11, 2016

@rsc

This comment has been minimized.

Contributor

rsc commented Oct 21, 2016

Like in #16379, I'm a bit worried about monitoring slowly taking up all the CPU we have available.

The nice thing about the current goroutine profile is that it only costs something when you ask for it. It doesn't slow down the program always. I suppose we could sample the goroutine creations, but that might be too confusing.

@rsc rsc added NeedsDecision and removed NeedsFix labels Oct 21, 2016

@rsc

This comment has been minimized.

Contributor

rsc commented Oct 21, 2016

@prashantv

This comment has been minimized.

Contributor

prashantv commented Oct 22, 2016

@rsc Similar to memory profiling, I think having the user pick a profile rate (for both goroutine creation and file descriptor profiling) makes a lot of sense. It could be 0 by default, although I imagine even having a 1% sample profile rate will be useful. Similar to memory leaks, if there's tens of thousands of goroutines leaking, having 1% of the creations would be enough to find the source.

@rsc rsc modified the milestones: Go1.9, Go1.8Maybe Nov 2, 2016

@aclements aclements modified the milestones: Go1.10, Go1.9 Jun 9, 2017

@mandarjog

This comment has been minimized.

mandarjog commented Oct 25, 2017

+1
This is extremely useful.
I am actively debugging the following. If I let the program run, it eventually runs out of memory.
I have already checked that the number of goroutines does not change during a load test, However as seen below I end up with 640490 calls to runtime.malg to create new go routines.
It would be great to see what code is using the go keyword.

jog@devinstance:~/mixer_leak_dumps$ go tool pprof --base 1_hd_initial.txt mixs.0.2.9 hd_resting.txt
Entering interactive mode (type "help" for commands)
(pprof) top
266.20MB of 267.18MB total (99.63%)
Dropped 616 nodes (cum <= 1.34MB)
Showing top 10 nodes out of 16 (cum >= 2.01MB)
      flat  flat%   sum%        cum   cum%
  254.10MB 95.10% 95.10%   254.10MB 95.10%  runtime.malg
   11.09MB  4.15% 99.26%    11.09MB  4.15%  runtime.allgadd
    1.01MB  0.38% 99.63%     2.01MB  0.75%  istio.io/api/mixer/v1.(*ReportRequest).Unmarshal
 pprof) inuse_objects
(pprof) top
676617 of 676603 total (  100%)
Dropped 611 nodes (cum <= 3383)
Showing top 10 nodes out of 21 (cum >= 32031)
      flat  flat%   sum%        cum   cum%
    640490 94.66% 94.66%     640490 94.66%  runtime.malg
     16427  2.43% 97.09%      32031  4.73%  istio.io/api/mixer/v1.(*ReportRequest).Unmarshal
     15604  2.31% 99.40%      15604  2.31%  istio.io/api/mixer/v1.(*Attributes).Unmarshal
@rsc

This comment has been minimized.

Contributor

rsc commented Oct 25, 2017

@mandarjog, did you try looking at the CPU profile or the memory profile in graphical form (the 'web' command)? I would expect goroutine creation to appear in both of those.

@mandarjog

This comment has been minimized.

mandarjog commented Oct 25, 2017

@rsc Following is the graphical view. runtime.malg seems to begin with runtime.startTheWorldWithSema

mixer_heap

Should I be looking at something else?

@rsc

This comment has been minimized.

Contributor

rsc commented Oct 25, 2017

@mandarjog, sorry, I guess the profiles have changed since I last saw them. You're right that that's not helpful.

@aclements, we've had problems trying to do a backtrace off the system stack onto the main stack for CPU profiling, but maybe we can do it for memory allocation profiling, which doesn't happen at random program counters?

@rsc rsc modified the milestones: Go1.10, Go1.11 Nov 22, 2017

@gopherbot gopherbot modified the milestones: Go1.11, Unplanned May 23, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment