Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
runtime: expose number of running/runnable goroutines #17089
I'd like to propose a way to expose the number of active (running + runnable) goroutines.
My primary use case for this metric is to estimate application load (
Currently the runtime package includes
The runtime package could be extended to include
It seems that such a function would need to acquire
It doesn't seem like a very interesting number, it'll always be less than
On Wed, 14 Sep 2016, 00:43 Quentin Smith email@example.com wrote:
@davecheney The suggestion counts runnable goroutines, so it can be larger than GOMAXPROCS.
My concern is that I don't see that this adds anything very useful over
That is correct.
Based on the POC that I made, It seems that at least some system goroutines always appear active. I could be wrong here as they might be blocking on a syscall (like say the netpoller).
The other reason I think system goroutines should be excluded is because
That is correct maybe the
How can this be detected?
Here is the POC code I wrote: https://gist.github.com/fd/7136de67a56e174d8c06cb505f7278aa
Goroutines blocked in the Go runtime have Gwaiting state. You probably don't want to count those, they contribute nothing to CPU load (but do consume some memory).
It is not clear whether you should count goroutines in the Gsyscall state. Whether you want to count them depends on whether they are doing real work in the syscall (reading a large file, say) or waiting (read on an idle network socket). I don't think the runtime has the information needed to make that call, although we might be able to make some approximation. That's what makes this problem hard.
So, how about this:
So unless you are heavily using something like
Remember, it is not my goal to find an accurate estimation of the CPU utilisation. Instead it is my goal to find a good-enough estimation of the application utilisation. I included a excerpt from Site Reliability Engineering, How Google Runs Production Systems which seems to suggest that Google uses a similar metric/approach.
As you say, you are looking for an approximation, and you care about load shedding. Unless you start a long running goroutine for each incoming request, the number of long running goroutines should be a tiny fraction of the total number of goroutines, and are therefore ignorable for approximation purposes.
I agree that proxy servers are a problem.
Since you have proof of concept code, do you have a way to see the difference between
I would be less concerned about adding
One possibility would be to return two numbers: the number of running/runnable goroutines and the number of goroutines waiting for a system call or C code. But that seems to me to be too tied to the current details of how system calls and cgo are implemented.
I assume you are looking for some sort of general framework here, because for any specific program that wants to do load shedding I would say just count the number of active requests.
The problem NumActiveGoroutines is trying to solve is when to shed load. Wouldn't monitoring the latency of an application request be a more direct and ultimately more correct way to do this. If latency increases shed load. If latency improves increase load.
Is there a use case where this doesn't work but NumActiveGoroutines does?
Discussing the nuances of what _Gidle, _Grunnable, _Grunning, _Gsyscall, _Gwaiting plus what _Gscanrunning _Gscanrunnable, _Gscansyscall, and _Gscanidle means in this context is a very implementation dependent discussion.
Even NumGoroutines does not capture all the work C is doing; the C code may have spawned threads that are independently doing work as well.
I think it's reasonable to say that goroutines in C are not active from the perspective of Go, regardless of what they're calling.