Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.Sign up
proposal: runtime: add per-goroutine CPU stats #41554
Per-process CPU stats can currently be obtained via third-party packages like https://github.com/elastic/gosigar. However, I believe that there exists a need for a certain type of applications to be able to understand CPU usage at a finer granularity.
At a high level in CockroachDB, whenever an application sends a query to the database, we spawn one or more goroutines to handle the request. If more queries are sent to the database, they each get an independent set of goroutines. Currently, we have no way of showing the database operator how much CPU is used per query. This is useful for operators in order to understand which queries are using more CPU and measure that against their expectations in order to do things like cancel a query that's using too many resources (e.g. an accidental overly intensive analytical query). If we had per-goroutine CPU stats, we could implement this by aggregating CPU usage across all goroutines that were spawned for that query.
Fundamentally, I think this is similar to bringing up a task manager when you feel like your computer is slow, figuring out which process on your computer is using more resources than expected, and killing that process.
Add a function to the
From a correctness level, an alternative to offering these stats is to
Obtaining execution statistics during runtime at a fine-grained goroutine level is essential for an application like a database. I'd like to focus this conversation on CPU usage specifically, but the same idea applies to goroutine memory usage. We'd like to be able to tell how much live memory a single goroutine has allocated to be able to decide whether this goroutine should spill a memory-intensive computation to disk, for example. This is reminiscent of #29696 but at a finer-grained level without a feedback mechanism.
I think that offering per-goroutine stats like this is useful even if it's just from an obervability standpoint. Any application that divides work into independent sets of goroutines and would like to track resource usage of a single group should benefit.
A possible solution to showing high level usage of different query paths can be achieved by setting profiling labels on the goroutine:
And doing background profiling on the job:
Overall Go program usage can be queried from the enclosing container or process stats from the Operating system directly.
Yes, this is exactly what labels are for. A nice thing about labels is that they let you measure CPU or heap performance across a range of goroutines all cooperating on some shared task.
Please let us know if you need something that is not addressed by labels.
Thanks for the suggestion. My main concern with profiling is that there is a non-negligible performance overhead. For example, running a quick workload (95% reads and 5% writes against a CockroachDB SQL table) shows that throughput drops by at least 8% when profiling with a one second interval.
I'm hoping that this information can be gathered by the scheduler in a much cheaper way since the question to answer is not "where has this goroutine spent most of its time" but "how much CPU time has this goroutine used". Would this even be feasible?
Ah, I see. I would think that always collecting CPU statistics would be unreasonably expensive. But it does seem possible to collect them upon request in some way, at least when running on GNU/Linux. Every time a thread switched to a different goroutine it would call
I don't know how useful this would be for most programs. In Go it is very easy to start a new goroutine, and it is very easy to ask an existing goroutine to do work on your behalf. That means that it's very easy for goroutine based stats to accidentally become very misleading, for example when the program forgets to collect the stats of some newly created goroutine. That is why runtime/pprof uses the labels mechanism.
Perhaps it would also be possible for this mechanism to use the labels mechanism. But then it is hard to see where the data should be stored or how it should be retrieved.