Use Case 1: Extract CPU Profile
If an execution trace contains CPU sample events, it would be useful to extract a cpu/nanoseconds CPU profile similar to the one produced by runtime.StartCPUProfile.
This can be useful when building tools. E.g. a "trace to CPU profile" tool. Or perhaps a tool for explaining the _Grunning time of a goroutine using the CPU samples for that goroutine. A naive solution would give equal weight to each collected CPU sample and stretch it over the sum of the _Grunning of the goroutine, but that could be misleading in case of scheduler latency, see below.
Use Case 2: Understand OS Scheduler Latency
A Go application might experience two types of scheduler latency: OS Scheduler Latency and Go Scheduler Latency. The latter can easily be analyzed using the execution tracer.
Detecting OS scheduler latency is a bit more tricky, but possible. Over a long enough time period, the cumulative time goroutines spend in running state should converge to the cumulative number of traceEvCPUSample events multiplied by their duration (default 10ms). If there are significantly less traceEvCPUSample events than expected, that's a strong indicator that the application is not getting enough scheduling time from the OS. That's a common problem for some setups, so it'd be nice to use tracing data to detect it.
(There are some dragons here when it comes to CPU samples received during syscalls/cgo ... but I think that deserves a separate discussion)
Problem:
The traceEvCPUSample event does not include a value indicating how much CPU time it represents:
|
traceEvCPUSample = 49 // CPU profiling sample [timestamp, real timestamp, real P id (-1 when absent), goroutine id, stack] |
One could assume that it's always 10ms, but that won't work if the user calls runtime.SetCPUProfileRate. Unfortunately the execution trace does record this value, and it's not possible to get the currently active value from user land either. Unlike SetMutexProfileFraction, SetCPUProfileRate does not return a value, and there is no GetCPUProfileRate method either.
Additionally it's currently not possible calculate the expected number of traceEvCPUSample events if the CPU profiler is not enabled for the entire duration of the trace.
Suggestion:
Add a new traceEvCPUProfileRate event that is recorded in the following case:
- The tracer starts while the CPU profiler is already running.
- StartCPUProfile is being called.
- StopCPUProfile is called (record
0)
Alternatively we could also have a start/stop event for the CPU profiler.
cc @mknyszek @prattmic @nsrip-dd @rhysh
Use Case 1: Extract CPU Profile
If an execution trace contains CPU sample events, it would be useful to extract a
cpu/nanosecondsCPU profile similar to the one produced by runtime.StartCPUProfile.This can be useful when building tools. E.g. a "trace to CPU profile" tool. Or perhaps a tool for explaining the _Grunning time of a goroutine using the CPU samples for that goroutine. A naive solution would give equal weight to each collected CPU sample and stretch it over the sum of the _Grunning of the goroutine, but that could be misleading in case of scheduler latency, see below.
Use Case 2: Understand OS Scheduler Latency
A Go application might experience two types of scheduler latency: OS Scheduler Latency and Go Scheduler Latency. The latter can easily be analyzed using the execution tracer.
Detecting OS scheduler latency is a bit more tricky, but possible. Over a long enough time period, the cumulative time goroutines spend in
runningstate should converge to the cumulative number oftraceEvCPUSampleevents multiplied by their duration (default10ms). If there are significantly lesstraceEvCPUSampleevents than expected, that's a strong indicator that the application is not getting enough scheduling time from the OS. That's a common problem for some setups, so it'd be nice to use tracing data to detect it.(There are some dragons here when it comes to CPU samples received during syscalls/cgo ... but I think that deserves a separate discussion)
Problem:
The
traceEvCPUSampleevent does not include a value indicating how much CPU time it represents:go/src/runtime/trace.go
Line 74 in 39effbc
One could assume that it's always
10ms, but that won't work if the user calls runtime.SetCPUProfileRate. Unfortunately the execution trace does record this value, and it's not possible to get the currently active value from user land either. Unlike SetMutexProfileFraction, SetCPUProfileRate does not return a value, and there is noGetCPUProfileRatemethod either.Additionally it's currently not possible calculate the expected number of
traceEvCPUSampleevents if the CPU profiler is not enabled for the entire duration of the trace.Suggestion:
Add a new
traceEvCPUProfileRateevent that is recorded in the following case:0)Alternatively we could also have a start/stop event for the CPU profiler.
cc @mknyszek @prattmic @nsrip-dd @rhysh