You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Nowadays, administrators and support engineers may work with a wide variety of platforms, ranging from bare metal servers to virtual machines to cloud instances to containers to edge and IoT devices. This means that when analysing or troubleshooting a system or an environment something like 4 GB or 1 GB/s could be "almost nothing" or "a lot" depending on the platform whereas something like "98%" would mean "plenty" regardless of the platform.
It is already possible to provide that kind of per-process percentage of CPU and RAM utilization but not for disk IO or network. The issue there is that since per-process IO metrics and the combined kernel IO metrics are disjoint in a sense the kernel actually doing IO may happen at a later time than the per-process IO is showing up such calculation in a formula is thus currently not possible.
Since it is not supported to do something like delta(sum(x)) it looks an extension to the existing derived metrics machinery would be needed to make the above possible. So something like "deltasum()" or such derived function to allow providing the kind information as "process N consumes X% of all IO". I haven't investigated how large a surgery this would require and would this be feasible or even possible with a reasonable effort so I'm merely filing this idea/suggestion for the record here. Thanks.
The text was updated successfully, but these errors were encountered:
kmcdonell
added a commit
to kmcdonell/pcp
that referenced
this issue
Jun 5, 2024
Most of this changes delta(x), rate(x) and instant(x). Formerly x was
restricted to a metric name, but now x may be an arbitrary expression.
The restriction was simply guilt by association: the other unary
functions like defined(x), sum(x), min(x), count(x), ... all make
sense for a single metric (which may be set-valued), but are not
needed for the unary functions delta(x), rate(x) and instant(x).
The code change was not difficult (yacc is a wonderful invention).
The man page changes and QA were more challenging.
Specifically this allows: delta(sum(x)) which fixes Marko's RFE at
performancecopilot#1991.
Also there was some flakey error reporting where some errors at
bind-time were not being reported at all, unless -Dderive was in play.
@myllynen We don't need deltasum(), turns out delta(sum()) was easy to implement and works just fine, as per commit bab1ef8
I'm going to close this issue, but please feel free to re-open with additional information if the change to allow delta(expr) and rate(expr) (and instant(expr) as it was in the same boat) does not address your needs.
Nowadays, administrators and support engineers may work with a wide variety of platforms, ranging from bare metal servers to virtual machines to cloud instances to containers to edge and IoT devices. This means that when analysing or troubleshooting a system or an environment something like 4 GB or 1 GB/s could be "almost nothing" or "a lot" depending on the platform whereas something like "98%" would mean "plenty" regardless of the platform.
It is already possible to provide that kind of per-process percentage of CPU and RAM utilization but not for disk IO or network. The issue there is that since per-process IO metrics and the combined kernel IO metrics are disjoint in a sense the kernel actually doing IO may happen at a later time than the per-process IO is showing up such calculation in a formula is thus currently not possible.
Since it is not supported to do something like delta(sum(x)) it looks an extension to the existing derived metrics machinery would be needed to make the above possible. So something like "deltasum()" or such derived function to allow providing the kind information as "process N consumes X% of all IO". I haven't investigated how large a surgery this would require and would this be feasible or even possible with a reasonable effort so I'm merely filing this idea/suggestion for the record here. Thanks.
The text was updated successfully, but these errors were encountered: