-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[collectd 6] cpu plugin: Align metrics with OpenTelemetry recommendations. #4216
Conversation
000d934
to
0d3380e
Compare
Solaris builds show quite a few warnings (not completely overlapping):
And the other of them fails to:
|
Thanks for pointing this out. Fixed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug in config_keys[]
needs to be fixed.
Also, does CPU plugin need to handle run-time CPU hot-plugging, or why states
is 1D dynamically resized [cpu*states] array [1], instead init-allocated 2D [cpu][state] array?
[1] I'd rather avoid things like cpu_num = u->states_num / STATE_MAX
and active_index = (cpu * STATE_MAX) + STATE_ACTIVE
if hot-plug support is not needed, as they complicate code significantly.
Thanks for the review @eero-t. I think I have addressed all your points. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review is fairly superficial, but as there are tests, I can approve once invalid comment is fixed (not setting this as approved yet, because I do not trust automerge bot).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved, but apparently the updated comment needs clang-formatting.
* Add metric description and unit. * Update label names (e.g. "cpu" → "system.cpu.logical_number") * Divide rates by number of CPUs, so that the sum of all rates equals to 1. (Previously the sum of all rates was equal to the number of logical CPUs) * Remove the "cpu=total" label when aggregating CPUs.
* The options `ReportUsage`, `ReportUtilization`, and `ReportNumCpu` control which metrics are emitted on a high level. Other options no longer influence *what* is being collected. This also allows to report usage and utilization metrics simultaneously. * The documentation has been updated to reflect that the plugin no longer emits a percentage, but a ratio for utilization metrics.
This one is done in a second aggregation loop, because we require a CPU-level aggregate rate to be available to properly scale the counter.
…o` and `usage_count`.
…ount`. The functionality is tested in the test cases for `usage_ratio` and `usage_count` and there is no need to test these separately. The opposite: the rest of the CPU plugin only uses `usage_ratio` and `usage_count`, so testing the global variants leaks abstraction.
This way the use of the field is much easier to understand when reading the code.
This will just work transparently.
Co-authored-by: Eero Tamminen <eero.t.tamminen@intel.com>
Also sort `config_keys`.
Co-authored-by: Eero Tamminen <eero.t.tamminen@intel.com>
Fixed. Thanks @eero-t! |
cpu=total
label when aggregating CPUs.usage_t
and unit tests ensure that it functions as expected.ChangeLog: CPU plugin: The metric schema has been aligned with OpenTelemetry semantic conventions.