New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
normalize metric labels and structure #3600
Conversation
ddanielr
commented
Jul 12, 2023
- Switched metric names to use dot notation.
- Refactored Scans metric to use a prefix.
- Fixed incorrect metric type in documentation
* Switched metric names to use dot notation. * Refactored Scans metric to use a prefix. * Fixed incorrect metric type in documentation
accumulo/core/src/main/java/org/apache/accumulo/core/metrics/MetricsProducer.java Lines 609 to 610 in 0733d3a
The
However, micrometer uses a flat metrics structure and requires dot notation for all metric names due to conversion on registry types. Metric Naming Because of this, the metric name ends up being multiple items in the Graphite hierarchy.
Vs the metric creation in a flat metrics systems like Prometheus.
If we want to be fully compliant with micrometer's conversion ability, then we have to only use dot notation in our metric names. We could add MeterFilters to our statsd sink to rewrite any metrics we want to have as a single metric. It looks like micrometer supports some interesting conversions with labels for hierarchical systems (See micrometer's HierarchicalNameMapper) which fully rewrites the metric name. |
What are the implication of this change in a bug fix release? What would the release notes for this change look like? |
Keith asked: I feel that the "bug" is that incorrect naming / naming hierarchy could complicate or prevent metrics from being used correctly down-stream. Getting the metric correct in a 2.x release should be a priority because it is a LTM release and we expect that it will be used for quite a while. And the approach may hinge on if metric names are considered part of the public API. Providing a mapping of prev -> new name in the release notes may be sufficient if the metrics are not a "hard" API requirement. If there is a more strict interpretation of metrics names must consistent across bug-fix versions, the we may want to consider 2.1.2 is released as 2.2 But not sure what that implies as far as LTM designations go. We'd probably not what to maintain 2.1 and 2.2, 3.x.,, I can see both viewpoints. Having metric names change between versions is irritating and could cause issues as monitors / alarms break or change with the version, and could make trending across versions problematic. On the other hand, having a broken metric so that I cannot monitor my system as intended and I can't get it fixed until the next major release does not seem ideal either, |
I don't have an opinion about making this change in 2.1 because I don't know enough. I am curious though, want to try to understand the implications. Reading the above point and earlier points it seems like the current metric name is buggy. Is a potential problem with fixing it in 2.1 that someone may have adapted to the buggy name and still have been able to use it? |
The problem is two-fold.
I think targeting 2.1 to fix the possible meterRegistry failure is in-scope of a bug fix change. Overall I think we may need a better structure for the metric generation to help simplify management and generation of any corresponding documentation. Similarly to how the Property docs are generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Metrics aren't public API, so I don't mind tweaking them, especially if they are being changed to make them stable and more capable of actually being used long-term. However, these should be called out in the release notes, in case anybody happened to be using the old names in an earlier 2.1 release.