New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Spring metrics endpoint duplicate metric name handling #6404
Comments
I am not sure I follow. Spring Boot currently supports counters and gauges. That metric is from the latter category where we record one value. If we have two events from the same key, we'll record |
I was just wondering how a system should behave when if finds 2 metrics with the same name inside the publicMetrics collection. (I'm using a prometheus spring boot client lib that assumes that the metric names coming out of that collection are unique. Is that a correct assumption or should it remove the duplicates (by for example storing the collection in a map) I also wasn't aware this was a gauge. I assumed it was a counter (as shown here)
This particular metric is coming from Zuul, so a Spring Cloud library is responsible for converting this into a Spring metric. I noticed that the ServoMetricReader is converting this counter into a metric :
The ServoMetricReader converts this Into normalized.servo.zuul.filter-formbodywrapperfilter (it takes into account some tags, but not the ones specified here I was just wondering if
|
To answer your last 2 questions: The issue you have is with a particular Spring Cloud adapter (from servo to Spring Boot metrics), so it might be worth raising the issue there, and proposing a fix if you have one.
|
Can you please clarify what your intended semantics are? Is the status here part of the identity of the metric, or is it merely an annotation that should be ignored and thrown away? If it's part of the identity, why are metrics being incorrectly and silently thrown away? If it's an annotation, why do you permit users to have duplicate metrics and what is the correct algorithm to select the correct duplicate? |
You're asking the question in the wrong place (as already indicated above). If you want to include the servo status in the metric name, please open an issue or maybe a PR in spring cloud netflix. |
My question is about Spring itself - what do you expect providers to put here? How do you expect consumers to process this data? I'm the maintainer of the Prometheus Java client library, and this came up as a problem when integrating with Spring. Telling us to go talk to another team doesn't really help, as we don't know exactly what is going wrong here. We need to solve this generally and that means understanding the intended semantics of Spring metrics. |
The semantics of Spring Boot metrics are simple, and actually pretty obvious from the Java APIs - a metric has a name, a value and a timestamp. That's it. It's intentionally very simple. If people want to attach semantics to particular formats in the metric names, that's understandable, and normal (we copied graphite, basically, and there's a huge community of people doing the same over there), but it isn't contained in the Spring Boot API. You actually do need to talk to the people who designed individual metrics to understand what they intended (and it looks like there might be a bug in the servo metrics, but this isn't the right place to fix that). Does that help? |
So to paraphrase, that the servo people have these duplicate metrics is likely a misunderstanding on their end and not considered to be valid metrics by Spring? |
To clarify a bit here, the issue as hand is whether Spring is actually following the Graphite data model. Thus far we'd been presuming that you are. This issue is about a case which indicates that either a) this is not the graphite data model or b) there's a bug/design flaw somewhere. |
Anything with a name, value and timestamp is valid, so that's not completely accurate, is it? I think you get it. |
In the example in the original post we have two metrics with identical names and timestamps but different values. That can't be valid in a metrics-based instrumentation system. If Spring considers this okay because of the differing tags, then you're following the Prometheus data model. We'll need to update our code to take advantage of that and the spring boot JSON metrics output needs to be fixed as dropping this vital information on the floor would be a major bug. If Spring considers this okay, by not forbidding it though code or convention, then that's a fundamental flaw and we should try to turn it into one of the other cases. If Spring doesn't consider this okay, then this is a provider bug combined with a flaw that allows such invalid state to be represented (i.e. PublicMetrics uses a Collection rather than a Map). The docs/sanity checks need improving to avoid confusion in the future and the code we have on the Prometheus end is already correct. Where duplicate entries would be valid would be an event logging based system, as multiple distinct events with the same name can happen in the same second. I'm fairly sure you didn't intend to build such a system and it's plausible that providers have gotten confused about this, so better docs/sanity checks will also cover that. I'm guessing this is what happened, combined with the 3rd case. |
Just to further what @brian-brazil is saying... There is clearly a disconnect here between what the At least to the casual user data (myself) data is being overwritten, or omitted and/or then magically created. If that's how the actuator collector works, that's fine, but it should be documented that the inputs need to ensure proper namespace separation to avoid data corruption. Let's be clear here, we're basically talking about the potential for silent data corruption in our monitoring system. If this is a real possibility, then the documentation needs to be updated to clearly define the PreConditions for use. (And it would be good to log warnings as possible). Agreed that there might also need to be a spring-cloud-netflix update depending on the decision about semantics (and corresponding documentation/code updates). |
I still don't see why anything needs to change in Spring Boot (which doesn't rule out ever making changes, I'm just talking about the need with the current codebase). The metrics have a name, and they don't have tags or annotations or dimensions or anything fancy. Why would we need to document something that isn't there? |
@dsyer Thanks. I think you just said it. Might be worth putting that note somewhere in the docs. @brian-brazil : I think this means that metrics in spring boot aren't intended to be per-se dimensional and therefore uniqueness needs to be determined by name and not the name + tags. Perhaps then defining the actual name as something like: "counter_xxx{ image = "foo" }" is the appropriate semantics? |
There's an implicit assumption in the current code that isn't documented, and at least two providers are violating this assumption. Thus documentation is appropriate to help clarify this situation, so we can then go to these providers, point at the clarified docs and get the bugs fixed on their end. More specifically, we need to clearly define the constraints on what can go in a PublicMetrics. Right now there are no constraints that I can find documented on PublicMetrics.
That's a completely separate discussion. Given this is Graphite-like I'd be very wary of making it look Prometheus-like, as I can see that creating much breakage. Better to use the Prometheus client library directly if you want to go down that route, rather than trying to hack it into Spring. |
What kind of constraints would you like? I'm not really getting the point of this discussion yet. It is what it is, and the existing APIs are pretty simple to understand (although I admit maybe we could be more explicit about which ones we consider public). |
That the name must be unique within a PublicMetrics collection. |
Currently the metrics endpoint stores all metrics in a hashmap, thus avoiding duplicates but potentially overwriting important metric values.
For example the following Zuul Monitors expose the following monitors :
That will results in the following metrics:
Only one will end up in the metrics endpoint.
Not sure if this is really a spring boot issue, as I can imagine that the metrics provider should guarantee uniqueness in terms of names, but the metrics endpoint is silently ignoring (by overwriting) some potentially important metrics this way.
The text was updated successfully, but these errors were encountered: