Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vector/vector filter comparisons return match group labels rather than LHS labels #5326

Open
juliusv opened this Issue Mar 10, 2019 · 16 comments

Comments

Projects
None yet
3 participants
@juliusv
Copy link
Member

juliusv commented Mar 10, 2019

See http://demo.robustperception.io:9090/graph?g0.range_input=1h&g0.expr=node_cpu_seconds_total%20%3D%3D%20on(cpu%2C%20instance%2C%20job%2C%20mode)%20node_cpu_seconds_total&g0.tab=1&g1.range_input=1h&g1.expr=node_cpu_seconds_total%20%3D%3D%20node_cpu_seconds_total&g1.tab=1

When adding an on() modifier to a vector-to-vector filter, the metric name is dropped. Without the modifier, the metric name is retained. According to the docs at https://prometheus.io/docs/prometheus/latest/querying/operators/#comparison-binary-operators, the metric name should always be kept for filter operators (when not using the bool modifier).

@juliusv

This comment has been minimized.

Copy link
Member Author

juliusv commented Mar 10, 2019

Btw., when including __name__ in the on() modifier, the metric name is retained.

@juliusv

This comment has been minimized.

Copy link
Member Author

juliusv commented Mar 10, 2019

To generalize, when including an on() clause, only labels mentioned in that clause are retained. E.g. node_cpu_seconds_total == on(cpu, instance, mode, __name__) node_cpu_seconds_total will drop the job label from the result.

@roidelapluie

This comment has been minimized.

Copy link
Contributor

roidelapluie commented Mar 10, 2019

sounds like a breaking change.

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Mar 10, 2019

Yeah, that'd be breaking. I'd need to think on this one.

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Mar 10, 2019

node_cpu_seconds_total == bool node_cpu_seconds_total also leaves the metric name in place, that's definitely a bug.

@brian-brazil brian-brazil changed the title Filter operators drop metric name when using on() modifier Vector/vector filter comparisons return match group labels rather than LHS labels Mar 10, 2019

@juliusv

This comment has been minimized.

Copy link
Member Author

juliusv commented Mar 10, 2019

So we've had bug fixes in the past that technically changed behavior, but since they were bug fixes they were deemed ok most of the time. Probably it depends on how much we expect people to rely on this behavior already and how breaking the change would likely be for those people. On the other hand if we don't fix it, we need to document that it's broken.

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Mar 10, 2019

Okay, we've no unittests covering this. The documented behaviour is what makes sense consistency wise, so the question is how breaking is this. __name__ I'm not worried about, however starting to return other labels could cause issues.

@roidelapluie

This comment has been minimized.

Copy link
Contributor

roidelapluie commented Mar 10, 2019

For the time being, users who are affected by this can get the desired behavior by adding group_left()

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Mar 10, 2019

True, we could define this to be the correct behaviour and leave that as a workaround.

@juliusv

This comment has been minimized.

Copy link
Member Author

juliusv commented Mar 10, 2019

I'd prefer calling it a bug and fixing it, but also not sure where the boundary for that should be. Officially defining PromQL to be this weirdly inconsistent would trigger me at least :)

@roidelapluie

This comment has been minimized.

Copy link
Contributor

roidelapluie commented Mar 10, 2019

I would propose to first see if that is a regression or an all time bug.

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Mar 10, 2019

If it's a regression, it's from one of the rewrites: either mine last year, or the one Fabian did a few years back. I'd suspect it's always been like this.

@roidelapluie

This comment has been minimized.

Copy link
Contributor

roidelapluie commented Mar 10, 2019

I tried prometheus-2.0.0-beta.1 and the bug was there

roidelapluie added a commit to roidelapluie/prometheus that referenced this issue Mar 10, 2019

Add test for vector filter comparisons
see prometheus#5326

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Mar 11, 2019

Thinking on it, if anyone is depending on this behaviour they can always wrap a sum() around the LHS as we know there's only one series on the LHS per group. Otherwise it'd be causing a many-to-one errror.

@juliusv

This comment has been minimized.

Copy link
Member Author

juliusv commented Apr 3, 2019

@brian-brazil Does that mean we should fix it or keep it as it is?

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Apr 3, 2019

I think we should fix it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.