Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PromQL cleanup for 2.0 #3060

Closed
brian-brazil opened this Issue Aug 11, 2017 · 24 comments

Comments

Projects
None yet
8 participants
@brian-brazil
Copy link
Member

brian-brazil commented Aug 11, 2017

Considering we're allowed breaking changes to PromQL for 2.0, I'd like us to consider removing the following:

  1. delta. Discussed previously on #1772 with no strong conclusion. I still have yet to see any valid uses cases, and it remains an attractive nuisance with users regularly using it inappropriately.

  2. keep_common. I don't think anyone is really using it, and it has semantic issues as you don't know what labels will come out the other end.

  3. drop_common_labels. This does seem to be used occasionally for display purposes. It has the same semantic issues as keep_common, so I wouldn't mind removing it.

  4. count_scalar. Use cases are better handled by absent() or correct propagation of labels in operations.

What do others think?

@fabxc

This comment has been minimized.

Copy link
Member

fabxc commented Aug 11, 2017

I proposed dropping most of our log customization options as they can generally be solved outside of Prometheus and are just bloat for us: https://groups.google.com/forum/#!topic/prometheus-developers/Dc2wRMyhqlE

@brian-brazil brian-brazil added this to the v2.x milestone Aug 11, 2017

@fabxc

This comment has been minimized.

Copy link
Member

fabxc commented Aug 11, 2017

  1. As before, no strong feelings on delta but @beorn7 has strong concerns which I believe still exist. If so, they are strong enough to err on the side of not changing anything.
  2. I'd be very in favour of getting rid of keep_common. It's a wart.
  3. Same.
  4. No strong opinion
@beorn7

This comment has been minimized.

Copy link
Member

beorn7 commented Aug 11, 2017

@brian-brazil

This comment has been minimized.

Copy link
Member Author

brian-brazil commented Aug 11, 2017

I found the conclusion in #1772 convincing and strong.

The conclusion in #1772 was that there was no conclusion.

In relation to your last set of points, for the third the behaviour is the same for both deriv and delta. For the forth point I think the salient issue there is the range you choose. Personally I'd graph the raw queue size in the dashboards rather than trying to find a function that captures exactly what I want. If you want something really responsive, there's always idelta.

On the other 3 proposals, I take it you're in favour.

@fabxc

This comment has been minimized.

Copy link
Member

fabxc commented Aug 11, 2017

I think the last discussion on delta was quite elaborate and both sides have read and semantically processed the arguments of the other.
I doubt that we will change any minds at this point and there's virtually no downside to keeping delta around.

@brian-brazil

This comment has been minimized.

Copy link
Member Author

brian-brazil commented Aug 11, 2017

there's virtually no downside to keeping delta around.

I'd disagree on this point. Users are pretty consistently abusing it, such as using it on counters.

@fabxc

This comment has been minimized.

Copy link
Member

fabxc commented Aug 11, 2017

I'd disagree on this point. Users are pretty consistently abusing it, such as using it on counters.

Not to the point where I find it enough to justify going against @beorn7's strong and certainly valid objections. I really think we have to agree to disagree on this one, just like we did with the job label thing, where we also went for no change in doubt.


Just realized that issue was only about PromQL. Just ignore my first comment on this :)

brian-brazil added a commit that referenced this issue Oct 5, 2017

brian-brazil added a commit that referenced this issue Oct 5, 2017

@flyingmutant

This comment has been minimized.

Copy link

flyingmutant commented Oct 17, 2017

Can you please explain what should be used instead of drop_common_labels?

@flyingmutant

This comment has been minimized.

Copy link

flyingmutant commented Oct 18, 2017

@brian-brazil should I open a separate issue to get an actual answer?

While drop_common_labels may not be the most "core" or semantically sound thing, it is extremely useful for interactive data exploration and analysis. Removing it right before the release with no actual discussion or justification seems strange to say the least.

@huangll

This comment has been minimized.

Copy link

huangll commented Jan 24, 2018

I try to use function histogram_quatile in consule; but it alaways : "no data".
anyone can tell me if it still support in prometheus version 2.0

functions like this:
histogram_quantile(0.9,rate(go_goroutines_bucket{ip="127.0.0.1"}[5m]))

@civik

This comment has been minimized.

Copy link

civik commented Jun 13, 2018

@brian-brazil Is there any best practice replacement for drop_common_labels()? like @flyingmutant referenced above there really isn't a great way for consumers of metrics to drop common garbage labels at query time. This is especially impactful for visualizations such as Grafana tables that rely totally on the query results to define the table structure. d_c_l() was a hack but it was a very useful one, and sometimes thats enough to warrant keeping it around.

@brian-brazil

This comment has been minimized.

Copy link
Member Author

brian-brazil commented Jun 13, 2018

The intended way to handle garbage labels is to remove them in either the target, service discovery, or relabelling that is introducing them.

@civik

This comment has been minimized.

Copy link

civik commented Jun 13, 2018

Yes, that is the 'intended' way. The problem is that the labels might very much be needed, but is not relevant to query output and thus become 'garbage' or maybe a better term 'redundant.'

Take this fictional metric: restarts_pod_total with labels cluster datacenter app namespace and pod

Now I want to make a table visualization of this metric with a query like restarts_pod_total{namespace='sooperfunapp', datacenter='east'}

it will return ALL the labels of the matching query, many of which are redundant now that the data has been filtered. If I'm making a table of "Pod Restarts For SooperFunApp in East DC" its now redundant to have the namespace datacenter and app field in my visualization. Its now incumbent on whatever consumer to provide that label filtering for me.

For better or worse d_c_l() was a pretty effective tool to deal with these situations.

@brian-brazil

This comment has been minimized.

Copy link
Member Author

brian-brazil commented Jun 13, 2018

That sounds like a feature request for Grafana.

@civik

This comment has been minimized.

Copy link

civik commented Jun 13, 2018

They'll tell me to make Prometheus fix it. Anyway, this is just been something my users have been telling me so I thought I would push the feedback up. Thanks!

@juliusv

This comment has been minimized.

Copy link
Member

juliusv commented Jun 13, 2018

@civik One problem with drop_common_labels() was that depending on the day and phase of the moon, labels could sometimes be common, sometimes not. For example, you might have some other labels than namespace and datacenter on that restarts_pod_total metric that sometimes just have one label value (e.g. because no other pod exists yet, or something like that) and sometimes more than one, and then those labels would also disappear unpredictably.

@flyingmutant

This comment has been minimized.

Copy link

flyingmutant commented Jun 14, 2018

@juliusv can you please tell how to handle the (extremely common) situation @civik described?

@kfdm

This comment has been minimized.

Copy link

kfdm commented Jun 14, 2018

I believe you should generally use the "Legend Format" on Grafana
http://docs.grafana.org/features/datasources/prometheus/#query-editor
to show only the labels that you are interested in seeing

@juliusv

This comment has been minimized.

Copy link
Member

juliusv commented Jun 14, 2018

Yep.

@civik

This comment has been minimized.

Copy link

civik commented Jun 14, 2018

@juliusv I'm not implying that drop_common_labels() is the right solution. Maybe this moves to more of a feature request. It would be nice to have some mechanism to tailor the label output at query time, so as not to require this functionality in anything that consumes Prom data.

@kfdm Nope that only controls the legend naming. The problem is that every Grafana panel is basically its own 'subapp' and they often times behave very differently. In a case of a graph this problem doesn't really exist because the graph panel is only interested in the values, and the labels just become part of the legend. However in other panels, like the table panel that I mentioned above assumes that every label should get its own column.

@flyingmutant I have discovered a slightly hacky way to drop lables by using aggregation operators. like sum by (pod,app)(restarts_pod_total{namespace='sooperfunapp', datacenter='east'}) but this only works when using an aggregation makes sense with the data you are querying.

@civik

This comment has been minimized.

Copy link

civik commented Jun 14, 2018

Super simple solution would be to allow the without keyword with any query and not just aggregations.

@juliusv

This comment has been minimized.

Copy link
Member

juliusv commented Jun 14, 2018

I see, yeah, for Grafana tables the legend format doesn't help.

sum without(all, the, common, labels) (x) should actually work for every case if you know what labels are common. The sum() then doesn't actually do anything because every summing group has only one element and is left unchanged, but the common labels are dropped. Example: http://demo.robustperception.io:9090/graph?g0.range_input=1h&g0.expr=sum%20without(job%2C%20instance)%20(node_cpu)&g0.tab=1&g1.range_input=1h&g1.expr=node_cpu&g1.tab=1

A clunkier variant of explicitly removing a label would be label_replace(x, "label_to_drop", "", "", ""), as empty labels are equivalent to missing ones in Prometheus (they will not be part of the output). Example: http://demo.robustperception.io:9090/graph?g0.range_input=1h&g0.expr=label_replace(label_replace(node_cpu%2C%20%22job%22%2C%20%22%22%2C%20%22%22%2C%20%22%22)%2C%20%22instance%22%2C%20%22%22%2C%22%22%2C%20%22%22)&g0.tab=1&g1.range_input=1h&g1.expr=node_cpu&g1.tab=1

So that's not as automatic as drop_common_labels(), but it at least forces you to think about which labels are actually always common, not just on some days. I guess drop_common_labels is unlikely to be added back because it's too easy to get unexpected results... I personally don't have strong feelings about it (I actually added drop_common_labels in the early days of Prometheus), but I can see that we would want to be conservative about what functionality we allow in PromQL.

@civik

This comment has been minimized.

Copy link

civik commented Jun 18, 2018

sum without(all, the, common, labels) (x) seems like a winner. Thanks @juliusv !

@lock

This comment has been minimized.

Copy link

lock bot commented Mar 22, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked and limited conversation to collaborators Mar 22, 2019

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.