New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sub query support for range selection. #1227

Open
fabxc opened this Issue Nov 18, 2015 · 27 comments

Comments

Projects
None yet
@fabxc
Member

fabxc commented Nov 18, 2015

Currently one can only do range selection on vector selectors and has to use recording rules to range-select expressions.

@brian-brazil

This comment has been minimized.

Member

brian-brazil commented Nov 18, 2015

An interesting question is how we choose the interval for this, currently it's not needed as everything in calculated at a point. We should also look at how to discourage/prevent use of this in rules/alerts on performance grounds.

@grobie

This comment has been minimized.

Member

grobie commented Dec 24, 2015

👍

@Andor

This comment has been minimized.

Andor commented Apr 29, 2016

Currently we can run query like rate(hit[1m]) + rate(miss[1m]), which is not very convenient.
IMHO better syntax is something like this: rate(hit+miss [1m]) or rate((hit+miss) [1m])

@brian-brazil

This comment has been minimized.

Member

brian-brazil commented Apr 29, 2016

That's incorrect usage, you need to do the rate separately first or you'll get the wrong answer around counter resets. I'd also suggest that the exporter do either total&hit or total&miss, as that's a bit easier to work with.

@guoshimin

This comment has been minimized.

guoshimin commented Jun 3, 2016

Being able to run range queries on arbitrary expressions is valuable for ad-hoc explorations, either for sanity checking before making recorded rules, or for analyzing past data where you didn't already have recorded rules.

@roman-vynar

This comment has been minimized.

Contributor

roman-vynar commented Nov 24, 2016

Will leave one of the references with discussion on this topic https://groups.google.com/forum/#!topic/prometheus-users/JOVfqQsVRl8

@virusdave

This comment has been minimized.

virusdave commented Apr 6, 2017

Is there any ticket tracking this feature request (historical function evaluation)?

@brian-brazil

This comment has been minimized.

Member

brian-brazil commented Apr 6, 2017

This is the issue for it. I wouldn't be too hopeful of it ever getting implemented.

@virusdave

This comment has been minimized.

virusdave commented Apr 6, 2017

OK. Is that due to a philosophical issue, or a eng-resource-scarcity problem?

@brian-brazil

This comment has been minimized.

Member

brian-brazil commented Apr 6, 2017

It's a semantic issue, and an operational one. It's unclear how to do this correctly, and even then its use would have to be greatly limited for safety.

@pparth

This comment has been minimized.

pparth commented Sep 26, 2017

I do not understand why this is not considered important at all. The requirement is simple: we want to generate a range-vector from a function that returns an instant-vector. For example i want to execute something like:

sum_over_time(((sum(rate(server_request_hit{service_name=~"[[serviceName]]",http_response_status=~"5.."}[1m]))) / (sum(rate(server_request_hit{service_name=~"[[serviceName]]"}[1m]))) * vector(100)) > bool vector(0.1))

The internal query returns a time series which has a Y value of 1 when the underlying value is > 0.1 and a 0 when the underlying value is <= 0.1. This is how i calculate failure incidents. Then, i just want to know how many seconds these incidents lasted in the selected time period. To do that, i need to execute the sum_over_time function which is not applicable as the internal query is not considered a range vector.

In fact, Netflix Atlas allows this kind of operations without any issues.

@brian-brazil

This comment has been minimized.

Member

brian-brazil commented Sep 26, 2017

That would be an inappropriate usage of this feature, what you want is a recording rule.

@pparth

This comment has been minimized.

pparth commented Sep 26, 2017

How is this supported, is through a recording rule. But not what i want. I cannot have a recording rule for each step of my query where an instance vector is generated. And this is just for a single query. Also, it is too difficult for everyone who want to create a dashboard to have to create a recording rule. It is not that easy. Adding recording rules is an administration process now and it is definitely not suitable for people who just want to create more elaborate queries for their dashboards.

@jeeyoungk

This comment has been minimized.

Contributor

jeeyoungk commented May 2, 2018

What would it take to implement this? I'm willing to jump through the hoops to make this happen. this makes a lot of x_over_time() functions useless when you have a lot of counters, as you want almost always want to apply rate() first.

@brian-brazil

This comment has been minimized.

Member

brian-brazil commented May 2, 2018

See https://www.robustperception.io/composing-range-vector-functions-in-promql/

Most of the over_time functions make no sense with counters.

@juliusv

This comment has been minimized.

Member

juliusv commented Jul 20, 2018

Yeah, the x_over_time() use cases come up often, and obviously it's not nice to have to use recording rules as an intermediary step every time during ad-hoc exploration.

I'm also undecided on adding this feature, but let's say we would want it, how would we determine the sub-query evaluation interval? I think we could go two routes here:

  • Introduce a new range syntax like [<range>:<resolution>], so rate(foo[1m])[10m:15s] would mean: evaluate rate(foo[1m]) for from now to 10 minutes back in time, at a 15s resolution.
  • Introduce a new function like subquery(<expression>, <resolution>), so subquery(rate(foo[1m]), 15) would do the same and return a range vector.
@brian-brazil

This comment has been minimized.

Member

brian-brazil commented Jul 20, 2018

I was pondering the 2nd, if we ever add this. At the very least we'd want more protections against abusive queries first.

@roidelapluie

This comment has been minimized.

Contributor

roidelapluie commented Aug 19, 2018

In https://docs.google.com/document/d/1-C5PycocOZEVIPrmM1hn8fBelShqtqiAmFptoG4yK70/edit# the consensus is to adopt option 1.

Some notes:

I think it is fine to default the subquery evaluation to the global evaluation, but I would rather still keep the ":" separator:

rate(sum(foo)[5m:]) rather than rate(sum(foo)[5m]); for 2 reasons:

  • the : would mean that you know what you are doing; most of the cases should not require subqueries.
  • rate(foo[5m]) and rate(foo[5m:1m]) would be different; as the last one would have a chance "hit" the boundaries.

Also subqueries would solve #3806 if we can do: rate(min without() foo)[5m:1m]

bitmap

@codesome

This comment has been minimized.

Contributor

codesome commented Sep 10, 2018

@brian-brazil I would like to work on this if we are keen on adding it now.

@brian-brazil

This comment has been minimized.

Member

brian-brazil commented Sep 11, 2018

Sure, one thing is that the generated data will need to be aligned independently of the query parameters.

@sumkincpp

This comment has been minimized.

sumkincpp commented Sep 25, 2018

This feature is really important for being able to work with empty values.

For example, this one is adjusted exporter availability :

avg_over_time(((absent(up)-1) or up)[365d])*100
@henridf

This comment has been minimized.

Contributor

henridf commented Sep 26, 2018

@juliusv catching up and trying to understand option 1. So for the use case from the above-linked discussion (https://groups.google.com/d/msg/prometheus-users/JOVfqQsVRl8/PLA0hGAMCgAJ), would the query in option 1 syntax be this? :

min_over_time( rate(mysql_global_status_questions[2s])[60s])

And returning to your example, is it correct that rate(foo[1m])[10m:15s] would give the same result as a request to the query_range endpoint with the following params? :

  • query='rate(foo[1m])`
  • start=<now - 10m>
  • end=< now >
  • step=15s
@juliusv

This comment has been minimized.

Member

juliusv commented Sep 26, 2018

@henridf I think yes, pretty much.

@brian-brazil

This comment has been minimized.

Member

brian-brazil commented Sep 26, 2018

Almost, to preserve the invariant that query_range is syntactic sugar on top of query the start time should be aligned with the step.

@henridf

This comment has been minimized.

Contributor

henridf commented Sep 26, 2018

so in theory at least, this could eventually remove the need for query_range?
(not sure about others but I was confused by the existence and role of query_range when first discovering the http api)

@brian-brazil

This comment has been minimized.

Member

brian-brazil commented Sep 26, 2018

No, I'd expect instant queries to still not return range vector and subqueries don't support arbitrary start points.

@henridf

This comment has been minimized.

Contributor

henridf commented Sep 27, 2018

Instant queries can already return range vectors...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment