Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Subquery #4831

Merged
merged 37 commits into from Dec 22, 2018
Merged

Subquery #4831

merged 37 commits into from Dec 22, 2018

Conversation

codesome
Copy link
Member

@codesome codesome commented Nov 6, 2018

Fixes #1227

Based on this design doc: https://docs.google.com/document/d/1P_G87zN88YvmMr4iwLWygChMTZhai1L7S_c0awu1CAE/edit

Checklist

  • Lexer, with unit tests
  • Parser, with unit tests
  • Query engine
  • Unit tests for the working of query
  • Intensive real world tests

* Introduced a new item called 'itemColon' => ':'. The colon present in the metric identifier wont be considered as itemColon.

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
promql/parse.go Outdated Show resolved Hide resolved
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
@codesome
Copy link
Member Author

codesome commented Nov 8, 2018

@brian-brazil I see 2 potential places to execute subqueries

  1. In Engine.populateSeries(...) we execute subquery and populate the series.
  2. In evaluator.eval(...) we execute the expression of subquery using another evaluator and return it.

Am I missing something here? Which one is preferred? (My choice is 1).

@brian-brazil
Copy link
Contributor

It should be in eval I think.

@codesome
Copy link
Member Author

While trying to device a solution in eval, I see 1 complication.

Consider this expression

rate(baz[1m]) +  rate( ( rate(foo[5m]) + rate(bar[10m]) )[15m:5s] )

As we are populating all the vector and matrix selectors beforehand, we would ideally like to load up to 25m back in time (15m of subquery and 10m of matrix selector inside it).

But as the maxOffset will be taken same for the entire expression, loading up to 25m back in time for baz is unnecessary, as it needs only up to 1m.

As there can be more nesting inside subqueries, this this added overhead fine?

@codesome
Copy link
Member Author

Or I can selectively load for subqueries.

@brian-brazil
Copy link
Contributor

As there can be more nesting inside subqueries, this this added overhead fine?

This is not a problem for local storage.

* The step of subquery is still not optional, need to be given.
* You cannot pass subquery as an argument to a function yet.

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
@codesome
Copy link
Member Author

@brian-brazil I am stuck at choosing default step for expressions of the form expr[5m:], where step is optional.

For range queries I am able to take the interval given with it. But I am not sure what is to be done in the case of an instant query.

@brian-brazil
Copy link
Contributor

For range queries I am able to take the interval given with it. But I am not sure what is to be done in the case of an instant query.

In both cases, use the global evaluation interval.

@codesome
Copy link
Member Author

I was unable to find out where the global evaluation interval is

I believe it is not this

EvaluationInterval model.Duration `yaml:"evaluation_interval,omitempty"`

@brian-brazil
Copy link
Contributor

That is it.

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
@codesome codesome changed the title [WIP] Subquery Subquery Nov 12, 2018
@codesome
Copy link
Member Author

@brian-brazil I think this is good for a review now. I will be adding tests in the meanwhile.

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
These tests are taken from selectors.test and functions.test. The range selector were just converted to subquery to generate these tests.
All the converted tests from selectors.test passed, but many from functions.test were failing due to small drift in the result, only the passing ones are in this commit.

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
Copy link
Contributor

@brian-brazil brian-brazil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How will this interact with sample counting and sample slice pooling?

cmd/prometheus/main.go Outdated Show resolved Hide resolved
promql/engine.go Outdated Show resolved Hide resolved
promql/engine.go Outdated Show resolved Hide resolved
promql/engine.go Outdated Show resolved Hide resolved
promql/lex_test.go Outdated Show resolved Hide resolved
promql/parse.go Outdated Show resolved Hide resolved
promql/testdata/subquery.test Outdated Show resolved Hide resolved
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
@codesome
Copy link
Member Author

@brian-brazil I have added tests for edge data and offsets. The tests for queries that would return a matrix are in Go, rest all are in testdata directory.

What more do I need to add?

promql/testdata/subquery.test Outdated Show resolved Hide resolved
promql/parse.go Outdated Show resolved Hide resolved
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
@brian-brazil
Copy link
Contributor

You've got a test failure.

@brian-brazil
Copy link
Contributor

The docs also need updating.

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
@codesome
Copy link
Member Author

Would the docs go inside docs/querying/basics.md?

@brian-brazil
Copy link
Contributor

Yeah, I think that makes most sense. Just before Operators.

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
promql/value.go Show resolved Hide resolved
docs/querying/basics.md Outdated Show resolved Hide resolved
docs/querying/basics.md Outdated Show resolved Hide resolved
@@ -169,6 +169,20 @@ The same works for range vectors. This returns the 5-minutes rate that

rate(http_requests_total[5m] offset 1w)

## Subquery

Subquery allows you to run an instant query for a given range and resolution. Like how range vector selector is for vector selector, subquery is for an instant query. The result of a subquery is a matrix.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comparison to range vector selectors is confusing. It'd be instant vector expressions.

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
@codesome
Copy link
Member Author

I have updated the docs and tests (#1227 (comment), #1227 (comment)).

I haven't verified the below test - the value is got when rate(metric[5m]) is run at t=1m.

eval instant at 20m min_over_time(rate(metric[5m])[20m:1m])
  {} 0.12119047619047618


This is an example of nested subquery. The subquery for the rate uses default resolution. Note that using subqueries unnecessarily is unwise.

max_over_time(rate(http_requests_total[30s:5s])[10m:])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't a good example, you should use a range vector selector with rate.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought so, will update it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

max_over_time(rate(rate(distance_covered_total[5s])[30s:5s])[10m:]) - does this look sane? Looks like max acceleration over time.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A rate of a rate doesn't make sense, as rate can only be applied to a counter and the output of rate is a gauge. A deriv of a rate would make sense though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So does max_over_time(deriv(rate(distance_covered_total[5s])[30s:5s])[10m:]) look fine for the doc?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's fine.

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
@codesome
Copy link
Member Author

I feel that it is pretty much done. Anything missing @brian-brazil?

@brian-brazil brian-brazil merged commit dbe55c1 into prometheus:master Dec 22, 2018
@brian-brazil
Copy link
Contributor

Thanks!

@juliusv
Copy link
Member

juliusv commented Dec 22, 2018

Wow, congrats on getting this done!

@roidelapluie
Copy link
Member

this is huge!! congratz!!

@codesome
Copy link
Member Author

Thanks @brian-brazil for all the patient reviews!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants