Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Subquery #4831

Merged
merged 37 commits into from Dec 22, 2018
Merged

Subquery #4831

merged 37 commits into from Dec 22, 2018

Conversation

@codesome
Copy link
Member

@codesome codesome commented Nov 6, 2018

Fixes #1227

Based on this design doc: https://docs.google.com/document/d/1P_G87zN88YvmMr4iwLWygChMTZhai1L7S_c0awu1CAE/edit

Checklist

  • Lexer, with unit tests
  • Parser, with unit tests
  • Query engine
  • Unit tests for the working of query
  • Intensive real world tests
codesome added 3 commits Nov 3, 2018
* Introduced a new item called 'itemColon' => ':'. The colon present in the metric identifier wont be considered as itemColon.

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
@codesome codesome force-pushed the codesome:subqueries branch Nov 6, 2018
promql/parse.go Outdated Show resolved Hide resolved
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
@codesome codesome force-pushed the codesome:subqueries branch Nov 7, 2018
@codesome codesome force-pushed the codesome:subqueries branch to 4c74315 Nov 7, 2018
@codesome
Copy link
Member Author

@codesome codesome commented Nov 8, 2018

@brian-brazil I see 2 potential places to execute subqueries

  1. In Engine.populateSeries(...) we execute subquery and populate the series.
  2. In evaluator.eval(...) we execute the expression of subquery using another evaluator and return it.

Am I missing something here? Which one is preferred? (My choice is 1).

@brian-brazil
Copy link
Contributor

@brian-brazil brian-brazil commented Nov 8, 2018

It should be in eval I think.

@codesome
Copy link
Member Author

@codesome codesome commented Nov 10, 2018

While trying to device a solution in eval, I see 1 complication.

Consider this expression

rate(baz[1m]) +  rate( ( rate(foo[5m]) + rate(bar[10m]) )[15m:5s] )

As we are populating all the vector and matrix selectors beforehand, we would ideally like to load up to 25m back in time (15m of subquery and 10m of matrix selector inside it).

But as the maxOffset will be taken same for the entire expression, loading up to 25m back in time for baz is unnecessary, as it needs only up to 1m.

As there can be more nesting inside subqueries, this this added overhead fine?

@codesome
Copy link
Member Author

@codesome codesome commented Nov 10, 2018

Or I can selectively load for subqueries.

@brian-brazil
Copy link
Contributor

@brian-brazil brian-brazil commented Nov 10, 2018

As there can be more nesting inside subqueries, this this added overhead fine?

This is not a problem for local storage.

* The step of subquery is still not optional, need to be given.
* You cannot pass subquery as an argument to a function yet.

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
@codesome codesome force-pushed the codesome:subqueries branch Nov 12, 2018
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
@codesome codesome force-pushed the codesome:subqueries branch to 7d07e42 Nov 12, 2018
@codesome
Copy link
Member Author

@codesome codesome commented Nov 12, 2018

@brian-brazil I am stuck at choosing default step for expressions of the form expr[5m:], where step is optional.

For range queries I am able to take the interval given with it. But I am not sure what is to be done in the case of an instant query.

@brian-brazil
Copy link
Contributor

@brian-brazil brian-brazil commented Nov 12, 2018

For range queries I am able to take the interval given with it. But I am not sure what is to be done in the case of an instant query.

In both cases, use the global evaluation interval.

@codesome
Copy link
Member Author

@codesome codesome commented Nov 12, 2018

I was unable to find out where the global evaluation interval is

I believe it is not this

EvaluationInterval model.Duration `yaml:"evaluation_interval,omitempty"`

@brian-brazil
Copy link
Contributor

@brian-brazil brian-brazil commented Nov 12, 2018

That is it.

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
@codesome codesome changed the title [WIP] Subquery Subquery Nov 12, 2018
@codesome
Copy link
Member Author

@codesome codesome commented Nov 12, 2018

@brian-brazil I think this is good for a review now. I will be adding tests in the meanwhile.

codesome added 3 commits Nov 13, 2018
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
These tests are taken from selectors.test and functions.test. The range selector were just converted to subquery to generate these tests.
All the converted tests from selectors.test passed, but many from functions.test were failing due to small drift in the result, only the passing ones are in this commit.

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
Copy link
Contributor

@brian-brazil brian-brazil left a comment

How will this interact with sample counting and sample slice pooling?

cmd/prometheus/main.go Outdated Show resolved Hide resolved
promql/engine.go Outdated Show resolved Hide resolved
promql/engine.go Outdated Show resolved Hide resolved
promql/engine.go Outdated Show resolved Hide resolved
promql/lex_test.go Outdated Show resolved Hide resolved
promql/parse.go Outdated Show resolved Hide resolved
promql/testdata/subquery.test Outdated Show resolved Hide resolved
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
@codesome
Copy link
Member Author

@codesome codesome commented Dec 15, 2018

@brian-brazil I have added tests for edge data and offsets. The tests for queries that would return a matrix are in Go, rest all are in testdata directory.

What more do I need to add?

promql/testdata/subquery.test Outdated Show resolved Hide resolved
promql/parse.go Outdated Show resolved Hide resolved
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
@brian-brazil
Copy link
Contributor

@brian-brazil brian-brazil commented Dec 18, 2018

You've got a test failure.

@brian-brazil
Copy link
Contributor

@brian-brazil brian-brazil commented Dec 18, 2018

The docs also need updating.

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
@codesome
Copy link
Member Author

@codesome codesome commented Dec 18, 2018

Would the docs go inside docs/querying/basics.md?

@brian-brazil
Copy link
Contributor

@brian-brazil brian-brazil commented Dec 18, 2018

Yeah, I think that makes most sense. Just before Operators.

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
promql/value.go Show resolved Hide resolved
docs/querying/basics.md Outdated Show resolved Hide resolved
docs/querying/basics.md Outdated Show resolved Hide resolved
@@ -169,6 +169,20 @@ The same works for range vectors. This returns the 5-minutes rate that

rate(http_requests_total[5m] offset 1w)

## Subquery

Subquery allows you to run an instant query for a given range and resolution. Like how range vector selector is for vector selector, subquery is for an instant query. The result of a subquery is a matrix.

This comment has been minimized.

@brian-brazil

brian-brazil Dec 18, 2018
Contributor

This comparison to range vector selectors is confusing. It'd be instant vector expressions.

docs/querying/basics.md Outdated Show resolved Hide resolved
codesome added 2 commits Dec 19, 2018
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
@codesome
Copy link
Member Author

@codesome codesome commented Dec 20, 2018

I have updated the docs and tests (#1227 (comment), #1227 (comment)).

I haven't verified the below test - the value is got when rate(metric[5m]) is run at t=1m.

eval instant at 20m min_over_time(rate(metric[5m])[20m:1m])
  {} 0.12119047619047618

This is an example of nested subquery. The subquery for the rate uses default resolution. Note that using subqueries unnecessarily is unwise.

max_over_time(rate(http_requests_total[30s:5s])[10m:])

This comment has been minimized.

@brian-brazil

brian-brazil Dec 20, 2018
Contributor

This isn't a good example, you should use a range vector selector with rate.

This comment has been minimized.

@codesome

codesome Dec 21, 2018
Author Member

I thought so, will update it.

This comment has been minimized.

@codesome

codesome Dec 21, 2018
Author Member

max_over_time(rate(rate(distance_covered_total[5s])[30s:5s])[10m:]) - does this look sane? Looks like max acceleration over time.

This comment has been minimized.

@brian-brazil

brian-brazil Dec 21, 2018
Contributor

A rate of a rate doesn't make sense, as rate can only be applied to a counter and the output of rate is a gauge. A deriv of a rate would make sense though.

This comment has been minimized.

@codesome

codesome Dec 21, 2018
Author Member

So does max_over_time(deriv(rate(distance_covered_total[5s])[30s:5s])[10m:]) look fine for the doc?

This comment has been minimized.

@brian-brazil

brian-brazil Dec 21, 2018
Contributor

Yes, that's fine.

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
@codesome
Copy link
Member Author

@codesome codesome commented Dec 22, 2018

I feel that it is pretty much done. Anything missing @brian-brazil?

@brian-brazil brian-brazil merged commit dbe55c1 into prometheus:master Dec 22, 2018
4 checks passed
4 checks passed
@dco
DCO DCO
Details
ci/circleci: build Your tests passed on CircleCI!
Details
ci/circleci: test Your tests passed on CircleCI!
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@brian-brazil
Copy link
Contributor

@brian-brazil brian-brazil commented Dec 22, 2018

Thanks!

@juliusv
Copy link
Member

@juliusv juliusv commented Dec 22, 2018

Wow, congrats on getting this done!

@roidelapluie
Copy link
Member

@roidelapluie roidelapluie commented Dec 22, 2018

this is huge!! congratz!!

@codesome
Copy link
Member Author

@codesome codesome commented Dec 22, 2018

Thanks @brian-brazil for all the patient reviews!

@codesome codesome deleted the codesome:subqueries branch Dec 28, 2018
@codesome codesome mentioned this pull request Dec 28, 2018
4 tasks done
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

6 participants