Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

querier: Execute Selects concurrently per query #2657

Merged
merged 5 commits into from Jun 17, 2020

Conversation

kakkoyun
Copy link
Member

@kakkoyun kakkoyun commented May 25, 2020

Enable Querier to dissect a query into pieces and execute concurrently.

xref: prometheus/prometheus#7251
depends: #2748

Signed-off-by: Kemal Akkoyun kakkoyun@gmail.com

  • I added CHANGELOG entry for this change.

Changes

  • Implement async select for Querier

Verification

  • maka test-local
  • make test-e2e

Tracing

Screenshot 2020-06-17 09 11 47

@kakkoyun kakkoyun changed the title [WIP] querier: Implement async select for Querier [WIP] querier: Enable concurrent .Select per query May 25, 2020
@kakkoyun kakkoyun changed the title [WIP] querier: Enable concurrent .Select per query [WIP] querier: Enable concurrent Select per query May 26, 2020
@kakkoyun kakkoyun force-pushed the concurrent_select branch 3 times, most recently from 5242e45 to 031b017 Compare May 28, 2020 09:44
@kakkoyun kakkoyun force-pushed the concurrent_select branch 4 times, most recently from c01bc9b to c566595 Compare June 5, 2020 17:08
cmd/thanos/query.go Outdated Show resolved Hide resolved
@kakkoyun kakkoyun force-pushed the concurrent_select branch 6 times, most recently from d6eeef0 to 3439ad6 Compare June 12, 2020 12:03
@kakkoyun kakkoyun changed the title [WIP] querier: Enable concurrent Select per query querier: Enable concurrent Select per query Jun 12, 2020
@kakkoyun kakkoyun changed the title querier: Enable concurrent Select per query querier: Enable concurrent Selects per query Jun 12, 2020
@kakkoyun kakkoyun changed the title querier: Enable concurrent Selects per query querier: Execute Selects concurrently per query Jun 12, 2020
@kakkoyun
Copy link
Member Author

Just a flaky test. Needs a re-run.

@kakkoyun kakkoyun marked this pull request as ready for review June 12, 2020 14:31
@kakkoyun kakkoyun requested a review from bwplotka June 15, 2020 08:55
@kakkoyun
Copy link
Member Author

A test re-run would be awesome.

@bwplotka
Copy link
Member

done, reloader is really flaky nowadays

@kakkoyun kakkoyun force-pushed the concurrent_select branch 2 times, most recently from 13f4ed1 to c611490 Compare June 16, 2020 06:55
Copy link
Member

@povilasv povilasv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one question, other than that LGTM

@@ -67,6 +67,9 @@ func registerQuery(m map[string]setupFunc, app *kingpin.Application) {
maxConcurrentQueries := cmd.Flag("query.max-concurrent", "Maximum number of queries processed concurrently by query node.").
Default("20").Int()

maxConcurrentSelects := cmd.Flag("query.max-concurrent-select", "Maximum number of select requests made concurrently per a query.").
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

default 4 feels like a bit arbitrary? maybe default to 1? as current default of maxConcurentQueries is 20, this might significantly increase respurce usage?

Maybe we can read cpu limits etc and autoconfigure those values based on available cpu / or cpu assigned to cgroups?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for now 1 and experimenting on our prods to figure out better default is the way to go

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's an arbitrary number. I've actually left a comment on the PR, probably got removed between force pushes :D That's actually a point I want to discuss.

Autoconfigure sounds good, we can try that.

However, I also suspect this will be I/O bounded and having 20x4 at worst case shouldn't hurt that much. That being said, maybe we can try to craft a benchmark to find the magic number.

Copy link
Member

@bwplotka bwplotka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks solid, LGTM! 💪

Just I wish we had some micro benchmarks for this path.

I started some work on this here #2305 but actually for proxy part.

What we need here is some benchmark for Multiple selects on large dataset returned by underlying Store API.

If you are bored @kakkoyun or @krasi-georgiev this is some work that has to be done at some point, in separate PR to this I think (:

@kakkoyun kakkoyun marked this pull request as draft June 16, 2020 17:46
Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>
Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>
Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>
Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>
Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>
@kakkoyun kakkoyun marked this pull request as ready for review June 17, 2020 08:29
@brancz brancz merged commit c9c60c1 into thanos-io:master Jun 17, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants