Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Search Perf] Improve Query Frontend -> Querier Job Throughput #2464

Closed
joe-elliott opened this issue May 11, 2023 · 2 comments
Closed

[Search Perf] Improve Query Frontend -> Querier Job Throughput #2464

joe-elliott opened this issue May 11, 2023 · 2 comments
Assignees

Comments

@joe-elliott
Copy link
Member

joe-elliott commented May 11, 2023

The querier/query-frontend relationship is using code originally developed for Cortex many years ago and it is likely showing it's age. For larger queries Tempo will often create 10s of thousands of jobs which are piped from the query-frontend to the queriers one at a time. Currently, I believe, there is a bottleneck in delivering these jobs to the queriers at scale.

Code

In the querier you can control the number of jobs that it will do in parallel using max_concurrent_queries. For every concurrent query the querier starts a new goroutine and opens a grpc connection to the query frontend.

On the query-frontend side a goroutine is started for every Process call above and they all block on a call to GetNextRequestForQuerier. The end result is often 10k + goroutines in the query frontend waiting on one mutex to deliever one job a time downstream to a querier.

Metrics

This is a graph of requests/second serviced by queriers during a longer traceql query. Notice how the least active querier starts 20-30 seconds later than the first queriers and how slow the ramp up is across the course of the query:

image

It should be noted that CPU or network saturation could also cause an effect like this.

Possible Solutions

  1. It's possible just by reducing contention on the mutex linked above we could see improved querier performance. Perhaps we can find a way to efficiently shard that queue to spread the load across N mutexes.

  2. Rewrite the relationship between these two components. Perhaps, upon connection, the querier could pass the number of jobs it is willing to take and the query-frontend could deliver a batch of jobs at once that the querier would respond to one at at time.

We are currently seeing a fair amount of querier imbalance and slow spin up for larger queries. Unlocking this bottleneck would likely have a large positive impact on performance.

@joe-elliott joe-elliott changed the title Improve Query Frontend -> Querier Job Throughput [Search Perf] Improve Query Frontend -> Querier Job Throughput May 12, 2023
@joe-elliott
Copy link
Member Author

Some additional analysis on the rate at which queriers draw down the frontend queue.

relevant config:

query_frontend:
    max_outstanding_per_tenant: 100000
    search:
        concurrent_jobs: 75000
querier:
    max_concurrent_queries: 1000

With 100 queriers the cluster had space for 100,000 concurrent jobs. I repeatedly executed an exhaustive query over 6 hours that created ~65k jobs.

Time spent in queue

histogram_quantile(.9, sum by (le) (rate(tempo_query_frontend_queue_duration_seconds_bucket{}[1m])))

image
It appears that 10% of the jobs spent 4.5s or more in the queue waiting for a querier to service them.

Querier min/max/avg RPS
image
Still seeing an imbalance in queriers. Our busiest queriers are doing 4x the work of our slowest queriers.

@joe-elliott
Copy link
Member Author

A number of PRs went in that have been merged to improve this situation. Closing this issue as any future improvements would require a dedicated redesign of the relationship between the queriers and fronted and should be their own issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Archived in project
Development

No branches or pull requests

1 participant