Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Search Perf] Query Frontend Caching #2470

Closed
joe-elliott opened this issue May 12, 2023 · 3 comments
Closed

[Search Perf] Query Frontend Caching #2470

joe-elliott opened this issue May 12, 2023 · 3 comments
Labels
area/query keepalive Label to exempt Issues / PRs from stale workflow type/performance

Comments

@joe-elliott
Copy link
Member

As more users are starting to embed TraceQL queries in dashboards it has become apparent that we need to improve our caching to handle repeated queries with slightly adjusted time ranges (e.g. auto refreshing dashboards). Currently we only cache parquet footers and bloom filters.

Let's add a cache at the query-frontend at the individual "job" level. After a query is broken into a stream of jobs we will cache based on the individual job url. This takes into account the query, block id, row groups, etc. For a given job the results are immutable b/c the blocks don't change. So if we have previously executed a query we can expect the results to be the same.

Caveats:

  1. We can only rely on cache if the start/end time ranges completely encapsulate the block. Use the metadata to determine this. If the start/end overlap the block we have to issue the job to the queriers b/c cache can't be trusted.

  2. Start/end time ranges need to be stripped from the url before hashing for cache. This way as dashboard slowly moves across a time range we will generally be pulling from cache for most blocks and only issues requests to the queriers for blocks on the edges of the time ranges and new blocks created by compactors.

@github-actions
Copy link
Contributor

This issue has been automatically marked as stale because it has not had any activity in the past 60 days.
The next time this stale check runs, the stale label will be removed if there is new activity. The issue will be closed after 15 days if there is no new activity.
Please apply keepalive label to exempt this Issue.

@github-actions github-actions bot added the stale Used for stale issues / PRs label Jul 12, 2023
@joe-elliott joe-elliott added keepalive Label to exempt Issues / PRs from stale workflow and removed stale Used for stale issues / PRs labels Jul 25, 2023
@github-actions
Copy link
Contributor

This issue has been automatically marked as stale because it has not had any activity in the past 60 days.
The next time this stale check runs, the stale label will be removed if there is new activity. The issue will be closed after 15 days if there is no new activity.
Please apply keepalive label to exempt this Issue.

@github-actions github-actions bot added the stale Used for stale issues / PRs label Oct 12, 2023
@joe-elliott joe-elliott removed the stale Used for stale issues / PRs label Oct 12, 2023
@joe-elliott
Copy link
Member Author

Completed with #3225

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/query keepalive Label to exempt Issues / PRs from stale workflow type/performance
Projects
Status: Done
Development

No branches or pull requests

2 participants