Skip to content

Conversation

@thetruecpaul
Copy link
Contributor

Two changes:

  1. Add a caching layer so that we can avoid hitting the DB for hot groups.
  2. We don't need the sort in 95% of cases. Let's raise that into Python and only do it when necessary.

@github-actions github-actions bot added the Scope: Backend Automatically applied to PRs that change backend components label Nov 20, 2025
@codecov
Copy link

codecov bot commented Nov 20, 2025

Codecov Report

❌ Patch coverage is 96.96970% with 1 line in your changes missing coverage. Please review.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
.../sentry/services/eventstore/query_preprocessing.py 96.96% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff            @@
##           master   #103756    +/-   ##
=========================================
  Coverage   80.59%    80.60%            
=========================================
  Files        9274      9279     +5     
  Lines      395956    396164   +208     
  Branches    25250     25250            
=========================================
+ Hits       319138    319320   +182     
- Misses      76389     76415    +26     
  Partials      429       429            

Two changes:
1. Add a caching layer so that we can avoid hitting the DB for hot groups.
2. We don't need the sort in 95% of cases. Let's raise that into Python and only do it when necessary.
Copy link
Member

@wedamija wedamija left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have some broader concerns about how we're doing this. It looks like for every snuba query that uses SnubaQueryParams we're potentially making multiple calls to postgres? The caching will help here, but we've added 1000s of queries a second to the database. Let's see if this drops a lot and then maybe we need to revisit this solution in general.

I think it's surprising behaviour that just initializing SnubaQueryParams causes postgres queries

Comment on lines +64 to +66
running_data = {
(group_id, datetime.now(UTC) + timedelta(minutes=1)) for group_id in group_ids
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this just be a dict of <group_id>: <max_date>? So when we add rows in later, we just do running_data[group_id] = max(running_data[group_id], new_date)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find the set easier to reason about and don't think there'll be that much duplication (only expect one Redirect per group)

@thetruecpaul thetruecpaul merged commit a305a6e into master Nov 21, 2025
66 of 67 checks passed
@thetruecpaul thetruecpaul deleted the cpaul/112025/redirectcachee branch November 21, 2025 19:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Scope: Backend Automatically applied to PRs that change backend components

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants