-
-
Notifications
You must be signed in to change notification settings - Fork 4.5k
feat(issue-search): Check max_candidates to push Postgres fields to post-filtering
#74531
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
max_candidates to push Postgres fields to post-filtering
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #74531 +/- ##
===========================================
+ Coverage 56.96% 78.18% +21.22%
===========================================
Files 6459 6691 +232
Lines 285436 299601 +14165
Branches 49078 51568 +2490
===========================================
+ Hits 162597 234258 +71661
+ Misses 118412 59019 -59393
- Partials 4427 6324 +1897
|
311fdec to
e0738cc
Compare
vartec
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, I've added some suggestion to simplify the logic a bit.
src/sentry/search/snuba/executors.py
Outdated
| having = [] | ||
| # if we need to prefetch from postgres, we add filter by the group ids | ||
| if group_ids_to_pass_to_snuba is not None: | ||
| if group_ids_to_pass_to_snuba is not None and not too_many_candidates: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
going back to the comment above, if you reset group_ids_to_pass_to_snuba to None when too_many_candidates, not need to change anything here
src/sentry/search/snuba/executors.py
Outdated
| if ( | ||
| group_ids_to_pass_to_snuba is not None | ||
| and len(group_ids_to_pass_to_snuba) > max_candidates | ||
| ): | ||
| metrics.incr("snuba.search.too_many_candidates", skip_internal=False) | ||
| too_many_candidates = True | ||
| group_ids_to_pass_to_snuba = [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| if ( | |
| group_ids_to_pass_to_snuba is not None | |
| and len(group_ids_to_pass_to_snuba) > max_candidates | |
| ): | |
| metrics.incr("snuba.search.too_many_candidates", skip_internal=False) | |
| too_many_candidates = True | |
| group_ids_to_pass_to_snuba = [] | |
| if (too_many_candidates := (len(group_ids_to_pass_to_snuba) > max_candidates)): | |
| metrics.incr("snuba.search.too_many_candidates", skip_internal=False) | |
| group_ids_to_pass_to_snuba = None |
You can keep this within same if that starts in line 1663, because that's the only way you'll have any candidates.
Reseting to None when there are too many candidates will simplify some checks below.
ed20a65 to
684b430
Compare
When applying query filters from Postgres, pre-filtering the results before passing the query to Snuba can break if it hits the max size that ClickHouse is able to process (sentry issue). This also manifests as a possible bug in the serializer, though that's more of a red herring since ClickHouse won't be able to handle the query either way.
To work around this, I'm applying the pattern used in the
PostgresSnubaQueryExecutorto determine whether to pre- or post-filter the results based on the number of group_ids in the filter. If the number of groups in the pre-filter exceeds themax_candidates, we'll apply the filter after the Snuba query in a post-filtering step right before the pagination happens.The tests confirm that the
is:linkedandis:unlinkedqueries work as a pre- and post-filter dependent on the value configured insnuba.search.max-pre-snuba-candidates.