perf(issues): improve adjacent_events query #79365

JoshFerge · 2024-10-18T17:50:54Z

Right now the a adjacent_events query does a really big scan in order to find the adjacent events given an event id. It does this because the old query does not specify an order by which can take advantage of the primary key
ORDER BY (project_id, toStartOfDay(timestamp), primary_hash, cityHash64(event_id))

This PR adds a new function get_adjacent_event_ids_snql which uses SnQL and adds a order by clause. We observe in the snuba admin tool this new query takes only 2 seconds and scans hundreds of MBs of data as opposed to the old query which scans 10+ GB on large issues.

get_adjacent_event_ids_snql is not general purpose as its only used in one place.

see ticket below for more info:
Fixes https://github.com/getsentry/team-issues/issues/42

armenzg · 2024-10-18T17:55:11Z

src/sentry/eventstore/snuba/backend.py

+                Condition(
+                    Column(DATASETS[dataset][Columns.TIMESTAMP.value.alias]),
+                    Op.LT,
+                    event.datetime + timedelta(days=100),


Why 100 days?

sentry/src/sentry/eventstore/snuba/backend.py

Line 577 in 55415ea

prev_filter.start = event.datetime - timedelta(days=100)

OK carried forward 👍🏻

Right now the a `adjacent_events` query does a really big scan in order to find the adjacent events given an event id. It does this because the old query does not specify an order by which can take advantage of the primary key `ORDER BY (project_id, toStartOfDay(timestamp), primary_hash, cityHash64(event_id))` This PR adds a new function `get_adjacent_event_ids_snql` which uses SnQL and adds a order by clause. We observe in the snuba admin tool this new query takes only 2 seconds and scans hundreds of MBs of data as opposed to the old query which scans 10+ GB on large issues. `get_adjacent_event_ids_snql` is not general purpose as its only used in one place. see ticket below for more info: Fixes getsentry/team-issues#42

* master: (288 commits) feat(metrics): Register MRI for spans/count_per_root_project (#78992) feat(dynamic-sampling): Settings for sample rate (#79341) Revert "feat(sentry-sdk): Enable HTTP2 transport" (#79391) fix(feedback): keep oldest date_added for duplicate user reports (#79387) chore(issue-stream): Remove tooltip for Unhandled (#79385) chore(autofix): Show banner if gen AI consent is given, even if no feature flag (#79362) chore(autofix+copilot) Allow autofix without FF if gen AI consent given (#79361) Fixes VULN-50 by enforcing option (#79384) perf(issues): improve adjacent_events query (#79365) feat(issues): Add anchor links back to issue sections (#79333) fix(issue-views): Make tab bar take up entire row (#79383) chore(issues): Add additional metrics for ownership matching (#79302) feat(insights): create screen rendering module (#79192) fix(issues): Avoid streamline issue layout rerenders (#79327) ref(performance): Add missing types to performance widgets (#79301) chore(issue-views): Add translation wrapper to aria label (#79320) chore(issue-stream): Reduce font size of title and message (#79378) feat(insights): update headers and breadcrumbs on frontend domain view (#78945) feat(insights): add view trends button to ai overview (#78611) ref(rr6): Remove unused param (#79379) ...

Right now the a `adjacent_events` query does a really big scan in order to find the adjacent events given an event id. It does this because the old query does not specify an order by which can take advantage of the primary key `ORDER BY (project_id, toStartOfDay(timestamp), primary_hash, cityHash64(event_id))` This PR adds a new function `get_adjacent_event_ids_snql` which uses SnQL and adds a order by clause. We observe in the snuba admin tool this new query takes only 2 seconds and scans hundreds of MBs of data as opposed to the old query which scans 10+ GB on large issues. `get_adjacent_event_ids_snql` is not general purpose as its only used in one place. see ticket below for more info: Fixes getsentry/team-issues#42

fixes the adjacent queries introduced in #79365 -- needed to adjust the conditions a bit. adds a new test to ensure correct behavior.

JoshFerge added 2 commits October 18, 2024 13:02

perf(issues): improve adjacent_events query

588d4ed

add event desc condition

55415ea

JoshFerge requested a review from a team as a code owner October 18, 2024 17:50

github-actions bot added the Scope: Backend Automatically applied to PRs that change backend components label Oct 18, 2024

armenzg approved these changes Oct 18, 2024

View reviewed changes

add rollout option

476232a

JoshFerge enabled auto-merge (squash) October 18, 2024 20:46

vercel bot deployed to Preview October 18, 2024 20:48 View deployment

Merge branch 'master' into jferg/efficient-adjacent-query

7588651

vercel bot deployed to Preview October 18, 2024 21:29 View deployment

JoshFerge merged commit d8ff0cb into master Oct 18, 2024
48 of 49 checks passed

JoshFerge deleted the jferg/efficient-adjacent-query branch October 18, 2024 22:01

JoshFerge mentioned this pull request Oct 29, 2024

fix(issues): fix optimized adjacent queries #79904

Merged

JoshFerge added a commit that referenced this pull request Oct 29, 2024

fix(issues): fix optimized adjacent queries (#79904)

2398fde

fixes the adjacent queries introduced in #79365 -- needed to adjust the conditions a bit. adds a new test to ensure correct behavior.

github-actions bot locked and limited conversation to collaborators Nov 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

perf(issues): improve adjacent_events query #79365

perf(issues): improve adjacent_events query #79365

Uh oh!

JoshFerge commented Oct 18, 2024 •

edited

Loading

Uh oh!

armenzg Oct 18, 2024

Uh oh!

JoshFerge Oct 18, 2024

Uh oh!

armenzg Oct 18, 2024

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

perf(issues): improve adjacent_events query #79365

perf(issues): improve adjacent_events query #79365

Uh oh!

Conversation

JoshFerge commented Oct 18, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

armenzg Oct 18, 2024

Choose a reason for hiding this comment

Uh oh!

JoshFerge Oct 18, 2024

Choose a reason for hiding this comment

Uh oh!

armenzg Oct 18, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

JoshFerge commented Oct 18, 2024 •

edited

Loading