Skip to content

Conversation

@JoshFerge
Copy link
Member

@JoshFerge JoshFerge commented Oct 18, 2024

Right now the a adjacent_events query does a really big scan in order to find the adjacent events given an event id. It does this because the old query does not specify an order by which can take advantage of the primary key
ORDER BY (project_id, toStartOfDay(timestamp), primary_hash, cityHash64(event_id))

This PR adds a new function get_adjacent_event_ids_snql which uses SnQL and adds a order by clause. We observe in the snuba admin tool this new query takes only 2 seconds and scans hundreds of MBs of data as opposed to the old query which scans 10+ GB on large issues.

get_adjacent_event_ids_snql is not general purpose as its only used in one place.

see ticket below for more info:
Fixes https://github.com/getsentry/team-issues/issues/42

@JoshFerge JoshFerge requested a review from a team as a code owner October 18, 2024 17:50
@github-actions github-actions bot added the Scope: Backend Automatically applied to PRs that change backend components label Oct 18, 2024
Condition(
Column(DATASETS[dataset][Columns.TIMESTAMP.value.alias]),
Op.LT,
event.datetime + timedelta(days=100),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why 100 days?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prev_filter.start = event.datetime - timedelta(days=100)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK carried forward 👍🏻

@JoshFerge JoshFerge merged commit d8ff0cb into master Oct 18, 2024
48 of 49 checks passed
@JoshFerge JoshFerge deleted the jferg/efficient-adjacent-query branch October 18, 2024 22:01
harshithadurai pushed a commit that referenced this pull request Oct 19, 2024
Right now the a `adjacent_events` query does a really big scan in order
to find the adjacent events given an event id. It does this because the
old query does not specify an order by which can take advantage of the
primary key
`ORDER BY (project_id, toStartOfDay(timestamp), primary_hash,
cityHash64(event_id))`

This PR adds a new function `get_adjacent_event_ids_snql` which uses
SnQL and adds a order by clause. We observe in the snuba admin tool this
new query takes only 2 seconds and scans hundreds of MBs of data as
opposed to the old query which scans 10+ GB on large issues.

`get_adjacent_event_ids_snql` is not general purpose as its only used in
one place.

see ticket below for more info:
Fixes getsentry/team-issues#42
jan-auer added a commit that referenced this pull request Oct 21, 2024
* master: (288 commits)
  feat(metrics): Register MRI for spans/count_per_root_project (#78992)
  feat(dynamic-sampling): Settings for sample rate (#79341)
  Revert "feat(sentry-sdk): Enable HTTP2 transport" (#79391)
  fix(feedback): keep oldest date_added for duplicate user reports (#79387)
  chore(issue-stream): Remove tooltip for Unhandled (#79385)
  chore(autofix): Show banner if gen AI consent is given, even if no feature flag (#79362)
  chore(autofix+copilot) Allow autofix without FF if gen AI consent given (#79361)
  Fixes VULN-50 by enforcing option (#79384)
  perf(issues): improve adjacent_events query (#79365)
  feat(issues): Add anchor links back to issue sections (#79333)
  fix(issue-views): Make tab bar take up entire row (#79383)
  chore(issues): Add additional metrics for ownership matching (#79302)
  feat(insights): create screen rendering module (#79192)
  fix(issues): Avoid streamline issue layout rerenders (#79327)
  ref(performance): Add missing types to performance widgets (#79301)
  chore(issue-views): Add translation wrapper to aria label (#79320)
  chore(issue-stream): Reduce font size of title and message (#79378)
  feat(insights): update headers and breadcrumbs on frontend domain view (#78945)
  feat(insights): add view trends button to ai overview (#78611)
  ref(rr6): Remove unused param (#79379)
  ...
cmanallen pushed a commit that referenced this pull request Oct 23, 2024
Right now the a `adjacent_events` query does a really big scan in order
to find the adjacent events given an event id. It does this because the
old query does not specify an order by which can take advantage of the
primary key
`ORDER BY (project_id, toStartOfDay(timestamp), primary_hash,
cityHash64(event_id))`

This PR adds a new function `get_adjacent_event_ids_snql` which uses
SnQL and adds a order by clause. We observe in the snuba admin tool this
new query takes only 2 seconds and scans hundreds of MBs of data as
opposed to the old query which scans 10+ GB on large issues.

`get_adjacent_event_ids_snql` is not general purpose as its only used in
one place.

see ticket below for more info:
Fixes getsentry/team-issues#42
JoshFerge added a commit that referenced this pull request Oct 29, 2024
fixes the adjacent queries introduced in #79365 -- needed to adjust the
conditions a bit. adds a new test to ensure correct behavior.
@github-actions github-actions bot locked and limited conversation to collaborators Nov 3, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

Scope: Backend Automatically applied to PRs that change backend components

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants