perf(crons): Adjust specific environment monitor stats query#114277
perf(crons): Adjust specific environment monitor stats query#114277
Conversation
Monitor stats already fetch monitor environment rows before querying check-ins. Use that mapping to leave monitor_id out of the aggregate when possible, so Postgres can stay on the monitor environment/date/status index. Keep the legacy fallback shape for rows without monitor environments. Co-Authored-By: OpenAI Codex <noreply@openai.com>
mypy inferred the first monitor stats group_by assignment as a fixed-width tuple. Make it a variable-length string tuple so both aggregate shapes type check. Co-Authored-By: OpenAI Codex <noreply@openai.com>
Keep the note next to the branch that switches the aggregate shape. That makes it clearer why the environment-backed path can omit monitor_id.
|
checking another one via EXPLAIN in redash |
Did you run these for the same organization? If you ran the new query after the old, postgres might have cached it. To be sure, it'd be helpful to find another candidate and run new and then old to see if it still improves |
Is this the name of the actual index in your explains? I don't see it in the production table https://redash.getsentry.net/queries/11098/source. It'd be helpful to know if the actual index usage changed here |
|
@wedamija i tried running the new one first and old one first on some different queries. i've updated the comment with the actual index used sentry_moni_monitor_1fb26c_idx |
wedamija
left a comment
There was a problem hiding this comment.
This seems like it should help based on the queries we tested on slack. I'd say try it out, and revert if you don't see the slow queries improve
Monitor stats has two query shapes. This optimizes the environment-backed shape, for example requests that include `?environment=...` or otherwise resolve to `MonitorEnvironment` rows before querying check-ins. In that shape, the endpoint already has a mapping from `monitor_environment_id` to `monitor_id`. This leaves `monitor_id` out of the aggregate so Postgres can answer from the existing `(monitor_environment_id, date_added, status)` index, then recovers `monitor_id` from the `MonitorEnvironment` rows it already loaded. The no-monitor-environment fallback keeps the old aggregate shape because it has no environment-to-monitor mapping. The query plan change from an environment-backed stats request: | Shape | Plan | Runtime | |---|---|---| | aggregate included `monitor_id` | index scan on the same composite index, with heap reads for `monitor_id` | about 5.5s | | aggregate uses `bucket`, `monitor_environment_id`, `status` | index-only scan on the same composite index, with a small number of heap fetches | about 12.5ms | attempt to fix SENTRY-3VHC --------- Co-authored-by: OpenAI Codex <noreply@openai.com>
Monitor stats has two query shapes. This optimizes the environment-backed shape, for example requests that include
?environment=...or otherwise resolve toMonitorEnvironmentrows before querying check-ins.In that shape, the endpoint already has a mapping from
monitor_environment_idtomonitor_id. This leavesmonitor_idout of the aggregate so Postgres can answer from the existing(monitor_environment_id, date_added, status)index, then recoversmonitor_idfrom theMonitorEnvironmentrows it already loaded.The no-monitor-environment fallback keeps the old aggregate shape because it has no environment-to-monitor mapping.
The query plan change from an environment-backed stats request:
monitor_idmonitor_idbucket,monitor_environment_id,statusattempt to fix SENTRY-3VHC