ref(seer): Refactor night shift into modules and use search backend#112635
ref(seer): Refactor night shift into modules and use search backend#112635
Conversation
| @pytest.fixture(autouse=True) | ||
| def initialize(self, reset_snuba, call_snuba): | ||
| def initialize(self, request, call_snuba): | ||
| if self.reset_snuba_data: |
There was a problem hiding this comment.
This was driving me insane. Running tests should not reset your entire devbox's data.
| search_filters=[ | ||
| SearchFilter(SearchKey("status"), "=", SearchValue([GroupStatus.UNRESOLVED])), | ||
| SearchFilter(SearchKey("issue.seer_last_run"), "=", SearchValue("")), | ||
| ], | ||
| referrer=Referrer.SEER_NIGHT_SHIFT_FIXABILITY_SCORE_STRATEGY.value, |
There was a problem hiding this comment.
Bug: The new search filter for issue.seer_last_run only checks one of two timestamp fields, allowing previously processed issues to be re-triaged, unlike the old logic which checked both.
Severity: HIGH
Suggested Fix
Modify the search query to ensure it checks both seer_autofix_last_triggered and seer_explorer_autofix_last_triggered for null values, restoring the original behavior. This will prevent issues processed by either Seer pathway from being re-selected for triage.
Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.
Location: src/sentry/tasks/seer/night_shift/simple_triage.py#L59-L63
Potential issue: The refactored `fixability_score_strategy` function changes the logic
for filtering previously processed Seer issues. The new `SearchFilter` for
`issue.seer_last_run` resolves to checking only one of two timestamp fields
(`seer_autofix_last_triggered` or `seer_explorer_autofix_last_triggered`) based on a
feature flag. The previous implementation correctly checked for null values in both
fields. This regression means that an issue processed by one Seer pathway can be
incorrectly re-selected for triage by the other, leading to duplicated work and wasted
compute resources.
Did we get this right? 👍 / 👎 to inform future reviews.
There was a problem hiding this comment.
This is fine, we can re-run an explorer client run even if they had a legacy run.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 7d9b2a5. Configure here.
| limit=NIGHT_SHIFT_ISSUE_FETCH_LIMIT, | ||
| search_filters=[ | ||
| SearchFilter(SearchKey("status"), "=", SearchValue([GroupStatus.UNRESOLVED])), | ||
| SearchFilter(SearchKey("issue.seer_last_run"), "=", SearchValue("")), |
There was a problem hiding this comment.
Seer last run filter checks only one field
Medium Severity
The old ORM query filtered on both seer_autofix_last_triggered__isnull=True AND seer_explorer_autofix_last_triggered__isnull=True, ensuring groups triggered by either mechanism were excluded. The new issue.seer_last_run search filter maps to a ScalarCondition that only checks one of those fields based on the organizations:autofix-on-explorer feature flag. This means groups previously processed via the other code path could be re-selected as candidates, leading to duplicate processing.
Reviewed by Cursor Bugbot for commit 7d9b2a5. Configure here.
There was a problem hiding this comment.
Same as above, this is fine and we can re-run an explorer client run even if they had a legacy run.
| referrer="night_shift.triage", | ||
| prompt=_build_triage_prompt(candidates), | ||
| system_prompt="", | ||
| temperature=0.0, |
There was a problem hiding this comment.
temperature=0.0 on purpose here?


Refactors night shift from a single file into a package with separate modules for cron scheduling, simple triage, agentic triage, and shared models.
Switches
fixability_score_strategyfrom a direct ORM query to usingsearch.backend.query()with therecommendedsort. This gives us the same ranking algorithm used in the issues list UI (recency, spike detection, severity, user impact, event volume) as a pre-filter, then re-ranks by fixability score in-memory.Other changes:
reset_snuba_dataflag toSnubaTestCaseso tests can opt out of dropping ClickHouse data between runsbin/seer/trigger-night-shiftscript for local testingmodel_validate_jsonfor LLM response parsing in agentic triage