Skip to content

Conversation

mifu67
Copy link
Contributor

@mifu67 mifu67 commented Oct 6, 2025

We're seeing some errors with the detect anomalies request when a dynamic alert has been edited. Send these so that the data is keyed according to the IDs that are passed.

The problem: alerts created since July will not fire because their stored data is keyed using alert rule ID, which is not sent anywhere in the processing request. Alerts updated since then will have their source ID overwritten, so the same problem applies.

The solution: pass source ID in the store data request.

@mifu67 mifu67 requested a review from ceorourke October 6, 2025 22:49
@mifu67 mifu67 requested a review from a team as a code owner October 6, 2025 22:49
@github-actions github-actions bot added the Scope: Backend Automatically applied to PRs that change backend components label Oct 6, 2025
cursor[bot]

This comment was marked as outdated.

Copy link
Member

@JoshFerge JoshFerge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we update/add a test as well?

Copy link

codecov bot commented Oct 6, 2025

Codecov Report

❌ Patch coverage is 83.33333% with 1 line in your changes missing coverage. Please review.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
src/sentry/seer/anomaly_detection/store_data.py 75.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           master   #101045      +/-   ##
===========================================
+ Coverage   76.52%    81.12%   +4.60%     
===========================================
  Files        8653      8661       +8     
  Lines      384048    384395     +347     
  Branches    24249     24249              
===========================================
+ Hits       293880    311835   +17955     
+ Misses      89824     72216   -17608     
  Partials      344       344              

@ram-senth
Copy link
Member

ram-senth commented Oct 6, 2025

So this fix is in the store_data call which is called when user is editing or creating an alert from the UI. However, the vast majority of the errors (about 700 at the top of every hour) are from the stream detection call. Are we assuming that we failed to save data for about 700 alerts and hence the stream detection is failing?

@mifu67
Copy link
Contributor Author

mifu67 commented Oct 6, 2025

So this fix is in the store_data call which is called when user is editing or creating an alert from the UI. However, the vast majority of the errors (about 700 at the top of every hour) are from the stream detection call. Are we assuming that we failed to save data for about 700 alerts and hence the stream detection is failing?

My assumption is that the alert was edited and the stored data call only had the alert rule ID and not the source ID, which would make all subsequent calls to anomaly detection fail because there is no historical data stored for this source ID. Let me know if this is not correct.

@ram-senth
Copy link
Member

My assumption is that the alert was edited and the stored data call only had the alert rule ID and not the source ID, which would make all subsequent calls to anomaly detection fail because there is no historical data stored for this source ID. Let me know if this is not correct.

Yes that is very possible. How do we fix the data for those 700 plus alerts?

@mifu67
Copy link
Contributor Author

mifu67 commented Oct 6, 2025

My assumption is that the alert was edited and the stored data call only had the alert rule ID and not the source ID, which would make all subsequent calls to anomaly detection fail because there is no historical data stored for this source ID. Let me know if this is not correct.

Yes that is very possible. How do we fix the data for those 700 plus alerts?

Could we do a new backfill? The alert history should be stored with the alert rule ID. We should fix it going forward first before doing the backfill, however.

@ram-senth
Copy link
Member

Could we do a new backfill? The alert history should be stored with the alert rule ID. We should fix it going forward first before doing the backfill, however.

Sure, if you can pull up a list like last time we can add a backfill script.

)
alert = AlertInSeer(id=alert_rule.id)
alert = AlertInSeer(
id=alert_rule.id,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you no longer need to pass the rule id if passing source_id and source_type - https://github.com/getsentry/seer/blob/main/src/seer/anomaly_detection/models/external.py#L48-L62

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sending all three will automatically update the ids in seer for those alerts that are missing source_id and source_it. I would recommend keeping it until we fully move away from alert_rule.id.

@mifu67 mifu67 merged commit 9babdbd into master Oct 7, 2025
66 checks passed
@mifu67 mifu67 deleted the mifu67/anomaly-detection/data-storage-fix branch October 7, 2025 22:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Scope: Backend Automatically applied to PRs that change backend components
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants