Skip to content

Fix alert transition#19507

Merged
ktsaou merged 4 commits intonetdata:masterfrom
stelfrag:fix_alert_trans_delay
Jan 28, 2025
Merged

Fix alert transition#19507
ktsaou merged 4 commits intonetdata:masterfrom
stelfrag:fix_alert_trans_delay

Conversation

@stelfrag
Copy link
Collaborator

@stelfrag stelfrag commented Jan 28, 2025

Summary
  • Fix the alert transition delay from warning or critical to removed to be 10 seconds (it was 10 mins)
    • A removed state could be due to a node reconnecting, the short delay may be enough to avoid transmitting two alert state changes to the cloud
  • Use the actual alert trigger time (rather than the current time) to adjust the scheduled time to send to cloud
    • As rate as it might be, additional delay may be added if processing takes longer than expected between trigger time and time to be added in the queue.

@stelfrag stelfrag marked this pull request as ready for review January 28, 2025 12:02
Copy link
Contributor

@thiagoftsm thiagoftsm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR is working as expected, with alerts reaching cloud normally. LGTM!

@ktsaou ktsaou merged commit 3d49d32 into netdata:master Jan 28, 2025
101 checks passed
@stelfrag stelfrag deleted the fix_alert_trans_delay branch January 28, 2025 17:59
stelfrag added a commit to stelfrag/netdata that referenced this pull request Jan 30, 2025
* Fix alert transition from WARNING,CRITICAL to REMOVED to have a short delay (10 seconds) instead of 10 mins

* date_scheduled when adding the alert queue will be from the trigger time of the alert

* When adding newly generated transitions use the AE when time as trigger time

* When mass populating removed states use the current time (as it is now)

(cherry picked from commit 3d49d32)
@stelfrag stelfrag mentioned this pull request Jan 30, 2025
Ferroin pushed a commit that referenced this pull request Jan 30, 2025
* Fix alert transition from WARNING,CRITICAL to REMOVED to have a short delay (10 seconds) instead of 10 mins

* date_scheduled when adding the alert queue will be from the trigger time of the alert

* When adding newly generated transitions use the AE when time as trigger time

* When mass populating removed states use the current time (as it is now)

(cherry picked from commit 3d49d32)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants