Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make a copy of firing alerts with EndsAt=0 when flushing. #1686

Merged
merged 1 commit into from
Jan 4, 2019

Conversation

brian-brazil
Copy link
Contributor

If the original EndsAt is left in place, then as time moves forwards
past the EndsAt then firing alerts will be rendered and treated as
resolved alerts which can cause confusion and races. This is most
likely to happen on retries for a notification.

Mitigate race and fix data races in TestAggrGroup.

Signed-off-by: Brian Brazil brian.brazil@robustperception.io

Copy link
Contributor

@stuartnelson3 stuartnelson3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just investigating the current go module build failures

If the original EndsAt is left in place, then as time moves forwards
past the EndsAt then firing alerts will be rendered and treated as
resolved alerts which can cause confusion and races. This is most
likely to happen on retries for a notification.

Mitigate race and fix data races in TestAggrGroup.

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
@stuartnelson3 stuartnelson3 merged commit 7078333 into master Jan 4, 2019
@stuartnelson3 stuartnelson3 deleted the resolve-race branch January 4, 2019 15:52
mxinden pushed a commit to mxinden/alertmanager that referenced this pull request Jan 14, 2019
…#1686)

If the original EndsAt is left in place, then as time moves forwards
past the EndsAt then firing alerts will be rendered and treated as
resolved alerts which can cause confusion and races. This is most
likely to happen on retries for a notification.

Mitigate race and fix data races in TestAggrGroup.

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
grobinson-grafana added a commit to grobinson-grafana/alertmanager that referenced this pull request Apr 12, 2024
This commit fixes a bug where firing alerts had zeroed EndsAt
timestamps in notifications. This bug was introduced in another
PR (prometheus#1686) to fix issues where alerts are resolved during a flush.

This commit takes a different approach where instead of removing
the EndsAt timestamp from the alert, it uses the ResolvedAt and
StatusAt methods with the timestamp in the context, not the current
time.

Signed-off-by: George Robinson <george.robinson@grafana.com>
grobinson-grafana added a commit to grobinson-grafana/alertmanager that referenced this pull request Apr 17, 2024
This commit fixes a bug where firing alerts had zeroed EndsAt
timestamps in notifications. This bug was introduced in another
PR (prometheus#1686) to fix issues where alerts are resolved during a flush.

This commit takes a different approach where instead of removing
the EndsAt timestamp from the alert, it uses the ResolvedAt and
StatusAt methods with the timestamp in the context, not the current
time.

Signed-off-by: George Robinson <george.robinson@grafana.com>
zecke added a commit to zecke/alertmanager that referenced this pull request May 1, 2024
This commit fixes a bug where firing alerts had zeroed EndsAt
timestamps in notifications. This bug was introduced in another
PR (prometheus#1686) to fix issues where alerts are resolved during a flush.

(Commit message from George Robinson)

Update the acceptance tests as the webhook now has a non-zero EndsAt
timestamp for firing alerts. The models.GettableAlert doesn't havea
`status` field which means the current test is unable to differentiate
between firing and resolved alerts.
zecke added a commit to zecke/alertmanager that referenced this pull request May 1, 2024
This commit fixes a bug where firing alerts had zeroed EndsAt
timestamps in notifications. This bug was introduced in another
PR (prometheus#1686) to fix issues where alerts are resolved during a flush.

(Commit message from George Robinson)

Update the acceptance tests as the webhook now has a non-zero EndsAt
timestamp for firing alerts. The models.GettableAlert doesn't havea
`status` field which means the current test is unable to differentiate
between firing and resolved alerts.

Signed-off-by: Holger Hans Peter Freyther <holger@freyther.de>
zecke added a commit to zecke/alertmanager that referenced this pull request May 5, 2024
This commit fixes a bug where firing alerts had zeroed EndsAt
timestamps in notifications. This bug was introduced in another
PR (prometheus#1686) to fix issues where alerts are resolved during a flush.

(Commit message from George Robinson)

Update the acceptance tests as the webhook now has a non-zero EndsAt
timestamp for firing alerts. The models.GettableAlert doesn't havea
`status` field which means the current test is unable to differentiate
between firing and resolved alerts.

Signed-off-by: Holger Hans Peter Freyther <holger@freyther.de>
zecke added a commit to zecke/alertmanager that referenced this pull request May 5, 2024
This commit fixes a bug where firing alerts had zeroed EndsAt
timestamps in notifications. This bug was introduced in another
PR (prometheus#1686) to fix issues where alerts are resolved during a flush.

(Commit message from George Robinson)

Update the acceptance tests as the webhook now has a non-zero EndsAt
timestamp for firing alerts. The models.GettableAlert doesn't havea
`status` field which means the current test is unable to differentiate
between firing and resolved alerts.

Signed-off-by: Holger Hans Peter Freyther <holger@freyther.de>
zecke added a commit to zecke/alertmanager that referenced this pull request May 8, 2024
This commit fixes a bug where firing alerts had zeroed EndsAt
timestamps in notifications. This bug was introduced in another
PR (prometheus#1686) to fix issues where alerts are resolved during a flush.

(Commit message from George Robinson)

Update the acceptance tests as the webhook now has a non-zero EndsAt
timestamp for firing alerts. The models.GettableAlert doesn't havea
`status` field which means the current test is unable to differentiate
between firing and resolved alerts.

Signed-off-by: Holger Hans Peter Freyther <holger@freyther.de>
zecke added a commit to zecke/alertmanager that referenced this pull request May 17, 2024
This commit fixes a bug where firing alerts had zeroed EndsAt
timestamps in notifications. This bug was introduced in another
PR (prometheus#1686) to fix issues where alerts are resolved during a flush.

(Commit message from George Robinson)

Update the acceptance tests as the webhook now has a non-zero EndsAt
timestamp for firing alerts. The models.GettableAlert doesn't havea
`status` field which means the current test is unable to differentiate
between firing and resolved alerts.

Signed-off-by: Holger Hans Peter Freyther <holger@freyther.de>
zecke added a commit to zecke/alertmanager that referenced this pull request May 17, 2024
This commit fixes a bug where firing alerts had zeroed EndsAt
timestamps in notifications. This bug was introduced in another
PR (prometheus#1686) to fix issues where alerts are resolved during a flush.

(Commit message from George Robinson)

Update the acceptance tests as the webhook now has a non-zero EndsAt
timestamp for firing alerts. The models.GettableAlert doesn't havea
`status` field which means the current test is unable to differentiate
between firing and resolved alerts.

Signed-off-by: Holger Hans Peter Freyther <holger@freyther.de>
zecke added a commit to zecke/alertmanager that referenced this pull request Jun 17, 2024
This commit fixes a bug where firing alerts had zeroed EndsAt
timestamps in notifications. This bug was introduced in another
PR (prometheus#1686) to fix issues where alerts are resolved during a flush.

(Commit message from George Robinson)

Update the acceptance tests as the webhook now has a non-zero EndsAt
timestamp for firing alerts. The models.GettableAlert doesn't havea
`status` field which means the current test is unable to differentiate
between firing and resolved alerts.

Signed-off-by: Holger Hans Peter Freyther <holger@freyther.de>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants