New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[release-4.5] Bug 1896168: Back port HAProxy reload failures alert fix #487
[release-4.5] Bug 1896168: Back port HAProxy reload failures alert fix #487
Conversation
openshift/router#209 reworks the HAProxy reload fails metric so that the HAProxyReloadFail alert can be improved. The new template_router_reload_failure metric in router openshift#209 that replaces the template_router_reload_fails metric is a simple boolean gauge metric, which allows the HAProxyReloadFail alert to fire for the duration of the HAProxy reload outage. Previously, the HAProxyReloadFail alert would fire for ~5 minutes regardless of whether or not reloads were still continuing to fail on the router. Also drops the HAProxyReloadFail alert to warning severity.
@sgreene570: This pull request references Bugzilla bug 1896168, which is invalid:
Comment In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: Miciah, sgreene570 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/retest Please review the full test history for this PR and help us cut down flakes. |
3 similar comments
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
Yes! |
@sgreene570: The following test failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
Issues go stale after 90d of inactivity. Mark the issue as fresh by commenting If this issue is safe to close now please do so with /lifecycle stale |
/remove-lifecycle stale |
/retest |
/bugzilla refresh |
@sgreene570: This pull request references Bugzilla bug 1896168, which is valid. The bug has been moved to the POST state. 6 validation(s) were run on this bug
In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/retest Please review the full test history for this PR and help us cut down flakes. |
@sgreene570: Some pull requests linked via external trackers have merged: The following pull requests linked via external trackers have not merged:
These pull request must merge or be unlinked from the Bugzilla bug in order for it to move to the next state. Once unlinked, request a bug refresh with Bugzilla bug 1896168 has not been moved to the MODIFIED state. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
This is the 4.5 backport of #481
openshift/router#209 reworks the HAProxy reload fails metric so that the HAProxyReloadFail alert can
be improved. The new template_router_reload_failure metric in router #209 that replaces the template_router_reload_fails metric is a simple boolean gauge metric, which allows the HAProxyReloadFail alert to fire for the duration of the HAProxy reload outage. Previously, the HAProxyReloadFail alert would fire for ~5 minutes regardless of whether or not reloads were still continuing to fail on the router.