New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extend CVO alerts to cover update retrieval #357
Extend CVO alerts to cover update retrieval #357
Conversation
a80d7eb
to
ca9a94a
Compare
/test e2e-aws |
5010537
to
2d99330
Compare
/test e2e-aws |
1487e3b
to
a9f575a
Compare
install/0000_90_cluster-version-operator_02_servicemonitor.yaml
Outdated
Show resolved
Hide resolved
install/0000_90_cluster-version-operator_02_servicemonitor.yaml
Outdated
Show resolved
Hide resolved
a9f575a
to
31b703e
Compare
1c9b7dd
to
6aaa85a
Compare
/retest |
1 similar comment
/retest |
/test e2e-aws |
CI and nightly releases are not part of the official Red Hat Cincinnati graphs. This commit removes the channel property [1], which will result in the NoChannel condition [2], but will keep the CVO from attempting to find its current version in the official Cincinnati [3]. That in turn should keep the CVO from throwing a new, critical alert [4], which will keep it from running afoul of recent update e2e logic that forbids critical alerts [5]. [1]: https://github.com/openshift/installer/blob/4eca2efd615f8abd65f576721e2410b19f0d40d0/data/data/manifests/bootkube/cvo-overrides.yaml.template#L8 [2]: https://github.com/openshift/cluster-version-operator/blob/fa452c2d270f1f989f3868ef97ae8cf825713583/docs/user/status.md#nochannel [3]: https://bugzilla.redhat.com/show_bug.cgi?id=1827378#c4 [4]: openshift/cluster-version-operator#357 [5]: openshift/origin#24786
/test e2e-aws |
install/0000_90_cluster-version-operator_02_servicemonitor.yaml
Outdated
Show resolved
Hide resolved
b81e6f0
to
3cc123d
Compare
/test images |
/retest |
Added metric cluster_version_operator_update_retrieval_timestamp_seconds to track last known successful update retrieval time and use metric cluster_operator_conditions to determine update retrieval failure reason. Added alert CannotRetrieveUpdates that fires when metric cluster_version_operator_update_retrieval_timestamp_seconds is >= 3600 unless reason, as reported by cluster_operator_conditions is NoChannel. Alert reports last known successful update retrieval time, reason unable to retrieve updates, and console URL to get more information.
3cc123d
to
1005398
Compare
/test e2e-aws-upgrade |
1 similar comment
/test e2e-aws-upgrade |
🎉 /lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: jottofar, wking The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/test e2e-aws-upgrade |
/retest Please review the full test history for this PR and help us cut down flakes. |
2 similar comments
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/test e2e-aws-upgrade |
/retest Please review the full test history for this PR and help us cut down flakes. |
1 similar comment
/retest Please review the full test history for this PR and help us cut down flakes. |
CI and nightly releases are not part of the official Red Hat Cincinnati graphs. This commit removes the channel property [1], which will result in the NoChannel condition [2], but will keep the CVO from attempting to find its current version in the official Cincinnati [3]. That in turn should keep the CVO from throwing a new, critical alert [4], which will keep it from running afoul of recent update e2e logic that forbids critical alerts [5]. [1]: https://github.com/openshift/installer/blob/4eca2efd615f8abd65f576721e2410b19f0d40d0/data/data/manifests/bootkube/cvo-overrides.yaml.template#L8 [2]: https://github.com/openshift/cluster-version-operator/blob/fa452c2d270f1f989f3868ef97ae8cf825713583/docs/user/status.md#nochannel [3]: https://bugzilla.redhat.com/show_bug.cgi?id=1827378#c4 [4]: openshift/cluster-version-operator#357 [5]: openshift/origin#24786
CI and nightly releases are not part of the official Red Hat Cincinnati graphs. This commit removes the channel property [1], which will result in the NoChannel condition [2], but will keep the CVO from attempting to find its current version in the official Cincinnati [3]. That in turn should keep the CVO from throwing a new, critical alert [4], which will keep it from running afoul of recent update e2e logic that forbids critical alerts [5]. See this related PR [6] which does this for other platforms IPI install step. [1]: https://github.com/openshift/installer/blob/4eca2efd615f8abd65f576721e2410b19f0d40d0/data/data/manifests/bootkube/cvo-overrides.yaml.template#L8 [2]: https://github.com/openshift/cluster-version-operator/blob/fa452c2d270f1f989f3868ef97ae8cf825713583/docs/user/status.md#nochannel [3]: https://bugzilla.redhat.com/show_bug.cgi?id=1827378#c4 [4]: openshift/cluster-version-operator#357 [5]: openshift/origin#24786 [6]: openshift/release#8631
CI and nightly releases are not part of the official Red Hat Cincinnati graphs. This commit removes the channel property [1], which will result in the NoChannel condition [2], but will keep the CVO from attempting to find its current version in the official Cincinnati [3]. That in turn should keep the CVO from throwing a new, critical alert [4], which will keep it from running afoul of recent update e2e logic that forbids critical alerts [5]. See this related PR [6] which does this for other platforms IPI install step. [1]: https://github.com/openshift/installer/blob/4eca2efd615f8abd65f576721e2410b19f0d40d0/data/data/manifests/bootkube/cvo-overrides.yaml.template#L8 [2]: https://github.com/openshift/cluster-version-operator/blob/fa452c2d270f1f989f3868ef97ae8cf825713583/docs/user/status.md#nochannel [3]: https://bugzilla.redhat.com/show_bug.cgi?id=1827378#c4 [4]: openshift/cluster-version-operator#357 [5]: openshift/origin#24786 [6]: openshift/release#8631 Co-authored-by: W. Trevor King <wking@tremily.us>
CI and nightly releases are not part of the official Red Hat Cincinnati graphs. This commit removes the channel property [1], which will result in the NoChannel condition [2], but will keep the CVO from attempting to find its current version in the official Cincinnati [3]. That in turn should keep the CVO from throwing a new, critical alert [4], which will keep it from running afoul of recent update e2e logic that forbids critical alerts [5]. See this related PR [6] which does this for other platforms IPI install step. [1]: https://github.com/openshift/installer/blob/4eca2efd615f8abd65f576721e2410b19f0d40d0/data/data/manifests/bootkube/cvo-overrides.yaml.template#L8 [2]: https://github.com/openshift/cluster-version-operator/blob/fa452c2d270f1f989f3868ef97ae8cf825713583/docs/user/status.md#nochannel [3]: https://bugzilla.redhat.com/show_bug.cgi?id=1827378#c4 [4]: openshift/cluster-version-operator#357 [5]: openshift/origin#24786 [6]: openshift/release#8631 Co-authored-by: W. Trevor King <wking@tremily.us> Co-authored-by: W. Trevor King <wking@tremily.us>
No description provided.