Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend CVO alerts to cover update retrieval #357

Merged

Conversation

jottofar
Copy link
Contributor

No description provided.

@openshift-ci-robot openshift-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 22, 2020
@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 22, 2020
@jottofar
Copy link
Contributor Author

/test e2e-aws

@jottofar jottofar force-pushed the ota-151-extend-alerts branch 2 times, most recently from 5010537 to 2d99330 Compare April 22, 2020 19:37
@jottofar
Copy link
Contributor Author

/test e2e-aws

@jottofar jottofar force-pushed the ota-151-extend-alerts branch 2 times, most recently from 1487e3b to a9f575a Compare April 23, 2020 02:08
pkg/cvo/metrics.go Outdated Show resolved Hide resolved
@jottofar jottofar force-pushed the ota-151-extend-alerts branch 2 times, most recently from 1c9b7dd to 6aaa85a Compare April 23, 2020 20:28
@jottofar jottofar changed the title WIP: Extend CVO alerts to cover update retrieval Extend CVO alerts to cover update retrieval Apr 23, 2020
@openshift-ci-robot openshift-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 23, 2020
@jottofar
Copy link
Contributor Author

/retest

1 similar comment
@jottofar
Copy link
Contributor Author

/retest

@jottofar
Copy link
Contributor Author

/test e2e-aws

wking added a commit to wking/openshift-release that referenced this pull request Apr 30, 2020
CI and nightly releases are not part of the official Red Hat
Cincinnati graphs.  This commit removes the channel property [1],
which will result in the NoChannel condition [2], but will keep the
CVO from attempting to find its current version in the official
Cincinnati [3].  That in turn should keep the CVO from throwing a new,
critical alert [4], which will keep it from running afoul of recent
update e2e logic that forbids critical alerts [5].

[1]: https://github.com/openshift/installer/blob/4eca2efd615f8abd65f576721e2410b19f0d40d0/data/data/manifests/bootkube/cvo-overrides.yaml.template#L8
[2]: https://github.com/openshift/cluster-version-operator/blob/fa452c2d270f1f989f3868ef97ae8cf825713583/docs/user/status.md#nochannel
[3]: https://bugzilla.redhat.com/show_bug.cgi?id=1827378#c4
[4]: openshift/cluster-version-operator#357
[5]: openshift/origin#24786
@jottofar
Copy link
Contributor Author

jottofar commented May 3, 2020

/test e2e-aws

pkg/cvo/metrics.go Outdated Show resolved Hide resolved
@jottofar jottofar force-pushed the ota-151-extend-alerts branch 8 times, most recently from b81e6f0 to 3cc123d Compare June 15, 2020 20:32
@jottofar
Copy link
Contributor Author

/test images

@jottofar
Copy link
Contributor Author

/retest

Added metric cluster_version_operator_update_retrieval_timestamp_seconds
to track last known successful update retrieval time and use metric
cluster_operator_conditions to determine update retrieval failure reason.

Added alert CannotRetrieveUpdates that fires when metric
cluster_version_operator_update_retrieval_timestamp_seconds is >= 3600
unless reason, as reported by cluster_operator_conditions is NoChannel.
Alert reports last known successful update retrieval time, reason unable
to retrieve updates, and console URL to get more information.
@jottofar
Copy link
Contributor Author

/test e2e-aws-upgrade

1 similar comment
@jottofar
Copy link
Contributor Author

/test e2e-aws-upgrade

@wking
Copy link
Member

wking commented Jun 17, 2020

🎉

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Jun 17, 2020
@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jottofar, wking

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@jottofar
Copy link
Contributor Author

/test e2e-aws-upgrade

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

2 similar comments
@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@jottofar
Copy link
Contributor Author

/test e2e-aws-upgrade

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

1 similar comment
@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-merge-robot openshift-merge-robot merged commit 2ddb4a2 into openshift:master Jun 18, 2020
stbenjam added a commit to stbenjam/dev-scripts that referenced this pull request Jun 3, 2021
CI and nightly releases are not part of the official Red Hat
Cincinnati graphs.  This commit removes the channel property [1],
which will result in the NoChannel condition [2], but will keep the
CVO from attempting to find its current version in the official
Cincinnati [3].  That in turn should keep the CVO from throwing a new,
critical alert [4], which will keep it from running afoul of recent
update e2e logic that forbids critical alerts [5].

[1]: https://github.com/openshift/installer/blob/4eca2efd615f8abd65f576721e2410b19f0d40d0/data/data/manifests/bootkube/cvo-overrides.yaml.template#L8
[2]: https://github.com/openshift/cluster-version-operator/blob/fa452c2d270f1f989f3868ef97ae8cf825713583/docs/user/status.md#nochannel
[3]: https://bugzilla.redhat.com/show_bug.cgi?id=1827378#c4
[4]: openshift/cluster-version-operator#357
[5]: openshift/origin#24786
stbenjam added a commit to stbenjam/dev-scripts that referenced this pull request Jun 3, 2021
CI and nightly releases are not part of the official Red Hat
Cincinnati graphs.  This commit removes the channel property [1],
which will result in the NoChannel condition [2], but will keep the
CVO from attempting to find its current version in the official
Cincinnati [3].  That in turn should keep the CVO from throwing a new,
critical alert [4], which will keep it from running afoul of recent
update e2e logic that forbids critical alerts [5].

See this related PR [6] which does this for other platforms IPI
install step.

[1]: https://github.com/openshift/installer/blob/4eca2efd615f8abd65f576721e2410b19f0d40d0/data/data/manifests/bootkube/cvo-overrides.yaml.template#L8
[2]: https://github.com/openshift/cluster-version-operator/blob/fa452c2d270f1f989f3868ef97ae8cf825713583/docs/user/status.md#nochannel
[3]: https://bugzilla.redhat.com/show_bug.cgi?id=1827378#c4
[4]: openshift/cluster-version-operator#357
[5]: openshift/origin#24786
[6]: openshift/release#8631
stbenjam added a commit to stbenjam/dev-scripts that referenced this pull request Jun 3, 2021
CI and nightly releases are not part of the official Red Hat
Cincinnati graphs.  This commit removes the channel property [1],
which will result in the NoChannel condition [2], but will keep the
CVO from attempting to find its current version in the official
Cincinnati [3].  That in turn should keep the CVO from throwing a new,
critical alert [4], which will keep it from running afoul of recent
update e2e logic that forbids critical alerts [5].

See this related PR [6] which does this for other platforms IPI
install step.

[1]: https://github.com/openshift/installer/blob/4eca2efd615f8abd65f576721e2410b19f0d40d0/data/data/manifests/bootkube/cvo-overrides.yaml.template#L8
[2]: https://github.com/openshift/cluster-version-operator/blob/fa452c2d270f1f989f3868ef97ae8cf825713583/docs/user/status.md#nochannel
[3]: https://bugzilla.redhat.com/show_bug.cgi?id=1827378#c4
[4]: openshift/cluster-version-operator#357
[5]: openshift/origin#24786
[6]: openshift/release#8631

Co-authored-by: W. Trevor King <wking@tremily.us>
openshift-merge-robot pushed a commit to openshift-metal3/dev-scripts that referenced this pull request Sep 23, 2021
CI and nightly releases are not part of the official Red Hat
Cincinnati graphs.  This commit removes the channel property [1],
which will result in the NoChannel condition [2], but will keep the
CVO from attempting to find its current version in the official
Cincinnati [3].  That in turn should keep the CVO from throwing a new,
critical alert [4], which will keep it from running afoul of recent
update e2e logic that forbids critical alerts [5].

See this related PR [6] which does this for other platforms IPI
install step.

[1]: https://github.com/openshift/installer/blob/4eca2efd615f8abd65f576721e2410b19f0d40d0/data/data/manifests/bootkube/cvo-overrides.yaml.template#L8
[2]: https://github.com/openshift/cluster-version-operator/blob/fa452c2d270f1f989f3868ef97ae8cf825713583/docs/user/status.md#nochannel
[3]: https://bugzilla.redhat.com/show_bug.cgi?id=1827378#c4
[4]: openshift/cluster-version-operator#357
[5]: openshift/origin#24786
[6]: openshift/release#8631

Co-authored-by: W. Trevor King <wking@tremily.us>

Co-authored-by: W. Trevor King <wking@tremily.us>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants