OCPBUGS-9133: pkg/cvo/metrics: Connect ClusterVersion to ClusterOperatorDown and ClusterOperatorDegraded #746

wking · 2022-02-24T20:55:16Z

By adding cluster_operator_up handling for ClusterVersion, with version as the component name, the same way we handle cluster_operator_conditions. This plugs us into ClusterOperatorDown (based on cluster_operator_up) and ClusterOperatorDegraded (based on both cluster_operator_conditions and cluster_operator_up).

I've adjusted the ClusterOperatorDegraded rule so that it fires on ClusterVersion Failing=True and does not fire on Failing=False. Thinking through an update from before:

Outgoing CVO does not serve cluster_operator_up{name="version"}.
User requests an update to a release with this change.
New CVO comes in, starts serving cluster_operator_up{name="version"}.
Old ClusterOperatorDegraded no matching cluster_operator_conditions{name="version",condition="Degraded"}, falls through to cluster_operator_up{name="version"}, and starts cooking the for: 30m.
If we go more than 30m before updating the ClusterOperatorDegraded rule to understand Failing, ClusterOperatorDegraded would fire.

We'll need to backport the ClusterOperatorDegraded expr change to one 4.y release before the CVO-metrics change lands to get:

Outgoing CVO does not serve cluster_operator_up{name="version"}.
User requests an update to a release with the expr change.
Incoming ClusterOperatorDegraded sees no cluster_operator_conditions{name="version",condition="Degraded"}, cluster_operator_conditions{name="version",condition="Failing"} (we hope), or cluster_operator_up{name="version"}, so it doesn't fire. Unless we are Failing=True, in which case, hooray, we'll start alerting about it.
User requests an update to a release with the CVO-metrics change.
New CVO starts serving cluster_operator_up{name="version"}, just like the fresh-modern-install situation, and everything is great.

The missing-ClusterVersion metrics don't matter all that much today, because the CVO has been creating replacement ClusterVersion since at least 90e9881 (#45). But it will become more important with #741, which is planning on removing that default creation. When there is no ClusterVersion, we expect ClusterOperatorDown to fire.

openshift-ci · 2022-04-12T00:52:21Z

@wking: This pull request references Bugzilla bug 2058416, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker.

3 validation(s) were run on this bug

bug is open, matching expected state (open)
bug target release (4.11.0) matches configured target release for branch (4.11.0)
bug is in the state NEW, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)

Requesting review from QA contact:
/cc @shellyyang1989

In response to this:

Bug 2058416: pkg/cvo/metrics: Connect ClusterVersion to ClusterOperatorDown and ClusterOperatorDegraded

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

openshift-bot · 2022-11-21T09:00:24Z

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

openshift-bot · 2022-12-22T00:30:45Z

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

openshift-bot · 2023-01-21T08:00:34Z

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

openshift-ci · 2023-01-21T08:01:01Z

@openshift-bot: Closed this PR.

In response to this:

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

openshift-ci · 2023-01-21T08:01:07Z

@wking: This pull request references Bugzilla bug 2058416. The bug has been updated to no longer refer to the pull request using the external bug tracker. All external bug links have been closed. The bug has been moved to the NEW state.
Warning: Failed to comment on Bugzilla bug with reason for changed state.

In response to this:

Bug 2058416: pkg/cvo/metrics: Connect ClusterVersion to ClusterOperatorDown and ClusterOperatorDegraded

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

openshift-ci · 2023-01-25T19:01:15Z

@wking: This pull request references Bugzilla bug 2058416, which is invalid:

expected the bug to target the "4.13.0" release, but it targets "---" instead

Comment /bugzilla refresh to re-evaluate validity if changes to the Bugzilla bug are made, or edit the title of this pull request to link to a different bug.

In response to this:

Bug 2058416: pkg/cvo/metrics: Connect ClusterVersion to ClusterOperatorDown and ClusterOperatorDegraded

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

openshift-bot · 2023-02-25T00:00:53Z

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

openshift-ci · 2023-02-25T00:01:25Z

@openshift-bot: Closed this PR.

In response to this:

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

openshift-ci · 2023-02-25T00:01:27Z

@wking: An error was encountered removing this pull request from the external tracker bugs for bug 2058416 on the Bugzilla server at https://bugzilla.redhat.com. No known errors were detected, please see the full error message for details.

Full error message.


response code 400 not 200

Please contact an administrator to resolve this issue, then request a bug refresh with /bugzilla refresh.

In response to this:

Bug 2058416: pkg/cvo/metrics: Connect ClusterVersion to ClusterOperatorDown and ClusterOperatorDegraded

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

wking · 2024-03-18T21:38:54Z

/retitle OCPBUGS-9133: pkg/cvo/metrics: Connect ClusterVersion to ClusterOperatorDown and ClusterOperatorDegraded

wking · 2024-04-09T15:45:04Z

/retest-required

wking · 2024-04-10T07:02:41Z

Testing with Cluster Bot and launch 4.16,openshift/cluster-version-operator#746 aws (logs), I opened by making auth mad:

$ oc adm cordon -l node-role.kubernetes.io/control-plane=
node/ip-10-0-120-3.us-west-1.compute.internal cordoned
node/ip-10-0-31-66.us-west-1.compute.internal cordoned
node/ip-10-0-71-7.us-west-1.compute.internal cordoned
$ oc -n openshift-authentication delete "$(oc -n openshift-authentication get -o name pods | head -n1)"
pod "oauth-openshift-6cbcb6579f-2fg22" deleted

Do the Machine-approver too, for good measure, as I pick on things that should spook cluster-operators without actually hurting cluster performance (I'm not trying to scale up new Machines/Nodes):

$ oc -n openshift-cluster-machine-approver delete "$(oc -n openshift-cluster-machine-approver get -o name pods | head -n1)"
pod "machine-approver-74b7866855-z2flp" deleted

With operand pods removed, and the cordon blocking its replacement from scheduling, the operators should be grumbling, but even after 15m, they're still all happy:

$ oc get -o json clusteroperator | jq -c '.items[].status.conditions[] | select(.type == "Available" or .type == "Degraded") | {type, status}' | sort | uniq -c
     33 {"type":"Available","status":"True"}
     33 {"type":"Degraded","status":"False"}

Maybe I should go after the registry:

$ oc adm cordon -l node-role.kubernetes.io/worker=
node/ip-10-0-108-53.us-west-1.compute.internal cordoned
node/ip-10-0-33-86.us-west-1.compute.internal cordoned
node/ip-10-0-87-58.us-west-1.compute.internal cordoned
$ oc delete namespace openshift-image-registry
namespace "openshift-image-registry" deleted

Hey, now an operator is mad:

$ oc get -o json clusteroperator | jq -c '.items[] | .metadata.name as $n | .status.conditions[] | select((.type == "Available" and .status == "False") or (.type == "Degraded" and .status == "True")) | .name = $n' | sort
{"lastTransitionTime":"2024-04-10T06:48:00Z","message":"1 of 6 credentials requests are failing to sync.","reason":"CredentialsFailing","status":"True","type":"Degraded","name":"cloud-credential"}

But that's not one I'd been trying to poke, and then it got happy again. Probably just trying to create the registry's CredentialsRequest Secret, and struggling until the CVO had recreated that namespace. Ah, eventually machine-config complains about the cordoned control plane:

$ oc get -o json clusteroperator | jq -c '.items[] | .metadata.name as $n | .status.conditions[] | select((.type == "Available" and .status == "False") or (.type == "Degraded" and .status == "True")) | .name = $n' | sort
{"lastTransitionTime":"2024-04-10T06:50:28Z","message":"Failed to resync 4.16.0-0.test-2024-04-10-052103-ci-ln-xtnbb9k-latest because: error during syncRequiredMachineConfigPools: [context deadline exceeded, failed to update clusteroperator: [client rate limiter Wait returned an error: context deadline exceeded, error required MachineConfigPool master is not ready, retrying. Status: (total: 3, ready 0, updated: 3, unavailable: 3, degraded: 0)]]","reason":"RequiredPoolsFailed","status":"True","type":"Degraded","name":"machine-config"}

and ~5m later, authentication starts complaining too:

$ oc get -o json clusteroperator | jq -c '.items[] | .metadata.name as $n | .status.conditions[] | select((.type == "Available" and .status == "False") or (.type == "Degraded" and .status == "True")) | .name = $n' | sort
{"lastTransitionTime":"2024-04-10T06:50:28Z","message":"Failed to resync 4.16.0-0.test-2024-04-10-052103-ci-ln-xtnbb9k-latest because: error during syncRequiredMachineConfigPools: [context deadline exceeded, failed to update clusteroperator: [client rate limiter Wait returned an error: context deadline exceeded, error required MachineConfigPool master is not ready, retrying. Status: (total: 3, ready 0, updated: 3, unavailable: 3, degraded: 0)]]","reason":"RequiredPoolsFailed","status":"True","type":"Degraded","name":"machine-config"}
{"lastTransitionTime":"2024-04-10T06:55:30Z","message":"OAuthServerDeploymentDegraded: 1 of 3 requested instances are unavailable for oauth-openshift.openshift-authentication ()","reason":"OAuthServerDeployment_UnavailablePod","status":"True","type":"Degraded","name":"authentication"}

And the CVO passes these along:

$ oc adm upgrade
Failing=True:

  Reason: ClusterOperatorsDegraded
  Message: Cluster operators authentication, machine-config are degraded

Error while reconciling 4.16.0-0.test-2024-04-10-052103-ci-ln-xtnbb9k-latest: authentication, machine-config has an unknown error: ClusterOperatorsDegraded
...

version alert renders well:

operator alert renders with an extra }}:

…usterOperatorDegraded By adding cluster_operator_up handling for ClusterVersion, with 'version' as the component name, the same way we handle cluster_operator_conditions. This plugs us into ClusterOperatorDown (based on cluster_operator_up) and ClusterOperatorDegraded (based on both cluster_operator_conditions and cluster_operator_up). I've adjusted the ClusterOperatorDegraded rule so that it fires on ClusterVersion Failing=True and does not fire on Failing=False. Thinking through an update from before: 1. Outgoing CVO does not serve cluster_operator_up{name="version"}. 2. User requests an update to a release with this change. 3. New CVO comes in, starts serving cluster_operator_up{name="version"}. 4. Old ClusterOperatorDegraded no matching cluster_operator_conditions{name="version",condition="Degraded"}, falls through to cluster_operator_up{name="version"}, and starts cooking the 'for: 30m'. 5. If we go more than 30m before updating the ClusterOperatorDegraded rule to understand Failing, ClusterOperatorDegraded would fire. We'll need to backport the ClusterOperatorDegraded expr change to one 4.y release before the CVO-metrics change lands to get: 1. Outgoing CVO does not serve cluster_operator_up{name="version"}. 2. User requests an update to a release with the expr change. 3. Incoming ClusterOperatorDegraded sees no cluster_operator_conditions{name="version",condition="Degraded"}, cluster_operator_conditions{name="version",condition="Failing"} (we hope), or cluster_operator_up{name="version"}, so it doesn't fire. Unless we are Failing=True, in which case, hooray, we'll start alerting about it. 4. User requests an update to a release with the CVO-metrics change. 5. New CVO starts serving cluster_operator_up, just like the fresh-modern-install situation, and everything is great. The missing-ClusterVersion metrics don't matter all that much today, because the CVO has been creating replacement ClusterVersion since at least 90e9881 (cvo: Change the core CVO loops to report status to ClusterVersion, 2018-11-02, openshift#45). But it will become more important with [1], which is planning on removing that default creation. When there is no ClusterVersion, we expect ClusterOperatorDown to fire. The awkward: {{ "{{ ... \"version\" }} ... {{ end }}" }} business is because this content is unpacked in two rounds of templating: 1. The cluster-version operator's getPayloadTasks' renderManifest preprocessing for the CVO directory, which is based on Go templates. 2. Prometheus alerting-rule templates, which use console templates [2], which are also based on Go templates [3]. The '{{ "..." }}' wrapping is consumed by the CVO's templating, and the remaining: {{ ... "version" }} ... {{ end }} is left for Promtheus' templating. [1]: openshift#741 [2]: https://prometheus.io/docs/prometheus/2.51/configuration/alerting_rules/#templating [3]: https://prometheus.io/docs/visualization/consoles/

wking · 2024-04-10T07:05:10Z

earlier testing:

operator alert renders with an extra }}:

I've pushed 74312d1 -> 10849d7 to address this.

petr-muller · 2024-04-17T12:01:26Z

/retest

petr-muller · 2024-04-17T12:02:23Z

pkg/start/start.go

@@ -247,6 +247,7 @@ func (o *Options) run(ctx context.Context, controllerCtx *Context, lock resource
 						}
 						klog.Infof("Failed to initialize from payload; shutting down: %v", err)
 						resultChannel <- asyncResult{name: "payload initialization", error: firstError}
+						return


This was the panic?

yup, essay in 2952a2f ;)

openshift-ci · 2024-04-17T13:04:53Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: petr-muller, wking

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [petr-muller,wking]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

openshift-ci · 2024-04-17T14:26:11Z

@wking: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
ci/prow/e2e-agnostic-upgrade	`43f13d7`	link	true	`/test e2e-agnostic-upgrade`
ci/prow/e2e-agnostic-upgrade-into-change	`43f13d7`	link	true	`/test e2e-agnostic-upgrade-into-change`
ci/prow/e2e-agnostic-upgrade-out-of-change	`43f13d7`	link	true	`/test e2e-agnostic-upgrade-out-of-change`
ci/prow/e2e-agnostic	`43f13d7`	link	true	`/test e2e-agnostic`

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

petr-muller · 2024-04-17T14:37:15Z

: [sig-api-machinery][Feature:APIServer][Late] API LBs follow /readyz of kube-apiserver and stop sending requests before server shutdowns for external clients [Suite:openshift/conformance/parallel

Single failure from the hypershift conformance job, does not seem to be related
/override ci/prow/e2e-hypershift-conformance

openshift-ci · 2024-04-17T14:57:12Z

@petr-muller: Overrode contexts on behalf of petr-muller: ci/prow/e2e-hypershift-conformance

In response to this:

: [sig-api-machinery][Feature:APIServer][Late] API LBs follow /readyz of kube-apiserver and stop sending requests before server shutdowns for external clients [Suite:openshift/conformance/parallel
Single failure from the hypershift conformance job, does not seem to be related
/override ci/prow/e2e-hypershift-conformance

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

dis016 · 2024-04-17T17:14:27Z

/qe-approved

jiajliu · 2024-04-18T02:56:37Z

/label qe-approved cc @dis016

wking · 2024-04-18T03:28:50Z

/label qe-approved

openshift-ci-robot · 2024-04-18T03:28:56Z

@wking: This pull request references Jira Issue OCPBUGS-9133, which is valid.

3 validation(s) were run on this bug

bug is open, matching expected state (open)
bug target version (4.16.0) matches configured target version for branch (4.16.0)
bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact:
/cc @dis016

In response to this:

By adding cluster_operator_up handling for ClusterVersion, with version as the component name, the same way we handle cluster_operator_conditions. This plugs us into ClusterOperatorDown (based on cluster_operator_up) and ClusterOperatorDegraded (based on both cluster_operator_conditions and cluster_operator_up).

I've adjusted the ClusterOperatorDegraded rule so that it fires on ClusterVersion Failing=True and does not fire on Failing=False. Thinking through an update from before:

Outgoing CVO does not serve cluster_operator_up{name="version"}.

User requests an update to a release with this change.

New CVO comes in, starts serving cluster_operator_up{name="version"}.

Old ClusterOperatorDegraded no matching cluster_operator_conditions{name="version",condition="Degraded"}, falls through to cluster_operator_up{name="version"}, and starts cooking the for: 30m.

If we go more than 30m before updating the ClusterOperatorDegraded rule to understand Failing, ClusterOperatorDegraded would fire.

We'll need to backport the ClusterOperatorDegraded expr change to one 4.y release before the CVO-metrics change lands to get:

Outgoing CVO does not serve cluster_operator_up{name="version"}.

User requests an update to a release with the expr change.

Incoming ClusterOperatorDegraded sees no cluster_operator_conditions{name="version",condition="Degraded"}, cluster_operator_conditions{name="version",condition="Failing"} (we hope), or cluster_operator_up{name="version"}, so it doesn't fire. Unless we are Failing=True, in which case, hooray, we'll start alerting about it.

User requests an update to a release with the CVO-metrics change.

New CVO starts serving cluster_operator_up{name="version"}, just like the fresh-modern-install situation, and everything is great.

The missing-ClusterVersion metrics don't matter all that much today, because the CVO has been creating replacement ClusterVersion since at least 90e9881 (#45). But it will become more important with #741, which is planning on removing that default creation. When there is no ClusterVersion, we expect ClusterOperatorDown to fire.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

openshift-ci-robot · 2024-04-18T05:38:56Z

/retest-required

Remaining retests: 0 against base HEAD debaaf6 and 2 for PR HEAD 10849d7 in total

openshift-ci-robot · 2024-04-18T07:02:03Z

@wking: Jira Issue OCPBUGS-9133: All pull requests linked via external trackers have merged:

openshift/cluster-version-operator#746

Jira Issue OCPBUGS-9133 has been moved to the MODIFIED state.

In response to this:

By adding cluster_operator_up handling for ClusterVersion, with version as the component name, the same way we handle cluster_operator_conditions. This plugs us into ClusterOperatorDown (based on cluster_operator_up) and ClusterOperatorDegraded (based on both cluster_operator_conditions and cluster_operator_up).

I've adjusted the ClusterOperatorDegraded rule so that it fires on ClusterVersion Failing=True and does not fire on Failing=False. Thinking through an update from before:

Outgoing CVO does not serve cluster_operator_up{name="version"}.

User requests an update to a release with this change.

New CVO comes in, starts serving cluster_operator_up{name="version"}.

Old ClusterOperatorDegraded no matching cluster_operator_conditions{name="version",condition="Degraded"}, falls through to cluster_operator_up{name="version"}, and starts cooking the for: 30m.

If we go more than 30m before updating the ClusterOperatorDegraded rule to understand Failing, ClusterOperatorDegraded would fire.

We'll need to backport the ClusterOperatorDegraded expr change to one 4.y release before the CVO-metrics change lands to get:

Outgoing CVO does not serve cluster_operator_up{name="version"}.

User requests an update to a release with the expr change.

Incoming ClusterOperatorDegraded sees no cluster_operator_conditions{name="version",condition="Degraded"}, cluster_operator_conditions{name="version",condition="Failing"} (we hope), or cluster_operator_up{name="version"}, so it doesn't fire. Unless we are Failing=True, in which case, hooray, we'll start alerting about it.

User requests an update to a release with the CVO-metrics change.

New CVO starts serving cluster_operator_up{name="version"}, just like the fresh-modern-install situation, and everything is great.

The missing-ClusterVersion metrics don't matter all that much today, because the CVO has been creating replacement ClusterVersion since at least 90e9881 (#45). But it will become more important with #741, which is planning on removing that default creation. When there is no ClusterVersion, we expect ClusterOperatorDown to fire.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

openshift-bot · 2024-04-18T17:10:28Z

[ART PR BUILD NOTIFIER]

This PR has been included in build cluster-version-operator-container-v4.16.0-202404181209.p0.g5e73deb.assembly.stream.el9 for distgit cluster-version-operator.
All builds following this will include this PR.

openshift-merge-robot · 2024-04-21T19:42:12Z

Fix included in accepted release 4.16.0-0.nightly-2024-04-21-123502

…sight Structure this output, so it gets all the usual pretty-printing, detail, etc. that the sad-ClusterOperator conditions are getting. Sometimes Failing will complain about sad ClusterOperators, and in that case we'll double up on that messaging. But we're punting on "consolidate when multiple updateInsights complain about the same root cause" for now. And sometimes Failing will complain about other resources, such as ProgressDeadlineExceeded operator Deployments [1], and in that case the information is only flowing out through ClusterVersion, and not via the other resources we check when rendering status. The links to ClusterOperatorDegraded are because [2] folded Failing=True into ClusterOperatorDegraded alerting, although we still need to update the runbook to address that change. [1]: https://github.com/openshift/cluster-version-operator/blob/1acac06742fb0e3e49ffe2294864007f26a7799d/lib/resourcebuilder/apps.go#L122C124-L122C148 [2]: openshift/cluster-version-operator#746

…sight Structure this output, so it gets all the usual pretty-printing, detail, etc. that the sad-ClusterOperator conditions are getting. Sometimes Failing will complain about sad ClusterOperators, and in that case we'll double up on that messaging. But we're punting on "consolidate when multiple updateInsights complain about the same root cause" for now. And sometimes Failing will complain about other resources, such as ProgressDeadlineExceeded operator Deployments [1], and in that case the information is only flowing out through ClusterVersion, and not via the other resources we check when rendering status. The links to ClusterOperatorDegraded are because [2] folded Failing=True into ClusterOperatorDegraded alerting, although we still need to update the runbook to address that change. The *output updates are via: $ go build ./cmd/oc $ export OC_ENABLE_CMD_UPGRADE_STATUS=true $ for X in pkg/cli/admin/upgrade/status/examples/*-cv.yaml; do ./oc adm upgrade status --mock-clusterversion "${X}" > "${X/-cv.yaml/.output}"; ./oc adm upgrade status --detailed=all --mock-clusterversion "${X}" > "${X/-cv.yaml/.detailed-output}"; done [1]: https://github.com/openshift/cluster-version-operator/blob/1acac06742fb0e3e49ffe2294864007f26a7799d/lib/resourcebuilder/apps.go#L122C124-L122C148 [2]: openshift/cluster-version-operator#746

openshift-ci bot requested review from jottofar and vrutkovs February 24, 2022 20:55

openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 24, 2022

wking force-pushed the metrics-for-no-cluster-version branch from 30aa6e6 to 43f13d7 Compare February 24, 2022 21:17

wking mentioned this pull request Feb 26, 2022

Get cluster version object earlier in startup #741

Merged

wking changed the title ~~2058416: pkg/cvo/metrics: Connect ClusterVersion to ClusterOperatorDown and ClusterOperatorDegraded~~ Bug 2058416: pkg/cvo/metrics: Connect ClusterVersion to ClusterOperatorDown and ClusterOperatorDegraded Apr 12, 2022

openshift-ci bot added bugzilla/severity-low Referenced Bugzilla bug's severity is low for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. labels Apr 12, 2022

openshift-ci bot requested a review from shellyyang1989 April 12, 2022 00:52

openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 22, 2022

openshift-ci bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 21, 2022

openshift-ci bot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Dec 22, 2022

openshift-ci bot closed this Jan 21, 2023

wking reopened this Jan 25, 2023

openshift-ci bot added bugzilla/invalid-bug Indicates that a referenced Bugzilla bug is invalid for the branch this PR is targeting. and removed bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. labels Jan 25, 2023

openshift-ci bot closed this Feb 25, 2023

wking reopened this Mar 18, 2024

wking force-pushed the metrics-for-no-cluster-version branch from 74312d1 to 10849d7 Compare April 10, 2024 07:04

petr-muller approved these changes Apr 17, 2024

View reviewed changes

openshift-ci bot assigned petr-muller Apr 17, 2024

openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Apr 17, 2024

openshift-ci bot added the qe-approved Signifies that QE has signed off on this PR label Apr 18, 2024

openshift-ci bot requested a review from dis016 April 18, 2024 03:28

openshift-merge-bot bot merged commit 5e73deb into openshift:master Apr 18, 2024
11 checks passed

wking deleted the metrics-for-no-cluster-version branch April 18, 2024 15:10

wking mentioned this pull request Apr 25, 2024

OTA-1279: pkg/cli/admin/upgrade/status: Move Failing from free-form to updateInsight openshift/oc#1744

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OCPBUGS-9133: pkg/cvo/metrics: Connect ClusterVersion to ClusterOperatorDown and ClusterOperatorDegraded #746

OCPBUGS-9133: pkg/cvo/metrics: Connect ClusterVersion to ClusterOperatorDown and ClusterOperatorDegraded #746

wking commented Feb 24, 2022

openshift-ci bot commented Apr 12, 2022

openshift-bot commented Nov 21, 2022

openshift-bot commented Dec 22, 2022

openshift-bot commented Jan 21, 2023

openshift-ci bot commented Jan 21, 2023

openshift-ci bot commented Jan 21, 2023

openshift-ci bot commented Jan 25, 2023

openshift-bot commented Feb 25, 2023

openshift-ci bot commented Feb 25, 2023

openshift-ci bot commented Feb 25, 2023

wking commented Mar 18, 2024

wking commented Apr 9, 2024

wking commented Apr 10, 2024

wking commented Apr 10, 2024

petr-muller commented Apr 17, 2024

petr-muller Apr 17, 2024

wking Apr 17, 2024

openshift-ci bot commented Apr 17, 2024

openshift-ci bot commented Apr 17, 2024 •

edited

petr-muller commented Apr 17, 2024

openshift-ci bot commented Apr 17, 2024

dis016 commented Apr 17, 2024

jiajliu commented Apr 18, 2024

wking commented Apr 18, 2024

openshift-ci-robot commented Apr 18, 2024

openshift-ci-robot commented Apr 18, 2024

openshift-ci-robot commented Apr 18, 2024

openshift-bot commented Apr 18, 2024

openshift-merge-robot commented Apr 21, 2024

OCPBUGS-9133: pkg/cvo/metrics: Connect ClusterVersion to ClusterOperatorDown and ClusterOperatorDegraded #746

OCPBUGS-9133: pkg/cvo/metrics: Connect ClusterVersion to ClusterOperatorDown and ClusterOperatorDegraded #746

Conversation

wking commented Feb 24, 2022

openshift-ci bot commented Apr 12, 2022

openshift-bot commented Nov 21, 2022

openshift-bot commented Dec 22, 2022

openshift-bot commented Jan 21, 2023

openshift-ci bot commented Jan 21, 2023

openshift-ci bot commented Jan 21, 2023

openshift-ci bot commented Jan 25, 2023

openshift-bot commented Feb 25, 2023

openshift-ci bot commented Feb 25, 2023

openshift-ci bot commented Feb 25, 2023

wking commented Mar 18, 2024

wking commented Apr 9, 2024

wking commented Apr 10, 2024

wking commented Apr 10, 2024

petr-muller commented Apr 17, 2024

petr-muller Apr 17, 2024

Choose a reason for hiding this comment

wking Apr 17, 2024

Choose a reason for hiding this comment

openshift-ci bot commented Apr 17, 2024

openshift-ci bot commented Apr 17, 2024 • edited

petr-muller commented Apr 17, 2024

openshift-ci bot commented Apr 17, 2024

dis016 commented Apr 17, 2024

jiajliu commented Apr 18, 2024

wking commented Apr 18, 2024

openshift-ci-robot commented Apr 18, 2024

openshift-ci-robot commented Apr 18, 2024

openshift-ci-robot commented Apr 18, 2024

openshift-bot commented Apr 18, 2024

openshift-merge-robot commented Apr 21, 2024

openshift-ci bot commented Apr 17, 2024 •

edited