New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor ClusterOperator status updates #213
refactor ClusterOperator status updates #213
Conversation
152dd10
to
487bb16
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks pretty good to me, just a couple comments.
co.Status.RelatedObjects = relatedObjects | ||
co.Status.Versions = computeClusterOperatorVersions() | ||
|
||
operatorIsDisabled, err := utils.IsOperatorDisabled(client, logger) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we move this condition up before "defaultUnsetConditions"? It feels a little strange to get all status handler conditions, then set all defaults, then change one of the defaults to non-default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the reason I put it here is so that i don't have to do the little dance checking to see if the condition exists and if it doesn't create/append it, etc.
though you did make me think of an issue with this. we shouldn't just blindly overwrite the reason if the condition was set False
by a handler. i'll fix that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
plus it isn't changing the Status
on the condition, just the Reason
and Message
.
487bb16
to
cdf0ce5
Compare
/test e2e-aws-upgrade |
cdf0ce5
to
31c9526
Compare
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: dgoodwin, sjenning The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
The cloud-cred operator hasn't set explicit happy-state reasons since [1]: Because, with the new interface, controllers only report non-default conditions, the reasons for being in the expected steady state for the condition is no longer reported (i.e. ReconcilingComplete and NoCredentialsFailing). Having an explicit happy reason makes it easier to do things like: sort_desc(count by (reason) (cluster_operator_conditions{name="cloud-credential",condition="Degraded"})) without having to worry about empty reason being implicitly happy, or some weird, sad condition that just happened to not get a reason string. These choices are fairly common. For example, in a recent 4.10 CI job [2]: $ curl -s https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/logs/periodic-ci-openshift-release-master-nightly-4.10-e2e-gcp/1459283637437468672/artifacts/e2e-gcp/gather-extra/artifacts/clusteroperators.json | jq -r '.items[].status.conditions[] | .reason + ": " + .message' | sort | uniq -c | sort -n | tail 1 UnsupportedPlatform: Nothing to do on this Platform 2 : Deployed 0.19.0 2 : Samples installation successful at 4.10.0-0.nightly-2021-11-12-221241 2 AsExpected: NodeInstallerProgressing: 3 nodes are at revision 7 2 AsExpected: StaticPodsAvailable: 3 nodes are active; 3 nodes are at revision 7 2 OperatorAvailable: Available release version: 4.10.0-0.nightly-2021-11-12-221241 4 AsExpected: NodeControllerDegraded: All master nodes are ready 13 AsExpected: 28 : 34 AsExpected: All is well [1]: openshift#213 (comment) [2]: https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-nightly-4.10-e2e-gcp/1459283637437468672
The cloud-cred operator hasn't set explicit happy-state reasons since [1]: Because, with the new interface, controllers only report non-default conditions, the reasons for being in the expected steady state for the condition is no longer reported (i.e. ReconcilingComplete and NoCredentialsFailing). Having an explicit happy reason makes it easier to do things like: sort_desc(count by (reason) (cluster_operator_conditions{name="cloud-credential",condition="Degraded"})) without having to worry about empty reason being implicitly happy, or some weird, sad condition that just happened to not get a reason string. These choices are fairly common. For example, in a recent 4.10 CI job [2]: $ curl -s https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/logs/periodic-ci-openshift-release-master-nightly-4.10-e2e-gcp/1459283637437468672/artifacts/e2e-gcp/gather-extra/artifacts/clusteroperators.json | jq -r '.items[].status.conditions[] | .reason + ": " + .message' | sort | uniq -c | sort -n | tail 1 UnsupportedPlatform: Nothing to do on this Platform 2 : Deployed 0.19.0 2 : Samples installation successful at 4.10.0-0.nightly-2021-11-12-221241 2 AsExpected: NodeInstallerProgressing: 3 nodes are at revision 7 2 AsExpected: StaticPodsAvailable: 3 nodes are active; 3 nodes are at revision 7 2 OperatorAvailable: Available release version: 4.10.0-0.nightly-2021-11-12-221241 4 AsExpected: NodeControllerDegraded: All master nodes are ready 13 AsExpected: 28 : 34 AsExpected: All is well [1]: openshift#213 (comment) [2]: https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-nightly-4.10-e2e-gcp/1459283637437468672
@dgoodwin @joelddiaz @derekwaynecarr
This PR refactors the code responsible for updating the
cloud-credential
ClusterOperator
status.Previous to the recent additions of the
awspodidentity
andoidcdiscoveryendpoint
controllers, the only controller whose status effected theClusterOperator
status was thecredentialsrequest
controller and thus, the code to modify the status conditions was authoritatively controlled there.However, with the new controllers, there are now 3 controllers whose state should be reflected in the status conditions.
The PR seeks to pull the code responsible for reconciling the
ClusterOperator
status out of thecredentialsrequest
controller so that other controllers can also register asStatusHandlers
that can impact the status conditions and related objects.This PR is largely as no-op, only seeking to separate the status reporting code, creating an interface that controllers can use, and switch the
credentialsrequest
controller to use that interface. Because, with the new interface, controllers only report non-default conditions, the reasons for being in the expected steady state for the condition is no longer reported (i.e.ReconcilingComplete
andNoCredentialsFailing
)