New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
move status calculation into its own controller #255
move status calculation into its own controller #255
Conversation
/assign @dgoodwin |
ee41f62
to
8336503
Compare
@@ -118,4 +123,19 @@ var ( | |||
ModeDegraded, | |||
ModeUnknown, | |||
} | |||
|
|||
// Add known new credentials for next version upgrade |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What was the rationale behind going to constants vs the actuator? Too difficult or awkward to instantiate an actuator in the status controller?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly seemed weird to instantiate the entire credentials actuator to just get a static list of objects. My first pass had me setting up the status controller to embed the actuator, but I ended up with this instead.
// Need to be able to force a status calculation here because the status | ||
// controller has no Watch that could cause it to run for this situation. | ||
// Could we annotate our secret to trigger a status calc??? | ||
status.SyncStatus(r.controllerRuntimeClient, r.logger) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Stared at this for a bit, still not sure what we should do to avoid that. Need to think on it a bit more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm really annoyed by this one. This thought occurred to me this morning: we could annotate the config CR with the current datestamp from the oidc controller???
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When this controller sees the degraded state change, annotated the CloudConfig CR with a timestamp to force a reconcile? That would put it into the normal controller flow, with error handling etc. Seems like a good idea to me.
6898026
to
ceb48d6
Compare
/test e2e-aws |
1 similar comment
/test e2e-aws |
@dgoodwin un-WIPed as I finally tested this and it looked fine to me. The working (but IMO non-ideal) syncing from the oidc controller is still in the current version of the PR. |
if existingCondition == nil { | ||
existing = append(existing, newCondition) | ||
} else { | ||
logger.WithField("statushandler", handlerName).Infof("condition already set for type %s by a previous handler, the new condition from the current handler will be accepted: %v", newCondition.Type, newCondition) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Last write wins? Could you clarify this in the function comment a little. This should probably be a warning as well as info is being lost. The suggestion above for logging every condition should help when combined with that.
if err != nil { | ||
logger.WithError(err).WithField("statushandler", handlerName).Errorf("failed to get conditions from status handler") | ||
continue | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you info log each condition we got here and the status handler it came from. Then below when we see a conflict we can log that as a warning and still know the user/admin can see which handlers reported the same condition.
rootSecret.Name = constants.VSphereCloudCredSecretName | ||
secrets = append(secrets, constants.VsphereUpcomingSecrets...) | ||
default: | ||
log.Errorf("unable to provide upcoming secrets for unknown platform: %v", platformType) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like it will be called on bare metal, so hitting this line is really not an error IMO, I'd go with info (or maybe even debug) and soften the message up a little bit.
}, | ||
}, | ||
}, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Coverage seems a little light in here, could we get something to simulate two handlers returning the same condition? Or is the handler registration making this infeasible?
aea4378
to
dcd7d1a
Compare
@joelddiaz: The following test failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
dcd7d1a
to
fabe045
Compare
/test e2e-aws |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One small typo but I think we're gtg otherwise, please squash.
// cannot watch for. So just update the config CR (which the status controller does watch) | ||
// to trigger a status recalculation. | ||
if err := r.updateConfigCRTimestamp(); err != nil { | ||
r.logger.WithError(err).Error("failed to updated config CR to trigger status caluclation") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
small calculation typo here
Rather than have each controller responsible for triggering a status calculation, move that responsiblity into its own controller. Store the upcoming list of Secrets as a shared constant/variable (for use by both the Actuator's Upgradeable() and the status controller's Secret watch list). Remove GetUpcomingSecrets() from the Actuator interface. Stop setting default ClusterOperator status condition values from the Actuator's Upgradeable() API call (it is already set in the current status calculation codepath). Update controllers to register themselves to the new status controller for status calculation callbacks. Convert previous pkg/util/status into the new status controller (now in pkg/operator/status). Remove hack of sending fake CredentialsRequests just to trigger a status sync from credentialsrequest controller (new status calculator now will watch and react appropriately). Update credentialsrequest test cases to only check for the setting of non-default conditions. Update credentialsrequest test cases to no longer need to register and deregister status callbacks. Move credentialsrequest status test cases to new status controller. Update secretannotator actuators to no longer trigger a status sync. Note: for situations where the status controller cannot watch for certain conditions that should cause a status recalculation, the controller can annotate the CCO's operator config object to add a timestamp (which will in turn trigger a status calc). Important changes: Ensure the managementState field is set on our CCO operator config (otherwise we run into problems trying to annotate the operator config). The operator config should not allow an empty string for the managementState field, but it absolutely is not set at all (which shows up as being set to the empty string after doing a Get() of the operator config). Detect this, and assume that the empty string means a managementState of "Managed", and send back "Managed" on Update().
fabe045
to
dbfdb78
Compare
/lgtm Nice work, looking forward to this as I have mass conflicts every time my creds rotate. :) |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: dgoodwin, joelddiaz The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/retest Please review the full test history for this PR and help us cut down flakes. |
Rather than have each controller responsible for triggering a status calculation, move that responsiblity into its own controller.
Note: for situations where the status controller cannot watch for certain conditions that should cause a status recalculation, the controller can annotate the CCO's operator config object to add a timestamp (which will in turn trigger a status calc).
Also note: Ensure the managementState field is set on our CCO operator config (otherwise we run into problems trying to annotate the operator config). The operator config should not allow an empty string for the managementState field, but it absolutely is not set at all (which shows up as being set to the empty string after doing a Get() of the operator config). Detect this, and assume that the empty string means a managementState of "Managed", and send back "Managed" on Update().