New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
vsphere creds checking #1186
vsphere creds checking #1186
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great.
Given what's going on don't worry about these changes, I or someone on the team will get them.
if authWorking { | ||
status = corev1.ConditionFalse | ||
reason = platformAuthSuccessReason | ||
message = "Platform credentails passed authentication check" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
credentails/credentials
authCondition := controllerutils.FindClusterDeploymentCondition(cd.Status.Conditions, hivev1.AuthenticationFailureClusterDeploymentCondition) | ||
if authCondition != nil && authCondition.Status == corev1.ConditionTrue { | ||
cdLog.Info("Skipping provision while platform credentials authentication is failing.") | ||
// Periodically retry??? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be really nice to just requeue every 5 min and be done with it, but there's no reason to think that would help. It probably needs to be done right, watch secrets for changes to a platform secret, requeue the CD if we spot one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We now return an error when credentials are not valid.
@@ -313,6 +313,9 @@ const ( | |||
|
|||
// DeprovisionLaunchErrorCondition is set when a cluster deprovision fails to launch. | |||
DeprovisionLaunchErrorCondition ClusterDeploymentConditionType = "DeprovisionLaunchError" | |||
|
|||
// AuthenticationFailureCondition is true when platform credentials cannot be used because of authentication failure | |||
AuthenticationFailureClusterDeploymentCondition ClusterDeploymentConditionType = "AuthenticationFailure" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think there's any opportunity for better synergy with #1109? I can't see any immediately.
@@ -57,6 +57,9 @@ const ( | |||
defaultRequeueTime = 10 * time.Second | |||
maxProvisions = 3 | |||
|
|||
platformAuthFailureResason = "PlatformAuthError" | |||
platformAuthSuccessReason = "PlatformAuthWorking" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest PlatformAuthSuccess instead of Working.
message = "Platform credentails passed authentication check" | ||
} else { | ||
status = corev1.ConditionTrue | ||
reason = platformAuthFailureResason |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Resason.
@@ -271,6 +268,10 @@ type ReconcileClusterDeployment struct { | |||
// for the remote cluster's API server | |||
remoteClusterAPIClientBuilder func(cd *hivev1.ClusterDeployment) remoteclient.Builder | |||
|
|||
// validateCredentialsForClusterDeployment is what this controller will call to validate | |||
// that the platform creds are good (used for testing) | |||
validateCredentialsForClusterDeployment func(client.Client, *hivev1.ClusterDeployment, log.FieldLogger) (bool, error) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I kinda wish this could be a separate controller, but obviously we don't want to process with launching an install here until it's been run. I know we typically wouldn't see a condition false unless it had previously been true, but would it make sense in this case? Don't launch unless we see an AuthenticationFailure condition set to False? It would be nice to encapsulate the responsibility for setting this across cloud providers into it's own controller.
Also please add this new condition to the array in provision_underway_collector.go so it's reported in the alerts if this happens. |
Fix the immediate problem where passing bad credentials for a vSphere installation means that deleting the ClusterDeployment doesn't get deleted (b/c the deprovision fails (b/c of the bad creds)). Test the creds, and set a condition so that we can avoid ever even creating a ClusterProvision when we detect bad credentials. This means deprovision works b/c the provision was never attempted. Right now we only check vSphere credentials, but more platforms can (and should) be added. Add test cases to validate this.
f6a63a9
to
c743702
Compare
/test e2e |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: dgoodwin, joelddiaz The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/test e2e |
1 similar comment
/test e2e |
/retest Please review the full test history for this PR and help us cut down flakes. |
1 similar comment
/retest Please review the full test history for this PR and help us cut down flakes. |
Fix the immediate problem where passing bad credentials for a vSphere installation means that deleting the ClusterDeployment doesn't get deleted (b/c the deprovision fails (b/c of the bad creds)).
Test the creds, and set a condition so that we can avoid ever even creating a ClusterProvision when we detect bad credentials. This means deprovision works b/c the provision was never attempted. Right now we only check vSphere credentials, but more platforms can (and should) be added.
Add test cases to validate this.