Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CFE-387: Added support for sts clusters #51

Merged
merged 5 commits into from
Apr 22, 2022

Conversation

thejasn
Copy link
Contributor

@thejasn thejasn commented Apr 18, 2022

No description provided.

@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 18, 2022
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Apr 18, 2022

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 18, 2022
@thejasn thejasn marked this pull request as ready for review April 19, 2022 09:53
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 19, 2022
@openshift-ci openshift-ci bot requested review from arjunrn and Miciah April 19, 2022 09:53
Copy link
Contributor

@arjunrn arjunrn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still needs some work.

@@ -50,6 +51,8 @@ const (
controllerWebhookPort = 9443
// common prefix for all resource of an operand
controllerResourcePrefix = "aws-load-balancer-controller"
// controllerReEnqueueDuration is the delay to re-enqueue.
controllerReEnqueueDuration = time.Second * 30
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be precise this value is used when the secret is missing, for some other retry condition it could be something else. So a better name would be secretMissingReEnqueueDuration.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@@ -135,6 +138,20 @@ func (r *AWSLoadBalancerControllerReconciler) Reconcile(ctx context.Context, req
return ctrl.Result{}, fmt.Errorf("failed to ensure CredentialsRequest for AWSLoadBalancerController %q: %w", req.Name, err)
}

secretProvisioned, err := r.ensureCredentialsRequestSecret(ctx, credentialsRequest)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we are not ensuring the credentials secret here. We are just checking if it is available. So something like provisioned, err := r.credentialsSecretProvisioned() would be a better fit. Also if you get a IsNotFound error then just return false. It keeps the outer calling function cleaner.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also if you get a IsNotFound error then just return false. It keeps the outer calling function cleaner.

It doesn't help much, since we still need to handle other types of errors (unless we suppress those errors)

if err == nil && !secretProvisioned {
       // update status and re-enqueue
} else if err != nil {
      // return non-NotFound errors
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All other errors we simply return an error even if secretProvisioned is true(because we never expect secretProvisioned to be true and for there to be an error).

provisioned, err := r.credentialsSecretProvisioned()
if err!=nil {
   // this is only possible when there is some error other than not found
  // and we want to immediately retry because it's probably transient
  return err
}

r.updateStatus(..provisioned)

// retrying after delay to ensure secret provisioning.
return ctrl.Result{RequeueAfter: controllerReEnqueueDuration}, nil
} else if err != nil {
return ctrl.Result{}, fmt.Errorf("failed to ensure AWSLoadBalancerController %q service account: %w", req.Name, err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This error message is wrong.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Name: boundSATokenVolumeName,
VolumeSource: corev1.VolumeSource{
Projected: &corev1.ProjectedVolumeSource{
DefaultMode: aws.Int32(420),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make this a constant. Also you shouldn't be using the aws pointer helper function. Kubernetes has it's own pointer helper called pointer.Int32().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Sources: []corev1.VolumeProjection{{
ServiceAccountToken: &corev1.ServiceAccountTokenProjection{
Audience: "openshift",
ExpirationSeconds: aws.Int64(3600),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same goes for this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Sources: []corev1.VolumeProjection{{
ServiceAccountToken: &corev1.ServiceAccountTokenProjection{
Audience: "openshift",
ExpirationSeconds: aws.Int64(3600),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change this please. Also do we have tests covering any change we make to the bound service account token?

@@ -63,6 +64,7 @@ func desiredAWSLoadBalancerServiceAccount(namespace string, controller *albo.AWS
Namespace: namespace,
Name: fmt.Sprintf("%s-%s", controllerResourcePrefix, controller.Name),
},
AutomountServiceAccountToken: aws.Bool(true),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that we are using a non default value we have to update the functions which verifies if the service account needs an update.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, my bad, I forgot this. Added.

CredentialsRequestAvailableCondition = "CredentialsRequestAvailable"
DeploymentAvailableCondition = "DeploymentAvailable"
DeploymentUpgradingCondition = "DeploymentUpgrading"
CredentialsRequestAvailable = "CredentialsRequestAvailable"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This constant should still be called CredentialsRequestAvailableCondition.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@@ -34,42 +39,25 @@ func (r *AWSLoadBalancerControllerReconciler) updateControllerStatus(ctx context
return nil
}

func credentialRequestsConditions(cr *cco.CredentialsRequest, generation int64) []metav1.Condition {
func credentialRequestsConditions(cr *cco.CredentialsRequest, secretProvisioned bool, generation int64) []metav1.Condition {
var conditions []metav1.Condition
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see the credentials request being used anywhere. Do we still have to pass it to this function.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also the condition name now doesn't make sense. We should call it something like CredentialsSecretAvailable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see the credentials request being used anywhere. Do we still have to pass it to this function.

Changed argument to be just the secret name.

Also the condition name now doesn't make sense. We should call it something like CredentialsSecretAvailable.

Fixed

docs/sts/sts.md Outdated Show resolved Hide resolved
docs/sts/sts.md Outdated Show resolved Hide resolved
docs/sts/sts.md Outdated
Comment on lines 14 to 15
### Extract and prepare the `ccoctl` binary

1. Obtain the OpenShift Container Platform release image:

```bash
RELEASE_IMAGE=$(./openshift-install version | awk '/release image/ {print $3}')
```

2. Get the CCO container image from the OpenShift Container Platform release image:

```bash
CCO_IMAGE=$(oc adm release info --image-for='cloud-credential-operator' $RELEASE_IMAGE)
```

3. Extract the ccoctl binary from the CCO container image within the OpenShift
Container Platform release image:

```bash
oc image extract $CCO_IMAGE --file="/usr/bin/ccoctl" -a ~/.pull-secret
```

4. Change the permissions to make ccoctl executable:

```bash
chmod 775 ccoctl
```

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's just link to the existing documentation where they describe how to get ccoctl.

docs/sts/sts.md Outdated
3. Apply the secrets to your cluster:

```bash
ls <path_to_ccoctl_output_dir>/manifests/*-credentials.yaml | xargs -I{} oc apply -f {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

<path_to_ccoctl_output_dir> is not defined anywhere.

@@ -78,6 +78,19 @@ func (r *AWSLoadBalancerControllerReconciler) ensureCredentialsRequest(ctx conte
return current, nil
}

func (r *AWSLoadBalancerControllerReconciler) credentialsSecretProvisioned(ctx context.Context, cr *cco.CredentialsRequest) (bool, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function should return a false, nil when the error is IsNotFound so that the caller doesn't have to handle it.

@@ -24,6 +24,9 @@ const (
appLabelName = "app.kubernetes.io/name"
appName = "aws-load-balancer-operator"
appInstanceName = "app.kubernetes.io/instance"
// awsSDKLoadConfigName is the name of the environment variable which enables shared configs.
// Without which certain fields in the config aren't set. Eg: role_arn.
awsSDKLoadConfigName = "AWS_SDK_LOAD_CONFIG"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you verify that this is actually required?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, https://github.com/aws/aws-sdk-go/blob/v1.43.42/aws/session/shared_config.go#L332 doesn't bind role_arn without setting exOpts and AWS_SDK_LOAD_CONFIG sets that value.

@@ -135,6 +138,23 @@ func (r *AWSLoadBalancerControllerReconciler) Reconcile(ctx context.Context, req
return ctrl.Result{}, fmt.Errorf("failed to ensure CredentialsRequest for AWSLoadBalancerController %q: %w", req.Name, err)
}

secretProvisioned, err := r.credentialsSecretProvisioned(ctx, credentialsRequest)
if err != nil {
return ctrl.Result{}, fmt.Errorf("failed to ensure credentialsrequest secret %q for AWSLoadBalancerController %q: %w", credentialsRequest.Spec.SecretRef.Name, req.Name, err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message should be different. Probably failed to verify if credentials secret %q for AWSLoadBalancerController %q has been provisioned.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@@ -135,6 +138,23 @@ func (r *AWSLoadBalancerControllerReconciler) Reconcile(ctx context.Context, req
return ctrl.Result{}, fmt.Errorf("failed to ensure CredentialsRequest for AWSLoadBalancerController %q: %w", req.Name, err)
}

secretProvisioned, err := r.credentialsSecretProvisioned(ctx, credentialsRequest)
if err != nil {
return ctrl.Result{}, fmt.Errorf("failed to ensure credentialsrequest secret %q for AWSLoadBalancerController %q: %w", credentialsRequest.Spec.SecretRef.Name, req.Name, err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message should be different. Probably failed to verify if credentials secret %q for AWSLoadBalancerController %q has been provisioned.

@arjunrn
Copy link
Contributor

arjunrn commented Apr 21, 2022

/lgtm
/approve

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Apr 21, 2022
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Apr 21, 2022

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: arjunrn, thejasn

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@thejasn
Copy link
Contributor Author

thejasn commented Apr 21, 2022

/assign @lihongan

@lihongan
Copy link

/label qe-approved

@openshift-ci openshift-ci bot added the qe-approved Signifies that QE has signed off on this PR label Apr 22, 2022
@arjunrn
Copy link
Contributor

arjunrn commented Apr 22, 2022

/label docs-approved
/label px-approved

@openshift-ci openshift-ci bot added docs-approved Signifies that Docs has signed off on this PR px-approved Signifies that Product Support has signed off on this PR labels Apr 22, 2022
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Apr 22, 2022

@thejasn: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@openshift-merge-robot openshift-merge-robot merged commit 03659f5 into openshift:main Apr 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. docs-approved Signifies that Docs has signed off on this PR lgtm Indicates that a PR is ready to be merged. px-approved Signifies that Product Support has signed off on this PR qe-approved Signifies that QE has signed off on this PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants