-
Notifications
You must be signed in to change notification settings - Fork 622
✨ Rosa Config implementaiton #5499
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
✨ Rosa Config implementaiton #5499
Conversation
Adding the "do-not-merge/release-note-label-needed" label because no release-note block was detected, please follow our release note process to remove it. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Hi @PanSpagetka. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
11dac0b
to
6056618
Compare
1587db4
to
9121ec2
Compare
07b73a3
to
097252f
Compare
Dockerfile
Outdated
@@ -28,12 +28,17 @@ WORKDIR /workspace | |||
# Copy the Go Modules manifests | |||
COPY go.mod go.mod | |||
COPY go.sum go.sum | |||
COPY ./rosa /workspace/rosa |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are we adding this ? what is the rosa file
Dockerfile
Outdated
# Cache deps before building and copying source so that we don't need to re-download as much | ||
# and so that source changes don't invalidate our downloaded layer | ||
RUN --mount=type=cache,target=/root/.local/share/golang \ | ||
--mount=type=cache,target=/go/pkg/mod \ | ||
go mod download | ||
|
||
# RUN go mod download |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need for this line
PROJECT
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you just add the RosaRoleConfig item without changing the order of other items
config/default/tmp.yaml
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we need this file ?
config/crd/kustomization.yaml
Outdated
@@ -38,6 +39,7 @@ patchesStrategicMerge: | |||
- patches/webhook_in_awsmanagedcontrolplanes.yaml | |||
- patches/webhook_in_eksconfigs.yaml | |||
- patches/webhook_in_eksconfigtemplates.yaml | |||
#- patches/webhook_in_rosaroleconfigs.yaml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe you need to uncomment this line
} | ||
} | ||
|
||
if scope.RosaRoleConfig.Status.OIDCID == "" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe you should check/set the RosaRoleConfig condition first , then get the oidc using OCM client if it is not exist then create it.
Same applied for account-roles and operator-roles
} | ||
} | ||
|
||
err = r.deleteOperatorRoles(ocmClient, awsClient, scope.RosaRoleConfig.Spec.AccountRoleConfig.Prefix) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to delete the operator roles before the oidc-provider ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope, a changed it so it matches reverse creation order.
return ocmClient.DeleteOidcConfig(oidcConfigID) | ||
} | ||
|
||
type reporter struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please move this to another file , better to be under pkg/.../rosa
097252f
to
0c9fa93
Compare
b3aded3
to
23fa4cb
Compare
23fa4cb
to
2515be2
Compare
5997fac
to
3f9bd3a
Compare
@@ -179,6 +183,24 @@ func (r *ROSAControlPlane) validateExternalAuthProviders() *field.Error { | |||
return nil | |||
} | |||
|
|||
func (r *ROSAControlPlane) validateRosaRoleConfig() *field.Error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if the logic here is valid, hasDirectRoleFields will be true even if one condition is true and others are false. Ex; r.Spec.OIDCID can be true but all others fields can be false that case hasDirectRoleFields will be true.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is intended, but we need both &&
and ||
of all values to make behavior you described in comment below.
r.Spec.RolesRef.NetworkARN != "" || r.Spec.RolesRef.KubeCloudControllerARN != "" || r.Spec.RolesRef.NodePoolManagementARN != "" || | ||
r.Spec.RolesRef.ControlPlaneOperatorARN != "" || r.Spec.RolesRef.KMSProviderARN != "" | ||
|
||
if hasRosaRoleConfigRef && hasDirectRoleFields { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets make the logic as below;
if RosaRoleConfigRef is set, we use the roleConfigRef to get all roles (ignore all other role fields even if set) just log warning.
if RosaRoleConfigRef not set and all other Role fields are set, we use the roles fields
if RosaRoleConfigRef not set and some Roles fields are missing we raise error
if RosaRoleConfigRef not set and all Roles fields are missing we raise error
|
||
conditions.MarkTrue(rosaScope.ControlPlane, rosacontrolplanev1.ROSARoleConfigReadyCondition) | ||
|
||
// Update spec fields from RosaRoleConfig |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This logic is not correct, If the user is already setting those fields value, we are updating those values. I believe we should define internal roleConfig in rosacontrolPlane scope and then check which roles to use based on availability
OperatorRoleConfig OperatorRoleConfig `json:"operatorRoleConfig"` | ||
OIDCConfig OIDCConfig `json:"oidcConfig"` | ||
IdentityRef *infrav1.AWSIdentityReference `json:"identityRef,omitempty"` | ||
Region string `json:"region,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing description and kbuilder tags.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as we discussed no need for region
Region string `json:"region,omitempty"` | ||
Prefix string `json:"prefix"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing description and kbuilder tags.
// User-defined prefix for generated AWS operator policies. | ||
// +kubebuilder:validation:MaxLength:=4 | ||
// +kubebuilder:validation:Required | ||
Prefix string `json:"prefix"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prefix must be immutable cause changing the prefix will create new role
// User-defined prefix for all generated AWS resources | ||
// +kubebuilder:validation:MaxLength:=4 | ||
// +kubebuilder:validation:Required | ||
Prefix string `json:"prefix"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prefix must be immutable cause changing the prefix will create new roles
) | ||
|
||
// SetupWebhookWithManager will setup the webhooks for the ROSARoleConfig. | ||
func (r *ROSARoleConfig) SetupWebhookWithManager(mgr ctrl.Manager) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This routine is never called from main.go
, which is why I'm getting:
$ kubectl apply -f rosa-roleconfig-01.yaml
Error from server (InternalError): error when creating "rosa-roleconfig-01.yaml": Internal error occurred: failed calling webhook "default.rosaroleconfig.infrastructure.cluster.x-k8s.io": failed to call webhook: the server could not find the requested resource
With this change in place the above error goes away:
$ git diff
diff --git a/main.go b/main.go
index c65ff7356..6348fd0de 100644
--- a/main.go
+++ b/main.go
@@ -285,6 +285,11 @@ func main() {
setupLog.Error(err, "unable to create webhook", "webhook", "ROSAMachinePool")
os.Exit(1)
}
+
+ if err := (&expinfrav1.ROSARoleConfig{}).SetupWebhookWithManager(mgr); err != nil {
+ setupLog.Error(err, "unable to create webhook", "webhook", "ROSARoleConfig")
+ os.Exit(1)
+ }
}
if err = (&expcontrollers.ROSARoleConfigReconciler{
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although I don't quite understand the point of this webhook, since it's currently not doing anything.
err = r.createOIDCConfig(roleConfig, scope, ocmClient) | ||
if err != nil { | ||
conditions.MarkFalse(scope.RosaRoleConfig, expinfrav1.RosaRoleConfigReadyCondition, expinfrav1.RosaRoleConfigReconciliationFailedReason, clusterv1.ConditionSeverityError, "Failed to create OIDC Config: %v", err) | ||
return ctrl.Result{RequeueAfter: time.Second * 60}, fmt.Errorf("failed to OICD Config: %w", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The above line (and all the other returns with non empty result and non-nil error) will produce the following log entry:
I0722 11:52:38.495821 15 controller.go:345] "Warning: Reconciler returned both a non-zero result and a non-nil error. The result will always be ignored if the error is non-nil and the non-nil error causes requeuing with exponential backoff. For more details, see: https://pkg.go.dev/sigs.k8s.io/controller-runtime/pkg/reconcile#Reconciler" controller="rosaroleconfig"
// - ocmToken: eyJhbGciOiJIUzI1NiIsI.... | ||
// - ocmApiUrl: Optional, defaults to 'https://api.openshift.com' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove those 2 lines. Currently the secret is having other info to authenticate
r.Spec.RolesRef.NetworkARN != "" && r.Spec.RolesRef.KubeCloudControllerARN != "" && r.Spec.RolesRef.NodePoolManagementARN != "" && | ||
r.Spec.RolesRef.ControlPlaneOperatorARN != "" && r.Spec.RolesRef.KMSProviderARN != "" | ||
|
||
if hasRosaRoleConfigRef && hasAnyDirectRoleFields { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure why you making this check; if the rosaRoleConfig is defined we use it ignore the other roles fields check here
@@ -179,6 +183,29 @@ func (r *ROSAControlPlane) validateExternalAuthProviders() *field.Error { | |||
return nil | |||
} | |||
|
|||
func (r *ROSAControlPlane) validateRosaRoleConfig() *field.Error { | |||
hasRosaRoleConfigRef := r.Spec.RosaRoleConfigRef != nil | |||
hasAnyDirectRoleFields := r.Spec.OIDCID != "" || r.Spec.InstallerRoleARN != "" || r.Spec.SupportRoleARN != "" || r.Spec.WorkerRoleARN != "" || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need to check with || having 1 field missing should raise error if the rosaRoleConfig not set
@@ -179,6 +183,29 @@ func (r *ROSAControlPlane) validateExternalAuthProviders() *field.Error { | |||
return nil | |||
} | |||
|
|||
func (r *ROSAControlPlane) validateRosaRoleConfig() *field.Error { | |||
hasRosaRoleConfigRef := r.Spec.RosaRoleConfigRef != nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hasRosaRoleConfigRef := r.Spec.RosaRoleConfigRef != nil | |
if r.Spec.RosaRoleConfigRef != nil { | |
return nil | |
} |
return field.Invalid(field.NewPath("spec.rosaRoleConfigRef"), r.Spec.RosaRoleConfigRef, "rosaRoleConfigRef and direct role fields (oidcID, installerRoleARN, supportRoleARN, workerRoleARN, rolesRef) are mutually exclusive") | ||
} | ||
|
||
if !hasRosaRoleConfigRef && !hasAllDirectRoleFields { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if !hasRosaRoleConfigRef && !hasAllDirectRoleFields { | |
if !hasAllDirectRoleFields { | |
// raise error here specifying which fields are missing | |
// OR do check for every field once it is missing return field.invalid | |
} |
Region string `json:"region,omitempty"` | ||
// Prefix is the prefix for the OIDC config. | ||
// +immutable | ||
Prefix string `json:"prefix"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lets make all the prefix fields have limit length 4
IdentityRef *infrav1.AWSIdentityReference `json:"identityRef,omitempty"` | ||
// Region is the AWS region for the OIDC config. | ||
// +immutable | ||
Region string `json:"region,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The region is defined above for all accountRole, operatorRole and oidcConfig
} | ||
|
||
if scope.RosaRoleConfig.Status.OIDCID == "" { | ||
err = r.createOIDCConfig(roleConfig, scope, ocmClient) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the logic need to change as follow;
First you get the oidcConfig using ocm client;
1- if the oidcConfig exist we set the oidcConfig status info and condition
2- if the oidcConfig not exist, we create the oidcConfig then set the oidcConfig stats info and condition
Same applied to accountRoles, operatorRoles and oidcProvider
rosacontrolplanev1.ROSARoleConfigNotReadyReason, | ||
clusterv1.ConditionSeverityWarning, | ||
"RosaRoleConfig %s/%s is not ready", rosaScope.ControlPlane.Namespace, rosaScope.ControlPlane.Spec.RosaRoleConfigRef.Name) | ||
return ctrl.Result{}, fmt.Errorf("RosaRoleConfig %s/%s is not ready", rosaScope.ControlPlane.Namespace, rosaScope.ControlPlane.Spec.RosaRoleConfigRef.Name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be more suitable to log an info message here and just requeue without returning the error.
b600808
to
fedd967
Compare
} | ||
|
||
// OperatorRoleConfig defines cluster-specific operator IAM roles based on your cluster configuration. | ||
type OperatorRoleConfig struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please follow the proposal definition for the OperatorRoleConfig. OperatorRoleConfig must have the option to assign the oidc-Id AND it should be mutual exclusive with the oidcConfig->createManagedOIDC
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the OidcConfig is missing in the ROSARoleConfig API .
@@ -907,7 +958,7 @@ func validateControlPlaneSpec(ocmClient rosa.OCMClient, rosaScope *scope.ROSACon | |||
return "", nil | |||
} | |||
|
|||
func buildOCMClusterSpec(controlPlaneSpec rosacontrolplanev1.RosaControlPlaneSpec, creator *rosaaws.Creator) (ocm.Spec, error) { | |||
func buildOCMClusterSpec(controlPlaneSpec rosacontrolplanev1.RosaControlPlaneSpec, roleConfig *expinfrav1.ROSARoleConfig, creator *rosaaws.Creator) (ocm.Spec, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ExternalAuth need to be defined as well in the ROSARoleConfig and assigned to the cluster spec similar to here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have discussed this on meeting and we dont need to include ExternalAuth
in RosaRoleConfig.
fedd967
to
a428b51
Compare
a428b51
to
c2e82ca
Compare
4761247
to
fe144c5
Compare
/test pull-cluster-api-provider-aws-e2e-blocking |
1 similar comment
/test pull-cluster-api-provider-aws-e2e-blocking |
// +optional | ||
// +immutable | ||
SharedVPCConfig SharedVPCConfig `json:"sharedVPCConfig,omitempty"` | ||
// OIDCID is the ID of the OIDC config that will be used to create the operator roles. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// OIDCID is the ID of the OIDC config that will be used to create the operator roles. | |
// OIDCID is the ID of the OIDC config that will be used to create the operator roles. A managed OIDC-provider will be created if the OIDCID not specified |
OperatorRoleConfig OperatorRoleConfig `json:"operatorRoleConfig"` | ||
OIDCConfig OIDCConfig `json:"oidcConfig"` | ||
IdentityRef *infrav1.AWSIdentityReference `json:"identityRef,omitempty"` | ||
Region string `json:"region,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as we discussed no need for region
} | ||
|
||
oidcID := scope.RosaRoleConfig.Status.OIDCID | ||
err = r.deleteOIDCProvider(ocmClient, awsClient, oidcID) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't delete the oidc-provider if the user set it under the spec.operatorRole.OIDCConfig
fe144c5
to
60be2ee
Compare
/retest-required |
/test pull-cluster-api-provider-aws-test |
1 similar comment
/test pull-cluster-api-provider-aws-test |
60be2ee
to
bc8e7af
Compare
/test pull-cluster-api-provider-aws-apidiff-main |
@PanSpagetka: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Based on proposal #5451
Adding RosaRoleConfig API with implementation. that should create Account/Operator roles and OIDC config/provider necessary to create ROSA cluster.
We need to move RosaMachinePoolAutoScaling definition to controlplane, because otherwise there would be circular dependency.
What type of PR is this?
/kind feature
What this PR does / why we need it:
Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):Fixes #
Special notes for your reviewer:
Checklist:
Release note: