Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OSSM-5730: Allow creating 2 cluster-wide SMCPs when one of them is a gateway controller #1585

Closed
wants to merge 32 commits into from

Conversation

jewertow
Copy link
Member

@jewertow jewertow commented Jan 22, 2024

This change aims to allow installing SMCP as a gateway controller and another one as a standard service mesh at the same time. What's more, service mesh should be allowed to be cluster-wide, even though the gateway controller is also cluster-wide, because if gateway controller would be pre-installed in the future, then cluster-wide SMCP could not be installed at all. Cluster-wide service mesh still makes sense, because users can narrow its scope using discovery selectors. There is a risk of overwriting istio-ca-root-cert config maps, so I added a validation which makes sure that 2 SMCPs can't be installed with the default config map name. Beside that we should require customizing the GatewayClass name, but it could be implemented in a follow-up.

Signed-off-by: Jacek Ewertowski <jewertow@redhat.com>
…troller

Signed-off-by: Jacek Ewertowski <jewertow@redhat.com>
Signed-off-by: Jacek Ewertowski <jewertow@redhat.com>
…bles

Signed-off-by: Jacek Ewertowski <jewertow@redhat.com>
Signed-off-by: Jacek Ewertowski <jewertow@redhat.com>
Signed-off-by: Jacek Ewertowski <jewertow@redhat.com>
Copy link

openshift-ci bot commented Jan 22, 2024

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@jewertow
Copy link
Member Author

/test all

@openshift-ci openshift-ci bot added the size/L label Jan 22, 2024
Signed-off-by: Jacek Ewertowski <jewertow@redhat.com>
Signed-off-by: Jacek Ewertowski <jewertow@redhat.com>
Signed-off-by: Jacek Ewertowski <jewertow@redhat.com>
Signed-off-by: Jacek Ewertowski <jewertow@redhat.com>
Signed-off-by: Jacek Ewertowski <jewertow@redhat.com>
Signed-off-by: Jacek Ewertowski <jewertow@redhat.com>
… already exists

Signed-off-by: Jacek Ewertowski <jewertow@redhat.com>
Signed-off-by: Jacek Ewertowski <jewertow@redhat.com>
Signed-off-by: Jacek Ewertowski <jewertow@redhat.com>
Signed-off-by: Jacek Ewertowski <jewertow@redhat.com>
@openshift-ci openshift-ci bot added size/XL and removed size/L labels Jan 30, 2024
…ayController

Signed-off-by: Jacek Ewertowski <jewertow@redhat.com>
…ap names

Signed-off-by: Jacek Ewertowski <jewertow@redhat.com>
Signed-off-by: Jacek Ewertowski <jewertow@redhat.com>
Signed-off-by: Jacek Ewertowski <jewertow@redhat.com>
Signed-off-by: Jacek Ewertowski <jewertow@redhat.com>
Signed-off-by: Jacek Ewertowski <jewertow@redhat.com>
Signed-off-by: Jacek Ewertowski <jewertow@redhat.com>
@jewertow jewertow marked this pull request as ready for review January 31, 2024 22:31
Signed-off-by: Jacek Ewertowski <jewertow@redhat.com>
Signed-off-by: Jacek Ewertowski <jewertow@redhat.com>
Signed-off-by: Jacek Ewertowski <jewertow@redhat.com>
Signed-off-by: Jacek Ewertowski <jewertow@redhat.com>
Signed-off-by: Jacek Ewertowski <jewertow@redhat.com>
Copy link
Contributor

@dgn dgn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks Jacek, especially for the good test coverage! I only had time for a first pass, will give it a more thorough look tomorrow. we don't need to allow multi-tenant gateway-controller modes, which simplifies the logic quite a bit; I left some comments to that extent

if currentSmcp.IsClusterScoped() {
// allow cluster-wide gateway controller when another cluster-wide non gateway controller already exists
// this is the case where openshift-ingress controller and cluster-wide mesh co-exist
if (newSmcp.IsGatewayController() && !currentSmcp.IsGatewayController()) || (!newSmcp.IsGatewayController() && currentSmcp.IsGatewayController()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here, check negatives first, then happy path

if newSmcp.IsGatewayController() == currentSmcp.IsGatewayController() {
    return append(allErrors, otherSmcpExists)
}
return append(allErrors, validateOverlappingCaCertConfigMapNames(meta, newSmcp, &smcps, allErrors)...)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, done.

}
}
if len(smcps.Items) > 1 {
if newSmcp.IsGatewayController() && countGatewayControllers(smcps.Items) > 1 ||
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how would countGatewayControllers() > 1 ever happen? I'd say if len(smcps.Items) > 1 we will never allow creation of a cluster-wide smcp, just return the error straight away

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. I added a test case, which cannot exist and then I implemented conditions, and I didn't notice that it does not make any sense.

{
        name: "creating cluster-wide gateway controller SMCP when another already exists - expected error (2nd execution)",
        smcp: clusterWideGatewayController,
        existingObjs: []*maistrav2.ServiceMeshControlPlane{
                NewV2SMCPResource("basic", "istio-system-1", clusterWideGatewayController),
                NewV2SMCPResource("basic", "istio-system-2", clusterWideGatewayController2),
        },
        expectedErr: fmt.Errorf("a cluster-scoped SMCP may only be created when no other SMCPs exist"),
},

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wrong. We need that test case as well, because 2 gateway-controllers can exist when a user creates: gateway-controller and simple cluster-wide and then changes the 2nd to gateway-controller. That's why more than 1 gateway-controller can be found and in such a case we have to return an error and stop reconciliation.

Copy link
Member Author

@jewertow jewertow Feb 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I could simplify this condition and implement this validation in ValidateUpdate, which is called when SMCP mode is changed. That approach would be better, because currently, users will be able to update the SMCP without any error, but the reconciliation will fail. If I implemented this check in ValidateUpdate, users would get validation error.

}
// only cluster-wide gateway controller can be created when another SMCP exist
if !newSmcp.IsGatewayController() {
return append(allErrors, otherSmcpExists)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this whole block can be removed if we check above if !currentSmcp.IsClusterScoped()

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you can see in another comment, this condition would have to be merged with !currentSmcp.IsClusterScoped().

// an SMCP already exists and new one is created
if len(smcps.Items) == 1 && smcps.Items[0].UID != meta.GetUID() {
currentSmcp := smcps.Items[0].Spec
if currentSmcp.IsClusterScoped() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest starting with the negative cases, so:

	if newSmcp.IsClusterScoped() {
		// an SMCP already exists and new one is created
		if len(smcps.Items) == 1 && smcps.Items[0].UID != meta.GetUID() {
			currentSmcp := smcps.Items[0].Spec
			if !currentSmcp.IsClusterScoped() {
				return append(allErrors, otherSmcpExists)
			}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This condition is incorrect, because when existing SMCP is multi-tenant we have to check if the new one is cluster-wide gateway controller, so it could be:

	if newSmcp.IsClusterScoped() {
		// an SMCP already exists and new one is created
		if len(smcps.Items) == 1 && smcps.Items[0].UID != meta.GetUID() {
			currentSmcp := smcps.Items[0].Spec
			if !currentSmcp.IsClusterScoped() && !newSmcp.IsGatewayController() {
				return append(allErrors, otherSmcpExists)
			}

return append(allErrors, validateOverlappingCaCertConfigMapNames(meta, newSmcp, &smcps, allErrors)...)
}
// do not allow more than 1 cluster-wide gateway controller
if newSmcp.IsGatewayController() && currentSmcp.IsGatewayController() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this whole if-clause is not needed anymore if you check the negative cases first

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, done

Signed-off-by: Jacek Ewertowski <jewertow@redhat.com>
…ditions

Signed-off-by: Jacek Ewertowski <jewertow@redhat.com>
Signed-off-by: Jacek Ewertowski <jewertow@redhat.com>
Signed-off-by: Jacek Ewertowski <jewertow@redhat.com>
@dgn
Copy link
Contributor

dgn commented Feb 5, 2024

I'm not sure we really need to cover multitenant meshes- we could probably say they're incompatible with GW-API support. That would simplify things as we don't have to mix cluster-wide with multitenant in the same cluster

@jewertow jewertow closed this Feb 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants