Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Similar issue #2656 (AKS Cluster must have at least one system pool) happens on the latest version of capz. #4341

Closed
kstreee-furiosa opened this issue Dec 6, 2023 · 9 comments · Fixed by #4392
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug.
Milestone

Comments

@kstreee-furiosa
Copy link

kstreee-furiosa commented Dec 6, 2023

/kind bug

[Before submitting an issue, have you checked the Troubleshooting Guide?]

What steps did you take and what happened:
Similar issue #2656 happens on the latest version of capz.

While clusterctl move --to-kubeconfig=...

Performing move...
Discovering Cluster API objects
Moving Cluster API objects Clusters=1
Moving Cluster API objects ClusterClasses=0
Waiting for all resources to be ready to move
Creating objects in the target cluster
Deleting objects from the source cluster
Error: action failed after 10 attempts: error deleting "infrastructure.cluster.x-k8s.io/v1beta1, Kind=AzureManagedMachinePool" k8s/k8s-pool0: admission webhook "validation.azuremanagedmachinepools.infrastructure.cluster.x-k8s.io" denied the request: if the delete is triggered
 via owner MachinePool please refer to trouble shooting section in https://capz.sigs.k8s.io/topics/managedcluster.html: AKS Cluster must have at least one system pool

What did you expect to happen:

Success of clusterctl move --to-kubeconfig=...

Anything else you would like to add:
Succeeded on capz version "v1.11.6"

Environment:

host cluster (kind) version: 1.27.3

clusterctl version:

Fetching providers
Installing cert-manager Version="v1.13.2"
Waiting for cert-manager to be available...
Installing Provider="cluster-api" Version="v1.6.0" TargetNamespace="capi-system"
Installing Provider="bootstrap-kubeadm" Version="v1.6.0" TargetNamespace="capi-kubeadm-bootstrap-system"
Installing Provider="control-plane-kubeadm" Version="v1.6.0" TargetNamespace="capi-kubeadm-control-plane-system"
Installing Provider="infrastructure-azure" Version="v1.12.0" TargetNamespace="capz-system"
@k8s-ci-robot k8s-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Dec 6, 2023
@CecileRobertMichon
Copy link
Contributor

It seems that #3426 may not be working as we think...

Do you see the annotation "clusterctl.cluster.x-k8s.io/delete-for-move" on the owner Cluster when the move is happening?

@CecileRobertMichon
Copy link
Contributor

@nawazkh @willie-yao would one of you be able to look into this one?

@CecileRobertMichon
Copy link
Contributor

Related: #4157

@nawazkh
Copy link
Member

nawazkh commented Dec 11, 2023

/assign

@willie-yao
Copy link
Contributor

@nawazkh Sorry I got to this a bit late. Let me know if you want to pair on this!

@nawazkh
Copy link
Member

nawazkh commented Dec 14, 2023

@nawazkh Sorry I got to this a bit late. Let me know if you want to pair on this!

Thanks @willie-yao ! I will reach out if needed! :)

@nawazkh
Copy link
Member

nawazkh commented Dec 14, 2023

I am able to reproduce this locally.

❯ clusterctl -v 5 move --to-kubeconfig capi-quickstart-config.config
No default config file available
Performing move...
Discovering Cluster API objects
ManagedCluster Count=1
ConfigMap Count=1
AzureManagedMachinePool Count=2
ResourceGroup Count=1
Secret Count=5
Cluster Count=1
MachinePool Count=2
AzureManagedCluster Count=1
AzureManagedControlPlane Count=1
AzureClusterIdentity Count=1
ManagedClustersAgentPool Count=2
Total objects Count=18
Excluding secret from move (not linked with any Cluster) name="cluster-identity-secret"
Moving Cluster API objects Clusters=1
Moving Cluster API objects ClusterClasses=0
Pausing the source cluster
Set Cluster.Spec.Paused Paused=true Cluster="capi-quickstart" Namespace="default"
Pausing the source cluster classes
Creating target namespaces, if missing
Creating objects in the target cluster
Creating AzureClusterIdentity="cluster-identity" Namespace="default"
Creating Cluster="capi-quickstart" Namespace="default"
Creating MachinePool="capi-quickstart-pool1" Namespace="default"
Creating AzureManagedControlPlane="capi-quickstart" Namespace="default"
Creating MachinePool="capi-quickstart-pool0" Namespace="default"
Creating AzureManagedCluster="capi-quickstart" Namespace="default"
Creating ResourceGroup="capi-quickstart" Namespace="default"
Creating AzureManagedMachinePool="capi-quickstart-pool0" Namespace="default"
Creating AzureManagedMachinePool="capi-quickstart-pool1" Namespace="default"
Creating Secret="capi-quickstart-aso-secret" Namespace="default"
Creating Secret="capi-quickstart-kubeconfig" Namespace="default"
Creating Secret="capi-quickstart-ca" Namespace="default"
Creating ManagedCluster="capi-quickstart" Namespace="default"
Creating ManagedClustersAgentPool="capi-quickstart-pool0" Namespace="default"
Creating Secret="capi-quickstart-aso-kubeconfig" Namespace="default"
Creating ManagedClustersAgentPool="capi-quickstart-pool1" Namespace="default"
Deleting objects from the source cluster
Deleting ManagedClustersAgentPool="capi-quickstart-pool0" Namespace="default"
Deleting Secret="capi-quickstart-aso-kubeconfig" Namespace="default"
Deleting ManagedClustersAgentPool="capi-quickstart-pool1" Namespace="default"
Deleting ManagedCluster="capi-quickstart" Namespace="default"
Deleting ResourceGroup="capi-quickstart" Namespace="default"
Deleting AzureManagedMachinePool="capi-quickstart-pool0" Namespace="default"
Retrying with backoff Cause="error deleting \"infrastructure.cluster.x-k8s.io/v1beta1, Kind=AzureManagedMachinePool\" default/capi-quickstart-pool0: admission webhook \"validation.azuremanagedmachinepools.infrastructure.cluster.x-k8s.io\" denied the request: if the delete is triggered via owner MachinePool please refer to trouble shooting section in https://capz.sigs.k8s.io/topics/managedcluster.html: AKS Cluster must have at least one system pool"
Deleting AzureManagedMachinePool="capi-quickstart-pool0" Namespace="default"
Retrying with backoff Cause="error deleting \"infrastructure.cluster.x-k8s.io/v1beta1, Kind=AzureManagedMachinePool\" default/capi-quickstart-pool0: admission webhook \"validation.azuremanagedmachinepools.infrastructure.cluster.x-k8s.io\" denied the request: if the delete is triggered via owner MachinePool please refer to trouble shooting section in https://capz.sigs.k8s.io/topics/managedcluster.html: AKS Cluster must have at least one system pool"
Deleting AzureManagedMachinePool="capi-quickstart-pool0" Namespace="default"
Retrying with backoff Cause="error deleting \"infrastructure.cluster.x-k8s.io/v1beta1, Kind=AzureManagedMachinePool\" default/capi-quickstart-pool0: admission webhook \"validation.azuremanagedmachinepools.infrastructure.cluster.x-k8s.io\" denied the request: if the delete is triggered via owner MachinePool please refer to trouble shooting section in https://capz.sigs.k8s.io/topics/managedcluster.html: AKS Cluster must have at least one system pool"
Deleting AzureManagedMachinePool="capi-quickstart-pool0" Namespace="default"
Retrying with backoff Cause="error deleting \"infrastructure.cluster.x-k8s.io/v1beta1, Kind=AzureManagedMachinePool\" default/capi-quickstart-pool0: admission webhook \"validation.azuremanagedmachinepools.infrastructure.cluster.x-k8s.io\" denied the request: if the delete is triggered via owner MachinePool please refer to trouble shooting section in https://capz.sigs.k8s.io/topics/managedcluster.html: AKS Cluster must have at least one system pool"
Deleting AzureManagedMachinePool="capi-quickstart-pool0" Namespace="default"
Retrying with backoff Cause="error deleting \"infrastructure.cluster.x-k8s.io/v1beta1, Kind=AzureManagedMachinePool\" default/capi-quickstart-pool0: admission webhook \"validation.azuremanagedmachinepools.infrastructure.cluster.x-k8s.io\" denied the request: if the delete is triggered via owner MachinePool please refer to trouble shooting section in https://capz.sigs.k8s.io/topics/managedcluster.html: AKS Cluster must have at least one system pool"
Deleting AzureManagedMachinePool="capi-quickstart-pool0" Namespace="default"
Retrying with backoff Cause="error deleting \"infrastructure.cluster.x-k8s.io/v1beta1, Kind=AzureManagedMachinePool\" default/capi-quickstart-pool0: admission webhook \"validation.azuremanagedmachinepools.infrastructure.cluster.x-k8s.io\" denied the request: if the delete is triggered via owner MachinePool please refer to trouble shooting section in https://capz.sigs.k8s.io/topics/managedcluster.html: AKS Cluster must have at least one system pool"
Deleting AzureManagedMachinePool="capi-quickstart-pool0" Namespace="default"
Retrying with backoff Cause="error deleting \"infrastructure.cluster.x-k8s.io/v1beta1, Kind=AzureManagedMachinePool\" default/capi-quickstart-pool0: admission webhook \"validation.azuremanagedmachinepools.infrastructure.cluster.x-k8s.io\" denied the request: if the delete is triggered via owner MachinePool please refer to trouble shooting section in https://capz.sigs.k8s.io/topics/managedcluster.html: AKS Cluster must have at least one system pool"
Deleting AzureManagedMachinePool="capi-quickstart-pool0" Namespace="default"
Retrying with backoff Cause="error deleting \"infrastructure.cluster.x-k8s.io/v1beta1, Kind=AzureManagedMachinePool\" default/capi-quickstart-pool0: admission webhook \"validation.azuremanagedmachinepools.infrastructure.cluster.x-k8s.io\" denied the request: if the delete is triggered via owner MachinePool please refer to trouble shooting section in https://capz.sigs.k8s.io/topics/managedcluster.html: AKS Cluster must have at least one system pool"
Deleting AzureManagedMachinePool="capi-quickstart-pool0" Namespace="default"
Retrying with backoff Cause="error deleting \"infrastructure.cluster.x-k8s.io/v1beta1, Kind=AzureManagedMachinePool\" default/capi-quickstart-pool0: admission webhook \"validation.azuremanagedmachinepools.infrastructure.cluster.x-k8s.io\" denied the request: if the delete is triggered via owner MachinePool please refer to trouble shooting section in https://capz.sigs.k8s.io/topics/managedcluster.html: AKS Cluster must have at least one system pool"
Deleting AzureManagedMachinePool="capi-quickstart-pool0" Namespace="default"
Deleting AzureManagedMachinePool="capi-quickstart-pool1" Namespace="default"
Deleting Secret="capi-quickstart-aso-secret" Namespace="default"
Deleting Secret="capi-quickstart-kubeconfig" Namespace="default"
Deleting Secret="capi-quickstart-ca" Namespace="default"
Error: action failed after 10 attempts: error deleting "infrastructure.cluster.x-k8s.io/v1beta1, Kind=AzureManagedMachinePool" default/capi-quickstart-pool0: admission webhook "validation.azuremanagedmachinepools.infrastructure.cluster.x-k8s.io" denied the request: if the delete is triggered via owner MachinePool please refer to trouble shooting section in https://capz.sigs.k8s.io/topics/managedcluster.html: AKS Cluster must have at least one system pool

*Update: probing further.

@nawazkh
Copy link
Member

nawazkh commented Dec 14, 2023

I figured out why this is happening.
The ValidateDelete() ->validateLastSystemNodePool() webhook at CAPZ is checking ownerCluster (i.e. clusters.cluster.x-k8s.io) for clusterctlv1.DeleteForMoveAnnotation annotation instead of checking AzureManagedMachinePool's annotation.

So either we update the CAPZ's webhook

// checking if the Cluster is going to be deleted for clusterctl move operation
if _, found := ownerCluster.Annotations[clusterctlv1.DeleteForMoveAnnotation]; found {
return nil
}
to search for AzureManagedMachinePool's annotations.

OR

Update CAPI's move operation to update owner cluster (in here)

As of current clusterctl's implementation, sourceObj in the CAPI's code snippet is of type AzureManagedMachinePool and not Clusters. Therefore DeleteForMoveAnnotation gets applied at AzureManagedMachinePool level.

@nawazkh
Copy link
Member

nawazkh commented Dec 14, 2023

Looking at the original PR that added DeleteForMoveAnnotation in CAPZ; kubernetes-sigs/cluster-api#8322

Add move annotation on objects those are going to be deleted for cluster move operation.

I sense that the author wanted to add DeleteForMoveAnnotation to all the objects which are being deleted as part of move.
(Moreover, adding DeleteForMoveAnnotation to all the (sub)resources instead of just the top level resource makes more sense.)
Therefore, I think updating CAPZ's webhook to check for DeleteForMoveAnnotation at AzureManagedMachinePool's level makes more sense.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
Archived in project
6 participants