[Bare Metal] Scale out of workload cluster worker node group cause control plane to roll #7993

jacobweinstock · 2024-04-15T23:11:28Z

What happened:
I have a single node workload cluster. I have added only one addition machine hardware to my hardware.csv in order to add 1 worker node group with 1 worker node to the cluster. When i run eksctl anywhere upgrade cluster, eksa starts to roll my 1 control plane node. As i dont have any hardware the cluster does not upgrade and is left is an unmanageable state, meaning I cannot perform any other cluster lifecycle phases with the cli.

This potentially has something to do with #7991 .

What you expected to happen:
The control plane should not roll. The cluster should not become unmanageable.

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

Environment:

EKS Anywhere Release: v0.19.2
EKS Distro Release:

The text was updated successfully, but these errors were encountered:

jacobweinstock · 2024-04-18T19:05:09Z

Update:
When a single node cluster is created. The single node is configured so that workload pods are permitted to run on the single node. This means that there is no taint or label prohibiting this. This is done via the taint: node-role.kubernetes.io/control-plane:NoSchedule and the label: node-role.kubernetes.io/control-plane=.

When scaling out a single node cluster with a worker node group, the control plane node "reverts" back to just being a control plane node. This means that the taint: node-role.kubernetes.io/control-plane:NoSchedule and the label: node-role.kubernetes.io/control-plane= are added to the control plane spec. This causes CAPI to see a spec change and trigger a rollout: Rolling out Control Plane machines: Machine [machine object] needs rollout: Machine InitConfiguration or JoinConfiguration are outdated"

In the code/spec this is the difference between:

Single node cluster control plane spec (kubectl get kcp -o yaml):

initConfiguration:
  localAPIEndpoint: {}
  nodeRegistration:
    imagePullPolicy: IfNotPresent
    kubeletExtraArgs:
      anonymous-auth: "false"
      provider-id: PROVIDER_ID
      read-only-port: "0"
      tls-cipher-suites: TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256
    taints: []
joinConfiguration:
  bottlerocketAdmin: {}
  bottlerocketBootstrap: {}
  bottlerocketControl: {}
  discovery: {}
  nodeRegistration:
    ignorePreflightErrors:
    - DirAvailable--etc-kubernetes-manifests
    imagePullPolicy: IfNotPresent
    kubeletExtraArgs:
      anonymous-auth: "false"
      provider-id: PROVIDER_ID
      read-only-port: "0"
      tls-cipher-suites: TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256
    taints: []
  pause: {}
  proxy: {}
  registryMirror: {}

Single node cluster control plane spec scaled out with a worker node group (kubectl get kcp -o yaml):

initConfiguration:
  localAPIEndpoint: {}
  nodeRegistration:
    imagePullPolicy: IfNotPresent
    kubeletExtraArgs:
      anonymous-auth: "false"
      provider-id: PROVIDER_ID
      read-only-port: "0"
      tls-cipher-suites: TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256
joinConfiguration:
  bottlerocketAdmin: {}
  bottlerocketBootstrap: {}
  bottlerocketControl: {}
  discovery: {}
  nodeRegistration:
    ignorePreflightErrors:
    - DirAvailable--etc-kubernetes-manifests
    imagePullPolicy: IfNotPresent
    kubeletExtraArgs:
      anonymous-auth: "false"
      provider-id: PROVIDER_ID
      read-only-port: "0"
      tls-cipher-suites: TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256
  pause: {}
  proxy: {}
  registryMirror: {}

Diff:

--- singleNode.yaml     2024-04-18 12:56:22.396824833 -0600
+++ scaledout.yaml      2024-04-18 12:56:31.960841460 -0600
@@ -7,7 +7,6 @@
       provider-id: PROVIDER_ID
       read-only-port: "0"
       tls-cipher-suites: TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256
-    taints: []
 joinConfiguration:
   bottlerocketAdmin: {}
   bottlerocketBootstrap: {}
@@ -22,7 +21,6 @@
       provider-id: PROVIDER_ID
       read-only-port: "0"
       tls-cipher-suites: TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256
-    taints: []
   pause: {}
   proxy: {}
   registryMirror: {}

In code this is the difference between a nil value and an empty slice []corev1.Taint. Ref: https://github.com/abhay-krishna/cluster-api/blob/f2c51dfbb9cd4dc60718f1ea7218b2ebe43bd0b3/bootstrap/kubeadm/api/v1beta1/kubeadm_types.go#L387

jacobweinstock · 2024-04-18T19:12:54Z

I also observed that an eksctl anywhere upgrade cluster command will fail at the command line but the new node did make it into the cluster.

kubectl get nodes
NAME                STATUS   ROLES           AGE    VERSION
<new worker node>   Ready    <none>          145m   v1.27.4-eks-cedffd4
<original CP node>  Ready    control-plane   21h    v1.27.4-eks-cedffd4

The workload cluster is left is a bad state though and subsequent lifecycle commands will fail with:

❌ Validation failed	{"validation": "control plane ready", "error": "1 control plane replicas are unavailable", "remediation": "ensure control plane nodes and pods for cluster workload-test are Ready"}

As the control plane needs to be rolled but there is no available hardware.

kubectl get kcp -n eksa-system workload-cluster
NAME               CLUSTER            INITIALIZED   API SERVER AVAILABLE   REPLICAS   READY   UPDATED   UNAVAILABLE   AGE   VERSION
workload-cluster   workload-cluster   true          true                   2          1       1         1              4h   v1.27.11-eks-1-27-25

jacobweinstock · 2024-04-19T16:45:41Z

Final update:

It appears that version of EKSA before v0.19 did not have this behavior and would not roll control plane nodes when adding or removing the only worker node group configuration. This, for better or worse, was a bug/not the intended behavior. v0.19 has "fixed" this bug/issue. We discussed this internally and we will not be pursuing any change to this behavior. There will be some doc updates to make this clear. One last note on why we won't be changing this behavior (this will be in the docs too). Going from a control plane only cluster to a control plane with worker node(s) cluster changes the nature and fundamental behavior of the control plane nodes. There are significant internal code, behavior, and spec consequences of a change like this. These are a couple of the reasons we have decided to not pursue a change to the current v0.19 behavior.

ndeksa assigned jacobweinstock Apr 16, 2024

jacobweinstock closed this as completed Apr 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bare Metal] Scale out of workload cluster worker node group cause control plane to roll #7993

[Bare Metal] Scale out of workload cluster worker node group cause control plane to roll #7993

jacobweinstock commented Apr 15, 2024

jacobweinstock commented Apr 18, 2024

jacobweinstock commented Apr 18, 2024

jacobweinstock commented Apr 19, 2024

[Bare Metal] Scale out of workload cluster worker node group cause control plane to roll #7993

[Bare Metal] Scale out of workload cluster worker node group cause control plane to roll #7993

Comments

jacobweinstock commented Apr 15, 2024

jacobweinstock commented Apr 18, 2024

jacobweinstock commented Apr 18, 2024

jacobweinstock commented Apr 19, 2024