Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KCP >= v1.2.8 and >= v1.3.0 doesn't work with certain Kubernetes versions #7833

Closed
4 tasks done
sbueringer opened this issue Jan 3, 2023 · 10 comments
Closed
4 tasks done
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@sbueringer
Copy link
Member

sbueringer commented Jan 3, 2023

As discussed in #7768 the new KCP versions don't work with all Kubernetes / kubeadm versions. This issue only occurs if KCP.spec.kubeadmConfigSpec.clusterConfiguration.imageRepository is one of "", "k8s.gcr.io", "registry.k8s.io" (i.e. custom registries are not affected)

Current behavior

kubeadm:

  • kubeadm init uses its embedded default registry and uploads it in the workload cluster in the kubeadm-config ConfigMap
  • kubeadm join uses the registry from the kubeadm-config ConfigMap
  • Both kubeadm init and kubeadm join run preflight checks which verify if relevant images (including Core DNS) can be pulled (both on control plane and worker machines)
  • If the imageRepository in the kubeadm-config ConfigMap (either .dns.imageRepository or as fallback .imageRepository) is equal to the default registry embedded in kubeadm kubeadm will
    use <registry>/coredns as imageRepository for the CoreDNS image (i.e. it will pull from e.g. registry.k8s.io/coredns/coredns)

KCP:

  • KCP will automatically change the imageRepository in the kubeadm-config ConfigMap to registry.k8s.io for Kubernetes >= 1.22.0 and < 1.26.0 (except an imageRepository has been explicitly set
    in KCP.spec.kubeadmConfigSpec.clusterConfiguration)
    • The code to modify the Configmap is only triggered during a KCP rollout (includes any kind of change which requires a rollout, Kubernetes upgrade is just one case)
    • There is similar code which modifies the kube-proxy DaemonSet which is run during every reconcile.

CAPD:

  • CAPD ignores all kubeadm preflight errors

Error cases

Pinning to the wrong default registry (occurred in CAPD: Cluster API v1.1 upgrade v1.23.15 => v1.24.9)

(job)

Explanation:

  • Test fails because the conformance test detects that the CoreDNS Pods are not ready because of image pull errors.
  • Both kubeadm v1.23.15 and v1.24.9 use the new registry as default.
  • The problem was that we "pinned" the registry to k8s.gcr.io in KCP.spec.kubeadmConfigSpec.clusterConfiguration
  • Pinning the registry to one of the default registries (k8s.gcr.io & registry.k8s.io) which is not the default registry of the kubeadm binary is not supported
  • Because of that kubeadm init did not use <registry>/coredns as imageRepository for CoreDNS thus the CoreDNS Deployment had the k8s.gcr.io/coredns:v1.8.6 image which doesn't
    exist (<registry>>/coredns/coredns:v1.8.6 would have been correct).
  • Usually kubeadm init would have already failed, but that didn't happen because CAPD is ignoring all kubeadm preflight errors.

Solution:

Upgrade to a Kubernetes version >= v1.22.0 which still has a kubeadm binary with the old default registry

Example: Upgrade from v1.21.14 to v1.22.16 (imageRepository is not set in KCP.spec.kubeadmConfigSpec.clusterConfiguration)

Explanation:

  • This case will fail as soon as the first v1.22.16 node is joined
  • Both kubeadm v1.21.14 and v1.22.16 use the old registry as default.
  • The following happens:
    • kubeadm init uses the embedded k8s.gcr.io imageRepository and uploads it to the kubeadm-config ConfigMap
    • Once the version is changed to v1.22.16 KCP will migrate the registry in the ConfigMap to registry.k8s.io
    • Subsequent kubeadm joins will fail with preflight error because the kubeadm binary only handles the CoreDNS imageRepository for the k8s.gcr.io registry correctly.

Solution:

Notes:

  • The "CAPZ: Cluster API v1.3 upgrade v1.23.13 => v1.24.7" e2e test was failing because of this (job).
  • The "CAPA: Cluster API v1.3 CSI Migration upgrade v1.22.4 => v1.23.3" e2e test was failing because of this (job)
  • The "CAPM3: Cluster API v1.2.8 upgrade v1.23.8 => v1.24.1" e2e test was failing because of this (issue comment)

Current state: Compatibility of KCP >= v1.2.8 & >= v1.3.0 with Kubernetes / kubeadm

Kubernetes / kubeadm version kubeadm default registry registry that KCP sets Works
<= v1.21.x k8s.gcr.io - ✔️
v1.22.0 - v1.22.16 k8s.gcr.io registry.k8s.io
>= v1.22.17 registry.k8s.io registry.k8s.io ✔️
v1.23.0 - v1.23.14 k8s.gcr.io registry.k8s.io
>= v1.23.15 registry.k8s.io registry.k8s.io ✔️
v1.24.0 - v1.24.8 k8s.gcr.io registry.k8s.io
>= v1.24.9 registry.k8s.io registry.k8s.io ✔️
>= v1.25.0 registry.k8s.io registry.k8s.io ✔️
>= v1.26.0 registry.k8s.io - ✔️

tl;dr KCP is broken for all v1.22.x, v1.23.x and v1.24.x kubeadm versions which have the old default registry.
The error occurs whenever a new Machine should be joined after KCP sets the new registry in the kubeadm-config ConfigMap (which is done whenever a rollout is needed, Kubernetes upgrade is just one case).

Background information

Kubeadm default registries:

  • registry.k8s.io: >= v1.22.17, >= v1.23.15, >= v1.24.9, >= v1.25.0
  • k8s.gcr.io: all older kubeadm versions

CoreDNS images available: (ignoring all versions < v1.6.0)

  • k8s.gcr.io/coredns & registry.k8s.io/coredns: 1.6.2, 1.6.5, 1.6.6, 1.6.7, 1.7.0
  • k8s.gcr.io/coredns/coredns & registry.k8s.io/coredns/coredns: v1.6.6 v1.6.7 v1.6.9 v1.7.0 v1.7.1 v1.8.0 v1.8.3 v1.8.4 v1.8.5 v1.8.6 v1.9.3 v1.9.4 v1.10.0

/kind bug

@k8s-ci-robot k8s-ci-robot added kind/bug Categorizes issue or PR as related to a bug. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jan 3, 2023
@sbueringer sbueringer changed the title [WIP] Cluster API >= v1.2.8 and >= v1.3.0 doesn't work with certain Kubernetes versions [WIP] KCP >= v1.2.8 and >= v1.3.0 doesn't work with certain Kubernetes versions Jan 3, 2023
@sbueringer sbueringer changed the title [WIP] KCP >= v1.2.8 and >= v1.3.0 doesn't work with certain Kubernetes versions KCP >= v1.2.8 and >= v1.3.0 doesn't work with certain Kubernetes versions Jan 3, 2023
@sbueringer
Copy link
Member Author

sbueringer commented Jan 3, 2023

cc @fabriziopandini @ykakarap @killianmuldoon @CecileRobertMichon @jackfrancis @furkatgofurov7 @oscr @Ankitasw @chrischdi @lentzi90 (just cc'ed everyone from the old issue in case this one is relevant for you as well)

Please let me know if you saw errors cases not covered above.

@fabriziopandini
Copy link
Member

/triage accepted
Thanks for this summary! this is definitely something we should fix IMO

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jan 3, 2023
@sbueringer
Copy link
Member Author

/assign

@sbueringer
Copy link
Member Author

PR has been merged. Now waiting for CI then merging cherry-picks as well:

@sbueringer
Copy link
Member Author

sbueringer commented Jan 12, 2023

One further follow-up is this PR: #7914

Essentially:

  • the topology controller was not waiting until KCP was entirely stable before triggering a rollout of the control plane
  • the test framework wasn't waiting until KCP was entirely stable
    • in the upgrade test this means the upgrade is triggered while KCP is still not stable from the create

@sbueringer
Copy link
Member Author

sbueringer commented Jan 12, 2023

Another follow-up is #7915

We found out during triaging that we have an unexpected double-rollout in our Cluster upgrade test (more details in the linked issue)

@sbueringer
Copy link
Member Author

sbueringer commented Jan 12, 2023

The last - known - remaining edge case is the following.

Example:

  • Let's assume we create a Cluster with v1.24.7 (kubeadm v1.24.7 is using the old registry)
  • KCP has 3 replicas, the first two are already ready and the last one is currently being created
  • At the moment the infrastructure provider is creating the server (notably: kubeadm join is not run yet)
  • Now someone modifies KCP to change the version to v1.25.3 (kubeadm v1.25.3 is using the new registry)
  • The KCP controller will modify the kubeadm-config ConfigMap in the workload cluster to change the imageRepository to registry.k8s.io
  • Our 3rd KCP replica is now executing kubeadm join. The kubeadm join (with kubeadm v1.24.7) will fail because of a registry mismatch:
    • kubeadm itself has handling for the old registry
    • the kubeadm-config ConfigMap has the new registry
    • The result is that kubeadm join fails in the ImagePull preflight check because it tries to pull the CoreDNS image from registry.k8s.io/coredns instead of registry.k8s.io/coredns/coredns

So tl;dr concurrently joining a Machine with a version using the old registry while upgrading KCP to a version with the new registry will lead to a failed kubeadm join of that machine.

We won't fix this edge case because:

  • It only occurs when a Machine is joined and when KCP / the cluster is migrated to the new registry concurrently
  • This is mitigated by:
  • In general kubeadm doesn't officially support joining with an older minor version (v1.24.7 vs v1.25.3). In this case it only fails because of the registry migration, but it's still not supported in general.

@sbueringer
Copy link
Member Author

So tl;dr concurrently joining a Machine with a version using the old registry while upgrading KCP to a version with the new registry will lead to a failed kubeadm join of that machine.

Focusing on this, it is recommended to avoid bumping the KCP version while KCP is not stable (e.g. a Machine is in the progress of joining.

With ClusterClass, we try to avoid this by waiting for KCP to be stable before triggering upgrades. This is correctly implemented on the Cluster topology controller side, but unfortunately through a stale cache it can happen that the KCP controller writes inconsistent status information to a KCP object. I.e. the KCP object looks stable but in fact it isn't.

We're currently trying to figure out how to improve patchHelper or how we call patchHelper to fix this issue.
As far as we can tell what we're observing is the same as described in this issue: #7594

Stay tuned! :)

@sbueringer
Copy link
Member Author

Let's close this issue as the issue itself is resolved and we had no further reports.

We'll follow-up in another issue regarding the patch helper.

/close

@k8s-ci-robot
Copy link
Contributor

@sbueringer: Closing this issue.

In response to this:

Let's close this issue as the issue itself is resolved and we had no further reports.

We'll follow-up in another issue regarding the patch helper.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

No branches or pull requests

3 participants