Re-add handling of rename from official image "k8s.gcr.io/coredns" to "k8s.gcr.io/coredns/coredns" to support deprecated installations #114978

AndiDog · 2023-01-11T07:51:26Z

What type of PR is this?

/kind bug
/kind regression

What this PR does / why we need it:

This was accidentally replaced in 50bea1d (PR #109938). By re-adding it, kubeadm join will no longer fail with failed to pull image k8s.gcr.io/coredns:v1.8.6 for clusters that still have k8s.gcr.io explicitly configured as image repository, such as CAPI-managed clusters.

Example failure that I'm facing:

Cluster got created via CAPI for an older Kubernetes version. The default value k8s.gcr.io gets stored in the cluster (kubectl get configmap -n kube-system kubeadm-config -o yaml | grep imageRepository: # prints k8s.gcr.io value)
Cluster upgrade to v1.23.15, or any other version that switched to registry.k8s.io, is performed by CAPI. This means starting a new cloud server as node, and running kubeadm join on it.
kubeadm (buggy version v1.23.15) reads the in-cluster configuration, sees ClusterConfiguration.imageRepository=k8s.gcr.io and then fails with failed to pull image k8s.gcr.io/coredns:v1.8.6 because the renaming logic for official images is gone

Which issue(s) this PR fixes:

Did not create an issue

Context:

Special notes for your reviewer:

Maybe @dims who authored the related migration wants to have a look 😉

Does this PR introduce a user-facing change?

Fix `kubeadm join` image pulls of the coredns image for installations that still use the old registry k8s.gcr.io

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

… "k8s.gcr.io/coredns/coredns" to support deprecated installations This was accidentally replaced in kubernetes@50bea1d. By re-adding it, `kubeadm join` will no longer fail with `failed to pull image k8s.gcr.io/coredns:v1.8.6` for clusters that still have `k8s.gcr.io` explicitly configured as image repository, such as CAPI-managed clusters.

k8s-ci-robot · 2023-01-11T07:51:34Z

Welcome @AndiDog!

It looks like this is your first PR to kubernetes/kubernetes 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes/kubernetes has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

k8s-ci-robot · 2023-01-11T07:51:35Z

@AndiDog: This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot · 2023-01-11T07:51:35Z

Hi @AndiDog. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot · 2023-01-11T07:52:12Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: AndiDog
Once this PR has been reviewed and has the lgtm label, please assign neolit123 for approval by writing /assign @neolit123 in a comment. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

cmd/kubeadm/OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

neolit123

Cluster got created via CAPI

capi has specific handling of the migration that they were working on recently. please contact the capi maintainers about your use case.

neolit123 · 2023-01-11T08:39:23Z

cmd/kubeadm/app/images/images.go

@@ -48,8 +48,9 @@ func GetDNSImage(cfg *kubeadmapi.ClusterConfiguration) string {
 	if cfg.DNS.ImageRepository != "" {
 		dnsImageRepository = cfg.DNS.ImageRepository
 	}
-	// Handle the renaming of the official image from "registry.k8s.io/coredns" to "registry.k8s.io/coredns/coredns
-	if dnsImageRepository == kubeadmapiv1beta3.DefaultImageRepository {
+	// Handle the renaming of the official image from "registry.k8s.io/coredns" to "registry.k8s.io/coredns/coredns"


this is actually not valid. the official image remains with a /coredns subpath. if the user has a custom repo the subpath is always absent.

I don't understand what's not valid here.

$ docker pull k8s.gcr.io/coredns:v1.8.6 Trying to pull k8s.gcr.io/coredns:v1.8.6... Error: initializing source docker://k8s.gcr.io/coredns:v1.8.6: reading manifest v1.8.6 in k8s.gcr.io/coredns: manifest unknown: Failed to fetch "v1.8.6" from request "/v2/coredns/manifests/v1.8.6". $ docker pull k8s.gcr.io/coredns/coredns:v1.8.6 Trying to pull k8s.gcr.io/coredns/coredns:v1.8.6... Getting image source signatures Copying blob sha256:88efb86cbcab356a7db0ba9ea09694d394f2e088f3e5119aa331025c1e6cb7fa [...] $ docker pull registry.k8s.io/coredns:v1.8.6 Trying to pull registry.k8s.io/coredns:v1.8.6... Error: initializing source docker://registry.k8s.io/coredns:v1.8.6: reading manifest v1.8.6 in registry.k8s.io/coredns: manifest unknown: Failed to fetch "v1.8.6" $ docker pull registry.k8s.io/coredns/coredns:v1.8.6 Trying to pull registry.k8s.io/coredns/coredns:v1.8.6... Getting image source signatures Copying blob sha256:88efb86cbcab356a7db0ba9ea09694d394f2e088f3e5119aa331025c1e6cb7fa [...]

So k8s.gcr.io and registry.k8s.io seem to behave the same, and therefore require the same workaround. Before 50bea1d, this exact code location did the rewrite for k8s.gcr.io. Or did I mistake something?

this code:
50bea1d#diff-54f02c1435b3faae5f85c0507e8ccb2d3986f080287d9de55473b6587e126fa9R51
is about appending /coredns to registry.k8s.io so that the location ends up as registry.k8s.io/coredns/coredns:foo and is the valid one.
for newer versions of kubeadm we should not track the old registry k8s.gcr.io.
at that point kubeadm considers k8s.gcr.io a "custom" registry and the coredns image location is k8s.gcr.io/coredns:foo
this is by design and was done for backwards compatibility once coredns maintainers introduced the /coredns subpath.

neolit123 · 2023-01-11T08:43:33Z

for clusters that still have k8s.gcr.io explicitly configured as image repository, such as CAPI-managed clusters.

this should not be done as it will trip kubeadm upgrade. you should remove the explicit setting and let kubeadm swap to registry.k8s.io. this applies to kubeadm standalone, not sure about capi.

AndiDog · 2023-01-11T08:53:29Z

Cluster got created via CAPI

capi has specific handling of the migration that they were working on recently. please contact the capi maintainers about your use case.

Yes, I had digged through all migration code related to registry.k8s.io. The code in CAPI only auto-migrates the CoreDNS image if using the official repos (k8s.gcr.io / registry.k8s.io). But since my use case has a custom CoreDNS installation, I don't actually want kubeadm to even pull CoreDNS (see kubernetes/kubeadm#2603 for skipping image pulls). While that bug is still active, we could fix image pull attempts as suggested in this PR. So my use case isn't CAPI-specific, as it only delegates to kubeadm for joining a cluster and won't rewrite my custom CoreDNS image (e.g. private.repo.com/coredns).

for clusters that still have k8s.gcr.io explicitly configured as image repository, such as CAPI-managed clusters.

this should not be done as it will trip kubeadm upgrade. you should remove the explicit setting and let kubeadm swap to registry.k8s.io. this applies to kubeadm standalone, not sure about capi.

That would mean manual intervention on every managed cluster that has the explicit setting. If Kubernetes allows smooth, automatic upgrades of existing Kubernetes clusters with old versions, users will be more likely to upgrade. Then with a later major version, Kubernetes could automatically replace any image pulls for k8s.gcr.io with registry.k8s.io instead of allowing the old registry or having to support some migration code.

….15 tries to pull the official image incorrectly While kubeadm is buggy (kubernetes/kubernetes#114978) and tries to download the CoreDNS image despite us skipping that addon (kubernetes/kubeadm#2603), we try to use the new official repository already. This fixes `kubeadm join` for new Kubernetes versions and therefore avoids stuck node upgrades.

…s to pull the official image incorrectly While kubeadm is buggy (kubernetes/kubernetes#114978) and tries to download the CoreDNS image despite us skipping that addon (kubernetes/kubeadm#2603), we try to use the new official repository already. This fixes `kubeadm join` for new Kubernetes versions and therefore avoids stuck node upgrades. CAPI propagates this value before attempting the node upgrade.

neolit123 · 2023-01-11T10:10:36Z

see kubernetes/kubeadm#2603 for skipping image pulls

that is actually a complicated problem and all solutions had trade-offs, IIRC.

That would mean manual intervention on every managed cluster that has the explicit setting. If Kubernetes allows smooth, automatic upgrades of existing Kubernetes clusters with old versions, users will be more likely to upgrade. Then with a later major version, Kubernetes could automatically replace any image pulls for k8s.gcr.io with registry.k8s.io instead of allowing the old registry or having to support some migration code.

kubeadm (no capi) users had the following problem:

they heard about the k8s.gcr.io -> registry.k8s.io migration and pinned k8s,gcr.io in their ClusterConfiguration.imageRepository, to delay the migration
that is not going to work, since kubeadm upgrade/join/... (from a binary that has registry.k8s.io as the default) will see k8s.gcr.io and think that this is a "custom" / unknown registry
coredns image pulls will fail since k8s.gcr.io/coredns does not exist
the fix is for them to remove the k8s.gcr.io pin and let kubeadm migrate from old default -> new default registry

we want users to stop using k8s.gcr.io ASAP which means that the transition had to be immediate and not track the old registry at all. k8s.gcr.io is currently still generating traffic that costs a lot of $$.

AndiDog · 2023-01-11T10:14:47Z

Fully agree with all the explanations! I didn't want to immediately suggest a more drastic change as first-time contributor. Then would it make sense to enforce the migration by replacing k8s.gcr.io/coredns with registry.k8s.io/coredns/coredns? The image pull of k8s.gcr.io/coredns:TAG anyway fails right now, so we wouldn't make it worse. And the change could be scoped to the CoreDNS image for now in order to avoid any impact on cluster upgrades/joins.

neolit123 · 2023-01-11T10:36:32Z

this branch now tracks .27. i am generally not in favour of keeping track of the old registry in kubeadm code for .27 or backporting more changes to .26.

of course, the other reviewers might have a different opinion.

…s to pull the official image incorrectly While kubeadm is buggy (kubernetes/kubernetes#114978) and tries to download the CoreDNS image despite us skipping that addon (kubernetes/kubeadm#2603), we try to use the new official repository already. This fixes `kubeadm join` for new Kubernetes versions and therefore avoids stuck node upgrades. CAPI propagates this value before attempting the node upgrade.

…s to pull the official image incorrectly (#197) While kubeadm is buggy (kubernetes/kubernetes#114978) and tries to download the CoreDNS image despite us skipping that addon (kubernetes/kubeadm#2603), we try to use the new official repository already. This fixes `kubeadm join` for new Kubernetes versions and therefore avoids stuck node upgrades. CAPI propagates this value before attempting the node upgrade.

pacoxu · 2023-03-13T09:52:38Z

kubernetes/kubeadm#2714 (comment)

If I understand correctly, you may try to use the old image registry by using a custom configuration like below.

apiVersion: kubeadm.k8s.io/v1beta3
bootstrapTokens:
- groups:
  - system:bootstrappers:kubeadm:default-node-token
  token: abcdef.0123456789abcdef
  ttl: 24h0m0s
  usages:
  - signing
  - authentication
kind: InitConfiguration
localAPIEndpoint:
  advertiseAddress: 1.2.3.4
  bindPort: 6443
nodeRegistration:
  criSocket: unix:///var/run/containerd/containerd.sock
  imagePullPolicy: IfNotPresent
  name: node
  taints: null
---
apiServer:
  timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta3
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager: {}
dns:
  imageRepository: k8s.gcr.io/coredns
etcd:
  local:
    dataDir: /var/lib/etcd
imageRepository: k8s.gcr.io
kind: ClusterConfiguration
kubernetesVersion: 1.25.0
networking:
  dnsDomain: cluster.local
  serviceSubnet: 10.96.0.0/12
scheduler: {}

k8s-ci-robot · 2023-03-17T05:36:39Z

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-triage-robot · 2023-06-15T05:55:53Z

The Kubernetes project currently lacks enough contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

Mark this PR as fresh with /remove-lifecycle stale
Close this PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-ci-robot added needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jan 11, 2023

k8s-ci-robot added the needs-priority Indicates a PR lacks a `priority/foo` label and requires one. label Jan 11, 2023

k8s-ci-robot requested review from pacoxu and RA489 January 11, 2023 07:52

k8s-ci-robot added area/kubeadm sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Jan 11, 2023

This was referenced Jan 11, 2023

Update CAPA to Kubernetes v1.23.15 giantswarm/roadmap#1669

Closed

kubeadm should not pull images for skipped addon phases (coredns, kube-proxy) #114979

Closed

neolit123 reviewed Jan 11, 2023

View reviewed changes

AndiDog mentioned this pull request Jan 11, 2023

Override image repository because kubeadm of Kubernetes v1.23.15 tries to pull the official image incorrectly giantswarm/cluster-aws#197

Merged

1 task

k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 17, 2023

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 15, 2023

AndiDog closed this Jun 15, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Re-add handling of rename from official image "k8s.gcr.io/coredns" to "k8s.gcr.io/coredns/coredns" to support deprecated installations #114978

Re-add handling of rename from official image "k8s.gcr.io/coredns" to "k8s.gcr.io/coredns/coredns" to support deprecated installations #114978

AndiDog commented Jan 11, 2023

k8s-ci-robot commented Jan 11, 2023

k8s-ci-robot commented Jan 11, 2023

k8s-ci-robot commented Jan 11, 2023

k8s-ci-robot commented Jan 11, 2023

neolit123 left a comment

neolit123 Jan 11, 2023 •

edited

AndiDog Jan 11, 2023

neolit123 Jan 11, 2023

neolit123 commented Jan 11, 2023 •

edited

AndiDog commented Jan 11, 2023

neolit123 commented Jan 11, 2023 •

edited

AndiDog commented Jan 11, 2023

neolit123 commented Jan 11, 2023

pacoxu commented Mar 13, 2023

k8s-ci-robot commented Mar 17, 2023

k8s-triage-robot commented Jun 15, 2023

Re-add handling of rename from official image "k8s.gcr.io/coredns" to "k8s.gcr.io/coredns/coredns" to support deprecated installations #114978

Re-add handling of rename from official image "k8s.gcr.io/coredns" to "k8s.gcr.io/coredns/coredns" to support deprecated installations #114978

Conversation

AndiDog commented Jan 11, 2023

What type of PR is this?

What this PR does / why we need it:

Which issue(s) this PR fixes:

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

k8s-ci-robot commented Jan 11, 2023

k8s-ci-robot commented Jan 11, 2023

k8s-ci-robot commented Jan 11, 2023

k8s-ci-robot commented Jan 11, 2023

neolit123 left a comment

Choose a reason for hiding this comment

neolit123 Jan 11, 2023 • edited

Choose a reason for hiding this comment

AndiDog Jan 11, 2023

Choose a reason for hiding this comment

neolit123 Jan 11, 2023

Choose a reason for hiding this comment

neolit123 commented Jan 11, 2023 • edited

AndiDog commented Jan 11, 2023

neolit123 commented Jan 11, 2023 • edited

AndiDog commented Jan 11, 2023

neolit123 commented Jan 11, 2023

pacoxu commented Mar 13, 2023

k8s-ci-robot commented Mar 17, 2023

k8s-triage-robot commented Jun 15, 2023

neolit123 Jan 11, 2023 •

edited

neolit123 commented Jan 11, 2023 •

edited

neolit123 commented Jan 11, 2023 •

edited