Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

switch the default kubeadm image registry to registry.k8s.io #2671

Closed
5 tasks done
neolit123 opened this issue Mar 24, 2022 · 28 comments
Closed
5 tasks done

switch the default kubeadm image registry to registry.k8s.io #2671

neolit123 opened this issue Mar 24, 2022 · 28 comments
Assignees
Labels
kind/deprecation Categorizes issue or PR as related to a feature/enhancement marked for deprecation. kind/feature Categorizes issue or PR as related to a new feature. lifecycle/active Indicates that an issue or PR is actively being worked on by a contributor. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete.

Comments

@neolit123
Copy link
Member

neolit123 commented Mar 24, 2022

the k8s project is moving away from k8s.gcr.io to registry.k8s.io.

  • registry.k8s.io is currently redirecting to k8s.gcr.io
  • in 1.25 we should switch kubeadm's code to use registry.k8s.io by default.
  • k8s.gcr.io will continue working (for some time) by redirecting to registry.k8s.io

https://groups.google.com/g/kubernetes-sig-testing/c/U7b_im9vRrM/m/7qywJeUTBQAJ
xref kubernetes/k8s.io#1834

1.25

1.26

@neolit123 neolit123 added priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. kind/feature Categorizes issue or PR as related to a new feature. kind/deprecation Categorizes issue or PR as related to a feature/enhancement marked for deprecation. labels Mar 24, 2022
@neolit123 neolit123 added this to the v1.25 milestone Mar 24, 2022
@SataQiu
Copy link
Member

SataQiu commented Mar 27, 2022

/cc

@neolit123
Copy link
Member Author

PR for this in 1.25:
kubernetes/kubernetes#109938

@neolit123
Copy link
Member Author

neolit123 commented Jun 2, 2022

kubernetes/kubernetes#109938 broke out e2e tests for 'latest'
https://k8s-testgrid.appspot.com/sig-cluster-lifecycle-kubeadm

I think what's happening is the following. A cluster is created with the older version of kubeadm and the clusterconfiguration.imageRepository is defaulted to "k8s gcr.io". The when upgrade happens the new kubeadm binary thinks that "k8s.gcr.io" is a custom repository. Kinder is not pre-pulling the right images.

For kubeadm we need to mutate the image repo field to registry.k8s.io.

For kinder we need to ensure it uses the ouput of "kubeadm config images" for prepull (might be the case already).

@neolit123
Copy link
Member Author

neolit123 commented Jun 2, 2022

first thing we have to do is here:
kubernetes/kubernetes#110343

once it merges i can try debugging kinder workflows.
EDIT: but looking at the logs we might be OK without kinder changes since it prepares registry.k8s.io images for the upgrade step, so as long as the clusterconfiguration.imagerepository has registry.k8s.io it should be good.

@neolit123
Copy link
Member Author

kubernetes/kubernetes#110343 merged and i was expecting it to be a sufficient fix, but it seems like https://storage.googleapis.com/k8s-release-dev/ci/latest.txt is not pointing to the latest CI build, thus the jobs are continuing to fail.

notified #release-management and #release-ci-signal on slack:
https://kubernetes.slack.com/archives/CN0K3TE2C/p1654258115320499
https://kubernetes.slack.com/archives/CJH2GBF7Y/p1654257907646369

@pacoxu
Copy link
Member

pacoxu commented Jun 7, 2022

time="11:29:00" level=info msg="Downloading https://storage.googleapis.com/k8s-release-dev/ci/v1.24.2-rc.0.10+a6b031e314ef2e/bin/linux/amd64/kube-apiserver.tar\n"
time="11:33:13" level=info msg="fixing /private/var/folders/g7/ywncky4109zfc_6v5ww1fc2w0000gn/T/kinder-alter-image3586133515/bits/init/kube-apiserver.tar"
fixed: k8s.gcr.io/kube-apiserver-amd64 -> k8s.gcr.io/kube-apiserver
fixed: k8s.gcr.io/kube-apiserver-amd64 -> k8s.gcr.io/kube-apiserver

During kubeadm init with latest kubeadm, it will try registry.k8s.io as the image repo.

[root@paco ~]# ./kubeadm-v1.25 config images list --kubernetes-version=v1.24.2-rc.0.10+a6b031e314ef2e
registry.k8s.io/kube-apiserver:v1.24.2-rc.0.10_a6b031e314ef2e
registry.k8s.io/kube-controller-manager:v1.24.2-rc.0.10_a6b031e314ef2e
registry.k8s.io/kube-scheduler:v1.24.2-rc.0.10_a6b031e314ef2e
registry.k8s.io/kube-proxy:v1.24.2-rc.0.10_a6b031e314ef2e
registry.k8s.io/pause:3.7
registry.k8s.io/etcd:3.5.3-0
registry.k8s.io/coredns/coredns:v1.8.6

@neolit123
Copy link
Member Author

neolit123 commented Jun 7, 2022

that's actually tricky to fix because the images are created from tars.
given kubeadm was downloaded with the tars and is already on the node, we could execute "kubeadm config images" and check what is the default repository for the "kubeadm init" binary. based on that "fix" the repo to be "registry.k8s.io" and not "k8s.gcr.io"

if err := fixImageTar(v); err != nil {

alternatively, we could just add a workaround for the ci-kubernetes-e2e-kubeadm-kinder-latest-on-1-24 job, such that we introduce a new "task" in the kinder workflow to execute on the nodes and re-tag / alias the images from gcr.k8s.io -> registry.k8s.io.

@pacoxu
Copy link
Member

pacoxu commented Jun 7, 2022

alternatively, we could just add a workaround for the ci-kubernetes-e2e-kubeadm-kinder-latest-on-1-24 job, such that we introduce a new "task" in the kinder workflow to execute on the nodes and re-tag / alias the images from gcr.k8s.io -> registry.k8s.io.

re-tag seems to be the simplest way here. (Or a trick change in fixImageTar to tag it to both registries)

@neolit123
Copy link
Member Author

neolit123 commented Jun 7, 2022

(Or a trick change in fixImageTar to tag it to both registries)

i'm testing a hack in fixImageTar right now. it's not great, but it can be removed once we no longer test the 1.24 k8s/1.25 kubeadm skew.

@neolit123
Copy link
Member Author

@pacoxu PR is here #2705

@neolit123
Copy link
Member Author

@neolit123 neolit123 modified the milestones: v1.25, v1.26 Jun 7, 2022
@neolit123 neolit123 self-assigned this Jun 7, 2022
@neolit123 neolit123 modified the milestones: v1.26, v1.25 Jun 7, 2022
@neolit123 neolit123 added the lifecycle/active Indicates that an issue or PR is actively being worked on by a contributor. label Jun 7, 2022
@ameukam
Copy link
Member

ameukam commented Aug 4, 2022

@neolit123 is there anything that's need to be done here ? happy to provide some help

@neolit123
Copy link
Member Author

@ameukam in the OP we have a task to do some cleanups in 1.26.
this can happen after k/k master opens post CF.

@afbjorklund
Copy link

afbjorklund commented Aug 9, 2022

Does something need to be done for kubeadm < 1.25 too, or can they just rely on the image redirect ?

This also goes for the old tarballs, that were generated with the content from kubeadm config images

Like: kubeadm config images list | xargs docker save

They could need some retagging, after image loading, if so.

@BenTheElder
Copy link
Member

Does something need to be done for kubeadm < 1.25 too, or can they just rely on the image redirect ?

k8s.gcr.io will continue to exist as a source of truth, and is itself an alias to the actual registries (unadvertised).

It would be a breaking change for older releases to switch the primary registry alias, e.g. for the reasons you mentioned, so they will not be updated.

However, in the near future k8s.gcr.io may begin to redirect to registry.k8s.io, which will again contain all the same images, but will backport only the need to allow it through the firewall, the image names will not change.

There was just a notice sent about this to the dev mailinglist, but it needs wider circulation

@neolit123
Copy link
Member Author

the flip was done on oct 3rd

https://groups.google.com/a/kubernetes.io/d/msgid/dev/CAOZRXm_h9CNpnwWe%3Dv07VdtbU60biUzED-2V94FsNmYjfVGQLw%40mail.gmail.com?utm_medium=email&utm_source=footer

kubeadm ci has been green thus far, let's reopen this if we need to change something else.

@neolit123
Copy link
Member Author

neolit123 commented Oct 25, 2022

after discussion with @BenTheElder we should actually backport this to >=1.23 releases (1.23, 1.24). >= 1.25 have the changes.
this is a matter of allowing the k8s project to not run out of funds ($$$).

note: this is an exception as we only backport bugfixes, but in this case it has to be done.

looks like we need to backport these PRs:
kubernetes/kubernetes#109938
kubernetes/kubernetes#110343

we could leave the backports without doing the cleanups that we did here:
kubernetes/kubernetes#112006

cc @SataQiu @pacoxu for comments

@neolit123
Copy link
Member Author

cc @fabriziopandini @sbueringer in case this affects CAPI.

@afbjorklund
Copy link

afbjorklund commented Oct 26, 2022

Need to do some retagging effort in minikube, or it would break the old preloads and caches (that only have the old registry)

Alternatively re-generate all the preloads, but that could already be "too late" if they are being cached and have been downloaded

it would only break air-gapped installs (and china ?)

the others would just be able to fetch the "new" image

@BenTheElder
Copy link
Member

BenTheElder commented Oct 26, 2022

Only if using the new patch release for which the images don't exist yet. Also for older releases we still currently plan to publish the tags to k8s.gcr.io it just won't be the default. Upgrades should always be taken carefully and we'll certainly need a prominent release note

@afbjorklund
Copy link

afbjorklund commented Oct 27, 2022

Yeah, on second thought this would just affect new (minor) releases of kubeadm

So all that is needed is to mirror the version selection logic, like today (>= 1.25.0-alpha.1)

@neolit123
Copy link
Member Author

@pacoxu
Copy link
Member

pacoxu commented Dec 23, 2022

It seems work is done in v1.26.

  • reopen if there are still remaining tasks.
    /close

@k8s-ci-robot
Copy link
Contributor

@pacoxu: Closing this issue.

In response to this:

It seems work is done in v1.26.

  • reopen if there are still remaining tasks.
    /close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@afbjorklund
Copy link

afbjorklund commented Dec 23, 2022

I added the backports to minikube. For most users, it is not used anyway.

By default, it will fetch the preload from GCS instead of using the cache.

https://storage.googleapis.com/minikube-preloaded-volume-tarballs/

But with --preload=false and without a mirror, then it will still use it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/deprecation Categorizes issue or PR as related to a feature/enhancement marked for deprecation. kind/feature Categorizes issue or PR as related to a new feature. lifecycle/active Indicates that an issue or PR is actively being worked on by a contributor. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete.
Projects
Development

No branches or pull requests

8 participants