Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

minikube image load loads the same image but with different tags again #11322

Open
kochetov-dmitrij opened this issue May 7, 2021 · 12 comments
Labels
area/image Issues/PRs related to the minikube image subcommand kind/feature Categorizes issue or PR as related to a new feature. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/backlog Higher priority than priority/awaiting-more-evidence.

Comments

@kochetov-dmitrij
Copy link

Problem

minikube image load loads the same image but with different tags again instead of just comparing layer hashes and skipping the second unnecessary uploading.

Steps to reproduce the issue:

  1. Use a fat image, for example, python:3.8-buster (330MB)
  2. Upload it to minikube, it will take a minute
  3. Use another tag for the image
  4. Upload the new tag to minikube, it will take approx the same time that makes no sense and should be optimized
$ docker pull python:3.8-buster
$ time minikube image load python:3.8-buster                                                                                                                                                                
                                                                                                                                                                                                            
real    0m54,873s                                                                                                                                                                                           
user    0m37,457s                                                                                                                                                                                           
sys     0m2,101s            
                                                                                                                                                                                
$ docker tag python:3.8-buster python:3.8-buster-mytag                                                                                                                                                      
$ time minikube image load python:3.8-buster-mytag                                                                                                                                                          
                                                                                                                                                                                                            
real    0m47,699s                                                                                                                                                                                           
user    0m37,212s                                                                                                                                                                                           
sys     0m1,855s  

The total size of all images doesn't change after loading different tags of the same image

$ minikube ssh
minikube$ docker system df

Sys info

😄  minikube v1.19.0 on Ubuntu 18.04
✨  Using the virtualbox driver based on user configuration
👍  Starting control plane node minikube in cluster minikube
🔥  Creating virtualbox VM (CPUs=4, Memory=16384MB, Disk=20000MB) ...
🐳  Preparing Kubernetes v1.20.2 on Docker 20.10.4 ...
    ▪ Generating certificates and keys ...
    ▪ Booting up control plane ...
    ▪ Configuring RBAC rules ...
🔎  Verifying Kubernetes components...
    ▪ Using image gcr.io/k8s-minikube/storage-provisioner:v5
🌟  Enabled addons: storage-provisioner, default-storageclass
@afbjorklund
Copy link
Collaborator

afbjorklund commented May 7, 2021

minikube image load loads the same image but with different tags again instead of just comparing layer hashes and skipping the second unnecessary uploading.

There are no such features in the current implementation, it is purely based on the name:tag

This means that you need to use the alternatives (docker/podman/ctr) if you require it now

@afbjorklund afbjorklund added the kind/feature Categorizes issue or PR as related to a new feature. label May 7, 2021
@afbjorklund
Copy link
Collaborator

afbjorklund commented May 7, 2021

that makes no sense and should be optimized

both load and build are based on tarballs, not layers or files

So it looks something like:

  1. docker image save
  2. scp
  3. docker image load

For build context it is like:

  1. tar c
  2. scp
  3. tar x
  4. docker build

And then similar equivalents for the others.

@afbjorklund
Copy link
Collaborator

afbjorklund commented May 7, 2021

For load: using the cluster registry addon should also be able to fix this.

$ docker tag python:3.8-buster $(minikube ip):5000/python:3.8-buster
$ time docker push $(minikube ip):5000/python:3.8-buster
The push refers to repository [192.168.49.2:5000/python]
a43310659d53: Pushed 
8be90fda4620: Pushed 
ddc3469d87c0: Pushed 
8d18b38717e2: Pushed 
651326e9f1ca: Pushed 
5d5962699bd5: Pushed 
a42439ce9650: Pushed 
26270c5e25fa: Pushed 
e2c6ff462357: Pushed 
3.8-buster: digest: sha256:5ca75ad9cdf54ceebfd30f2e7e6b396c6779a7efac5d7aaa40cfd73190c2e8fc size: 2217

real	0m31,681s
user	0m0,192s
sys	0m0,090s
$ docker tag python:3.8-buster $(minikube ip):5000/python:3.8-buster-mytag
$ time docker push $(minikube ip):5000/python:3.8-buster-mytag
The push refers to repository [192.168.49.2:5000/python]
a43310659d53: Layer already exists 
8be90fda4620: Layer already exists 
ddc3469d87c0: Layer already exists 
8d18b38717e2: Layer already exists 
651326e9f1ca: Layer already exists 
5d5962699bd5: Layer already exists 
a42439ce9650: Layer already exists 
26270c5e25fa: Layer already exists 
e2c6ff462357: Layer already exists 
3.8-buster-mytag: digest: sha256:5ca75ad9cdf54ceebfd30f2e7e6b396c6779a7efac5d7aaa40cfd73190c2e8fc size: 2217

real	0m0,199s
user	0m0,148s
sys	0m0,064s

Since that uses the layers, not tarballs.


For build one would need to keep a local build context (on the node), and instead use rsync to update it before building.

That way only the delta would be transferred, rather than having to archive the whole build context (dir) and send it again.

testbuild/Dockerfile

minikube image build testbuild
#/tmp/build.554899704.tar
#scp /tmp/build.554899704.tar --> /var/lib/minikube/build/build.554899704.tar
#sudo mkdir -p /var/lib/minikube/build/build.554899704
#sudo tar -C /var/lib/minikube/build/build.554899704 -xf /var/lib/minikube/build/build.554899704.tar
testbuild/Dockerfile

rsync --update --delete testbuild/ minikube:/var/lib/minikube/build/testbuild/
minikube build file:///var/lib/minikube/build/testbuild

As in it would use (partial) files, not tarballs.

@afbjorklund
Copy link
Collaborator

afbjorklund commented May 7, 2021

@spowelljr something for your benchmarks

https://minikube.sigs.k8s.io/docs/benchmarks/imagebuild/

@afbjorklund
Copy link
Collaborator

I should also mention that the problem with using the registry, is that the container start then needs to pull the image...

i.e. it has been uploaded to /var/lib/registry but needs to be copied over to /var/lib/docker (etc) before it becomes available

@kochetov-dmitrij : for your artificial use case, would it help if there was a minikube image tag command ?

minikube image tag python:3.8-buster python:3.8-buster-mytag

When minikube is pulling the image directly from an external registry, it will also use layers - like it normally does.

The load command is intended to load something from the cache or from the host locally, not really from external.

So if that is the case, it might be better to let the kubelet handle it - or maybe to use minikube image load --pull

minikube image  load --pull python:3.8-buster  # uses crictl

@afbjorklund
Copy link
Collaborator

afbjorklund commented May 8, 2021

This is (sortof) related to: #11276

Currently there is an optimization to not load images that exist, but it only looks at name:tag and not at contents.

@kochetov-dmitrij
Copy link
Author

Thanks for the suggestions!

My use case is building images on my host and running them on minikube in my dev pipeline. Sometimes there are no changes in the image but a new tag gets assigned.

I thought minikube image load is a more convenient alternative of eval $(minikube docker-env)
So minikube image tag and minikube image load --pull wouldn't help me.

Looks like using eval $(minikube docker-env) is still the best option in my case. Things that disturb me are

  • Forgetting about env reset sometimes eval $(minikube docker-env -u)
  • Keeping build cache in minikube so after doing minikube delete; minikube start it gets removed

I can look into implementing minikube image load based on layers when I have free time if you don't mind

@afbjorklund
Copy link
Collaborator

afbjorklund commented May 8, 2021

Sometimes there are no changes in the image but a new tag gets assigned.

It would be nice if we could recognize this. It would also help with the "latest"

i.e. the opposite problem, you have the same tag - but the contents changed

The hope is to be able to use the "id" for this, even if it has other problems...

@afbjorklund
Copy link
Collaborator

afbjorklund commented May 8, 2021

My use case is building images on my host and running them on minikube in my dev pipeline.

The usual workaround/shortcut is to build the images in minikube, instead of building on the host

Keeping build cache in minikube so after doing minikube delete; minikube start it gets removed

This would need some kind of "cloud storage" I suppose, or at least a backup and restore step.

@kochetov-dmitrij
Copy link
Author

This would need some kind of "cloud storage" I suppose, or at least a backup and restore step.

I meant I would build the images and store their cache on my host. And regardless whether minikube is up or going to start from a scratch, I can always quickly build my images on host and uploading them to minikube by running my pipeline. The "restore step" is simply running the pipeline again

@afbjorklund
Copy link
Collaborator

You said that it disturbed you that your build cache stored in the cluster got deleted with the cluster.
So that means that the cache will have to be saved somewhere, between the "delete" and the "start".

It doesn't mean that all images will need to be built outside the cluster, even if that is one solution.
Currently we are using /var/lib/minikube/build and /var/lib/buildkit (and friends), for the build cache.

@spowelljr spowelljr added the priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. label May 17, 2021
@sharifelgamal sharifelgamal added priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. and removed priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. labels Jun 14, 2021
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 12, 2021
@spowelljr spowelljr added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Sep 13, 2021
@spowelljr spowelljr added priority/backlog Higher priority than priority/awaiting-more-evidence. and removed priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. labels Jan 19, 2022
@afbjorklund afbjorklund added the area/image Issues/PRs related to the minikube image subcommand label Mar 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/image Issues/PRs related to the minikube image subcommand kind/feature Categorizes issue or PR as related to a new feature. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/backlog Higher priority than priority/awaiting-more-evidence.
Projects
None yet
Development

No branches or pull requests

6 participants