Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

minikube image cache slow (Support image caching for 'clean resetup') #14032

Closed
hughes20 opened this issue Apr 24, 2022 · 10 comments
Closed

minikube image cache slow (Support image caching for 'clean resetup') #14032

hughes20 opened this issue Apr 24, 2022 · 10 comments
Labels
co/docker-driver Issues related to kubernetes in container kind/feature Categorizes issue or PR as related to a new feature. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. os/linux priority/backlog Higher priority than priority/awaiting-more-evidence.

Comments

@hughes20
Copy link

hughes20 commented Apr 24, 2022

What Happened?

I am running a multi-node (2 to be specific) cluster via minikube. It takes about 10 minutes to setup from scratch (including image pulling).
Once everything is started up and running, I want to 'save' the images out so then I could do the clean setup again, but significantly faster (since we should not have to pull images again). So, I resorted to minikube image ls | xargs minikube cache add. This correctly gets all my images and adds them to ~/.minikube/cache/images. I then minikube delete, and start setting up my cluster again. This time though, my complete setup takes 15 minutes!? When I do minikube image ls I see there are two copies of every image that I cached. I assume what it did was it copy all images to both my nodes ('minikube' and 'minikube-m02'). That is not needed nor desired, but regardless, how is starting from cached images, slower than just pulling from the internet? I have about 36 images and some are upwards of 2GB (total size is only 9GB though). I know it's not doing anything with the internet because I displayed wifi and it was able to start (also did not disable wifi and it still took the 15 minutes).

Does the cache directory not get mounted to the minikube VM? Meaning it is copying 9GB into a VM, then into each of the nodes? My computer is able to copy the cache folder in (so I assume it is doing more than just copying that is taking forever?)

real	0m6.467s
user	0m0.029s
sys	0m5.971s

I have noticed that cache is being deprecated, so I attempted to use image save but that does not seem to do anything... I have gotten it to save when I explicitly specify a .tar file, but I cannot seem to get multiple images to save to a tar file (not supported I assume).
I was able to minikube image ls | docker image pull then save all via docker to one tar file, then load that back into minikube, this does not actually work for all the images it seems

I see there is the eval $(minikube docker-env) that (does not work for multi node clusters) that points your docker to minikubes docker, but why is there not a config to just have minikube point to local docker? That way the images always persist in my local docker and I do not have to try all these hacks to get around it

Attach the log file

I am not uploading my logs, their are sensitive container names/images. Sorry.

Operating System

Ubuntu

Driver

Docker

@afbjorklund
Copy link
Collaborator

afbjorklund commented Apr 24, 2022

Meaning it is copying 9GB into a VM, then into each of the nodes?

Yes. The cache directory is on the host. It gets copied to the node with scp

Unfortunately not all container runtimes support having more than one image to load.

So there is one tarball per image, which misses out on some optimization possibilities.

why is there not a config to just have minikube point to local docker?

It is not possible to access the docker images remotely, without using a registry.

But we could run a docker registry on the host, and have the nodes pull from there.

Note however, that there might not be a docker daemon running on the host at all.

@afbjorklund
Copy link
Collaborator

afbjorklund commented Apr 24, 2022

I have noticed that cache is being deprecated

I think that was a misunderstanding, it isn't really...

It is more like the "minikube cache" and "minikube image" commands have different semantics, and different config memory.

The cached images are recorded, and when a new cluster or node is started - they all get copied to it, same as reload

@afbjorklund afbjorklund added co/docker-driver Issues related to kubernetes in container kind/support Categorizes issue or PR as a support question. labels Apr 24, 2022
@afbjorklund
Copy link
Collaborator

afbjorklund commented Apr 24, 2022

When running with Docker on Linux, there are some optimizations that could be done with regards to the cache.

The original kind implementation bakes the images into the "node" image, but that makes it harder to extend later.
The current minikube implementation uses a "preload" storage snapshot, that is unpacked during the initial start.

But one could also use the old crane* image cache, and do a volume mount from ~/.minikube/cache/images/

* https://github.com/google/go-containerregistry#crane

That would save the overhead of having to copy them, like we otherwise need to do when running virtual machines.
However it would be something of a special case, and it'd probably more straight-forward to run a host registry ?


I wouldn't say that these are "hacks", though. It is about image distribution, it's the bind-mount that is the hack.

When running on a single Linux node, the easiest workaround would be to deploy Kubernetes right on the host.
It would "fix" the distribution problem, but you would not have any isolation and there would be only one node.

As soon as we have separate nodes, we need some way to distribute the images. The easiest being: a registry.

@afbjorklund afbjorklund added kind/feature Categorizes issue or PR as related to a new feature. priority/backlog Higher priority than priority/awaiting-more-evidence. and removed kind/support Categorizes issue or PR as a support question. labels Apr 24, 2022
@hughes20
Copy link
Author

hughes20 commented Apr 25, 2022

As soon as we have separate nodes, we need some way to distribute the images. The easiest being: a registry.

This all makes sens to me. For the docker registry then, I would configure it as a pass-through cache to get images from something like quay.io, docker.io, etc.. Do I have to do anything specific with minikube to get it to use the registry?

@afbjorklund
Copy link
Collaborator

Only the in-cluster registry is supported now, and only through the localhost:5000 proxy.

See the "registry" plugin

@hughes20
Copy link
Author

I see, so I would have to re-tag all my images to be from that registry? and then push all the images to that registry on startup. Assuming I have them stored locally in docker to begin with. (This is if I want to do it all offline)

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 24, 2022
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Aug 23, 2022
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

@k8s-ci-robot k8s-ci-robot closed this as not planned Won't fix, can't repro, duplicate, stale Sep 22, 2022
@k8s-ci-robot
Copy link
Contributor

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
co/docker-driver Issues related to kubernetes in container kind/feature Categorizes issue or PR as related to a new feature. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. os/linux priority/backlog Higher priority than priority/awaiting-more-evidence.
Projects
None yet
Development

No branches or pull requests

4 participants