Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature request: Take a snapshot of a cluster #9991

Open
owenthereal opened this issue Dec 17, 2020 · 15 comments
Open

feature request: Take a snapshot of a cluster #9991

owenthereal opened this issue Dec 17, 2020 · 15 comments
Labels
help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/feature Categorizes issue or PR as related to a new feature. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/backlog Higher priority than priority/awaiting-more-evidence.

Comments

@owenthereal
Copy link

owenthereal commented Dec 17, 2020

Steps to reproduce the issue:

This is a feature request not an issue.

I would like to be able to take a snapshot of the cluster and reuse it for CI for a faster boot time. I have functional tests that run on a minikube cluster but the tests require installation of a bunch of software, e.g. ingress-nginx, cert-manager etc. I'm wondering whether there is a way to "pre-burn" a minikube image by taking a snapshot of a running cluster. The image includes a version of the minikube k8s and software that I install on it. The image is then used in CI to avoid reinstalling all the basic software every time. This would immensely shorten the CI time.

Full output of failed command:

N/A

Full output of minikube start command used, if not already included:

N/A

Optional: Full output of minikube logs command:

N/A

@afbjorklund
Copy link
Collaborator

Did you check out the minikube cache command already ?

@owenthereal
Copy link
Author

minikube cache adds/deletes/loads an image to/from minikube. What I wanted is to burn the whole cluster with software that I install into the cluster as an image so that it can be reused elsewhere, e.g. CI.

@tstromberg tstromberg added the kind/feature Categorizes issue or PR as related to a new feature. label Dec 18, 2020
@afbjorklund
Copy link
Collaborator

afbjorklund commented Dec 20, 2020

You did not mention what platform and driver this was for. I'm not sure the "pause" feature will survive such a "burn"...

We talked about adding the suspend/resume feature earlier, but ended up adding the pause/unpause feature instead.
In theory the VirtualBox driver should support it, but I have only done it locally - and there is no minikube support for it

There might be other "kubernetes backup" software, to take snapshots of running clusters. Maybe they can be integrated.

@afbjorklund afbjorklund added the triage/discuss Items for discussion label Dec 20, 2020
@owenthereal
Copy link
Author

owenthereal commented Dec 20, 2020

You did not mention what platform and driver this was for

I'm looking for this feature for Linux with the Docker driver mainly. A quick google shows that it is possible and easy to do: https://stackoverflow.com/questions/26331651/how-can-i-backup-a-docker-container-with-its-data-volumes.

Regarding other drivers, most of them should have the feature of snapshotting or suspending/resuming as you mentioned.

There might be other "kubernetes backup" software, to take snapshots of running clusters. Maybe they can be integrated.

I would love to see the integration of such software. Any pointers?

@afbjorklund
Copy link
Collaborator

Something like https://github.com/vmware-tanzu/velero perhaps ?

@k8s-ci-robot k8s-ci-robot added the kind/support Categorizes issue or PR as a support question. label Dec 23, 2020
@owenthereal
Copy link
Author

I checked https://github.com/vmware-tanzu/velero and it's cloud provider specific. I'm unclear how that would apply for minikube.

@medyagh
Copy link
Member

medyagh commented Jan 6, 2021

this is a good idea ! we have plans to explore the idea of freezing a cluster to potentially start minikube faster with a frozen cluster.

this is on our roadmap to explore.
both
@ilya-zuyev @priyawadhwa are looking into this.

@priyawadhwa
Copy link

Yup, we are aiming to have a design for this this quarter. I will update this issue as we know more!

@priyawadhwa priyawadhwa removed the triage/discuss Items for discussion label Jan 20, 2021
@medyagh medyagh changed the title Take a snapshot of a cluster feature request: Take a snapshot of a cluster Mar 3, 2021
@medyagh medyagh added priority/backlog Higher priority than priority/awaiting-more-evidence. and removed kind/support Categorizes issue or PR as a support question. labels Mar 3, 2021
@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 1, 2021
@owenthereal
Copy link
Author

Hello, I'm curious that while we are waiting for this feature to be ready, is there anything you could share we do in the meantime, e.g. commands to run to take a snapshot of the running cluster, export it, and run the snapshot at another place.

@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jul 11, 2021
@spowelljr spowelljr added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. labels Jul 14, 2021
@sharifelgamal
Copy link
Collaborator

This unfortunately fell by the wayside as priorities shifted in our roadmap. I still think it's a great idea, but a design was never fully finished.

@mholtzhausen
Copy link

Is there a chance this is coming back?

@sharifelgamal sharifelgamal added the help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. label May 18, 2022
@sharifelgamal
Copy link
Collaborator

The minikube maintainers don't currently have the bandwidth to work on this feature, but it's still a cool idea! I'll add the help wanted label and would be happy to look at a PR.

@nirs
Copy link
Contributor

nirs commented Nov 16, 2022

I can see something like this:

Creating cluster:

minikube start ...
play with cluster ...
minikube stop

minikube snapshot snap1

Using cluster form last persisted state:

minikube start
Play ...
minikube stop

Reverting to snapshot, dropping all changes since the snapshot:

minikube revert snap1

Taking a snapshot is easy with vm based drivers (e.g. kvm2). Not sure about other drivers,
but such feature can be useful even if it works only with some drivers.

For kvm2/qemu based drivers, creating a snapshot can be simple as:

mv disk.qcow2 snap1.qcow2
qemu-img create -f qcow2 -b snap1.qcow2 -F qcow2 disk.qcow2

Reverting a snapshot:

mv snap1.qcow2 disk.qcow2 

If the original disk was raw, we need to change the disk in the vm xml.
If there are multiple snapshot, the qcow2 chain may need to be modified.

Creating and deleting live snapshot can be nice but is not really needed
to make this useful.

This can awesome feature of CI. In ramen we have a test environment creating 3 clusters
with complicated deployment, taking 7-8 minutes to create:
https://github.com/RamenDR/ramen/blob/main/test/regional-dr.yaml

If we can re-create the environment periodically and starting from a snapshot,
we can get a running test environment in 1-2 minutes. Of course if the tests run
with the environment take 60 minutes, saving 6 minutes creating the environment
is not huge saving, but still important.

Is it possible to implement such feature as a 3rd party addon? Do we have APIs to query
and modify the vm disks, so such addon will not break when minikube changes the way
disks are stored?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/feature Categorizes issue or PR as related to a new feature. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/backlog Higher priority than priority/awaiting-more-evidence.
Projects
None yet
Development

No branches or pull requests