Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Checkpoint/Restart or Live Motion #3803

Open
jonathan-3play opened this issue Feb 28, 2024 · 2 comments
Open

Checkpoint/Restart or Live Motion #3803

jonathan-3play opened this issue Feb 28, 2024 · 2 comments
Labels
area/kubernetes K8s including EKS, EKS-A, and including VMW type/enhancement New feature or request

Comments

@jonathan-3play
Copy link

What I'd like:

Live container migration. The ability to checkpoint containers and restart them on a different node, ideally with an ease and confidence rivaling workload migration seen in enterprise IT using virtual machines (e.g. VMware vMotion or Hyper-V Live Migration.

This will perhaps be viewed as outside the remit of a minimized, security-first OS such as Bottlerocket. OTOH Bottlerocket aspires to be the OS, foundation, and infrastructure for containerized workloads and world-leading k8s environments (e.g. EKS). Enterprise computing has long enjoyed workload migration (vMotion released 2003 and known to be used in production at scale by 2006). We'd love to see that in the container/k8s world.

In fact, we need workload migration it in the container/k8s world. While autoscalers (e.g. Karpenter) can eagerly provision more resources when needed, if a workload contains a mixture of short-, medium-, and long-duration jobs (ours most certainly do!), autoscalers are almost guaranteed to "strand" some nodes awaiting completion of the longest running jobs. Without workload migration, there is no way to effectively consolidate the long-running jobs and "compact" the cluster's resources.

Any alternatives you've considered:

  1. Segmenting possibly long-running jobs onto a separate node pool in the hopes of stranding fewer resources. Effortful and home-grown. Difficult to accurate determine every job's likely run duration a priori. Somewhat challenging to link app-based duration signals with infrastructure-level (Karpenter/k8s) scheduling controls. Not clear node segmentation would be efficient/efficacious.
  2. CRIU. Unclear if supported on Bottlerocket, or how well.
  3. DIY checkpoint/restart. Effortful and home-grown. Feels like should be system-supported, as in the VM world.
  4. Lighting a candle that Karpenter over time becomes smarter about recognizing node stranding and using that understanding to better bin-pack jobs, revisit previous do-not-schedule and deprovisioning decisions.
  5. Probably others. None feels compelling.
@jonathan-3play jonathan-3play added status/needs-triage Pending triage or re-evaluation type/enhancement New feature or request labels Feb 28, 2024
@yeazelm
Copy link
Contributor

yeazelm commented Feb 29, 2024

Hello @jonathan-3play, Thanks for cutting this well written issue! There has been some work around the ability to checkpoint and restore containers in cri-o and k8s. I don’t believe there is something off the shelf though for what you are describing. This is a pretty interesting feature request and I think there is some compelling use cases for being able to checkpoint/restore a long-running container. This issue will first require a deep dive into the current state of the various tools and what might be needed to deliver this type of functionality. I’d like to use this task to track any findings that might be of interest around checkpoint/restore and CRIU in Bottlerocket.

@yeazelm yeazelm added area/kubernetes K8s including EKS, EKS-A, and including VMW and removed status/needs-triage Pending triage or re-evaluation labels Feb 29, 2024
@kannon92
Copy link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/kubernetes K8s including EKS, EKS-A, and including VMW type/enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants