You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Live container migration. The ability to checkpoint containers and restart them on a different node, ideally with an ease and confidence rivaling workload migration seen in enterprise IT using virtual machines (e.g. VMware vMotion or Hyper-V Live Migration.
This will perhaps be viewed as outside the remit of a minimized, security-first OS such as Bottlerocket. OTOH Bottlerocket aspires to be the OS, foundation, and infrastructure for containerized workloads and world-leading k8s environments (e.g. EKS). Enterprise computing has long enjoyed workload migration (vMotion released 2003 and known to be used in production at scale by 2006). We'd love to see that in the container/k8s world.
In fact, we need workload migration it in the container/k8s world. While autoscalers (e.g. Karpenter) can eagerly provision more resources when needed, if a workload contains a mixture of short-, medium-, and long-duration jobs (ours most certainly do!), autoscalers are almost guaranteed to "strand" some nodes awaiting completion of the longest running jobs. Without workload migration, there is no way to effectively consolidate the long-running jobs and "compact" the cluster's resources.
Any alternatives you've considered:
Segmenting possibly long-running jobs onto a separate node pool in the hopes of stranding fewer resources. Effortful and home-grown. Difficult to accurate determine every job's likely run duration a priori. Somewhat challenging to link app-based duration signals with infrastructure-level (Karpenter/k8s) scheduling controls. Not clear node segmentation would be efficient/efficacious.
CRIU. Unclear if supported on Bottlerocket, or how well.
DIY checkpoint/restart. Effortful and home-grown. Feels like should be system-supported, as in the VM world.
Lighting a candle that Karpenter over time becomes smarter about recognizing node stranding and using that understanding to better bin-pack jobs, revisit previous do-not-schedule and deprovisioning decisions.
Probably others. None feels compelling.
The text was updated successfully, but these errors were encountered:
Hello @jonathan-3play, Thanks for cutting this well written issue! There has been some work around the ability to checkpoint and restore containers in cri-o and k8s. I don’t believe there is something off the shelf though for what you are describing. This is a pretty interesting feature request and I think there is some compelling use cases for being able to checkpoint/restore a long-running container. This issue will first require a deep dive into the current state of the various tools and what might be needed to deliver this type of functionality. I’d like to use this task to track any findings that might be of interest around checkpoint/restore and CRIU in Bottlerocket.
What I'd like:
Live container migration. The ability to checkpoint containers and restart them on a different node, ideally with an ease and confidence rivaling workload migration seen in enterprise IT using virtual machines (e.g. VMware vMotion or Hyper-V Live Migration.
This will perhaps be viewed as outside the remit of a minimized, security-first OS such as Bottlerocket. OTOH Bottlerocket aspires to be the OS, foundation, and infrastructure for containerized workloads and world-leading k8s environments (e.g. EKS). Enterprise computing has long enjoyed workload migration (vMotion released 2003 and known to be used in production at scale by 2006). We'd love to see that in the container/k8s world.
In fact, we need workload migration it in the container/k8s world. While autoscalers (e.g. Karpenter) can eagerly provision more resources when needed, if a workload contains a mixture of short-, medium-, and long-duration jobs (ours most certainly do!), autoscalers are almost guaranteed to "strand" some nodes awaiting completion of the longest running jobs. Without workload migration, there is no way to effectively consolidate the long-running jobs and "compact" the cluster's resources.
Any alternatives you've considered:
The text was updated successfully, but these errors were encountered: