Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: enable user-managed Pod Migration #43405

Open
erictune opened this issue Mar 20, 2017 · 9 comments
Open

Feature Request: enable user-managed Pod Migration #43405

erictune opened this issue Mar 20, 2017 · 9 comments
Labels
lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. sig/apps Categorizes an issue or PR as relevant to SIG Apps. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling.

Comments

@erictune
Copy link
Member

A user wants to extend Kubernetes to allow for application-specific migration in response to pod deletion events, whenever possible.

  1. Normally, there should be 1 instance of a pod -- call it pod-1.
  2. However, something (usually the system, e.g. rescheduler or node upgrades) wants to delete pod-1, then a replica, pod-2 should be created.
  3. Before pod-1 is actually terminated, it will discover pod-2 and they will do an application-level handoff of state.
  4. After handoff, scale down to just 1 pod, for economy.

This issue is created to suggest possible ways to implement this pattern.

@erictune
Copy link
Member Author

Possible approach 1:

  • Use statefulset for this pod. Initial size 1. Also a headless service.
  • When pod-1 gets graceful termination notice, it should call back to the API and scale its statefulset up to size 2, then wait to discover pod-2 via DNS, then sync with it, then exit.
  • When a new instance ofpod-1 is later created, sync the other way, and scale back down.

Advantages of this approach:

  • Can be implemented today without Kubernetes changes, and basically using any script or program that can do a scale up, scale down, and discover the peer using DNS.

Drawbacks to this approach:

  • Requires migrating twice, when in principle only 1 migration is needed, in order to get the stateful set down to size 1 again.
  • Requires authorizing the pod to scale its own controller. Might be undesirable for some security-sensitive applications.

@erictune
Copy link
Member Author

erictune commented Mar 20, 2017

Possible approach 2:

  • write an "operator", in the style of https://github.com/upmc-enterprises/elasticsearch-operator which implements a "migratingPod" concept, which has a pod Template.
  • It makes one pod with a random name, say pod-32jdg
  • When that pod gets a graceful termination, it waits for a peer to appear.
  • When the operator sees a deletion timestamp on pod-32jdg, then it creates a replacement such as pod-m2k87. It uses the same labels.
  • Both pods discover each other, either by watching DNS + using a headless service, or by direct message from the operator. They initiate migration.
  • pod-32jdg exits gracefully when migration is done.
  • A regular service in front of both of them can provide a stable name as the pods go through different random names. Operator could even manage endpoints if it wants to control the moment of handoff between the two instances (modulo pre-existing connections).

Advantages of this approach:

  • Can be implemented today without Kubernetes changes.
  • Only one migration needed. A regular service

Drawbacks to this approach:

  • Writing operator is more complex than Approach 1, but not too bad.

@davidopp davidopp added sig/apps Categorizes an issue or PR as relevant to SIG Apps. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. labels Mar 21, 2017
@0xmichalis
Copy link
Contributor

0xmichalis commented Mar 23, 2017

Approach 1 seems like a custom strategy: #14510

cc: @kubernetes/sig-apps-feature-requests

@dhilipkumars
Copy link

This proposal can simplify both the approaches. I believe future Operators can become lightweight if we allow more elaborate cleanup mechanism.

@bgrant0607
Copy link
Member

Ref #3949

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 29, 2018
@kow3ns kow3ns added this to Backlog in Workloads Feb 27, 2018
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten
/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Feb 28, 2018
@kow3ns kow3ns added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. labels Mar 3, 2018
@kow3ns kow3ns removed this from Backlog in Workloads Mar 3, 2018
@kow3ns kow3ns added this to Backlog in Workloads via automation Mar 3, 2018
@kow3ns kow3ns moved this from Backlog to Freezer in Workloads Mar 3, 2018
@xtchenhui
Copy link

@erictune have you tried the approach 2? Actually i'm investigating on the similar method now to migrate runv containters from one node to another.

@krmayankk krmayankk added this to Enterprise Readiness in Technical Debt Research Oct 21, 2019
@ashish-billore
Copy link
Contributor

Related: #3949

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. sig/apps Categorizes an issue or PR as relevant to SIG Apps. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling.
Projects
Status: Needs Triage
Status: Needs Triage
Technical Debt Research
Enterprise Readiness
Workloads
  
Freezer
Development

No branches or pull requests

10 participants