Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Write proposal for controller pod management: adoption, orphaning, ownership, etc. (aka controllers v2) #14961

Open
bgrant0607 opened this issue Oct 2, 2015 · 37 comments
Labels
area/app-lifecycle lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/apps Categorizes an issue or PR as relevant to SIG Apps.
Projects

Comments

@bgrant0607
Copy link
Member

bgrant0607 commented Oct 2, 2015

Write a comprehensive proposal for how controllers should manage sets of pods. The main goal is to make controller APIs more usable and less error-prone.

We've discussed a number of changes:

We may want to split the following into separate issues:

Changes that would facilitate static work/role assignment:

Long-standing idea to improve security and reusability around templates:

Reusability could also be addressed by:

Also need to make it easier to update existing pods:

@bgrant0607 bgrant0607 added priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. kind/documentation Categorizes issue or PR as related to documentation. team/ux labels Oct 2, 2015
@bgrant0607 bgrant0607 self-assigned this Oct 2, 2015
@bgrant0607
Copy link
Member Author

A wrinkle: updating a selector in Deployment #14894

@bgrant0607
Copy link
Member Author

Some use cases for orphaning and/or adoption of pods:

  • Stop the controller and restart it later. For instance, that's the way we pause Deployment at the moment.
  • Update some attributes of the controller that can't be updated by deleting and re-creating it. For instance, renaming the controller, which we do in "simple rolling update".
  • Adoption by an non-backward-compatible API resource.
  • Update labels on existing pods that would be incompatible with the controller. Delete controller, change pod labels, create new controller with new selector.
  • Bootstrapping. Create pods to start the control plane, then create their controllers.
  • System exploration/learning. Create a pod, then a controller to manage it.
  • Debugging. Change a pod's labels to orphan it so that it can be replaced and debugged out of the critical path.
  • Replacement. Replace a pod generated from the template with a special one, perhaps with special support for profiling, tracing, audit, monitoring, etc.

@bgrant0607 bgrant0607 changed the title Write proposal for controller pod management: adoption, orphaning, ownership, etc. Write proposal for controller pod management: adoption, orphaning, ownership, etc. (aka controllers v2) Oct 2, 2015
@bgrant0607
Copy link
Member Author

If we had an imperative API, a controllerRef backpointer could be manipulated to transfer ownership: transfer pods from controller X to controller Y. Orphaning would be somewhat weird in that a dangling controllerRef pointer would need to be left. Otherwise, there would be no attribute to select on for re-adoption. Or one could require transferring ownership before deleting the previous controller.

With our declarative API, we can't get rid of the labels and selector since those identify which pods should be adopted -- there needs to be some known unique set of attributes to select on.

Will also include nominal services #260, splitting out the template #170, and nominal jobs #14188 in the proposal. See also #12450.

Ref #8190, since this is a "next-gen" API proposal.

@bgrant0607
Copy link
Member Author

@pmorie
Copy link
Member

pmorie commented Oct 3, 2015

@kubernetes/rh-cluster-infra

@davidopp
Copy link
Member

davidopp commented Oct 5, 2015

If we had an imperative API, a controllerRef backpointer could be manipulated to transfer ownership: transfer pods from controller X to controller Y. Orphaning would be somewhat weird in that a dangling controllerRef pointer would need to be left. Otherwise, there would be no attribute to select on for re-adoption. Or one could require transferring ownership before deleting the previous controller.

With our declarative API, we can't get rid of the labels and selector since those identify which pods should be adopted -- there needs to be some known unique set of attributes to select on.

Maybe it's a semantic quibble, but I don't think the issue is imperative vs. declarative -- it's implicit vs. explicit encoding of which controller owns a pod. I see the as both being declarative.

Do we have any examples/uses cases of adoption yet?

I think it would be great if we could allow people to write controllers that only support the explicit model, i.e. don't have a Selector. (The controller would fill in controllerRef for the pods it creates.)

@bgrant0607
Copy link
Member Author

By imperative, I meant operations like "transfer pods from controller X to controller Y".

Adoption requests: #11209
Other scenarios are described above.

@nikhiljindal
Copy link
Contributor

/sub

@bgrant0607
Copy link
Member Author

cc @smarterclayton

@soltysh
Copy link
Contributor

soltysh commented Nov 9, 2015

/sub

@davidopp
Copy link
Member

davidopp commented Dec 1, 2015

Having a backpointer from pod to controller will greatly simplify scheduler code, consider for example the hoops we jump through to determine if a pod is controlled by a particular RC in CalculateSpreadPriority() and CalculateAntiAffinityPriority() (both in selector_spreading.go). A backpointer to the service would also be nice (would simplify CalculateSpreadPriority(), which spreads on both RC and service).

@smarterclayton
Copy link
Contributor

Back pointer to service has other undesirable characteristics - like
services have to mutate pods to reference them, which means services
controller is another thing which has effective root access to the
cluster. Is the benefit high enough?

On Dec 1, 2015, at 5:28 AM, David Oppenheimer notifications@github.com
wrote:

Having a backpointer from pod to controller will greatly simplify scheduler
code, consider for example the hoops we jump through to determine if a pod
is controlled by a particular RC in CalculateSpreadPriority() and
CalculateAntiAffinityPriority() (both in selector_spreading.go). A
backpointer to the service would also be nice (would simplify
CalculateSpreadPriority(), which spreads on both RC and service).


Reply to this email directly or view it on GitHub
#14961 (comment)
.

@bgrant0607
Copy link
Member Author

Template proposal: #18215

PetSet proposal: #18016

@bgrant0607 bgrant0607 removed this from the next-candidate milestone Apr 28, 2016
@bgrant0607 bgrant0607 removed their assignment Apr 28, 2016
@bgrant0607
Copy link
Member Author

For 1.3, we're working on:

  • controllerRef
  • cascading deletion
  • finalizers
  • generation/observedGeneration
  • PetSet
  • Templates

@bgrant0607
Copy link
Member Author

Note that unique label/selector generation similar to what we do for Job will require changes to kubectl expose, if will want users to be able to do a rolling update to an RC/RS with auto-generated labels.

#17902
https://groups.google.com/forum/?utm_medium=email&utm_source=footer#!msg/kubernetes-dev/WbqQVNkDZUE/1f5WZK2zCAAJ

@0xmichalis
Copy link
Contributor

Make controllers aware of namespace termination: #38612

@bgrant0607
Copy link
Member Author

v1 plan: #42752

cc @kow3ns

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

Prevent issues from auto-closing with an /lifecycle frozen comment.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or @fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 4, 2018
@soltysh
Copy link
Contributor

soltysh commented Jan 16, 2018

/remove-lifecycle stale
/lifecycle frozen

@k8s-ci-robot k8s-ci-robot added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jan 16, 2018
@kow3ns kow3ns added this to Backlog in Workloads Mar 1, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/app-lifecycle lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/apps Categorizes an issue or PR as relevant to SIG Apps.
Projects
Status: Needs Triage
Workloads
  
Backlog
Development

No branches or pull requests