Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal for ControllerReference #25256

Merged
merged 1 commit into from Jul 20, 2016
Merged

Conversation

gmarek
Copy link
Contributor

@gmarek gmarek commented May 6, 2016

Proposal for including the reference pointing to the owning "collection" (controller) for objects that can be grouped. The goal is to prevent a situation when two controllers are fighting over some resources.

cc @bgrant0607 @lavalamp @caesarxuchao @davidopp @fgrzadkowski @wojtek-t @kubernetes/sig-api-machinery

@gmarek gmarek added kind/design Categorizes issue or PR as related to design. kind/documentation Categorizes issue or PR as related to documentation. labels May 6, 2016
@k8s-github-robot k8s-github-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. release-note-label-needed labels May 6, 2016
@gmarek gmarek assigned lavalamp and unassigned smarterclayton May 6, 2016
@gmarek
Copy link
Contributor Author

gmarek commented May 6, 2016

There's a couple questions that need answers:

  • do we care about determinism between controllers of the same kind
  • what do we want to do with invalid setReferences that don't have corresponding controllerManagers running (because of downgrade, crash or some bug)

@gmarek gmarek force-pushed the proposal branch 2 times, most recently from 208b69b to 4d026d9 Compare May 6, 2016 14:33
```
ObjectMeta {
setReference *SetReference
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think "set" is a bad name, it sounds like a verb. "collection"? "organizer"? "boss"? "manager"? "director"?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overseer?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gmarek's original idea was collection. I suggested set, on the argument that it is no more vague than collection (except to former Borg users) and it contains the suffix we're kinda trying to use for API objects (I say kinda because we did ReplicaSet and DaemonSet, but then fell off the wagon with Job and Deployment). You make a good point that "set" might be interpreted as a verb. I'm fine collection. I don't think the other terms you listed are good, because I think we want to covey that this is the set, not the logic that manages the set (which AIUI is part of the reason we moved away from calling collections *Controller, to calling them *Set).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed to CollectionRef.

@lavalamp
Copy link
Member

lavalamp commented May 6, 2016

@smarterclayton I'd like to get an LGTM from you or one of your minions on this before we accept the design.


# Upgrade/downgrade procedure

During system upgrade controllers will adopt objects that match their label selector using ordinary adoption procedure. Adoption, by default, will set *IsParent* to *false,* so the behavior of objects existing before the update won’t change (i.e. deleting owners won't cause cascading deletion - see OwnerReference design).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think by default IsParent should set true, so if the user later want to do cascading deletion, it will be possible.

We can keep the backward compatibility by defaulting the DeleteOptions.OrphanDependents to true.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SG.

@gmarek gmarek force-pushed the proposal branch 2 times, most recently from ba08332 to c9e3d4c Compare May 9, 2016 09:26
@gmarek gmarek added release-note-none Denotes a PR that doesn't merit a release note. and removed release-note-label-needed labels May 9, 2016
// UID of the referent.
UID types.UID
// Should this CollectionReference be treated as an owner for cascading deletion purposes
Owner bool
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As an alternative to this, we could instead add a CollectionOwner bool field to OwnerRef, making this a special OwnerReference. Did we discuss that approach?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if you want the garbage collector to be responsible for clearing these, then we should really consider making these less different from the things the garbage collector actually looks at.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We did not. We were talking about how to do this with @caesarxuchao but run out of time. What you propose is that a collectionRef is always an owner. I'm fine with that, if everyone are (@bgrant0607).

Then we can have collectionRef of an *OwnerRef type as a copy of the value from OwnerReferences. Everything that modifies OwnereReferences would be responsible for keeping collectionRef up to date. Other solution would require iterating over all owner every time we need to get to the "managing collection" (I'm running out of words...), and as it's done on every update it may be quite expensive (@wojtek-t)

@bgrant0607
Copy link
Member

@gmarek The title of this PR is misleading.

It's also not helpful for others to omit the PR description.

- [Implementation plan (sketch)](#implementation-plan-sketch)
- [Considered alternatives](#considered-alternatives)

# Goal of CollectionReference
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't use the term "Collection" ANYWHERE. Please don't introduce a new term.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have a better suggestion? AFAIK we don't have a term for the set of pods belonging to a single controller.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We were planning to use "Set" as in SetReference but @lavalamp pointed out that SetReference sounds like Set is being used as a verb.

@gmarek
Copy link
Contributor Author

gmarek commented Jul 1, 2016

@lavalamp PTAL

@gmarek
Copy link
Contributor Author

gmarek commented Jul 7, 2016

Ping @lavalamp

@gmarek
Copy link
Contributor Author

gmarek commented Jul 7, 2016

@smarterclayton - can someone from RH take a look.
@davidopp - can you find some approver for this proposal?

@gmarek gmarek assigned smarterclayton and davidopp and unassigned lavalamp Jul 7, 2016
@davidopp
Copy link
Member

davidopp commented Jul 7, 2016

I think @bgrant0607 or @lavalamp or @caesarxuchao are the best approvers for this. I don't mind doing it but I wasn't involved in all the design discussions so I might not notice something that was omitted.

@caesarxuchao
Copy link
Member

I'll take another pass tonight.

@smarterclayton
Copy link
Contributor

I have no further comments

* there are no `OwnerReferences` with `Controller` set to true in `OwnerReferences` array
* `DeletionTimestamp` is not set
and
* Controller is the oldest controller of a given kind from ones which has a selector that matches labels of adopted object that does not have `DeletionTimestamp` set.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you remind me why we want this determinism?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Testability and general complexity of the system. It's more like philosophy than anything else, but I believe that deterministic behavior is way simpler for people to understand, and various interaction between different components are slightly clearer. Maybe @bgrant0607 can give more reasons.

That being said I agree that the system will work mostly the same from the user's perspective without this assumption.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm asking this because to enforce this rule will make the controller code complex and less efficient based on my experience of refactoring the rc manager. Let's say rc1 and rc2 both have matching selectors, and rc1 is older. If rc2 gets processed by the rm manager's syncHandler before rc1, the only way to prevent rc2 to adopt a pod is checking if there is older matching rc in the system. This is time consuming (if the lookupCache is invalidated) and make the code complex. I'm not convince we should do this for "testability and general complexity". Having multiple controllers matching a pod is still an anti-pattern. We are not obligated to make the system behavior simple to understand if this happens.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Testability implies robustness. Let me think about the implementation next week, if I won't think of something I may remove this requirement.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm with @caesarxuchao here. How about using the pattern @erictune introduced in Jobs, where upon creation we add an additional selector which guarantees pod selection uniqueness? See this code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's an orthogonal to the issue here and is something that should be discussed in much wider audience and we'd need to overwrite it for RS created by the Deployment.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not quite, since #14961, which this proposal addresses mentioned #12298 as one of the topics.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I didn't knew about that.

@wojtek-t
Copy link
Member

We are finishing implementation - can we have the proposal finally merged?

k8s-github-robot pushed a commit that referenced this pull request Jul 15, 2016
Automatic merge from submit-queue

[GarbageCollector] Let the RC manager set/remove ControllerRef

What's done:
* RC manager sets Controller Ref when creating new pods
* RC manager sets Controller Ref when adopting pods with matching labels but having no controller
* RC manager clears Controller Ref when pod labels change
* RC manager clears pods' Controller Ref when rc's selector changes
* RC manager stops adoption/creating/deleting pods when rc's DeletionTimestamp is set
* RC manager bumps up ObservedGeneration: The [original code](https://github.com/kubernetes/kubernetes/blob/master/pkg/controller/replication/replication_controller_utils.go#L36) will do this.
* Integration tests:
  * verifies that changing RC's selector or Pod's Labels triggers adoption/abandoning
* e2e tests (separated to #27151):
  * verifies GC deletes the pods created by RC if DeleteOptions.OrphanDependents=false, and orphans the pods if DeleteOptions.OrphanDependents=true.

TODO:

- [x] we need to be able to select Pods that have a specific ControllerRef. Then each time we sync the RC, we will iterate through all the Pods that has a controllerRef pointing the RC, event if the labels of the Pod doesn't match the selector of RC anymore. This will prevent a Pod from stuck with a stale controllerRef, which could be caused by the race between abandoner (the goroutine that removes controllerRef) and worker the goroutine that add controllerRef to pods).
- [ ] use controllerRef instead of calling `getPodController`. This might be carried out by the control-plane team.
- [ ] according to the controllerRef proposal (#25256): "For debugging purposes we want to add an adoptionTime annotation prefixed with kubernetes.io/ which will keep the time of last controller ownership transfer." This might be carried out by the control-plane team.

cc @lavalamp @gmarek
@gmarek
Copy link
Contributor Author

gmarek commented Jul 18, 2016

@caesarxuchao - removed the 'oldest controller' part.

@soltysh
Copy link
Contributor

soltysh commented Jul 18, 2016

Current version LGTM. Let's get this in and we can iterate further.

@gmarek
Copy link
Contributor Author

gmarek commented Jul 18, 2016

I'll squash when I get home.

@caesarxuchao
Copy link
Member

It seems @soltysh's comment is lost on this thread. I found it in the email notification and @soltysh's suggestions is using server generated selector upon controller creation to make the pod selection unique (code). Does this make adoption impossible?

@gmarek
Copy link
Contributor Author

gmarek commented Jul 19, 2016

Automatic adoption yes. But user always can set the label by hand.

@k8s-bot
Copy link

k8s-bot commented Jul 19, 2016

GCE e2e build/test passed for commit 8e0ff44.

@caesarxuchao caesarxuchao self-assigned this Jul 20, 2016
@caesarxuchao
Copy link
Member

LGTM. Thanks, @gmarek! Since we are already implementing it, let's merge and iterate if there are issues.

@caesarxuchao caesarxuchao added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 20, 2016
@k8s-github-robot
Copy link

Automatic merge from submit-queue

@k8s-github-robot k8s-github-robot merged commit b7490d5 into kubernetes:master Jul 20, 2016
@gmarek gmarek deleted the proposal branch August 30, 2016 09:48
xingzhou pushed a commit to xingzhou/kubernetes that referenced this pull request Dec 15, 2016
Automatic merge from submit-queue

Proposal for ControllerReference

Proposal for including the reference pointing to the owning "collection" (controller) for objects that can be grouped. The goal is to prevent a situation when two controllers are fighting over some resources.

cc @bgrant0607 @lavalamp @caesarxuchao @davidopp @fgrzadkowski @wojtek-t @kubernetes/sig-api-machinery
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/design Categorizes issue or PR as related to design. kind/documentation Categorizes issue or PR as related to documentation. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note-none Denotes a PR that doesn't merit a release note. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet