New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal for ControllerReference #25256
Conversation
There's a couple questions that need answers:
|
208b69b
to
4d026d9
Compare
``` | ||
ObjectMeta { | ||
… | ||
setReference *SetReference |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think "set" is a bad name, it sounds like a verb. "collection"? "organizer"? "boss"? "manager"? "director"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
overseer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gmarek's original idea was collection. I suggested set, on the argument that it is no more vague than collection (except to former Borg users) and it contains the suffix we're kinda trying to use for API objects (I say kinda because we did ReplicaSet and DaemonSet, but then fell off the wagon with Job and Deployment). You make a good point that "set" might be interpreted as a verb. I'm fine collection. I don't think the other terms you listed are good, because I think we want to covey that this is the set, not the logic that manages the set (which AIUI is part of the reason we moved away from calling collections *Controller, to calling them *Set).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed to CollectionRef.
@smarterclayton I'd like to get an LGTM from you or one of your minions on this before we accept the design. |
|
||
# Upgrade/downgrade procedure | ||
|
||
During system upgrade controllers will adopt objects that match their label selector using ordinary adoption procedure. Adoption, by default, will set *IsParent* to *false,* so the behavior of objects existing before the update won’t change (i.e. deleting owners won't cause cascading deletion - see OwnerReference design). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think by default IsParent should set true, so if the user later want to do cascading deletion, it will be possible.
We can keep the backward compatibility by defaulting the DeleteOptions.OrphanDependents to true.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SG.
ba08332
to
c9e3d4c
Compare
// UID of the referent. | ||
UID types.UID | ||
// Should this CollectionReference be treated as an owner for cascading deletion purposes | ||
Owner bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As an alternative to this, we could instead add a CollectionOwner bool
field to OwnerRef, making this a special OwnerReference. Did we discuss that approach?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think if you want the garbage collector to be responsible for clearing these, then we should really consider making these less different from the things the garbage collector actually looks at.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We did not. We were talking about how to do this with @caesarxuchao but run out of time. What you propose is that a collectionRef is always an owner. I'm fine with that, if everyone are (@bgrant0607).
Then we can have collectionRef of an *OwnerRef type as a copy of the value from OwnerReferences. Everything that modifies OwnereReferences would be responsible for keeping collectionRef up to date. Other solution would require iterating over all owner every time we need to get to the "managing collection" (I'm running out of words...), and as it's done on every update it may be quite expensive (@wojtek-t)
@gmarek The title of this PR is misleading. It's also not helpful for others to omit the PR description. |
- [Implementation plan (sketch)](#implementation-plan-sketch) | ||
- [Considered alternatives](#considered-alternatives) | ||
|
||
# Goal of CollectionReference |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't use the term "Collection" ANYWHERE. Please don't introduce a new term.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have a better suggestion? AFAIK we don't have a term for the set of pods belonging to a single controller.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We were planning to use "Set" as in SetReference but @lavalamp pointed out that SetReference sounds like Set is being used as a verb.
@lavalamp PTAL |
Ping @lavalamp |
@smarterclayton - can someone from RH take a look. |
I think @bgrant0607 or @lavalamp or @caesarxuchao are the best approvers for this. I don't mind doing it but I wasn't involved in all the design discussions so I might not notice something that was omitted. |
I'll take another pass tonight. |
I have no further comments |
* there are no `OwnerReferences` with `Controller` set to true in `OwnerReferences` array | ||
* `DeletionTimestamp` is not set | ||
and | ||
* Controller is the oldest controller of a given kind from ones which has a selector that matches labels of adopted object that does not have `DeletionTimestamp` set. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you remind me why we want this determinism?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Testability and general complexity of the system. It's more like philosophy than anything else, but I believe that deterministic behavior is way simpler for people to understand, and various interaction between different components are slightly clearer. Maybe @bgrant0607 can give more reasons.
That being said I agree that the system will work mostly the same from the user's perspective without this assumption.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm asking this because to enforce this rule will make the controller code complex and less efficient based on my experience of refactoring the rc manager. Let's say rc1 and rc2 both have matching selectors, and rc1 is older. If rc2 gets processed by the rm manager's syncHandler before rc1, the only way to prevent rc2 to adopt a pod is checking if there is older matching rc in the system. This is time consuming (if the lookupCache is invalidated) and make the code complex. I'm not convince we should do this for "testability and general complexity". Having multiple controllers matching a pod is still an anti-pattern. We are not obligated to make the system behavior simple to understand if this happens.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Testability implies robustness. Let me think about the implementation next week, if I won't think of something I may remove this requirement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm with @caesarxuchao here. How about using the pattern @erictune introduced in Jobs, where upon creation we add an additional selector which guarantees pod selection uniqueness? See this code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's an orthogonal to the issue here and is something that should be discussed in much wider audience and we'd need to overwrite it for RS created by the Deployment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I didn't knew about that.
We are finishing implementation - can we have the proposal finally merged? |
Automatic merge from submit-queue [GarbageCollector] Let the RC manager set/remove ControllerRef What's done: * RC manager sets Controller Ref when creating new pods * RC manager sets Controller Ref when adopting pods with matching labels but having no controller * RC manager clears Controller Ref when pod labels change * RC manager clears pods' Controller Ref when rc's selector changes * RC manager stops adoption/creating/deleting pods when rc's DeletionTimestamp is set * RC manager bumps up ObservedGeneration: The [original code](https://github.com/kubernetes/kubernetes/blob/master/pkg/controller/replication/replication_controller_utils.go#L36) will do this. * Integration tests: * verifies that changing RC's selector or Pod's Labels triggers adoption/abandoning * e2e tests (separated to #27151): * verifies GC deletes the pods created by RC if DeleteOptions.OrphanDependents=false, and orphans the pods if DeleteOptions.OrphanDependents=true. TODO: - [x] we need to be able to select Pods that have a specific ControllerRef. Then each time we sync the RC, we will iterate through all the Pods that has a controllerRef pointing the RC, event if the labels of the Pod doesn't match the selector of RC anymore. This will prevent a Pod from stuck with a stale controllerRef, which could be caused by the race between abandoner (the goroutine that removes controllerRef) and worker the goroutine that add controllerRef to pods). - [ ] use controllerRef instead of calling `getPodController`. This might be carried out by the control-plane team. - [ ] according to the controllerRef proposal (#25256): "For debugging purposes we want to add an adoptionTime annotation prefixed with kubernetes.io/ which will keep the time of last controller ownership transfer." This might be carried out by the control-plane team. cc @lavalamp @gmarek
@caesarxuchao - removed the 'oldest controller' part. |
Current version LGTM. Let's get this in and we can iterate further. |
I'll squash when I get home. |
Automatic adoption yes. But user always can set the label by hand. |
GCE e2e build/test passed for commit 8e0ff44. |
LGTM. Thanks, @gmarek! Since we are already implementing it, let's merge and iterate if there are issues. |
Automatic merge from submit-queue |
Automatic merge from submit-queue Proposal for ControllerReference Proposal for including the reference pointing to the owning "collection" (controller) for objects that can be grouped. The goal is to prevent a situation when two controllers are fighting over some resources. cc @bgrant0607 @lavalamp @caesarxuchao @davidopp @fgrzadkowski @wojtek-t @kubernetes/sig-api-machinery
Proposal for including the reference pointing to the owning "collection" (controller) for objects that can be grouped. The goal is to prevent a situation when two controllers are fighting over some resources.
cc @bgrant0607 @lavalamp @caesarxuchao @davidopp @fgrzadkowski @wojtek-t @kubernetes/sig-api-machinery