-
Notifications
You must be signed in to change notification settings - Fork 5.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal for DaemonSet history and rollback #527
Conversation
@erictune @bgrant0607 if we go down this path we'll need to figure out how we could do "safe" pod templates (the original use case). I proposed in the other issue that the isolation could be preserved if a user would have to activate a pod template before it could mint pod templates (thus the controllers could create them, but only a user could authorize them to instantiate). How concerned are we that using pod templates for this limits the options for future changes along these lines? |
|
||
#### API | ||
|
||
Implement a subresource for DaemonSet history (`daemonsets/foo/history`) that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this generic (pod template list, intended for generic clients) or type specific (array of daemonsets, with only some fields preserved)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we move this into a separate proposal since a history subresource is not needed for upgrades and it's something we'll need to do for all the workload APIs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@smarterclayton something generic if we use PodTemplates to store history
@Kargakis you meant moving the daemonsets/foo/history
part to a separate proposal, right? Sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This section is obsolete?
@smarterclayton @erictune @bgrant0607 My primary concern about PodTemplates actually stems from the ability to mitigate inter controller (e.g. DaemonSet and StatefulSet with the same name) history overlap. A History object per controller Kind does provides adequate mitigation. My primary concern about a per controller Kind History object is that third party operators or controllers have no way to integrate with it at the moment. However TPRs and sub resources are currently insufficient to communicate Changing the method or persistence really doesn't impact the overall design much (provided we do so prior to implementation). If we use a per Kind History object we could make the following claim about controllers. |
// +optional | ||
UpdatedAnnotations map[string]string `json:"updatedAnnotations,omitempty"` | ||
// The config of this DaemonSet rollback. | ||
RollbackTo RollbackConfig `json:"rollbackTo"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the same struct used by Deployments, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes just reuse RollbackConfig
seen as old pods. | ||
- Find existing PodTemplates owned by DaemonSet, and sort them by the value | ||
of `pod-template-generation` label | ||
- Clean up PodTemplates based on `.spec.revisionHistoryLimit` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think cleanup should be the last step
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think there's much difference, but sure we can move it to be the last step.
One a side note, we don't need to keep history for OnDelete
DaemonSets, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One a side note, we don't need to keep history for OnDelete DaemonSets, right?
Seems that people can use OnDelete in the same way Paused can be used for Deployments, what is our stance on that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OnDelete
is mainly for backward-compatibility. Without history, it's still backward compatible.
If we decide to maintain history for OnDelete
, this history will be available when switching to RollingUpdate
. The logic for OnDelete
would be similar to RollingUpdate
, except for the pod killing / creating part.
I'm not opposed to keeping history for OnDelete
, but just not sure who are really using OnDelete
to update their DaemonSets and want to keep the history.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Kargakis another question, do people generally use pause
before triggering a rollout (your example), or in the middle of it? Will this usage be different in Deployments vs. DaemonSet?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In your example, there will be 3 more PodTemplates created for OnDelete. Is that expected? In this case, it seems that a single history with all three changes make more sense (since you want to trigger all 3 changes at once).
No new PodTemplate should be generated in the OnDelete case. At least, when a Deployment is paused, we won't create a new ReplicaSet for it.
@Kargakis another question, do people generally use pause before triggering a rollout (your example), or in the middle of it?
Based on feedback from others and on my own usage, most common is the flow I explained above: use pause
before triggering. Ideally, for manual deployments (eg. prod rollouts), I think we want to run one-off rollouts and always run with paused=true
. Pausing in the middle or at any point in the rollout is auto-pause (kubernetes/kubernetes#11505) and that ability solves many other use cases (canaries, A/B, hooks).
Will this usage be different in Deployments vs. DaemonSet?
IMO, the flow described above applies equally to DaemonSets and StatefulSets. Auto-pause will probably differ slightly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In your example, there will be 3 more PodTemplates created for OnDelete. Is that expected? In this case, it seems that a single history with all three changes make more sense (since you want to trigger all 3 changes at once).
No new PodTemplate should be generated in the OnDelete case. At least, when a Deployment is paused, we won't create a new ReplicaSet for it.
If we don't create history for OnDelete
, a history with all three changes will be created when switching to RollingUpdate
. If we create history for OnDelete
, there should be one history created each time templateGeneration is updated. So you agree on not creating history for OnDelete
? Or you have other thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think OnDelete
should not create templates but we should probably allow cleanup of older templates, if we can delete any.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On second thought we still need history for OnDelete, otherwise once switching a OnDelete DS to RollingUpdate, all its pods will be killed and recreated. We also need history to compare hash.
- Remove PodTemplates from the ones with the smallest `pod-template-generation` | ||
to the highest, until `.spec.revisionHistoryLimit` is hit or no | ||
PodTemplates are allowed to be removed | ||
- Create a PodTemplate based on DaemonSet |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should first check if it's not a "rollback by update". User may update image x to y, then y to x.
I think we should first check all PodTemplates and if we find one, use it instead of creating new one.
Label pod-template-generation should be updated then also for all pods which are still using this template we should also update their label.
Basically it's normal rollback.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This proposal doesn't differentiate rollback by update vs. by calling /rollback.
Relabeling a pod is already included (see L221).
The reason for duplicating PodTemplates is to keep track of templateGeneration vs. PodSpec mapping. For example, let's say current templateGeneration is 5, v1 = v3 = v5 and there are v1 and v3 existing pods. If we reuse PodTemplates, we only know that v3 = v5 (or v1 = v5, depends on current label of the PodTemplate), and will have to kill v1 (or v3) pods. Alternatively, we can use revision annotation (instead of relabeling Pods and PodTemplates) to solve this, but templateGeneration + revision may be confusing since they have similar meaning (from the name).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I'm convinced :) I'm not sure if this approach is optimal but for sure it's much safer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why can't we deduplicate and keep a label in the latest PodTemplate with all the templateGenerations it has served? We already do that in Deployments for the revision annotation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes we can deduplicate, and then we need a way to remember previous PodTemplate templateGenerations.
One way is to use revision annotations like I mentioned above.
Another way is to use a label or annotation, say something like previous-template-generation="1,3,5"
, and then we append more generations in the new PodTemplate after a rollout (previous-template-generation="1,3,5,7"
). It may eventually become too long so we'd want to clean it up when no pods with that generation exist anymore. We'll also need to parse the string and search for the generation of all pods to see if they're from this PodTemplate.
I thought about these alternatives, and prefer duplicating PodTemplate because it's simpler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Document something brought up in the sig-apps meeting today:
If we reuse PodTemplates, we need to relabel them with a different templateGeneration when rolling back, and the name of the PodTemplate will be different with its label (regarding templateGeneration number) -- possible user confusion.
If we duplicate PodTemplates, it's hard for Deployment-like controllers to follow this pattern.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this approach work:
- create new PodTemplate
- start deleting old pods. If for old pod its PodTemplate == new PodTemplate. Don't delete it, just relabel
- If PodTemplate != new PodTemplate, delete the pod.
Ps. Is there anything more which needs to be discussed before implementation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After agreeing on continuing with hashing and the avoidance mechanism for Deployments, are we going to switch DaemonSets to use hashing too? If we want to switch then it'll be much easier to do it now rather than in the future. Hashing also would obsolete the need to relabel pods when we adopt them as part of a rollback.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah. I agree. We should deprecate TemplateGeneration and start using hashing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep switching to hash and deprecating templateGeneration is the plan. Will update this proposal when the other "history resource" proposal is out
From afar I wonder if this belongs in the ecosystem, or at least, some portion of it does belong outside of kubernetes core ? |
@jayunit100 based on experience we have so far with Deployments (and DeploymentConfigs in OpenShift), providing sensible update strategies works for most use cases. Extensible strategies is kubernetes/kubernetes#14510 + kubernetes/kubernetes#39337 |
I have one more question. What should we do if user deletes PodTemplate, current or from history? |
Not sure but on a first thought I don't think it's necessary. The only thing that we could do is recreate the current PodTemplate if a delete event comes down the watch but apart from that, not much else. We should definitely list them to clean them up regurarly. |
@lukaszo thanks for bringing this up. We should always create a current/new history that matches current DS template if it doesn't exist yet (and if it's a rolling update DS). This will be done in the sync loop (list all existing history and compare) but not watch. I don't think we need to watch for history deletion/updates yet, and if the users mess up with history it's seen as user error. |
Why not make history read-only? |
Yes, the plan is to make the history object read-only. The previous comment is to answer the question what to do if users can change history. We can make the history read-only and disallow users from deleting it (they can only delete history implicitly via changing clean up policy). Even if we choose not to disallow them, we'll only recreate current history (not via watch, but from normal sync loop) |
Where can I read more about this plan? And what is a history object? |
The proposal that handles history is #594 |
This proposal needs to be updated. Controller history (#594) will only replace the |
Updated the proposal to deprecate templateGeneration (use hash instead), replace PodTemplate with controller history, and add hash collision avoidance mechanism. Note that this proposal is not finalized until controller history is finalized. It needs to be consistent with other controller history. Things need to be considered include (but not limited to):
|
// DefaultDaemonSetUniqueLabelKey is the default label key that is added | ||
// to existing DaemonSet pods to distinguish between old and new | ||
// DaemonSet pods during DaemonSet template updates. | ||
DefaultDaemonSetUniqueLabelKey string = "pod-template-hash" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that's why I was asking for making this namespaced. This collides with the label added in Deployments pods with the potential for cross-controller overlaps.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not final. #594 proposed hashing AnyState.Data
. If we decide to hash data instead of template, this won't be a proper name. If all controllers (except for Deployments) will use AnyStates
, we can just use AnyState.Name
as the label value which is guaranteed to be unique.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can just use AnyState.Name as the label value which is guaranteed to be unique.
How is this guaranteed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can't have two AnyStates with the same name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean how do we guarantee that a history object for a DS won't ask for the same name as a history object owned by a StatefulSet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean how do we guarantee that a history object for a DS won't ask for the same name as a history object owned by a StatefulSet.
Each history has OwnerRef and labels adopted from controller's selector. Discussed with @kow3ns yesterday, the label key will be consistent (format-wise) across controllers, something like "statefulset-history-revision", "daemonset-history-revision"
Should all controllers follow the same "copying labels/annotations" pattern? If so, #594 needs to include this.
#594 proposed fnv and SHA-2. #594 proposed hashing data, so we should include uniquifier in the data.
I'm +1 for server side too, but there are some debates about it. The main concern is that controller is modifying its own spec ( |
/cc @skriss |
@timothysc: GitHub didn't allow me to request PR reviews from the following users: skriss. Note that only people with write access to kubernetes/community can review this PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
I don't feel strong about this but people seem to expect that metadata such as labels and annotations are inherited.
I don't think we need a cryptographic algo for hashing and fnv is faster while still being more stable than what we need.
I can see the argument but I am on the fence for this one. I like the way rollbackTo works but I agree that mutating spec is not good. I think we can have a server-side implementation w/o the need to mutate spec. |
DaemonSet controller history is ready for review: kubernetes/kubernetes#45924 In this PR, I have the server side rollback ready regarding API changes and /rollback endpoint, but we may need to do the rollback client-side, considering the "controller is modifying its own spec" drawback. kubectl rollback PR is here (but not ready for review yet): kubernetes/kubernetes#46144 |
Automatic merge from submit-queue Implement Daemonset history ~Depends on #45867 (the 1st commit, ignore it when reviewing)~ (already merged) Ref kubernetes/community#527 and kubernetes/community#594 @kubernetes/sig-apps-api-reviews @kubernetes/sig-apps-pr-reviews @erictune @kow3ns @lukaszo @Kargakis --- TODOs: - [x] API changes - [x] (maybe) Remove rollback subresource if we decide to do client-side rollback - [x] deployment controller - [x] controller revision - [x] owner ref (claim & adoption) - [x] history reconstruct (put revision number, hash collision avoidance) - [x] de-dup history and relabel pods - [x] compare ds template with history - [x] hash labels (put it in controller revision, pods, and maybe deployment) - [x] clean up old history - [x] Rename status.uniquifier when we reach consensus in #44774 - [x] e2e tests - [x] unit tests - [x] daemoncontroller_test.go - [x] update_test.go - [x] ~(maybe) storage_test.go // if we do server side rollback~ kubectl part is in #46144 --- **Release note**: ```release-note ```
Automatic merge from submit-queue (batch tested with PRs 45871, 46498, 46729, 46144, 46804) Implement kubectl rollout undo and history for DaemonSet ~Depends on #45924, only the 2nd commit needs review~ (merged) Ref kubernetes/community#527 TODOs: - [x] kubectl rollout history - [x] sort controller history, print overview (with revision number and change cause) - [x] print detail view (content of a history) - [x] print template - [x] ~(do we need to?) print labels and annotations~ - [x] kubectl rollout undo: - [x] list controller history, figure out which revision to rollback to - if toRevision == 0, rollback to the latest revision, otherwise choose the history with matching revision - [x] update the ds using the history to rollback to - [x] replace the ds template with history's - [x] ~(do we need to?) replace the ds labels and annotations with history's~ - [x] test-cmd.sh @kubernetes/sig-apps-pr-reviews @erictune @kow3ns @lukaszo @Kargakis @kubernetes/sig-cli-maintainers --- **Release note**: ```release-note ```
…mplate, add hash collision avoidance
a58a026
to
7cf8393
Compare
7cf8393
to
edf520a
Compare
Updated proposal. PTAL |
If no one objects by the end of this week, I'm going to merge this. This is already implemented. |
Automatic merge from submit-queue Implement Daemonset history ~Depends on #45867 (the 1st commit, ignore it when reviewing)~ (already merged) Ref kubernetes/community#527 and kubernetes/community#594 @kubernetes/sig-apps-api-reviews @kubernetes/sig-apps-pr-reviews @erictune @kow3ns @lukaszo @Kargakis --- TODOs: - [x] API changes - [x] (maybe) Remove rollback subresource if we decide to do client-side rollback - [x] deployment controller - [x] controller revision - [x] owner ref (claim & adoption) - [x] history reconstruct (put revision number, hash collision avoidance) - [x] de-dup history and relabel pods - [x] compare ds template with history - [x] hash labels (put it in controller revision, pods, and maybe deployment) - [x] clean up old history - [x] Rename status.uniquifier when we reach consensus in #44774 - [x] e2e tests - [x] unit tests - [x] daemoncontroller_test.go - [x] update_test.go - [x] ~(maybe) storage_test.go // if we do server side rollback~ kubectl part is in #46144 --- **Release note**: ```release-note ```
Proposal for DaemonSet history and rollback
In summary:
An alternative is to create a DaemonSetHistory resource, and/or use template hash instead of template generation.
@erictune @kow3ns @foxish @enisoc @Kargakis @lukaszo @smarterclayton @bgrant0607 @kubernetes/sig-apps-api-reviews