New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DaemonSet updates - take 2 #41116
DaemonSet updates - take 2 #41116
Conversation
@kubernetes/sig-apps-pr-reviews |
pkg/apis/extensions/types.go
Outdated
// available (ready for at least minReadySeconds) | ||
NumberAvailable int32 | ||
|
||
// NumberUnavailable is the number of nodes that should be running the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, for Deployments, this is the number of pods running but not considered ready/available.
kubernetes/pkg/controller/deployment/sync.go
Line 586 in d953402
unavailableReplicas := totalReplicas - availableReplicas |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, for Deployments, this is the number of pods running but not considered ready/available.
kubernetes/pkg/controller/deployment/sync.go
Line 586 in d953402
unavailableReplicas := totalReplicas - availableReplicas
That means the same thing as !(running && available)
as this comment described, right?
@@ -219,6 +225,73 @@ func Convert_v1beta1_RollingUpdateDeployment_To_extensions_RollingUpdateDeployme | |||
return nil | |||
} | |||
|
|||
func Convert_extensions_DaemonSetSpec_To_v1beta1_DaemonSetSpec(in *extensions.DaemonSetSpec, out *DaemonSetSpec, s conversion.Scope) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need custom conversions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I had some errors with validation earlier and thought that it will help. It turns out that only
Convert_extensions_RollingUpdateDaemonSet_To_v1beta1_RollingUpdateDaemonSet Convert_v1beta1_RollingUpdateDaemonSet_To_extensions_RollingUpdateDaemonSet,
are required
} | ||
allErrs = append(allErrs, ValidateRollingUpdateDaemonSet(strategy.RollingUpdate, fldPath.Child("rollingUpdate"))...) | ||
} else { | ||
allErrs = append(allErrs, field.Invalid(fldPath.Child("type"), strategy.Type, "RollingUpdate and OnDelete are only supported types")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use field.NotSupported
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
|
||
func ValidateDaemonSetUpdateStrategy(strategy *extensions.DaemonSetUpdateStrategy, fldPath *field.Path) field.ErrorList { | ||
allErrs := field.ErrorList{} | ||
if strategy.Type == extensions.OnDeleteDaemonSetStrategyType { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: make this a switch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -548,10 +560,12 @@ func (dsc *DaemonSetsController) manage(ds *extensions.DaemonSet) error { | |||
glog.V(4).Infof("Nodes needing daemon pods for daemon set %s: %+v, creating %d", ds.Name, nodesNeedingDaemonPods, createDiff) | |||
createWait := sync.WaitGroup{} | |||
createWait.Add(createDiff) | |||
dsPodTemplateSpecHash := podutil.GetPodTemplateSpecHashFnv(ds.Spec.Template) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a TODO to this helper to be moved in a more generic location.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
return true | ||
} | ||
} else { | ||
// XXX: Hash does not exist. It's old pod. For now returning true. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In such a case I think you should compare (apimachinery.Semantic.DeepEqual) the ds pod template spec with the pod spec AND the labels and annotations found in the metadata of the ds podtemplate vs pod metadata.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why can't we have a helper that will do the Pod->PodTemplateSpec conversion?
badPods := []*v1.Pod{} | ||
goodPods := []*v1.Pod{} | ||
for _, pod := range pods { | ||
if pod.Status.Phase == v1.PodRunning { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Terminating pods are still running pods, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also we should probably use IsPodReady (uses Conditions) in favor of Phase.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
func GetNumUnavailable(ds *extensions.DaemonSet, newPods, oldPods []*v1.Pod) int { | ||
unavailable := int(ds.Status.DesiredNumberScheduled) - len(newPods) - len(oldPods) | ||
for _, pod := range newPods { | ||
if pod.Status.Phase != v1.PodRunning { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to use IsPodAvailable and provide MinReadySeconds from the ds spec
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also pods that are marked for deletion should count as unavailable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
pkg/controller/daemon/update.go
Outdated
glog.V(4).Infof("Marking all bad old pods for deletion") | ||
for _, pod := range badPods { | ||
glog.V(4).Infof("Marking pod %s/%s for deletion", ds.Name, pod.Name) | ||
podsToDelete = append(podsToDelete, pod.Name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We shouldn't retry the deletion if the pod is already marked for deletion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure. manage
doesn't check it. Also RollingUpdate
shouldn't be executed until all expected deletions are complete.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pods won't be deleted immediately after Delete
returns; it may be terminating depending on its termination grace period
@@ -716,6 +744,17 @@ func (dsc *DaemonSetsController) syncDaemonSet(key string) error { | |||
} | |||
} | |||
|
|||
dsNeedsSync = dsc.expectations.SatisfiedExpectations(dsKey) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Already called above. Although expectations never worked as expected and we should probably get rid of them from this controller too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
manage
can set new expectations. I want to run updates only if manage
didn't change anything.
} | ||
|
||
func GetNumUnavailable(ds *extensions.DaemonSet, newPods, oldPods []*v1.Pod) int { | ||
unavailable := int(ds.Status.DesiredNumberScheduled) - len(newPods) - len(oldPods) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DesiredNumberScheduled
may not be synced when you call GetNumUnavailable
. Generally, utilities for constructing the status of an object should not treat fields from the status object as correct. Instead try to identify how much DesiredNumberScheduled
really is.
pkg/controller/daemon/update.go
Outdated
if err != nil { | ||
return fmt.Errorf("couldn't get list of nodes during rolling update of daemon set %#v: %v", ds, err) | ||
} | ||
maxUnavailable, err := intstrutil.GetValueFromIntOrPercent(ds.Spec.UpdateStrategy.RollingUpdate.MaxUnavailable, len(nodeList), true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a helper for this in the deployment utilities.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one https://github.com/kubernetes/kubernetes/blob/master/pkg/controller/deployment/util/deployment_util.go#L417 ? I can not use it.
return utilerrors.NewAggregate(errors) | ||
} | ||
|
||
func (dsc *DaemonSetsController) syncNodes(ds *extensions.DaemonSet, podsToDelete, nodesNeedingDaemonPods []string) []error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
godoc what syncNodes is supposed to do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
pkg/controller/daemon/update.go
Outdated
podutil "k8s.io/kubernetes/pkg/controller/deployment/util" | ||
) | ||
|
||
func (dsc *DaemonSetsController) rollingUpdate(ds *extensions.DaemonSet) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
godoc what rollingUpdate
is doing exactly. It seems that it simply deletes pods if possible and then manage
will make sure the new pods will spin up?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, manage will create new pods but I'm not sure if this is a correct approach. I need to run more tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a comment to describe that? // rollingUpdate does such and such...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
pkg/controller/daemon/update.go
Outdated
} | ||
|
||
func (dsc *DaemonSetsController) getAllDaemonSetPods(ds *extensions.DaemonSet) ([]*v1.Pod, []*v1.Pod, error) { | ||
newPods := []*v1.Pod{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use var newPods []*v1.Pod
instead - consistent with what's the conventional way of defining stuff in the codebase (avoids an extra allocation)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
updated = true | ||
} | ||
if ready && updated { | ||
numberAvailable++ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not what available should be. Available is the # of pods (old or new) that are ready for at least minReadySeconds.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.status.numberAvailable
= the total number of DaemonSet pods that have becomeReady
forMinReadySeconds
.status.numberUnavailable
=.status.desiredNumberScheduled
-.status.numberAvailable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
cc @kubernetes/mirantis |
Can you paste the errors you are getting? We should be good with generated
conversions for DS afaik
…On Wed, Feb 8, 2017 at 4:19 PM, Łukasz Oleś ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In pkg/apis/extensions/v1beta1/conversion.go
<#41116>:
> @@ -219,6 +225,73 @@ func Convert_v1beta1_RollingUpdateDeployment_To_extensions_RollingUpdateDeployme
return nil
}
+func Convert_extensions_DaemonSetSpec_To_v1beta1_DaemonSetSpec(in *extensions.DaemonSetSpec, out *DaemonSetSpec, s conversion.Scope) error {
Hmm, I had some errors with validation earlier and thought that it will
help. It turns out that only
Convert_extensions_RollingUpdateDaemonSet_To_v1beta1_RollingUpdateDaemonSet
Convert_v1beta1_RollingUpdateDaemonSet_To_extensions_
RollingUpdateDaemonSet,
are required
—
You are receiving this because you were assigned.
Reply to this email directly, view it on GitHub
<#41116>, or mute the thread
<https://github.com/notifications/unsubscribe-auth/ADuFf8XNL153xm7TX44rft-Fa7XnBV0mks5radz-gaJpZM4L6TEV>
.
|
We need more e2e tests and kubectl rollout status should probably respect strategy params and quit as soon as the minimum available pods are updated. Fine with a follow-up lgtm |
@k8s-bot non-cri e2e test this |
@k8s-bot gce etcd3 e2e test this |
Nothing changed, just rebased and updated generated part. |
[APPROVALNOTIFIER] This PR is NOT APPROVED The following people have approved this PR: janetkuo, k8s-merge-robot, lukaszo Needs approval from an approver in each of these OWNERS Files:
We suggest the following people: |
pkg/apis/extensions/v1beta1/types.go
Outdated
@@ -508,8 +507,20 @@ type DaemonSet struct { | |||
// More info: http://releases.k8s.io/HEAD/docs/devel/api-conventions.md#spec-and-status | |||
// +optional | |||
Status DaemonSetStatus `json:"status,omitempty" protobuf:"bytes,3,opt,name=status"` | |||
|
|||
// A sequence number representing a specific generation of the template. | |||
// Populated by the system. Read-only. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not entirely true since users can populate it at creation time, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will prepare a follow up PR with a fix
} | ||
available := false | ||
for _, pod := range daemonPods { | ||
if v1.IsPodAvailable(pod, ds.Spec.MinReadySeconds, metav1.Now()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getNodesToDaemonPods does not filter out terminating pods and I don't think IsPodAvailable filter them out either. You should consider them as unavailable so before this check you probably want to continue on deletiontimestamp != nil.
P2 because it has been rebased a couple of times already and needs to land sooner than most of the PRs in the queue. |
@k8s-bot unit test this |
Automatic merge from submit-queue (batch tested with PRs 41116, 41804, 42104, 42111, 42120) |
Automatic merge from submit-queue (batch tested with PRs 42024, 42780, 42808, 42640) kubectl: respect DaemonSet strategy parameters for rollout status It handles "after-merge" comments from kubernetes#41116 cc @Kargakis @janetkuo I will add one more e2e test later. I need to handle some in company stuff.
It implements https://github.com/kubernetes/community/blob/master/contributors/design-proposals/daemonset-update.md
Feature kubernetes/enhancements#124
#15310
cc @Kargakis @janetkuo @mikedanese