Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set reason and message on Pod during nodecontroller eviction #36017

Merged
merged 3 commits into from
Nov 4, 2016

Conversation

foxish
Copy link
Contributor

@foxish foxish commented Nov 1, 2016

What this PR does / why we need it: Pods which are evicted by the nodecontroller due to network partition, or unresponsive kubelet should be differentiated from termination initiated by other sources. The reason/message are consumed by kubectl to provide a better summary using get/describe.

Which issue this PR fixes (optional, in fixes #<issue number>(, #<issue_number>, ...) format, will close that issue when PR gets merged): fixes #35725

Release note:

Pods that are terminating due to eviction by the nodecontroller (typically due to unresponsive kubelet, or network partition) now surface in `kubectl get` output 
as being in state "Unknown", along with a longer description in `kubectl describe` output.

This change is Reviewable

@foxish foxish added release-note Denotes a PR that will be considered when it comes time to generate release notes. do-not-merge DEPRECATED. Indicates that a PR should not merge. Label can only be manually applied/removed. labels Nov 1, 2016
@foxish foxish added this to the v1.5 milestone Nov 1, 2016
@foxish foxish self-assigned this Nov 1, 2016
@k8s-github-robot k8s-github-robot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Nov 1, 2016
@foxish foxish force-pushed the kubectl-new-2 branch 2 times, most recently from d0ee9d9 to c1ce04b Compare November 2, 2016 03:12
@k8s-github-robot k8s-github-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Nov 2, 2016
@foxish foxish assigned smarterclayton, gmarek and mengqiy and unassigned foxish Nov 2, 2016
@foxish
Copy link
Contributor Author

foxish commented Nov 2, 2016

/cc @kubernetes/sig-apps @erictune @janetkuo

@foxish foxish removed the do-not-merge DEPRECATED. Indicates that a PR should not merge. Label can only be manually applied/removed. label Nov 2, 2016
Copy link
Contributor

@soltysh soltysh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One nit, otherwise LGTM.

@@ -70,6 +71,14 @@ func deletePods(kubeClient clientset.Interface, recorder record.EventRecorder, n
continue
}

// Set reason and message in the pod object.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to have a test case covering this method and that action.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@soltysh Added test

@foxish foxish force-pushed the kubectl-new-2 branch 3 times, most recently from 7ebae58 to 7bbcb76 Compare November 2, 2016 20:00
@k8s-github-robot k8s-github-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Nov 2, 2016
@@ -70,6 +71,14 @@ func deletePods(kubeClient clientset.Interface, recorder record.EventRecorder, n
continue
}

// Set reason and message in the pod object.
if updatedPod, err := setPodTerminationReason(kubeClient, &pod, nodeName); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would block eviction. Do we want to do that?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should keep the error and return it after deletion. Alternatively we can just hope we'll retry here (will we retry?)

Copy link
Contributor Author

@foxish foxish Nov 2, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, when we return false, we retry right away with 0 delay.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@smarterclayton I can't think of a scenario where the update fails when the delete could have proceeded. However, like you said, it would be more defensive to keep the error and not block the eviction, I'll go ahead and do that. At worst, it could lead to multiple delete calls due to the retry loop.

Copy link
Contributor

@soltysh soltysh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@foxish foxish force-pushed the kubectl-new-2 branch 2 times, most recently from 85b94b0 to 940565e Compare November 2, 2016 22:00
@googlebot googlebot added cla: no and removed cla: yes labels Nov 2, 2016
@foxish
Copy link
Contributor Author

foxish commented Nov 2, 2016

@smarterclayton Updated. PTAL.

@gmarek
Copy link
Contributor

gmarek commented Nov 3, 2016

I you think that such 'best effort' behavior is better, then I'm not going to block this PR. I just think it may cause more harm than good, and it might be a cause of some red herrings when debugging NC issues (so whatever you do, just please at least add a retry loop).

@smarterclayton
Copy link
Contributor

The retry loop is the evictor:

remaining, err := deletePods(nc.kubeClient, nc.recorder, value.Value,
nodeUid, nc.daemonSetStore)
if err != nil {
utilruntime.HandleError(fmt.Errorf("unable to evict node %q: %v",
value.Value, err))
return false, 0
}


If we fail with error, we retry with no additional delay.

@gmarek
Copy link
Contributor

gmarek commented Nov 3, 2016

Oh, that one, yes. But this is actually bad - if we succeed in deleting then there's nothing to update in another round (e.g. if Node is not existing and Pod will be force deleted by PodGC).

@smarterclayton
Copy link
Contributor

Good point. So either we force the loop to run again if we get a conflict,
or we don't worry about retry at all. I'd prefer the former - queue the
error, don't delete the pod, return all errors, expect to get run again,
then succeed again. For maximum safety, we could only do that for Conflict
errors, and in all other cases just continue to delete (i.e. if you don't
grant the node controller permission to update pod status then we should
still delete the pod.

So:

for pods {
...
if updateStatus(); err != nil {
if errors.IsConflict(err) {
updateErrs = append(updateErrs, err)
continue
}
utilruntime.HandleError("unable to update pod status to indicate
unreachable")
}
deletePod()
}
if len(updateErrs) > 0 {
return false, NewAggregate(updateErrs)
}

On Thu, Nov 3, 2016 at 11:30 AM, Marek Grabowski notifications@github.com
wrote:

Oh, that one, yes. But this is actually bad - if we succeed in deleting
then there's nothing to update in another round (e.g. if Node is not
existing and Pod will be force deleted by PodGC).


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#36017 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABG_p4FZ5Axm_-E0AT3Ta3WwiRtfXrjZks5q6f4pgaJpZM4Kmq7x
.

@foxish
Copy link
Contributor Author

foxish commented Nov 3, 2016

Is the concern that we wouldn't have updated status and deleted it too soon?

@gmarek
Copy link
Contributor

gmarek commented Nov 3, 2016

I'm fine with @smarterclayton proposal.

@foxish
Copy link
Contributor Author

foxish commented Nov 3, 2016

@smarterclayton @gmarek Do we also want to cover other failure reasons. For example: ServerTimeout seems like something we should account for, and retry in addition to Conflict.

@smarterclayton
Copy link
Contributor

The client should already retry those for you.

On Thu, Nov 3, 2016 at 3:34 PM, Anirudh Ramanathan <notifications@github.com

wrote:

@smarterclayton https://github.com/smarterclayton @gmarek
https://github.com/gmarek Do we also want to cover other failure
reasons. For example: ServerTimeout

StatusReasonServerTimeout StatusReason = "ServerTimeout"

seems like something we should account for, and retry in addition to
Conflict.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#36017 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABG_pw3XDa1GlVH1g35X1IyuDeLB6PBYks5q6jdQgaJpZM4Kmq7x
.

@foxish
Copy link
Contributor Author

foxish commented Nov 3, 2016

@smarterclayton SG. I've updated the PR with the special retry in case of update conflict.

@foxish
Copy link
Contributor Author

foxish commented Nov 3, 2016

Oops, there is an issue with it. Will push an update shortly.

Pods which are evicted by the nodecontroller due to network
malfunction, or unresponsive kubelet should be differentiated
from termination initiated by other sources. The reason/message
are consumed by kubectl to provide a better summary using get/describe.
@foxish
Copy link
Contributor Author

foxish commented Nov 3, 2016

@smarterclayton Updated. PTAL.

@smarterclayton
Copy link
Contributor

/lgtm

@k8s-github-robot k8s-github-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 3, 2016
@foxish
Copy link
Contributor Author

foxish commented Nov 3, 2016

Bumping priority to P2 as #34825 will need to be rebased afterwards.

@foxish foxish added the priority/backlog Higher priority than priority/awaiting-more-evidence. label Nov 3, 2016
@k8s-ci-robot
Copy link
Contributor

Jenkins GCI GCE e2e failed for commit 6d7213d. Full PR test history.

The magic incantation to run this job again is @k8s-bot gci gce e2e test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

@k8s-github-robot
Copy link

@k8s-bot test this [submit-queue is verifying that this PR is safe to merge]

@foxish
Copy link
Contributor Author

foxish commented Nov 4, 2016

@k8s-bot gci gce e2e test this

@k8s-github-robot
Copy link

Automatic merge from submit-queue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/backlog Higher priority than priority/awaiting-more-evidence. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Pods that are terminating due to node loss should be surfaced in kubectl output
10 participants