Pods moving from Succeeded to Pending #54499

sjenning · 2017-10-24T17:43:08Z

After #53233, we observed pods moving from phase Succeeded to Pending. Presents on master and release-1.8 as it was picked in #53422

apiVersion: v1
kind: Pod
metadata:
  name: test
spec:
  containers:
  - command:
    - bash
    - -c
    - trap 'excode=$?; trap "" EXIT; cmd; echo $excode' EXIT HUP INT QUIT PIPE TERM;
      sleep 5
    env:
    image: centos:centos7
    imagePullPolicy: IfNotPresent
    name: test-container 
  restartPolicy: Never
  terminationGracePeriodSeconds: 30

test script

kubectl create -f pod.yaml; sleep 3; kubectl delete -f pod.yaml

oc has an observe command for basically doing a CLI watch

oc observe pods --output=gotemplate -a '{{.}}' | grep phase

or you can look in the kubelet log to see the phase transition from Succeeded to Pending

Thanks to @joelsmith @frobware for helping isolate this 👍

@dashpole @derekwaynecarr @smarterclayton

/sig node

The text was updated successfully, but these errors were encountered:

sjenning · 2017-10-24T17:43:54Z

/assign @dashpole

dashpole · 2017-10-24T21:52:23Z

taking a look. although at first glance it is strange this didnt happen with evicted pods prior to this.

sjenning · 2017-10-24T22:02:02Z

Thanks. FYI, I did confirm that reverting the mentioned PR on master corrected the issue.

smarterclayton · 2017-10-24T23:30:44Z

This showed up breaking a job controller that depended on the pod not transitioning back to pending. It's possible this can cause applications to fail / run multiple times in potentially dangerous ways.

I'm not sure we should allow a status update that moves back to a non terminal state via validation. It's not a legal transition as we have defined phase. We should separately put a guard in place that will report a clear error to clients.

sjenning · 2017-10-25T16:41:27Z

@dashpole after thinking on it, I think that the situation with eviction is a little different in that the containers are terminated but they are not immediately cleaned up. It is the proximity between pod termination time and container removal time that is uncovering the issue, I think.

AFAICT, some path is attempting to get the container status after the container has been removed. When this fails, something in the status manager gets messed up and it under the impression that the pod is new and the status needs to be reinitialized.

dashpole · 2017-10-25T17:38:45Z

Here are the relevant logs showing the bug:
status_manager.go:442] Status for pod "simple-mount-pod_e2e-tests-create-delete-pod-test-vprql(f8834113-b91d-11e7-a5fa-42010a800002)" updated successfully: (5, {Phase:Failed ...
kubelet_pods.go:864] Pod "simple-mount-pod_e2e-tests-create-delete-pod-test-vprql(f8834113-b91d-11e7-a5fa-42010a800002)" is terminated, but some containers have not been cleaned up
generic.go:147] GenericPLEG: f8834113-b91d-11e7-a5fa-42010a800002/66f42e6c9d77ab4d4a429a35291303751f7aac4803f8fe47f73b11afc4a32d37: exited -> non-existent
kuberuntime_manager.go:834] getSandboxIDByPodUID got sandbox IDs ["7c0c2d55bb1b191f5179c79c6ac0ae4cfc5ba256e766c8308ec2c457dd521e0d"] for pod "simple-mount-pod_e2e-tests-create-delete-pod-test-vprql(f8834113-b91d-11e7-a5fa-42010a800002)"
generic.go:381] PLEG: Write status for simple-mount-pod/e2e-tests-create-delete-pod-test-vprql: &container.PodStatus{ID:"f8834113-b91d-11e7-a5fa-42010a800002", Name:"simple-mount-pod", Namespace:"e2e-tests-create-delete-pod-test-vprql", IP:"", ContainerStatuses:[]*container.ContainerStatus{}, SandboxStatuses:[]*runtime.PodSandboxStatus{(*runtime.PodSandboxStatus)(0xc421423680)}} (err: <nil>)
kubelet_pods.go:1294] Generating status for "simple-mount-pod_e2e-tests-create-delete-pod-test-vprql(f8834113-b91d-11e7-a5fa-42010a800002)"
kubelet.go:1884] SyncLoop (RECONCILE, "api"): "simple-mount-pod_e2e-tests-create-delete-pod-test-vprql(f8834113-b91d-11e7-a5fa-42010a800002)"
status_manager.go:442] Status for pod "simple-mount-pod_e2e-tests-create-delete-pod-test-vprql(f8834113-b91d-11e7-a5fa-42010a800002)" updated successfully: (6, {Phase:Pending ...

Still trying to figure out what caused the status update.

dashpole · 2017-10-25T19:19:16Z

Ok, I figured out why this is happening. When we compute a container's status, we assume that if it is not found, its status is ContainerCreating. So deleting a pod's containers before the pod object is removed causes the state to return to pending.

sjenning · 2017-10-25T19:38:58Z

Maybe we should expand this that I added to prevent init container status reset to include all containers with an oldStatus
https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/kubelet_pods.go?utf8=%E2%9C%93#L1427-L1430

dashpole · 2017-10-25T22:48:03Z

Would that prevent updating a container that crashed and restarted?
Edit: nvm, I think that might work @sjenning

dchen1107 · 2017-10-25T22:50:56Z

I added a comment at #54530 (comment).

derekwaynecarr · 2017-10-26T01:34:56Z

i also commented here that it would be good if syncPod in status manager protected against an invalid transfer of phase...

see: #54530 (comment)

can we do that in this same PR?

dashpole · 2017-10-26T01:37:43Z

My hope is to cherrypick this to 1.8. Is that check something we want to cherrypick as well? @derekwaynecarr

derekwaynecarr · 2017-10-26T01:40:20Z

i dont see why not... absent that its hard to know this fix actually fixes it... or that we dont regress.

sjenning · 2017-10-26T01:50:07Z

@derekwaynecarr do we want to do it in syncPod or updateStatusInternal? syncPod is on the other side of a channel where a stack trace on the caller of updateStatusInternal could be more informative.

derekwaynecarr · 2017-10-26T02:38:25Z

@sjenning both? i just know that we get the latest state as told to api server on the other end of the channel...

@smarterclayton

Automatic merge from submit-queue (batch tested with PRs 54597, 54593, 54081, 54271, 54600). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. kubelet: check for illegal container state transition supersedes #54530 Puts a state transition check in the kubelet status manager to detect and block illegal transitions; namely from terminated to non-terminated. @smarterclayton @derekwaynecarr @dashpole @joelsmith @frobware I confirmed that the reproducer in #54499 does not work with this check in place. The erroneous kubelet status update is rejected: ``` status_manager.go:301] Status update on pod default/test aborted: terminated container test-container attempted illegal transition to non-terminated state ``` After fix #54593, I do not see the message with the above mentioned reproducer.

@sjenning

Automatic merge from submit-queue (batch tested with PRs 54593, 54607, 54539, 54105). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. Removed containers are not always waiting fixes #54499 The issue was that a container that is removed (during pod deletion, for example), is assumed to be in a "waiting" state. Instead, we should use the previous container state. Fetching the most recent status is required to ensure that we accurately reflect the previous state. The status attached to a pod object is often stale. I verified this by looking through the kubelet logs during a deletion, and verifying that the status updates do not transition from terminated -> pending. cc @kubernetes/sig-node-bugs @sjenning @smarterclayton @derekwaynecarr @dchen1107 ```release-note Fix an issue where pods were briefly transitioned to a "Pending" state during the deletion process. ```

@smarterclayton

Automatic merge from submit-queue (batch tested with PRs 53962, 54708). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. Prevent successful containers from restarting with OnFailure restart policy **What this PR does / why we need it**: This is a follow-on to #54597 which makes sure that its validation also applies to pods with a restart policy of OnFailure. This deficiency was pointed out by @smarterclayton here: #54530 (comment) **Which issue this PR fixes** This is another fix to address #54499 **Release note**: ```release-note NONE ```

…93-upstream-release-1.8 Automatic merge from submit-queue. Automated cherry pick of #54593 Cherry pick of #54593 on release-1.8. #54593: fix #54499. Removed containers are not waiting ```release-note Fix an issue where pods were briefly transitioned to a "Pending" state during the deletion process. ```

@smarterclayton

Automatic merge from submit-queue (batch tested with PRs 56108, 56811, 57335, 57331, 54530). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. api: validate container phase transitions #54499 exposed an issue where a container was transitioning from the terminal phases of `Succeeded` or `Failed` to `Pending`. It is due to a bug in the kubelet, but additional validation in the API server can prevent this invalid phase transition from being accepted. @smarterclayton @derekwaynecarr @dashpole @joelsmith @frobware I confirmed that the reproducer in #54499 does not work with this validation in place. The erroneous kubelet status update is rejected: ``` status_manager.go:437] Failed to update status for pod "test_default(2f02ecdf-b92a-11e7-a2d0-1c1b0deeddfa)": Pod "test" is invalid: status.containerStatuses[0].state: Forbidden: may not be transitioned to non-terminated state ``` However, it only works to a point with this particular issue. The termination hangs and eventually the resource is removed from etcd and the status update goes through because there is no old statuses to compare. Not exactly sure how this happens since there is no pod in etcd anymore ¯\\_(ツ)_/¯

k8s-github-robot added the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Oct 24, 2017

k8s-ci-robot added the sig/node Categorizes an issue or PR as relevant to SIG Node. label Oct 24, 2017

k8s-ci-robot assigned dashpole Oct 24, 2017

k8s-github-robot removed the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Oct 24, 2017

sjenning mentioned this issue Oct 24, 2017

Pod status.phase shall not transition from Succeeded to Pending openshift/origin#17011

Closed

sjenning mentioned this issue Oct 25, 2017

api: validate container phase transitions #54530

Merged

dashpole added a commit to dashpole/kubernetes that referenced this issue Oct 25, 2017

fix kubernetes#54499. Removed containers are not waiting

42a2a2f

dashpole mentioned this issue Oct 25, 2017

Removed containers are not always waiting #54593

Merged

sjenning mentioned this issue Oct 26, 2017

kubelet: check for illegal container state transition #54597

Merged

k8s-github-robot closed this as completed in #54593 Oct 26, 2017

This was referenced Oct 27, 2017

Prevent successful containers from restarting with OnFailure restart policy openshift/kubernetes#46

Closed

Prevent successful containers from restarting with OnFailure restart policy #54708

Merged

dashpole added a commit to dashpole/kubernetes that referenced this issue Oct 31, 2017

fix kubernetes#54499. Removed containers are not waiting

02cf95b

dashpole mentioned this issue Oct 31, 2017

Automated cherry pick of #54593 #54883

Merged

dashpole mentioned this issue Nov 21, 2017

Large kubemark performance tests failing with timeout during ns deletion #53327

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pods moving from Succeeded to Pending #54499

Pods moving from Succeeded to Pending #54499

sjenning commented Oct 24, 2017

sjenning commented Oct 24, 2017

dashpole commented Oct 24, 2017

sjenning commented Oct 24, 2017

smarterclayton commented Oct 24, 2017

sjenning commented Oct 25, 2017 •

edited

dashpole commented Oct 25, 2017

dashpole commented Oct 25, 2017

sjenning commented Oct 25, 2017

dashpole commented Oct 25, 2017 •

edited

dchen1107 commented Oct 25, 2017

derekwaynecarr commented Oct 26, 2017

dashpole commented Oct 26, 2017

derekwaynecarr commented Oct 26, 2017

sjenning commented Oct 26, 2017

derekwaynecarr commented Oct 26, 2017

Pods moving from Succeeded to Pending #54499

Pods moving from Succeeded to Pending #54499

Comments

sjenning commented Oct 24, 2017

sjenning commented Oct 24, 2017

dashpole commented Oct 24, 2017

sjenning commented Oct 24, 2017

smarterclayton commented Oct 24, 2017

sjenning commented Oct 25, 2017 • edited

dashpole commented Oct 25, 2017

dashpole commented Oct 25, 2017

sjenning commented Oct 25, 2017

dashpole commented Oct 25, 2017 • edited

dchen1107 commented Oct 25, 2017

derekwaynecarr commented Oct 26, 2017

dashpole commented Oct 26, 2017

derekwaynecarr commented Oct 26, 2017

sjenning commented Oct 26, 2017

derekwaynecarr commented Oct 26, 2017

sjenning commented Oct 25, 2017 •

edited

dashpole commented Oct 25, 2017 •

edited