Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pods moving from Succeeded to Pending #54499

Closed
sjenning opened this issue Oct 24, 2017 · 15 comments · Fixed by #54593
Closed

Pods moving from Succeeded to Pending #54499

sjenning opened this issue Oct 24, 2017 · 15 comments · Fixed by #54593
Assignees
Labels
sig/node Categorizes an issue or PR as relevant to SIG Node.

Comments

@sjenning
Copy link
Contributor

After #53233, we observed pods moving from phase Succeeded to Pending. Presents on master and release-1.8 as it was picked in #53422

apiVersion: v1
kind: Pod
metadata:
  name: test
spec:
  containers:
  - command:
    - bash
    - -c
    - trap 'excode=$?; trap "" EXIT; cmd; echo $excode' EXIT HUP INT QUIT PIPE TERM;
      sleep 5
    env:
    image: centos:centos7
    imagePullPolicy: IfNotPresent
    name: test-container 
  restartPolicy: Never
  terminationGracePeriodSeconds: 30

test script

kubectl create -f pod.yaml; sleep 3; kubectl delete -f pod.yaml

oc has an observe command for basically doing a CLI watch

oc observe pods --output=gotemplate -a '{{.}}' | grep phase

or you can look in the kubelet log to see the phase transition from Succeeded to Pending

Thanks to @joelsmith @frobware for helping isolate this 👍

@dashpole @derekwaynecarr @smarterclayton

/sig node

@k8s-github-robot k8s-github-robot added the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Oct 24, 2017
@k8s-ci-robot k8s-ci-robot added the sig/node Categorizes an issue or PR as relevant to SIG Node. label Oct 24, 2017
@sjenning
Copy link
Contributor Author

/assign @dashpole

@k8s-github-robot k8s-github-robot removed the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Oct 24, 2017
@dashpole
Copy link
Contributor

taking a look. although at first glance it is strange this didnt happen with evicted pods prior to this.

@sjenning
Copy link
Contributor Author

Thanks. FYI, I did confirm that reverting the mentioned PR on master corrected the issue.

@smarterclayton
Copy link
Contributor

This showed up breaking a job controller that depended on the pod not transitioning back to pending. It's possible this can cause applications to fail / run multiple times in potentially dangerous ways.

I'm not sure we should allow a status update that moves back to a non terminal state via validation. It's not a legal transition as we have defined phase. We should separately put a guard in place that will report a clear error to clients.

@sjenning
Copy link
Contributor Author

sjenning commented Oct 25, 2017

@dashpole after thinking on it, I think that the situation with eviction is a little different in that the containers are terminated but they are not immediately cleaned up. It is the proximity between pod termination time and container removal time that is uncovering the issue, I think.

AFAICT, some path is attempting to get the container status after the container has been removed. When this fails, something in the status manager gets messed up and it under the impression that the pod is new and the status needs to be reinitialized.

@dashpole
Copy link
Contributor

Here are the relevant logs showing the bug:
status_manager.go:442] Status for pod "simple-mount-pod_e2e-tests-create-delete-pod-test-vprql(f8834113-b91d-11e7-a5fa-42010a800002)" updated successfully: (5, {Phase:Failed ...
kubelet_pods.go:864] Pod "simple-mount-pod_e2e-tests-create-delete-pod-test-vprql(f8834113-b91d-11e7-a5fa-42010a800002)" is terminated, but some containers have not been cleaned up
generic.go:147] GenericPLEG: f8834113-b91d-11e7-a5fa-42010a800002/66f42e6c9d77ab4d4a429a35291303751f7aac4803f8fe47f73b11afc4a32d37: exited -> non-existent
kuberuntime_manager.go:834] getSandboxIDByPodUID got sandbox IDs ["7c0c2d55bb1b191f5179c79c6ac0ae4cfc5ba256e766c8308ec2c457dd521e0d"] for pod "simple-mount-pod_e2e-tests-create-delete-pod-test-vprql(f8834113-b91d-11e7-a5fa-42010a800002)"
generic.go:381] PLEG: Write status for simple-mount-pod/e2e-tests-create-delete-pod-test-vprql: &container.PodStatus{ID:"f8834113-b91d-11e7-a5fa-42010a800002", Name:"simple-mount-pod", Namespace:"e2e-tests-create-delete-pod-test-vprql", IP:"", ContainerStatuses:[]*container.ContainerStatus{}, SandboxStatuses:[]*runtime.PodSandboxStatus{(*runtime.PodSandboxStatus)(0xc421423680)}} (err: <nil>)
kubelet_pods.go:1294] Generating status for "simple-mount-pod_e2e-tests-create-delete-pod-test-vprql(f8834113-b91d-11e7-a5fa-42010a800002)"
kubelet.go:1884] SyncLoop (RECONCILE, "api"): "simple-mount-pod_e2e-tests-create-delete-pod-test-vprql(f8834113-b91d-11e7-a5fa-42010a800002)"
status_manager.go:442] Status for pod "simple-mount-pod_e2e-tests-create-delete-pod-test-vprql(f8834113-b91d-11e7-a5fa-42010a800002)" updated successfully: (6, {Phase:Pending ...

Still trying to figure out what caused the status update.

@dashpole
Copy link
Contributor

Ok, I figured out why this is happening. When we compute a container's status, we assume that if it is not found, its status is ContainerCreating. So deleting a pod's containers before the pod object is removed causes the state to return to pending.

@sjenning
Copy link
Contributor Author

Maybe we should expand this that I added to prevent init container status reset to include all containers with an oldStatus
https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/kubelet_pods.go?utf8=%E2%9C%93#L1427-L1430

@dashpole
Copy link
Contributor

dashpole commented Oct 25, 2017

Would that prevent updating a container that crashed and restarted?
Edit: nvm, I think that might work @sjenning

@dchen1107
Copy link
Member

I added a comment at #54530 (comment).

@derekwaynecarr
Copy link
Member

i also commented here that it would be good if syncPod in status manager protected against an invalid transfer of phase...

see: #54530 (comment)

can we do that in this same PR?

@dashpole
Copy link
Contributor

My hope is to cherrypick this to 1.8. Is that check something we want to cherrypick as well? @derekwaynecarr

@derekwaynecarr
Copy link
Member

i dont see why not... absent that its hard to know this fix actually fixes it... or that we dont regress.

@sjenning
Copy link
Contributor Author

@derekwaynecarr do we want to do it in syncPod or updateStatusInternal? syncPod is on the other side of a channel where a stack trace on the caller of updateStatusInternal could be more informative.

@derekwaynecarr
Copy link
Member

@sjenning both? i just know that we get the latest state as told to api server on the other end of the channel...

k8s-github-robot pushed a commit that referenced this issue Oct 26, 2017
Automatic merge from submit-queue (batch tested with PRs 54597, 54593, 54081, 54271, 54600). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

kubelet: check for illegal container state transition

supersedes #54530

Puts a state transition check in the kubelet status manager to detect and block illegal transitions; namely from terminated to non-terminated.

@smarterclayton @derekwaynecarr @dashpole @joelsmith @frobware

I confirmed that the reproducer in #54499 does not work with this check in place. The erroneous kubelet status update is rejected:

```
status_manager.go:301] Status update on pod default/test aborted: terminated container test-container attempted illegal transition to non-terminated state
```

After fix #54593, I do not see the message with the above mentioned reproducer.
k8s-github-robot pushed a commit that referenced this issue Oct 26, 2017
Automatic merge from submit-queue (batch tested with PRs 54593, 54607, 54539, 54105). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

 Removed containers are not always waiting

fixes #54499 
The issue was that a container that is removed (during pod deletion, for example), is assumed to be in a "waiting" state.
Instead, we should use the previous container state.
Fetching the most recent status is required to ensure that we accurately reflect the previous state.  The status attached to a pod object is often stale.

I verified this by looking through the kubelet logs during a deletion, and verifying that the status updates do not transition from terminated -> pending.

cc @kubernetes/sig-node-bugs @sjenning @smarterclayton @derekwaynecarr @dchen1107 

```release-note
Fix an issue where pods were briefly transitioned to a "Pending" state during the deletion process.
```
dashpole added a commit to dashpole/kubernetes that referenced this issue Oct 31, 2017
k8s-github-robot pushed a commit that referenced this issue Nov 1, 2017
Automatic merge from submit-queue (batch tested with PRs 53962, 54708). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Prevent successful containers from restarting with OnFailure restart policy

**What this PR does / why we need it**:

This is a follow-on to #54597 which makes sure that its validation
also applies to pods with a restart policy of OnFailure. This
deficiency was pointed out by @smarterclayton here:
#54530 (comment)

**Which issue this PR fixes**  This is another fix to address #54499

**Release note**:
```release-note
NONE
```
k8s-github-robot pushed a commit that referenced this issue Nov 2, 2017
…93-upstream-release-1.8

Automatic merge from submit-queue.

Automated cherry pick of #54593

Cherry pick of #54593 on release-1.8.

#54593: fix #54499. Removed containers are not waiting

```release-note
Fix an issue where pods were briefly transitioned to a "Pending" state during the deletion process.
```
k8s-github-robot pushed a commit that referenced this issue Dec 19, 2017
Automatic merge from submit-queue (batch tested with PRs 56108, 56811, 57335, 57331, 54530). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

api: validate container phase transitions

#54499 exposed an issue where a container was transitioning from the terminal phases of `Succeeded` or `Failed` to `Pending`.  It is due to a bug in the kubelet, but additional validation in the API server can prevent this invalid phase transition from being accepted.

@smarterclayton @derekwaynecarr @dashpole @joelsmith @frobware 

I confirmed that the reproducer in #54499 does not work with this validation in place.  The erroneous kubelet status update is rejected:
```
status_manager.go:437] Failed to update status for pod "test_default(2f02ecdf-b92a-11e7-a2d0-1c1b0deeddfa)": Pod "test" is invalid: status.containerStatuses[0].state: Forbidden: may not be transitioned to non-terminated state
```

However, it only works to a point with this particular issue.  The termination hangs and eventually the resource is removed from etcd and the status update goes through because there is no old statuses to compare.  Not exactly sure how this happens since there is no pod in etcd anymore  ¯\\_(ツ)_/¯
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sig/node Categorizes an issue or PR as relevant to SIG Node.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants