Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Job backoffLimit does not cap pod restarts when restartPolicy: OnFailure #54870

Closed
ritsok opened this issue Oct 31, 2017 · 21 comments
Closed

Job backoffLimit does not cap pod restarts when restartPolicy: OnFailure #54870

ritsok opened this issue Oct 31, 2017 · 21 comments
Assignees
Projects

Comments

@ritsok
Copy link

@ritsok ritsok commented Oct 31, 2017

Is this a BUG REPORT or FEATURE REQUEST?:

/kind bug

What happened:
When creating a job with backoffLimit: 2 and restartPolicy: OnFailure, the pod with the configuration included below continued to restart and was never marked as failed:

$ kubectl get pods
NAME               READY     STATUS    RESTARTS   AGE
failed-job-t6mln   0/1       Error     4          57s
$ kubectl describe job failed-job
Name:           failed-job
Namespace:      default
Selector:       controller-uid=58c6d945-be62-11e7-86f7-080027797e6b
Labels:         controller-uid=58c6d945-be62-11e7-86f7-080027797e6b
                job-name=failed-job
Annotations:    kubectl.kubernetes.io/last-applied-configuration={"apiVersion":"batch/v1","kind":"Job","metadata":{"annotations":{},"name":"failed-job","namespace":"default"},"spec":{"backoffLimit":2,"template":{"met...
Parallelism:    1
Completions:    1
Start Time:     Tue, 31 Oct 2017 10:38:46 -0700
Pods Statuses:  1 Running / 0 Succeeded / 0 Failed
Pod Template:
  Labels:  controller-uid=58c6d945-be62-11e7-86f7-080027797e6b
           job-name=failed-job
  Containers:
   nginx:
    Image:  nginx:1.7.9
    Port:   <none>
    Command:
      bash
      -c
      exit 1
    Environment:  <none>
    Mounts:       <none>
  Volumes:        <none>
Events:
  Type    Reason            Age   From            Message
  ----    ------            ----  ----            -------
  Normal  SuccessfulCreate  1m    job-controller  Created pod: failed-job-t6mln

What you expected to happen:

We expected only 2 attempted pod restarts before the job was marked as failed. The pod kept restarting and job.pod.status was never set to failed, remained active.

How to reproduce it (as minimally and precisely as possible):

 apiVersion: batch/v1
 kind: Job
 metadata:
   name: failed-job
   namespace: default
 spec:
   backoffLimit: 2
   template:
     metadata:
       name: failed-job
     spec:
       containers:
       - name: nginx
         image: nginx:1.7.9
         command: ["bash", "-c", "exit 1"]
       restartPolicy: OnFailure

Create the above job and observe the number of pod restarts.

Anything else we need to know?:

The backoffLimit flag works as expected when restartPolicy: Never.

Environment:

  • Kubernetes version (use kubectl version): 1.8.0 using minikube
@ritsok
Copy link
Author

@ritsok ritsok commented Oct 31, 2017

@kubernetes/sig-api-machinery-bugs

@k8s-ci-robot
Copy link
Contributor

@k8s-ci-robot k8s-ci-robot commented Oct 31, 2017

@ritsok: Reiterating the mentions to trigger a notification:
@kubernetes/sig-api-machinery-bugs

In response to this:

@kubernetes/sig-api-machinery-bugs

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@weiwei04
Copy link
Contributor

@weiwei04 weiwei04 commented Nov 2, 2017

I think this is because for restartPolicy: "OnFailure", kubelet will try to restart failed pod in place, and the pod status will be Running even the container is failed to run, so jobcontroller get a illusion as the pod is still running.

Here's my naive thoughts: for short-term to avoid this, add a spec.activeDeadlineSeconds, for long-term we need consider how to recognize continue restarting Pod and count this kind of Pod as failed pod.

cc @soltysh

@lavalamp lavalamp added sig/apps and removed sig/api-machinery labels Nov 2, 2017
@timoreimann
Copy link
Contributor

@timoreimann timoreimann commented Nov 7, 2017

Just ran into the issue as well. It might be useful to amend the documentation to warn about the fact that backoffLimit won't be effective if OnFailure is chosen as restart policy. At least, it wasn't obvious to me and I didn't catch any relevant note in the docs. Happy to do a quick PR if that's a desirable short-term action.

For Pods failing as reported in this issue, what's the difference in behavior between a Never restart policy and a hypothetical scenario where backoffLimit is respected by OnFailure? Anything other than failing Pods being replaced by newly created ones in the former case vs. existing Pods being restarted in the latter?

@weiwei04
Copy link
Contributor

@weiwei04 weiwei04 commented Nov 8, 2017

For OnFailure, the failed Pod will restart on the same kubelet host, so emptyDir/hostPath is still there, for Never the failed Pod got delete and a replacement will create on another(may be the same) kubelet host, if the Pod use emptyDir/hostPath to store some tmp data, the tmp data may be lost.

For Pod don't use emptyDir/hostPath, I think there's no much difference for usage(Never will involve controller, apiserver, scheduler to do some work as it needs to create a replacement, backoffLimit policy will avoid bad jobSpec & container cause to create too many replacement consume control plane bandwidth).

@timoreimann
Copy link
Contributor

@timoreimann timoreimann commented Nov 8, 2017

Might it be a reasonable idea to move the backoffLimit parameter into the Pod? I'd think it could prove valuable to have an upper bound on the restarts on Pods themselves independently of Jobs, and it may solve the "how does the Cronjob controller determine that the limit has been reached" problem more elegantly if the Pod's state automatically transitions to failed once the limit is hit.

WDYT?

@innovia
Copy link

@innovia innovia commented Dec 13, 2017

can anyone tell me if I set the Never and the back off to 3 will it try a restart 3 times? this is very confusing

@timoreimann
Copy link
Contributor

@timoreimann timoreimann commented Dec 13, 2017

@innovia yes, it will. In fact, setting it to Never is the only way right now to have a back off limit be applied and effective.

@timoreimann
Copy link
Contributor

@timoreimann timoreimann commented Dec 13, 2017

I should mention that my statement is based on Kubernetes 1.8. I don't know if any fixes possibly made it into the upcoming 1.9. Then again, if there were, this issue should have been updated or at least referenced.

@innovia
Copy link

@innovia innovia commented Dec 25, 2017

@timoreimann this is not working for a cronjob job spec (i am running 1.8.6)
in the picture i still get new pods when i kill the previous on (5 created 3 should have been created and the job should have failed)

any idea how to get this to actually work?

 jobTemplate:
    spec:
      backoffLimit: 3
      template:
        spec:
          containers:
          - name: cron-job
            command:
            - "/bin/bash"
            - "-c"
            - | 
                WORKER_PID=""
                
                handle_sig_term(){
                  echo -e "Sending SIGTERM to $WORKER_PID"
                  kill -SIGTERM $WORKER_PID
                  echo -e "Waiting for $WORKER_PID"
                  wait $WORKER_PID
                  exit 143  # exit code of SIGTERM is 128 + 15
                }

                trap "handle_sig_term" TERM
 
                 x=1
                 while [ $x -le 100 ]; do
                   echo "Welcome $x times"
                   x=$(( $x + 1 ))
                   sleep 5
                  done   & WORKER_PID=$!
              
                  echo "Job PID: ${WORKER_PID}"

                 # wait <n> waits until the process with ID is complete
                 # (it will block until the process completes)
                 wait $WORKER_PID

                 EXIT_CODE=$?

                 if [ $EXIT_CODE -ne 0 ]; then
                   echo "Job failed with exit code: $EXIT_CODE"
                   exit $EXIT_CODE
                else
                  echo "Job done!"
                fi
          restartPolicy: Never

screen shot 2017-12-25 at 12 07 43 pm

@timoreimann
Copy link
Contributor

@timoreimann timoreimann commented Dec 25, 2017

@innovia what's your spec's schedule? I don't see it in your manifest.

@innovia
Copy link

@innovia innovia commented Dec 25, 2017

every minute just for testing

@timoreimann
Copy link
Contributor

@timoreimann timoreimann commented Dec 25, 2017

Aren't you risking then that restarting jobs from a failed schedule and those from a succeeding schedule might be overlapping? How does it look like when you set the interval to, say, 10 minutes?

@innovia
Copy link

@innovia innovia commented Dec 25, 2017

@timoreimann same result with 10 minutes interval

@timoreimann
Copy link
Contributor

@timoreimann timoreimann commented Dec 25, 2017

@innovia I ran a simplified version of your manifest against my Kubernetes 1.8.2 cluster with the command field replaced by /bin/bash exit 1 so that every run fails, the schedule set to every 5 minutes, and the back-off limit reduced to 2. Here's my manifest:

apiVersion: batch/v1beta1
kind: CronJob
metadata:
  name: hello
spec:
  schedule: "*/5 * * * *"
  jobTemplate:
    spec:
      backoffLimit: 2
      template:
        spec:
          containers:
          - name: cron-job
            image: ubuntu
            command:
            - "/bin/bash"
            - "-c"
            - "exit 1"
          restartPolicy: Never

Initially, I get a series of three failing pods:

Mon Dec 25 23:15:28 CET 2017
NAME                     READY     STATUS    RESTARTS   AGE
hello-1514240100-7mktd   0/1       Error     0          19s
hello-1514240100-bf6rw   0/1       Error     0          5s
hello-1514240100-h44kz   0/1       Error     0          15s

Then it takes pretty much 5 minutes for the next batch to start up and fail:

Mon Dec 25 23:20:23 CET 2017
NAME                     READY     STATUS    RESTARTS   AGE
hello-1514240400-cj4bn   0/1       Error     0          4s
hello-1514240400-shnph   0/1       Error     0          14s

Interestingly, I only get two new containers instead of three. The next batch though comes with three again:

Mon Dec 25 23:25:16 CET 2017
NAME                     READY     STATUS    RESTARTS   AGE
hello-1514240700-jr9sp   0/1       Error     0          3s
hello-1514240700-jxck9   0/1       Error     0          13s
hello-1514240700-x26kg   0/1       Error     0          16s

and so does the final batch I tested:

Mon Dec 25 23:30:16 CET 2017
NAME                     READY     STATUS    RESTARTS   AGE

hello-1514241000-mm8c9   0/1       Error     0          2s
hello-1514241000-pqx8f   0/1       Error     0          16s
hello-1514241000-qjtn5   0/1       Error     0          12s

Apart from the supposed off-by-one error, things seem to work for me.

@timoreimann
Copy link
Contributor

@timoreimann timoreimann commented Dec 25, 2017

Maybe you can double-check using my manifest?

@soltysh
Copy link
Contributor

@soltysh soltysh commented Jan 29, 2018

In the original proposal we've stated it should work for both restart policies, I've addressed the problem in #58972. It counts pod's restartCount, if overall (iow. all containers in all pods) number exceeds backoff limit the job will fail.

@kow3ns kow3ns added this to Backlog in Workloads Feb 27, 2018
Workloads automation moved this from Backlog to Done Apr 19, 2018
k8s-github-robot pushed a commit that referenced this issue Apr 19, 2018
Kubernetes Submit Queue
Automatic merge from submit-queue (batch tested with PRs 61962, 58972, 62509, 62606). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fix job's backoff limit for restart policy OnFailure

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #54870

**Release note**:
```release-note
NONE
```

/assign janetkuo
@zhangguanzhang
Copy link

@zhangguanzhang zhangguanzhang commented Jun 3, 2018

I think the backoffLimit doesn't work at all. My kubernetes version is 1.10.3

[root@k8s-m1 k8s]# cat job.yml 
apiVersion: batch/v1
kind: Job
metadata:
  name: test-job
spec:
  backoffLimit: 2
  template:
    spec:
      containers:
      - name: pi
        image: busybox
        command: ["exit 1"]
      restartPolicy: Never
[root@k8s-m1 k8s]# kubectl apply -f job.yml 
job.batch "test-job" created
[root@k8s-m1 k8s]# kubectl get pod
NAME             READY     STATUS               RESTARTS   AGE
pod-test         1/1       Running              0          6h
test-affinity    1/1       Running              0          6h
test-job-g4r8w   0/1       ContainerCreating    0          1s
test-job-tqbvl   0/1       ContainerCannotRun   0          3s
[root@k8s-m1 k8s]# kubectl get pod
NAME             READY     STATUS               RESTARTS   AGE
pod-test         1/1       Running              0          6h
test-affinity    1/1       Running              0          6h
test-job-2h5vp   0/1       ContainerCannotRun   0          13s
test-job-2klzs   0/1       ContainerCannotRun   0          15s
test-job-4c2xz   0/1       ContainerCannotRun   0          24s
test-job-6bcqc   0/1       ContainerCannotRun   0          8s
test-job-6r7wq   0/1       ContainerCannotRun   0          11s
test-job-7cz7c   0/1       Terminating          0          8s
test-job-7mkdn   0/1       Terminating          0          16s
test-job-88ws8   0/1       ContainerCannotRun   0          26s
test-job-bqfk4   0/1       ContainerCannotRun   0          22s
test-job-jh7dp   0/1       ContainerCannotRun   0          4s
test-job-k2c4r   0/1       ContainerCannotRun   0          18s
test-job-qfj7m   0/1       ContainerCannotRun   0          6s
test-job-r8794   0/1       ContainerCreating    0          1s
test-job-r9gz6   0/1       Terminating          0          6s
test-job-w4f9r   0/1       Terminating          0          1s
[root@k8s-m1 k8s]# kubectl delete  -f job.yml 
job.batch "test-job" deleted
@soltysh
Copy link
Contributor

@soltysh soltysh commented Jun 5, 2018

See #62382.

diegodelemos added a commit to diegodelemos/reana-job-controller that referenced this issue Jun 6, 2018
* This will cause that for each retry a new pod will be started instead
  of trying with the same all the time. Even though it seems to be fixed
  kubernetes/kubernetes#54870 (comment)
  in Kubernetes 1.9.4 backoffLimit is ignored with OnFailure policy.
diegodelemos added a commit to diegodelemos/reana-job-controller that referenced this issue Jun 6, 2018
* Upgrades code to support Kubernetes 1.9.4 (1.10 compatible already).

* Uses restartpolicy never. This will cause that for each retry a new pod
  will be started instead of trying with the same all the time. Even though
  it seems to be fixed kubernetes/kubernetes#54870 (comment)
  in Kubernetes 1.9.4 backoffLimit is ignored with OnFailure policy.
diegodelemos added a commit to diegodelemos/reana-job-controller that referenced this issue Jun 6, 2018
* Upgrades code to support Kubernetes 1.9.4 (1.10 compatible already).

* Uses restartpolicy never. This will cause that for each retry a new pod
  will be started instead of trying with the same all the time. Even though
  it seems to be fixed kubernetes/kubernetes#54870 (comment)
  in Kubernetes 1.9.4 backoffLimit is ignored with OnFailure policy.
diegodelemos added a commit to diegodelemos/reana-job-controller that referenced this issue Jun 6, 2018
* Upgrades code to support Kubernetes 1.9.4 (1.10 compatible already).

* Uses restartpolicy never. This will cause that for each retry a new pod
  will be started instead of trying with the same all the time. Even though
  it seems to be fixed kubernetes/kubernetes#54870 (comment)
  in Kubernetes 1.9.4 backoffLimit is ignored with OnFailure policy.
diegodelemos added a commit to diegodelemos/reana-job-controller that referenced this issue Jun 7, 2018
* Upgrades code to support Kubernetes 1.9.4 (1.10 compatible already).

* Uses restartpolicy never. This will cause that for each retry a new pod
  will be started instead of trying with the same all the time. Even though
  it seems to be fixed kubernetes/kubernetes#54870 (comment)
  in Kubernetes 1.9.4 backoffLimit is ignored with OnFailure policy.
diegodelemos added a commit to diegodelemos/reana-job-controller that referenced this issue Jun 7, 2018
* Upgrades code to support Kubernetes 1.9.4 (1.10 compatible already).

* Uses restartpolicy never. This will cause that for each retry a new pod
  will be started instead of trying with the same all the time. Even though
  it seems to be fixed kubernetes/kubernetes#54870 (comment)
  in Kubernetes 1.9.4 backoffLimit is ignored with OnFailure policy.
diegodelemos added a commit to diegodelemos/reana-job-controller that referenced this issue Jun 7, 2018
* Upgrades code to support Kubernetes 1.9.4 (1.10 compatible already).

* Uses restartpolicy never. This will cause that for each retry a new pod
  will be started instead of trying with the same all the time. Even though
  it seems to be fixed kubernetes/kubernetes#54870 (comment)
  in Kubernetes 1.9.4 backoffLimit is ignored with OnFailure policy.
diegodelemos added a commit to diegodelemos/reana-job-controller that referenced this issue Jun 7, 2018
* Upgrades code to support Kubernetes 1.9.4 (1.10 compatible already).

* Uses restartpolicy never. This will cause that for each retry a new pod
  will be started instead of trying with the same all the time. Even though
  it seems to be fixed kubernetes/kubernetes#54870 (comment)
  in Kubernetes 1.9.4 backoffLimit is ignored with OnFailure policy.
diegodelemos added a commit to diegodelemos/reana-job-controller that referenced this issue Jun 8, 2018
* Upgrades code to support Kubernetes 1.9.4 (1.10 compatible already).

* Uses restartpolicy never. This will cause that for each retry a new pod
  will be started instead of trying with the same all the time. Even though
  it seems to be fixed kubernetes/kubernetes#54870 (comment)
  in Kubernetes 1.9.4 backoffLimit is ignored with OnFailure policy.
diegodelemos added a commit to diegodelemos/reana-job-controller that referenced this issue Jun 8, 2018
* Upgrades code to support Kubernetes 1.9.4 (1.10 compatible already).

* Uses restartpolicy never. This will cause that for each retry a new pod
  will be started instead of trying with the same all the time. Even though
  it seems to be fixed kubernetes/kubernetes#54870 (comment)
  in Kubernetes 1.9.4 backoffLimit is ignored with OnFailure policy.
@lowang-bh
Copy link

@lowang-bh lowang-bh commented Dec 6, 2018

kubectl get pod

NAME READY STATUS RESTARTS AGE
job-backofflimit-2-9n74x 0/1 Error 0 34s
job-backofflimit-2-bt796 0/1 Error 0 1m
job-backofflimit-2-fgktc 0/1 Error 0 49s
job-backofflimit-2-gfhlx 0/1 Error 0 1m
job-backofflimit-2-gzd54 0/1 ContainerCreating 0 8s
job-backofflimit-2-ljlpx 0/1 Error 0 1m
job-backofflimit-2-p5fxm 0/1 Error 0 1m
job-backofflimit-2-w6s5d 0/1 Error 0 21s

cat job-with-second.yaml

apiVersion: batch/v1
kind: Job
metadata:
name: job-backofflimit-2
spec:
backoffLimit: 2
template:
spec:
containers:
- name: job-with-python2
image: registry.laincloud.test/k8s-job/jobtest:0.0
restartPolicy: Never

k8s version

v1.10.4

@moshucong
Copy link

@moshucong moshucong commented Dec 24, 2018

I should mention that my statement is based on Kubernetes 1.8. I don't know if any fixes possibly made it into the upcoming 1.9. Then again, if there were, this issue should have been updated or at least referenced.

1.9 has the same problem. My cluster is 1.9.6, and i run into the same issue

dleehr pushed a commit to Duke-GCB/calrissian that referenced this issue Jan 17, 2019
This should prevent them from re-starting when the workflow fails
Note that on kubernetes before 1.12, kubernetes/kubernetes#54870 exists and these fields are not interpreted
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Workloads
  
Done
Linked pull requests

Successfully merging a pull request may close this issue.

You can’t perform that action at this time.