Skip to content

Job backoffLimit does not cap pod restarts when restartPolicy: OnFailure #54870

@ritsok

Description

@ritsok

Is this a BUG REPORT or FEATURE REQUEST?:

/kind bug

What happened:
When creating a job with backoffLimit: 2 and restartPolicy: OnFailure, the pod with the configuration included below continued to restart and was never marked as failed:

$ kubectl get pods
NAME               READY     STATUS    RESTARTS   AGE
failed-job-t6mln   0/1       Error     4          57s
$ kubectl describe job failed-job
Name:           failed-job
Namespace:      default
Selector:       controller-uid=58c6d945-be62-11e7-86f7-080027797e6b
Labels:         controller-uid=58c6d945-be62-11e7-86f7-080027797e6b
                job-name=failed-job
Annotations:    kubectl.kubernetes.io/last-applied-configuration={"apiVersion":"batch/v1","kind":"Job","metadata":{"annotations":{},"name":"failed-job","namespace":"default"},"spec":{"backoffLimit":2,"template":{"met...
Parallelism:    1
Completions:    1
Start Time:     Tue, 31 Oct 2017 10:38:46 -0700
Pods Statuses:  1 Running / 0 Succeeded / 0 Failed
Pod Template:
  Labels:  controller-uid=58c6d945-be62-11e7-86f7-080027797e6b
           job-name=failed-job
  Containers:
   nginx:
    Image:  nginx:1.7.9
    Port:   <none>
    Command:
      bash
      -c
      exit 1
    Environment:  <none>
    Mounts:       <none>
  Volumes:        <none>
Events:
  Type    Reason            Age   From            Message
  ----    ------            ----  ----            -------
  Normal  SuccessfulCreate  1m    job-controller  Created pod: failed-job-t6mln

What you expected to happen:

We expected only 2 attempted pod restarts before the job was marked as failed. The pod kept restarting and job.pod.status was never set to failed, remained active.

How to reproduce it (as minimally and precisely as possible):

 apiVersion: batch/v1
 kind: Job
 metadata:
   name: failed-job
   namespace: default
 spec:
   backoffLimit: 2
   template:
     metadata:
       name: failed-job
     spec:
       containers:
       - name: nginx
         image: nginx:1.7.9
         command: ["bash", "-c", "exit 1"]
       restartPolicy: OnFailure

Create the above job and observe the number of pod restarts.

Anything else we need to know?:

The backoffLimit flag works as expected when restartPolicy: Never.

Environment:

  • Kubernetes version (use kubectl version): 1.8.0 using minikube

Metadata

Metadata

Assignees

Labels

kind/bugCategorizes issue or PR as related to a bug.sig/appsCategorizes an issue or PR as relevant to SIG Apps.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions