Documentation for Indexed completion mode

Signed-off-by: Aldo Culquicondor <acondor@google.com>
kubernetes · Mar 16, 2021 · 0af3ede · 0af3ede
1 parent 0b32f89
commit 0af3ede
Show file tree

Hide file tree

Showing 8 changed files with 300 additions and 21 deletions.
diff --git a/content/en/docs/concepts/workloads/controllers/job.md b/content/en/docs/concepts/workloads/controllers/job.md
@@ -145,8 +145,8 @@ There are three main types of task suitable to run as a Job:
    - the Job is complete as soon as its Pod terminates successfully.
 1. Parallel Jobs with a *fixed completion count*:
    - specify a non-zero positive value for `.spec.completions`.
-   - the Job represents the overall task, and is complete when there is one successful Pod for each value in the range 1 to `.spec.completions`.
-   - **not implemented yet:** Each Pod is passed a different index in the range 1 to `.spec.completions`.
+   - the Job represents the overall task, and is complete when there are `.spec.completions` successful Pods.
+   - when using `.spec.completionMode="Indexed"`, each Pod gets a different index in the range 0 to `.spec.completions-1`.
 1. Parallel Jobs with a *work queue*:
    - do not specify `.spec.completions`, default to `.spec.parallelism`.
    - the Pods must coordinate amongst themselves or an external service to determine what each should work on. For example, a Pod might fetch a batch of up to N items from the work queue.
@@ -166,7 +166,6 @@ a non-negative integer.
 
 For more information about how to make use of the different types of job, see the [job patterns](#job-patterns) section.
 
-
 #### Controlling parallelism
 
 The requested parallelism (`.spec.parallelism`) can be set to any non-negative value.
@@ -185,6 +184,33 @@ parallelism, for a variety of reasons:
 - The Job controller may throttle new Pod creation due to excessive previous pod failures in the same Job.
 - When a Pod is gracefully shut down, it takes time to stop.
 
+#### Completion Mode
+
+{{< feature-state for_k8s_version="v1.21" state="alpha" >}}
+
+{{< note >}}
+To be able to create Indexed Jobs, make sure to enable the `IndexedJob`
+[feature gate](/docs/reference/command-line-tools-reference/feature-gates/)
+on the [API server](docs/reference/command-line-tools-reference/kube-apiserver/)
+and the [controller manager](/docs/reference/command-line-tools-reference/kube-controller-manager/).
+{{< /note >}}
+
+Jobs with _fixed completion count_ - that is, jobs that have non null
+`.spec.completions` - can have a completion mode that is specified in `.spec.completionMode`:
+
+- `NonIndexed` (default): the Job is considered complete when there have been
+  `.spec.completions` successfully completed Pods. In other words, each Pod
+  completion is homologous to each other. Note that Jobs that have null
+  `.spec.completions` are implicitly `NonIndexed`.
+- `Indexed`: the Pods of a Job get an associated completion index from 0 to
+  `.spec.completions-1`, available in the annotation `batch.kubernetes.io/job-completion-index`.
+  The Job is considered complete when there is one successfully completed Pod
+  for each index. For more information about how to use this mode, see
+  [Indexed Job for Parallel Processing with Static Work Assignment](/docs/tasks/job/indexed-parallel-processing-static/).
+  Note that, although rare, more than one Pod could be started for the same
+  index, but only one of them will count towards the completion count.
+
+
 ## Handling Pod and container failures
 
 A container in a Pod may fail for a number of reasons, such as because the process in it exited with
@@ -348,12 +374,12 @@ The tradeoffs are:
 The tradeoffs are summarized here, with columns 2 to 4 corresponding to the above tradeoffs.
 The pattern names are also links to examples and more detailed description.
 
-|                            Pattern                                   | Single Job object | Fewer pods than work items? | Use app unmodified? |  Works in Kube 1.1? |
-| -------------------------------------------------------------------- |:-----------------:|:---------------------------:|:-------------------:|:-------------------:|
-| [Job Template Expansion](/docs/tasks/job/parallel-processing-expansion/)            |                   |                             |          ✓          |          ✓          |
-| [Queue with Pod Per Work Item](/docs/tasks/job/coarse-parallel-processing-work-queue/)   |         ✓         |                             |      sometimes      |          ✓          |
-| [Queue with Variable Pod Count](/docs/tasks/job/fine-parallel-processing-work-queue/)  |         ✓         |             ✓               |                     |          ✓          |
-| Single Job with Static Work Assignment                               |         ✓         |                             |          ✓          |                     |
+|                  Pattern                  | Single Job object | Fewer pods than work items? | Use app unmodified? |
+| ----------------------------------------- |:-----------------:|:---------------------------:|:-------------------:|
+| [Queue with Pod Per Work Item]            |         ✓         |                             |      sometimes      |
+| [Queue with Variable Pod Count]           |         ✓         |             ✓               |                     |
+| [Indexed Job with Static Work Assignment] |         ✓         |                             |          ✓          | 
+| [Job Template Expansion]                  |                   |                             |          ✓          |
 
 When you specify completions with `.spec.completions`, each Pod created by the Job controller
 has an identical [`spec`](https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#spec-and-status).  This means that
@@ -364,13 +390,17 @@ are different ways to arrange for pods to work on different things.
 This table shows the required settings for `.spec.parallelism` and `.spec.completions` for each of the patterns.
 Here, `W` is the number of work items.
 
-|                             Pattern                                  | `.spec.completions` |  `.spec.parallelism` |
-| -------------------------------------------------------------------- |:-------------------:|:--------------------:|
-| [Job Template Expansion](/docs/tasks/job/parallel-processing-expansion/)           |          1          |     should be 1      |
-| [Queue with Pod Per Work Item](/docs/tasks/job/coarse-parallel-processing-work-queue/)   |          W          |        any           |
-| [Queue with Variable Pod Count](/docs/tasks/job/fine-parallel-processing-work-queue/)  |          1          |        any           |
-| Single Job with Static Work Assignment                               |          W          |        any           |
-
+|             Pattern                       | `.spec.completions` |  `.spec.parallelism` |
+| ----------------------------------------- |:-------------------:|:--------------------:|
+| [Queue with Pod Per Work Item]            |          W          |        any           |
+| [Queue with Variable Pod Count]           |         null        |        any           |
+| [Indexed Job with Static Work Assignment] |          W          |        any           |
+| [Job Template Expansion]                  |          1          |     should be 1      |
+
+[Queue with Pod Per Work Item]: /docs/tasks/job/coarse-parallel-processing-work-queue/
+[Queue with Variable Pod Count]: /docs/tasks/job/fine-parallel-processing-work-queue/
+[Indexed Job with Static Work Assignment]: /docs/tasks/job/indexed-parallel-processing-static/
+[Job Template Expansion]: /docs/tasks/job/parallel-processing-expansion/
 
 ## Advanced usage
 

diff --git a/content/en/docs/reference/command-line-tools-reference/feature-gates.md b/content/en/docs/reference/command-line-tools-reference/feature-gates.md
@@ -169,6 +169,7 @@ different Kubernetes components.
 | `StorageVersionHash` | `true` | Beta | 1.15 | |
 | `Sysctls` | `true` | Beta | 1.11 | |
 | `TTLAfterFinished` | `false` | Alpha | 1.12 | |
+| `IndexedJob` | `false` | Alpha | 1.21 | |
 | `TopologyManager` | `false` | Alpha | 1.16 | 1.17 |
 | `TopologyManager` | `true` | Beta | 1.18 | |
 | `ValidateProxyRedirects` | `false` | Alpha | 1.12 | 1.13 |
@@ -628,10 +629,12 @@ Each feature gate is designed for enabling/disabling a specific feature:
 - `HyperVContainer`: Enable
   [Hyper-V isolation](https://docs.microsoft.com/en-us/virtualization/windowscontainers/manage-containers/hyperv-container)
   for Windows containers.
-- `IPv6DualStack`: Enable [dual stack](/docs/concepts/services-networking/dual-stack/)
-  support for IPv6.
 - `ImmutableEphemeralVolumes`: Allows for marking individual Secrets and ConfigMaps as
   immutable for better safety and performance.
+- `IndexedJob`: Allows the [Job](/docs/concepts/workloads/controllers/job/)
+  controller to manage Pod completions per completion index.
+- `IPv6DualStack`: Enable [dual stack](/docs/concepts/services-networking/dual-stack/)
+  support for IPv6.
 - `KubeletConfigFile` (*deprecated*): Enable loading kubelet configuration from
   a file specified using a config file.
   See [setting kubelet parameters via a config file](/docs/tasks/administer-cluster/kubelet-config-file/)

diff --git a/content/en/docs/tasks/job/coarse-parallel-processing-work-queue.md b/content/en/docs/tasks/job/coarse-parallel-processing-work-queue.md
@@ -2,7 +2,7 @@
 title: Coarse Parallel Processing Using a Work Queue
 min-kubernetes-server-version: v1.8
 content_type: task
-weight: 30
+weight: 20
 ---
 
 

diff --git a/content/en/docs/tasks/job/fine-parallel-processing-work-queue.md b/content/en/docs/tasks/job/fine-parallel-processing-work-queue.md
@@ -2,7 +2,7 @@
 title: Fine Parallel Processing Using a Work Queue
 content_type: task
 min-kubernetes-server-version: v1.8
-weight: 40
+weight: 30
 ---
 
 <!-- overview -->

diff --git a/content/en/docs/tasks/job/indexed-parallel-processing-static.md b/content/en/docs/tasks/job/indexed-parallel-processing-static.md
@@ -0,0 +1,184 @@
+---
+title: Indexed Job for Parallel Processing with Static Work Assignment
+content_type: task
+min-kubernetes-server-version: v1.21
+weight: 30
+---
+
+{{< feature-state for_k8s_version="v1.21" state="alpha" >}}
+
+<!-- overview -->
+
+
+In this example, you will run a Kubernetes Job that uses multiple parallel
+worker processes.
+Each worker is a different container running in its own Pod. The Pods have an
+_index number_ that the control plane sets automatically, which allows each Pod
+to identify which part of the overall task to work on.
+
+The pod index is available in the {{< glossary_tooltip text="annotation" term_id="annotation" >}}
+`batch.kubernetes.io/job-completion-index` as string representing its
+decimal value. In order for the containerized task process to obtain this index,
+you can publish the value of the annotation using the [downward API](/docs/tasks/inject-data-application/downward-api-volume-expose-pod-information/#the-downward-api)
+mechanism.
+For convenience, the control plane automatically sets the downward API to
+expose the index in the `JOB_COMPLETION_INDEX` environment variable.
+
+Here is an overview of the steps in this example:
+
+1. **Create an image that can read the pod index**. You might modify the worker
+   program or add a script wrapper.
+2. **Start an Indexed Job**. The downward API allows you to pass the annotation
+   as an environment variable or file to the container.
+
+## {{% heading "prerequisites" %}}
+
+Be familiar with the basic,
+non-parallel, use of [Job](/docs/concepts/workloads/controllers/job/).
+
+{{< include "task-tutorial-prereqs.md" >}} {{< version-check >}}
+
+To be able to create Indexed Jobs, make sure to enable the `IndexedJob`
+[feature gate](/docs/reference/command-line-tools-reference/feature-gates/)
+on the [API server](docs/reference/command-line-tools-reference/kube-apiserver/)
+and the [controller manager](/docs/reference/command-line-tools-reference/kube-controller-manager/).
+
+<!-- steps -->
+
+## Create a container image
+
+To access the work item from the worker program, you have a few options:
+
+1. Read the `JOB_COMPLETION_INDEX` environment variable. The Job
+   {{< glossary_tooltip text="controller" term_id="controller" >}}
+   automatically links this variable to the annotation containing the completion
+   index.
+1. Read a file that contains the completion index.
+1. Assuming that you can't modify the program, you can wrap it with a script
+   that reads the index using any of the methods above and converts it into
+   something that the program can use as input.
+
+For this example, imagine that you chose option 3 and you want to run the
+[rev](https://man7.org/linux/man-pages/man1/rev.1.html) utility. This
+program accepts a file as an argument and prints its content reversed.
+
+```shell
+rev data.txt
+```
+
+This program is available in the [busybox container image](https://hub.docker.com/_/busybox):
+
+{{< tabs name="busybox container image" >}}
+{{{< tab name="Docker Hub" >}}
+docker.io/library/busybox
+{{< /tab >}}
+{{< tab name="Google Container Registry" >}}
+mirror.gcr.io/library/busybox
+{{< /tab >}}}
+{{< /tabs >}}
+
+## Define an Indexed Job
+
+Here is a job definition. You'll need to edit the container image to match your
+preferred registry.
+
+{{< codenew language="yaml" file="application/job/indexed-job.yaml" >}}
+
+In the example above, you use the builtin `JOB_COMPLETION_INDEX` environment
+variable set by the Job controller for all containers. An [init container](/docs/concepts/workloads/pods/init-containers/)
+maps the index to a static value and writes it to a file that is shared with the
+container running the worker through an [emptyDir volume](/docs/concepts/storage/volumes/#emptydir).
+Optionally, you can [define your own environment variable through the downward
+API](/docs/tasks/inject-data-application/environment-variable-expose-pod-information/)
+to publish the index to containers. You can also choose to load a list of values
+from a [ConfigMap as an environment variable or file](/docs/tasks/configure-pod-container/configure-pod-configmap/).
+
+Alternatively, you can directly [use the downward API to pass the annotation
+value as a volume file](/docs/tasks/inject-data-application/downward-api-volume-expose-pod-information/#store-pod-fields),
+like shown in the following example:
+
+{{< codenew language="yaml" file="application/job/indexed-job-vol.yaml" >}}
+
+## Running the Job
+
+Now run the Job:
+
+```shell
+kubectl apply -f ./indexed-job.yaml
+```
+
+Wait a bit, then check on the job:
+
+```shell
+kubectl describe jobs/indexed-job
+```
+
+The output is similar to:
+
+```
+Name:              indexed-job
+Namespace:         default
+Selector:          controller-uid=bf865e04-0b67-483b-9a90-74cfc4c3e756
+Labels:            controller-uid=bf865e04-0b67-483b-9a90-74cfc4c3e756
+                   job-name=indexed-job
+Annotations:       <none>
+Parallelism:       3
+Completions:       5
+Start Time:        Thu, 11 Mar 2021 15:47:34 +0000
+Pods Statuses:     2 Running / 3 Succeeded / 0 Failed
+Completed Indexes: 0-2
+Pod Template:
+  Labels:  controller-uid=bf865e04-0b67-483b-9a90-74cfc4c3e756
+           job-name=indexed-job
+  Init Containers:
+   input:
+    Image:      docker.io/library/bash
+    Port:       <none>
+    Host Port:  <none>
+    Command:
+      bash
+      -c
+      items=(foo bar baz qux xyz)
+      echo ${items[$JOB_COMPLETION_INDEX]} > /input/data.txt
+
+    Environment:  <none>
+    Mounts:
+      /input from input (rw)
+  Containers:
+   worker:
+    Image:      docker.io/library/busybox
+    Port:       <none>
+    Host Port:  <none>
+    Command:
+      rev
+      /input/data.txt
+    Environment:  <none>
+    Mounts:
+      /input from input (rw)
+  Volumes:
+   input:
+    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
+    Medium:
+    SizeLimit:  <unset>
+Events:
+  Type    Reason            Age   From            Message
+  ----    ------            ----  ----            -------
+  Normal  SuccessfulCreate  4s    job-controller  Created pod: indexed-job-njkjj
+  Normal  SuccessfulCreate  4s    job-controller  Created pod: indexed-job-9kd4h
+  Normal  SuccessfulCreate  4s    job-controller  Created pod: indexed-job-qjwsz
+  Normal  SuccessfulCreate  1s    job-controller  Created pod: indexed-job-fdhq5
+  Normal  SuccessfulCreate  1s    job-controller  Created pod: indexed-job-ncslj
+```
+
+In this example, we run the job with custom values for each index. You can
+inspect the output of the pods:
+
+```shell
+kubectl logs indexed-job-fdhq5 # Change this to match the name of a Pod in your cluster.
+```
+
+The output is similar to:
+
+```
+xuq
+```
diff --git a/content/en/docs/tasks/job/parallel-processing-expansion.md b/content/en/docs/tasks/job/parallel-processing-expansion.md
@@ -2,7 +2,7 @@
 title: Parallel Processing using Expansions
 content_type: task
 min-kubernetes-server-version: v1.8
-weight: 20
+weight: 50
 ---
 
 <!-- overview -->

diff --git a/content/en/examples/application/job/indexed-job-vol.yaml b/content/en/examples/application/job/indexed-job-vol.yaml
@@ -0,0 +1,27 @@
+apiVersion: batch/v1
+kind: Job
+metadata:
+  name: 'indexed-job'
+spec:
+  completions: 5
+  parallelism: 3
+  completionMode: Indexed
+  template:
+    spec:
+      restartPolicy: Never
+      containers:
+      - name: 'worker'
+        image: 'docker.io/library/busybox'
+        command:
+        - "rev"
+        - "/input/data.txt"
+        volumeMounts:
+        - mountPath: /input
+          name: input
+      volumes:
+      - name: input
+        downwardAPI:
+          items:
+          - path: "data.txt"
+            fieldRef:
+              fieldPath: metadata.annotations['batch.alpha.kubernetes.io/job-completion-index']
diff --git a/content/en/examples/application/job/indexed-job.yaml b/content/en/examples/application/job/indexed-job.yaml
@@ -0,0 +1,35 @@
+apiVersion: batch/v1
+kind: Job
+metadata:
+  name: 'indexed-job'
+spec:
+  completions: 5
+  parallelism: 3
+  completionMode: Indexed
+  template:
+    spec:
+      restartPolicy: Never
+      initContainers:
+      - name: 'input'
+        image: 'docker.io/library/bash'
+        command:
+        - "bash"
+        - "-c"
+        - |
+          items=(foo bar baz qux xyz)
+          echo ${items[$JOB_COMPLETION_INDEX]} > /input/data.txt
+        volumeMounts:
+        - mountPath: /input
+          name: input
+      containers:
+      - name: 'worker'
+        image: 'docker.io/library/busybox'
+        command:
+        - "rev"
+        - "/input/data.txt"
+        volumeMounts:
+        - mountPath: /input
+          name: input
+      volumes:
+      - name: input
+        emptyDir: {}