# Workloads

## Introduction
> <font size=+1> Workload is an application running on `Kubernetes`. </font>

It can be divided into `Pods` and `Workload Resources`.

### Pods overview
Each `workload` runs within __a set of `Pods`__, and each `Pod` __represents a set of containers running together.__.

- `Pod`s are scheduled to run on nodes.
- If a `node` fails, all `Pod`s on the node are deleted (__`Pod`s are ephemeral and are easy to (re)create__).
- If a `Pod` run fails, it will stay in the inappropriate state.

Notably, since there are many failure points, the cluster admin should
- constantly verify if the `Pod`s are running well.
- manually reschedule `Pod`s to different `Node`s.
- manually restart `Pod`s if any container fails.

### Workload resources overview

> __`Workload resources` were created to address the inefficiency of manual handling.__

They can specify, among other things,

- when a `Pod` should be recreated.
- what to do when a failure occurs.
- the number of times to attempt rescheduling before stopping. 

> `Workload Resources` should be used to manage the lifecycle of `Pod`s. Furthermore, avoid deploying 'bare Pods'.

Here are the most common `Workload Resources`:

- Deployment
- StatefulSet
- DaemonSet
- Job and CronJob

In this lesson, we will learn about Pods and Workload Resources, as well as how to implement them.

## Pods


> <font size=+1>A group of one or more containers, with a shared context and a specification for running the containers.</font>

Pods add another layer of abstraction to containers and behave similarly to `docker-compose`, i.e. __they can connect multiple related containers in one logical grouping__.

*Shared context*
- Shared storage
- Shared network resources (e.g. IP)
- Linux namespaces

__Notably, individual applications can be further isolated within a Pod.__

> A pod is a minimal deployment unit in Kubernetes.

This indicates that containers cannot be deployed on their own.


### Lifecycle

> `Pod`s remain on their scheduled `Node` until termination (according to the restart `policy` if a failure occurs) or deletion.

__Further, they are never moved across nodes and are eventually recreated.__

#### Features
- `Pod`s cannot self-heal (i.e. initiate self-restart). The restart is carried out by the appropriate `Workload Resources` or the cluster admin.
- __Pods can restart failed containers using `kubelet`.__
- Related resources (e.g. `volume`s) are also deleted after `Pod` termination, unless otherwise specified.

`Pod`s can be in one of multiple phases:
- `pending`: The pod has been accepted by the `k8s` cluster; however,
    - one or more containers did not start.
    - the `Pod` is waiting for the node schedule.
    - the container image is currently downloading.
- `running`: At least one `container` within the `Pod` is running (or being restarted).
- `succeeded`: All the containers in the `Pod` have succeeded, __and no restart is required.__
- `failed`: At least one container failed and was terminated.
- `unknown`: The state of the `Pod` could not be obtained (__typically caused by communication error with the `Node`).__

### Container states

> __Kubernetes also watches the state of individual containers within `Pod`s.__

Containers can be in one of three states:
- `Waiting`: downloading the image or pulling `secrets` (__the reason for this state is provided for monitoring).__
- `Running`
- `Terminated`: either terminated successfully or not (__the reason and `exit code` are provided for monitoring).__


### Single-container pods

Generally, p are run with a single container. A Pod can be considered as a container wrapper.

Examples include the following:
- A fastAPI server receiving requests and saving them to a shared database.
- A docker container receiving images as requests and forwarding the classification.


## Multiple-container pods

> Multi-container pods are employed in more advanced use cases. They comprise multiple, tightly coupled containers, __affording a cohesive service unit.__

<p align=center><img src=images/pod.svg width=350></p>

Here are examples of use cases:
- Training multiple ML models, where the
    - first container accesses the shared storage of raw data and transforms it.
    - second container trains the neural network on the presented data.
    - third container pushes the trained model to the serving container.
    - fourth container serves the model.
- One container serves data to the public (`read only` permissions), while another, an internal one, writes data to shared storage.

> __Pod containers are scheduled on the same 'logical host' (for cloud, for clusters of servers: same VM or physical computer) because of tight coupling.__

As mentioned in 'Introduction', pods in a cluster can be viewed using the `kubectl get pod` command. However, only the pods in the default namespace will be shown. To observe all the pods in the cluster, add the -A flag.

<i>_If you have not done so already, run `minikube start` to create a cluster in your local machine._</i>

In [4]:
!kubectl get pod -A

NAMESPACE              NAME                                         READY   STATUS    RESTARTS       AGE
default                hello-minikube-6ddfcc9757-mndds              1/1     Running   2 (106s ago)   16h
kube-system            coredns-78fcd69978-8j8hz                     1/1     Running   2 (106s ago)   19h
kube-system            etcd-minikube                                1/1     Running   2 (106s ago)   19h
kube-system            kube-apiserver-minikube                      1/1     Running   2 (106s ago)   19h
kube-system            kube-controller-manager-minikube             1/1     Running   2 (106s ago)   19h
kube-system            kube-proxy-jb22b                             1/1     Running   2 (106s ago)   16h
kube-system            kube-scheduler-minikube                      1/1     Running   2 (106s ago)   19h
kube-system            storage-provisioner                          1/1     Running   4 (32s ago)    16h
kubernetes-dashboard   dashboard-metrics-scraper-559445

As shown, we already have some Pods. This is because `minikube` comes with some pods by default. In the next section, we see how to deploy pods.

### Defining pods (pod template)

As mentioned in the last lesson, Kubernetes objects can be defined imperatively (using defined steps) or declaratively. In this lesson, we will focus on defining the declarative configurations, which are specified using `.yaml` files.

A pod, as a basic `kind`, can be specified via the `.yaml` config file; however, __we strongly advise against specifying bare `Pod`s.__

It is easy to specify a 'bare `Pod`':

```
apiVersion: v1
kind: Pod
metadata:
  name: pod1
  labels:
    tier: frontend
spec:
  containers:
  - name: hello1
    image: gcr.io/google-samples/hello-app:2.0
```

In the previous lesson, we learnt about the API versions and how to find them in the [API docs](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.22/). 

Inside `spec`, we added the `container`. This will list the docker containers belonging to the list. Inside `container`, we specified `image`, which corresponds to the Docker image name. In this case, it is a sample image from google.

In [10]:
# Observe the pods in the default namespace
!kubectl ###Your command here


No resources found in default namespace.


In [13]:
# Spin up the pod corresponding to the single-pod configuration above
!kubectl

pod/pod1 created


In [14]:
# Observe the pods in the default namespace again
!kubectl

NAME   READY   STATUS              RESTARTS   AGE
pod1   0/1     ContainerCreating   0          23s


- Delete the pod using the correct `kubectl` command.
- Observe the pods in the default namespace once again.

In [16]:
# Delete the pod
!kubectl

pod "pod1" deleted


In [17]:
!kubectl

No resources found in default namespace.


Unlike the case in the last notebook, the pod disappears here. This is because, contrary to the case in the last notebook, no instructions were provided here to keep the pod alive. The pod is 'bare', without any resource to keep it running after it fails. 

Keeping it 'alive' is one of the desired states that can be specified in the config file, and it can be achieved with `Deployment` or `Replica Set`. More options can be found within the workload resources.

## Workload Resources

Before diving into specific `Workload Resources`, here are a few important concepts.

- Each `workload` presented below uses the `.spec.template` field, which specifies how to create a pod.
- The `template` is essentially the same as pod config, except for the `kind` and `apiVersion` fields.
- Each `workload` has a `.spec.selector` that specifies which pods are handled by the `workload resource`.
- `.spec.selector` employs matches on defined `labels` and may 'handle' pods outside its config file.

### Implementation
Workload resources can be implemented via
- deployment
- DaemonSet
- jobs

In subsequent lessons, we will learn how to implement another type of workload, `StatefulSets`.

### ReplicaSet

>  This maintains a stable set of replicated pods running at any given time.

`ReplicaSet`:
- creates new `POD`s accordingly to the `.spec.replicas` field value.
    (from `.spec.template` config)
- deletes `POD`s if overly many of them are scheduled to nodes.

#### Acquiring pods

> `ReplicaSet` is linked to the pods via `metadata.ownerReferences` and acquired via `.spec.selector` matching.

The above works as follows:
- Each pod has `metadata.ownerReferences`, which was __automatically added by `k8s`.__
- The above specifies who manages the pod, e.g. another controller.
- __If the `POD`__has no 'owner' (e.g. bare `POD`) __or__ its owner __is not another controller and__ the `.spec.selector` fields match, then the `POD` is acquired by the `ReplicaSet`.

> <font size=+1>The process above works the same for other `workload resources` (or managers).</font>

#### Using `ReplicaSet`s

> __Generally, it is advised to use relatively high-level `Deployment` `workload resources`.__

`Deployment`, a high-level concept, __manages `ReplicaSet`s__ and, in addition, provides __declarative updates to pods.__

`ReplicaSet`s may be employed if
- custom-update orchestration is to be performed.
- the `config` file will never be updated.

> <font size=+1>Note, however, that we recommend using Deployment instead.</font>

For more detailed information on `ReplicaSet`s, check [here](https://kubernetes.io/docs/concepts/workloads/controllers/replicaset/).

### Deployment

> Refers to the provision of declarative updates for `Pods` and `ReplicaSets`.

Consider the example config below, and attempt to decipher the meaning of each field:

```
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
  labels:
    app: nginx
spec:
  replicas: 3
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:1.14.2
        ports:
        - containerPort: 80
```

In the example above, the resource is being instructed to ensure that at least three pods are running at all times. These three pods can be found using the `selector.matchLabels`, which searches the template and finds the label whose key and value are `app` and `nginx`, respectively. 

#### Example

1. Create a .yaml file with the above configuration.
2. Run the right `kubectl` command to run the .yaml file.
3. Observe the number of pods.
4. Delete one pod.
5. Observe the number of pods once more.

In [1]:
!kubectl

deployment.apps/nginx-deployment created


In [2]:
# Observe how many pods you have
!kubectl

NAME                                READY   STATUS              RESTARTS   AGE
nginx-deployment-66b6c48dd5-7qpvr   0/1     ContainerCreating   0          17s
nginx-deployment-66b6c48dd5-gc7z7   0/1     ContainerCreating   0          17s
nginx-deployment-66b6c48dd5-vzsx2   0/1     ContainerCreating   0          17s


Now, we can exploit the running `nginx` containers to run some commands manually.

In this case, we run the command in the terminal:

<p align=center><img src=images/nginx.png width=700></p>

As shown above, after deletion, the pod persists, although with a different name:

In [1]:
# Delete one of the pods
!kubectl delete pod nginx-deployment-66b6c48dd5-vzsx2

pod "nginx-deployment-66b6c48dd5-vzsx2" deleted


In [2]:
!kubectl get pod

NAME                                READY   STATUS    RESTARTS        AGE
nginx-deployment-66b6c48dd5-7qpvr   1/1     Running   1 (8m15s ago)   8h
nginx-deployment-66b6c48dd5-fsck5   1/1     Running   0               40s
nginx-deployment-66b6c48dd5-gc7z7   1/1     Running   1 (8m15s ago)   8h


To avoid confusion, delete the deployment resource.

In [3]:
# Delete the deployment resource
!kubectl delete deployment nginx-deployment

deployment.apps "nginx-deployment" deleted


### DaemonSet

> DaemonSet ensures that a pod is deployed in all Nodes as it is added to the cluster.

In the demonstration thus far, we have one pod per node; therefore, if a node is removed, the number of pods will decrease.

DaemonSets are generally used to monitor services. A single pod can be used to monitor the health or capture the logs of each node. As you probably can tell, they are quite similar to deployment; however, they do not have replicas.

#### Required fields

> The required fields are similar to those for `Deployment`, i.e. `.spec.template`, `.spec.selector` and __no `.spec.replicas` (as the same `daemon` is run per node).__

- Similarly, `POD`s are acquired via `.spec.selector` matching.
- Additionally, __`Node`s are acquired via the `.spec.template.spec.nodeSelector` field.__

### Assigning pods to nodes

> Generally, we do not have to interfere with `k8s` POD deployment to specific `Node`s.

However, interference may be allowed for the following reasons:
- To ensure that the `POD` ends up on a `Node` with `SSD` attached.
- To co-locate `POD`s from different services in the same zone if they communicate frequently.

`k8s` comes with a set of predefined `labels` for `Node`s. 

Here are a few common labels (the full list can be found [here](https://kubernetes.io/docs/reference/labels-annotations-taints/)):
- Region of deployment (for cloud): `topology.kubernetes.io/region=us-east-1`.
- IP address of a node: `kubernetes.io/hostname=ip-172-20-114-199.ec2.internal`.
- Operating system of the node: `kubernetes.io/os=linux`.

> __In the `kubectl` lesson, we will learn how to add custom `label`s to `Node`s.__

Now, the `.spec.template.spec.nodeSelector` can be used as follows:

In [None]:
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: fluentd-elasticsearch
spec:
  selector:
    matchLabels:
      name: fluentd-elasticsearch
  template:
    metadata:
      labels:
        name: fluentd-elasticsearch
    spec:
      containers:
      - name: fluentd-elasticsearch
        image: quay.io/fluentd_elasticsearch/fluentd:v2.5.2
        resources:
          limits:
            memory: 200Mi
          requests:
            cpu: 100m
            memory: 200Mi
      terminationGracePeriodSeconds: 30

Before proceeding, we will briefly go over memory resources. As can be observed in the Kubernetes object, there is a new field in the `template.spec.container` field named `resources`. This field, in turn, has two more fields: `limits` and `requests`. 

- `limits`: These values represent the maximum capacities allotted to a pod. If the process expends more than 200MB of RAM, the pod will be restarted.
- `requests`: These values represent the minimum capacities allotted to a pod. In the example above, 200MB of RAM and 100 milicores are allotted to the pod. A _milicore_ is a fraction of a computer's cores, and each core is equivalent to 1000 milicores.

For more information on container resources, visit the [link](https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/).

As you can observe, most of the arguments in the DaemonSet are the same as those in the Deployment resource. Observe that the `replicas` key is not used. This is because, as mentioned, there will be a single `DaemonSet` pod per node. For this demonstration, we add a node to the minikube existing cluster.

In [None]:
!minikube node add

### Jobs

`Job`s create one or more pods and makes repetitive attempts to execute them until a specified number successfully terminates.

#### Use cases
- A `Job` is created to ensure the successful running of a task.
- The same `Job` can be run in parallel for `N` times.

Below is an example config `Job` workload that calculates the `pi` value:

```
apiVersion: batch/v1
kind: Job
metadata:
  name: pi
spec:
  template:
    spec:
      containers:
      - name: pi
        image: perl
        command: ["perl",  "-Mbignum=bpi", "-wle", "print bpi(2000)"]
      restartPolicy: Never
  backoffLimit: 4
```

#### `.spec`ification

For Jobs, the standard fields are necessary, in addition to the following:

- `.spec.restartPolicy`: can either be `Never` or `OnFailure` (default).
- `.spec.completions`: the number of `Job`s to be completed prior.
- `.spec.parallelism`: the number of `pod`s to be scheduled with the job simultaneously.

Using `.spec.completions` and `.spec.parallelism`, we can construct different levels of parallelism:

- __non-parallel__: specify `.spec.completions=1`, and only one `Job` will be created (__a new one will only start after this one fails__).
- __parallel with a fixed completion count__: specify `.spec.completions=N` to run __at most `N` parallel jobs at a given time__ (the controller will reschedule the `Node`s in case of a failure).
- __parallel with work queue__: specify `.spec.completions=1` and `.spec.parallelism=N`; `N` pods will run after the first one succeeds; __the execution of the rest will continue until termination__ (implement direct `pod` to `pod` communication for improved efficiency).

`non-parallel` is the default mode, while `.spec.completions=1` and `.spec.parallelism=1` are the default values.

One could also specify the [completion Mode](https://kubernetes.io/docs/concepts/workloads/controllers/job/#completion-mode), which allows us to modify the behaviour of the pods upon termination.

- `.spec.backoffLimit`: the number of times __a single `pod`__ should be restarted before considering the `Job` as failed.

Things to note in this case:
- Depending on the settings, if `.spec.completions=N` is hit, the __job__ is considered successful.
- Until this moment, attempt to recreate `pod` with `Job` `N` times.
- Exponential back-off delay is applied:
    - first retry after `10s`
    - second after `20s`
    - third after `40s`
    - __capped at `6m` backoff.__
    
- `.spec.activeDeadlineSeconds`: how long (in seconds) the __whole job__ takes until termination. Once reached, __all the `pod`s are terminated__ (takes precedence over `.spec.backoffLimit`).
    
#### Cleaning up

> Once completed, a `Job` __will not automatically be removed from the cluster.__

> This is disadvantageous because `kube-apiserver` will still query `Job` and look for its `pod`s, __applying unnecessary pressure on `k8s`.__

This is the default behaviour because the user may need to check
- the logs of finished jobs, which are stored within `pod`s or in an external storage volume.
- the status of `Job`(s).

__`TTL` (time to live) offers an adequate solution to this issue.__

> __`TTL` specifies when the `job` should be removed from the cluster, including all of its `pod`s and dependencies__.

TTL can be set up via the `.spec.ttlSecondsAfterFinished` field, as shown below:

```
apiVersion: batch/v1
kind: Job
metadata:
  name: pi-with-ttl
spec:
  ttlSecondsAfterFinished: 100
  template:
    spec:
      containers:
      - name: pi
        image: perl
        command: ["perl",  "-Mbignum=bpi", "-wle", "print bpi(2000)"]
      restartPolicy: Never

As mentioned, `Job`s will terminate when they successfully execute a number of Pods. To repeat that operation, you would need to `apply` a Kubernetes object to re-run the job. Fortunately, there is an easier approach to generate `Job`s periodically with the desired frequency: __`Cronjob`__.

A `Cronjob` creates `Job`s on a repeating schedule. The desired schedule can be specified in the specs:
```
apiVersion: batch/v1
kind: CronJob
metadata:
  name: hello
spec:
  schedule: "*/1 * * * *"
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: hello
            image: busybox
            imagePullPolicy: IfNotPresent
            command:
            - /bin/sh
            - -c
            - date; echo Hello from the Kubernetes cluster
          restartPolicy: OnFailure
```

## Conclusion
At this point, you should have a good understanding of 
- workload resources, pods, nodes and jobs.
- how to create a .yaml file with the above configuration.