# Storage

> `POD` and containers are __ephemeral__, which has it's upsides and downsides

An upsides, as we have seen:
- Easy to create
- Easy to destroy 
- Immutable and easier to reason about
- Easy to parallelize

But, unfortunately, there are some downsides:
- __When container/`POD` is removed all of the data created by it on local disk is also removed__
- Each container is separate entity, hence sharing data between containers is not possible...

... not possible unless we introduce `Volume`s!

# Volumes

`Volume`s in `k8s` are a little different from the ones in `Docker`.

Brief overview of `volume`s in `Docker`:
- Directory on `disk` or in another `container`
- `Volume`s are mounted to containers during runtime
- One can share data across instances (or using `cloud` storage) via `drivers` (see [here](https://docs.docker.com/storage/volumes/#share-data-among-machines))

Above resolves some of our problems, but the feature set is quite limited and __is not enough for handling large-scale deployments__

> `Kubernetes` supports a lot of `volume` types which makes our life substantially easier

High level features:
- __Any `volume`s can be mounted at the same time__
- __Ephemeral `volume`s__ - have lifetime the same as `POD`
- __Persistent `volume`s__ - are indepdent of `POD` lifetime
- __Data is available across `container`s restart__ handled by `kubelet`

Providing volumes to `POD`s is done via:
- `.spec.volumes` - specifies which `volumes` to use
- `.spec.containers[*].volumeMounts` - where and which volume to mount for specific container

`DaemonSet` with volume mounting (mounting `volume` residing on `node`, similiar to basic `docker` volume case):

In [None]:
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: fluentd-elasticsearch
  labels:
    k8s-app: fluentd-logging
spec:
  selector:
    matchLabels:
      name: fluentd-elasticsearch
  template:
    metadata:
      labels:
        name: fluentd-elasticsearch
    spec:
      containers:
      - name: fluentd-elasticsearch
        image: quay.io/fluentd_elasticsearch/fluentd:v2.5.2
        resources:
          limits:
            memory: 200Mi
          requests:
            cpu: 100m
            memory: 200Mi
        # Here we can mount them with `name` matching
        # Ephemeral Volumes
        volumeMounts:
        - name: varlog
          mountPath: /var/log
        - name: varlibdockercontainers
          mountPath: /var/lib/docker/containers
          readOnly: true
      terminationGracePeriodSeconds: 30
      # Here we define our volumes
      # Data from POD will be mounted
      volumes:
      - name: varlog
        hostPath:
          path: /var/log
      - name: varlibdockercontainers
        hostPath:
          path: /var/lib/docker/container

## `subPath`

> __`subPath` allows us to share single `volume` for multiple uses in single `POD`__

We could also do that via `mountPath`, but:
- there is no access separation between different containers
- more data is mounted than necessary for containers

Below is an example of `LAMP` stack (Linux Apache MySql PHP) defined as a "bare POD":

In [None]:
apiVersion: v1
kind: Pod
metadata:
  name: my-lamp-site
spec:
    containers:
    - name: mysql
      image: mysql
      env:
      - name: MYSQL_ROOT_PASSWORD
        value: "rootpasswd"
      volumeMounts:
      - mountPath: /var/lib/mysql
        name: site-data
        subPath: mysql
    - name: php
      image: php:7.0-apache
      volumeMounts:
      - mountPath: /var/www/html
        name: site-data
        subPath: html
    volumes:
    - name: site-data
      persistentVolumeClaim:
        claimName: my-lamp-site-data

What `subpath` does in this case:
- for `php` data within `persistentVolume` inside `php` folder will be mounted under `/var/www/html`
- for `mysql` data within `persistentVolume` inside `mysql` folder will be mounted under `/var/www/mysql`

Please note that:
- `php` folder in `persistentVolume` __is not available for `mysql` container__
- `mysql` folder in `persistentVolume` __is not available for `php` container__

An additional explanation is given [here](https://stackoverflow.com/questions/65399714/what-is-the-difference-between-subpath-and-mountpath-in-kubernetes)

## Volume Types

`kubernetes` provides quite a few integrations for standard volumes, including:
- `AWS Elastic Block Store`
- Microsoft's Azure `Disk` and `File`
- Self-hosted `cephfs`
- [Google Cloud Persistent Disk](https://kubernetes.io/docs/concepts/storage/volumes/#gcepersistentdisk)

An example config with `awsElasticBlockStore` could be:

In [None]:
apiVersion: v1
kind: Pod
metadata:
  name: test-ebs
spec:
  containers:
  - image: k8s.gcr.io/test-webserver
    name: test-container
    volumeMounts:
    - mountPath: /test-ebs
      name: test-volume
  volumes:
  - name: test-volume
    # This AWS EBS volume must already exist.
    awsElasticBlockStore:
      volumeID: "<volume id>"
      fsType: ext4

to see volume types in more detail [check here](https://kubernetes.io/docs/concepts/storage/volumes/#volume-types).

> __These are all ephemeral `volume`s hence they will only live as long as it's `POD`!__

# Persistent Volumes

> `Kubernetes` provides two resources to manage persistent storage: `PersistentVolume` and `PersistentVolumeClaim`

This abstraction allows us to:
- Abstract how storage is provided
- Abstract a way storage is consumed 

## PersistentVolume

> A piece of storage in the cluster that __has been provisioned by an administrator or dynamically provisioned using `Storage Classes`__

Features:
- resource in the cluster (just like `Node`)
- are volume plugins just like `Volume`s described above
- __they have lifecycle independent of any `POD` using it__
- when bounded can be used just like `volume`

## PersistentVolumeClaim

> Request for storage from user

Features:
- Conceptually similiar to `POD`s:
    - `POD`s consume `Node` resources
    - `PVC`s consume `PV` resources
    - `POD`s can request specific resources (e.g. RAM memory)
    - `PVC`s can request specific sizes and access modes (e.g. `read` and `write`)
    
## Provisioning

There are two ways `PV` can be provisioned:
- `statically` - cluster admin creates `PV`s for consumption
- `dynamically` - cluster tries to dynamically provision appropriate `PersistentVolume` based on `PersistentVolumClaim`'s __`StorageClasses`__ (administrator has to provision `StorageClass`, __described later__)

Features:
- __`Dynamic Provision` will always match exactly the requirements of `PVC`__
- __`Static Provision` has to match AT LEAST the given claim__ (e.g. claim of `50Gb` might be given `100Gb`)

## Reclaim Policy

> When a user is done with their volume, they can delete the PVC objects from the API that __allows reclamation of the resource__

There are three ways to reclaim the resource:
- [`retain`](https://kubernetes.io/docs/concepts/storage/persistent-volumes/#retain) - leave the data as is (leave `PersistentVolume` and 
- [`delete`](https://kubernetes.io/docs/concepts/storage/persistent-volumes/#delete) - delete `PersistentVolume` __and external storage__. This one is default for `Dynamic Provisioning` (although configurable)
- `recycle` (__now deprecated, dynamic provisioning should be used instead__, see [here](https://kubernetes.io/docs/concepts/storage/persistent-volumes/#recycle))

## Phases

> Volumes can come in one of four phases, namely:

- `Available`
- `Bound` - CLI will show `POD` to which `PersistentVolume` is bound
- `Released` - `PesistentVolumeClaim` ended, but resource is not yet reclaimed by `cluster`
- `Failed` - Reclamation failed

## Specifying `PersistentVolume`

As with `POD`, we create them using `.yaml` config files, example below:

In [None]:
apiVersion: v1
kind: PersistentVolume
metadata:
  name: pv0003
spec:
  capacity:
    storage: 5Gi
  volumeMode: Filesystem
  accessModes:
    - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  storageClassName: slow
  mountOptions:
    - hard
    - nfsvers=4.1
  nfs:
    path: /tmp
    server: 172.17.0.2

Let's describe the arguments provided:
1. `.spec.capacity.storage` - request `5Gb` of storage. Currently only storage can be requested (see [`k8s` resource model for description of units](https://github.com/kubernetes/community/blob/master/contributors/design-proposals/scheduling/resources.md))
2. `.spec.volumeMode` - either `FileSystem` (default) or `Block`:
    - `FileSystem` - directory mounted in `POD`s
    - `Block` - uses __raw block of storage__ (without filesystem created)
    - __`Block` is rarely used as application needs to know how to access `raw` data__ (see [here](https://kubernetes.io/docs/concepts/storage/persistent-volumes/#raw-block-volume-support) for example)
3. `.spec.accessModes` - how data can be accessed:
    - `ReadWriteOnce` - volume mounted as `rw` __by a single `Node`__
    - `ReadOnlyMany` - __can be mounted by many `nodes`__ but data can only be read
    - `ReadWriteMany` - as above but both `rw` (applications have to handle possible data races!)
    - `ReadWriteOncePod` - __single `POD` can `rw` data__
      
Modes above differ by type of `PersistentVolume` provider, a few of them shown below:

![](./images/modes_providers.png)

4. __`.spec.storageclassName`__ - specifies `StorageClass`, if left unspecified __there is no `StorageClass` specified and only `PV` without one can be matched to `POD`!__
5. `.spec.mountOptions`- (__NOT SUPPORTED BY ALL TYPES!__); specifies how to mount `disk`, one can leave it as is

# StorageClasses

> A StorageClass provides a way for administrators to describe the "classes" of storage they offer.

These allow us to __dynamically provision storage__ and acts like a template for new `PersistentVolume`.

As per usual, these are defined using `.yaml` files and can be referred by `PV` config files.

A small example:

In [None]:
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: standard
provisioner: kubernetes.io/aws-ebs
parameters:
  type: gp2
reclaimPolicy: Retain
allowVolumeExpansion: true
mountOptions:
  - debug
volumeBindingMode: Immediate

> ### `.metadata.name` value allows users to request this `StorageClass`!

## Mandatory Fields

> `provisioner` - which volume plugin is used for provisiong `PV`

Most common ones are shipped with `k8s` under `kubernetes.io` prefix, for example:

- `local` - `kubernetes.io/no-provisioner` - create `PV`s dynamically from local resources

In [None]:
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: local-storage
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer

- `GCEPersistentDisk` - Persistent disk from Google Cloud

In [None]:
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: slow
provisioner: kubernetes.io/gce-pd
parameters:
  type: pd-standard
  fstype: ext4
  replication-type: none

We are not limited to the ones provided, `provider`s can be written by anyone and hosted

> `parameters` define per-provisioner specification of volume properties

You can see examples for different internal provisioners [here](https://kubernetes.io/docs/concepts/storage/storage-classes/#aws-ebs)

> `reclaimPolicy` - when `PV` is freed from `PVC` what should be done with created `PersistentVolume`

As mentioned previously, one of `Delete` or `Retain` available.

## Expandable Volumes

> From `k8s` 1.11 one can __expand volume dynamically__

This happens, when we change storage requirements in our `PVC` and `apply` new config.

> Easiest way to use this feature is to use one of internal cloud providers

__All we have to do is set `.allowVolumeExpansion: true` in our `StorageClass` definition__

Below is a list of `providers` supporting expandable volumes:

![](./images/expandable-volumes.png)

# PersistentVolumeClaims

> Third `.yaml` resource one could use which specifies claim for `PersistentVolume`

As previously, described via `.yaml` and specifying appropriate `kind`.

Also, it contains `spec` and `status` as other `k8s` objects described up to this point:

In [None]:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: myclaim
spec:
  accessModes:
    - ReadWriteOnce
  volumeMode: Filesystem
  resources:
    requests:
      storage: 8Gi
  storageClassName: slow
  selector:
    matchLabels:
      release: "stable"

## Parameters

This time it will be a little easier:
- `accessModes` - same as `PV`
- `volumeMode` - same as `PV`
- `resources` - same as `PV`
- `selector` - further specify which volumes can fulfill the claim via matching (basics of which we have already seen for `POD`s)
- __`class` - request specific `storageClass`__, only `PV`s of the same class can be bounded to their respective `PVC`s

`storageClassName` features:
- if defined as empty string (`""`) __request for `PV` without any `storageClass` is made__
- if not defined, __default `storageClass` is used__ (if this notion is defined by admin via `storageclass.kubernetes.io/is-default-class=True` within `storageClass` k8s object definition)

# Claims as Volumes

Finally, we can specify specifc `PVC` within our `POD` (or any other `workload`) in order to get specific storage type:

In [None]:
apiVersion: v1
kind: Pod
metadata:
  name: mypod
spec:
  containers:
    - name: myfrontend
      image: nginx
      volumeMounts:
      - mountPath: "/var/www/html"
        name: mypd
  volumes:
    - name: mypd
      persistentVolumeClaim:
        claimName: myclaim

Things to note:
- `claims` must be in the same namespace as `pod`
- `claimName` specifies which `claim` we refer to

# Storage summary

So, given all of the above, a rough guideline one could follow?

> __Decide how your data should be shared__

Answer this question first:
- Shared between containers or `POD`s?

If it is shared between containers answer this question:
- Should data be preserved after `POD` termination?

__If yes, use `ephemeral volumes` to simply exchange data between applications (`LAMP` example above)__

__If it is shared between `POD`s or should be preserved use `PersistentVolumes`__

## Chose `PersistentVolume`, what now?

Another question to help you:

- Do I need dynamic provisioning (e.g. Hard to know beforehand how much storage I will need?)

If not create the following:
- `PersistentVolume` `.yml` config (defines how to create `volume`)
- `PersistentVolumeClaim` `.yml` config (defines how `volume request` looks like)
- `MyApplication` `.yml` config (your `workload resource`, __avoid bare `POD`s__)

## I need `dynamic` provisioning, what now?

> __Use this for large scale apps where "by-hand" provisioning is infeasible__

This might happen due to a few reasons:
- A lot of `POD`s requesting a lot of storage
- We cannot see beforehand how many `POD`s will run

> Prefer cloud storage providers in this case, __as one might run out of local storage for large deployments__

In this case, to the steps outlined above one should add another `.yml` config file:
- `StorageClass` `.yml` config (acts as a template for giving out `PersistentVolume`s to `POD`s in need)

## Other options?

Yes, one can also use aforementioned `expandable volumes` if:
- you know how many `POD`s __at most__ will run at any given time
- you are not sure how much storage will be needed for each `POD`

# StatefulSets

Previously, we saw `kubernetes` basics, namely:
- What it does (high level view)
- Basic concepts and their parts:
    - `cluster`
    - `control-plane`
    - `Node`s
    - `POD`s
- `k8s` objects and how to:
    - Create them and manage
    - How to `apply` using `declarative` approach
    - What are the upsides of such approach
- A few `workload resource`s and why:
    - These are better than using "bare `POD`s"
    - What they can do and how to specify them
    - Specific applications and `field`s required
    
Previously shown `workload`s are used for __stateless applications__ (e.g. the ones not writing to external storage).

In this notebook we will also cover `StatefulSets`

## What is it?

> Workload used to manage __stateful application__

Differences between `StatefulSet` and `Deployment`:
- __Provides ordering and uniqueness for `POD`s__ (not interchange'able anymore!)
- Works well with `storage` volumes (e.g. persisting data in the cluster)
- Appropriate `POD`s are replaced in case of failure

## Usage

- __POD persistence across (re)scheduling__
- When deployment order is needed
- When we want to store data
- Ordered, automated rolling updates

## Limitations

- `Storage` must be provisioned by admin (or by `StorageClass` dynamically)
- __Deleting will not delete `Storage`__ (preserving storage > automatic purging)
- __Headless `Service`__ is used for network identity of `POD`s __and we have to create it__
- `StatefulSets` provide __no guarantees when we terminate `workload`__. Instead __we should scale to `0` before removal__

## Components

Let's see `.yaml` definitions necessary for `StatefulSet`:

In [None]:
---
# Headless Service definition
apiVersion: v1
kind: Service
metadata:
  name: nginx
  labels:
    app: nginx
spec:
  ports:
  - port: 80
    name: web
  clusterIP: None
  selector:
    app: nginx
---
# StatefulSet definition
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: web
spec:
  selector:
    matchLabels:
      app: nginx # has to match .spec.template.metadata.labels
  # Has to match created service
  serviceName: "nginx"
  replicas: 3 # by default is 1
  template:
    metadata:
      labels:
        app: nginx # has to match .spec.selector.matchLabels
    spec:
      terminationGracePeriodSeconds: 10
      containers:
      - name: nginx
        image: k8s.gcr.io/nginx-slim:0.8
        ports:
        - containerPort: 80
          name: web
        volumeMounts:
        - name: www
          mountPath: /usr/share/nginx/html
  volumeClaimTemplates:
  - metadata:
      name: www
    spec:
      accessModes: [ "ReadWriteOnce" ]
      storageClassName: "my-storage-class"
      resources:
        requests:
          storage: 1Gi

- A Headless Service, named `nginx`, is used to control the network domain.
- The `StatefulSet`, named web, has a Spec that indicates that 3 replicas of the `nginx` container will be launched in unique Pods.
- The `volumeClaimTemplates` will provide stable storage using `PersistentVolumes` provisioned by a `PersistentVolume` Provisioner.

## PodIdentity

> __Identity sticks to `POD` regardless of where it is (re)scheduled__

It is defined by stable network, stable storage and an identity (ordinal value).

> Ordinal will be `0... N-1` where `N` is the number of requested replicas

This means that each replica __will be a separate "stateful" identity__ 

### Storage

> One `PersistentVolume` is created for each `VolumeTemplateClaim`

In our case __one `PersistentVolume` of `1Gb` of `my-storage-class` PER `replica`__. 

Due to that, if `POD` is rescheduled, __the same `Volume` will be mounted__ (same for `Endpoints` and `Service` updates with new identity).

## Deployment Guarantees

So, how it works?

- For a `StatefulSet` with `N` replicas, when Pods are being deployed, they are created sequentially, in order from `{0..N-1}`.
- When Pods are being deleted, they are terminated in reverse order, from `{N-1..0}`.
- Before a scaling operation is applied to a Pod, all of its predecessors must be Running and Ready.
- Before a Pod is terminated, all of its successors must be completely shutdown.

In our case:
- `web0`, `web1`, `web2` will be deployed
- __Each with prior `index` has to be `Running and Ready` before next one is deployed__
- Order mentione above __will always be kept__ (even during failures, say if `web1` is deployed and `web0` fails, `web2` will not be deployed)

## `.spec.podManagementPolicy`

> Allows us to control __how `StatefulSets` `POD`s are launched/terminated__

- `Ordered` - default one described above
- `Parallel` - each will be run independently

# Challenges

> ### Approach information below AFTER all of `Kubernetes` related lessons!

## Mandatory

- What is and how [Mount Propagation](https://kubernetes.io/docs/concepts/storage/volumes/#mount-propagation) works for volumes?
- What is [Storage Object in Use Protection](https://kubernetes.io/docs/concepts/storage/persistent-volumes/#storage-object-in-use-protection)?
- What is [Volume Binding Mode](https://kubernetes.io/docs/concepts/storage/storage-classes/#volume-binding-mode) for `StorageClasses`?
- What [Allowed Topologies](https://kubernetes.io/docs/concepts/storage/storage-classes/#allowed-topologies) mean for `StorageClasses`?
- Check out [Update Strategies](https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/#update-strategies) for `StatefulSet`s

## Additional

- What is `DefaultStorageClass` admission plugin? How it changes behavior for __unspecified `StorageClass`__ in case of `PV` or `PVC`?
- Check out `Volume Snapshots` feature of `PersistentVolume`s [here](https://kubernetes.io/docs/concepts/storage/volume-snapshots/)
- How to create __non-empty `PersistentVolume`s?__ One can find answer in [Volume populators](https://kubernetes.io/docs/concepts/storage/persistent-volumes/#volume-populators-and-data-sources) section.
- Check how to monitor health of your `Volume`s [here](https://kubernetes.io/docs/concepts/storage/volume-health-monitoring/)