# Kubernetes Basics

<p align=center><a href=https://kubernetes.io/docs/concepts/overview/what-is-kubernetes/><img src=images/k8s.png width=500></a></p>

> <font size=+1>Kubernetes is an open source platform for managing __containerized__ workloads __focused on declarative configuration and automation__</font>

When you have to run multiple containers, you can use a long docker run command, or even better, a docker compose. However, when you have to deploy an application that needs to run tens or hundreds of containers, making sure that every single container is healthy and working as expected is a difficult task. This is where Kubernetes (k8s) is going to help us. 

Kubernetes is going to orchestrate containers. If you need to spin up a lot of containers in a particular configuration, Kubernetes can do that. If a container goes down for any reason, Kubernetes can start a new one to replace it. 


Kubernetes has many benefits and features and allows us to:
- Deploy many containers across a Kubernetes cluster
    - We have can choose which and how many containers should be deployed.
    - How many containers replicas to specify for fault tolerance
- __Self-healing__: If a container is down (not responding to health checks) Kubernetes will spin up a new one
- __Auto-scaling__: If the given containers are not enough to process all the requirements, Kubernetes can deploy more containers within the cluster. On the other hand, if there are too many resources, Kubernetes can also destroy some of them to keep balance.
- __Discoverability__: Each container can be accessed via its name (or IP) and port
- __Load Balancing__: The traffic to a single node might be overloaded. So Kubernetes can distribute that load to several nodes.
- Storage orchestration - allows us to mount storage (per-node, shared and others) to save/read data

In this notebook you will see:
- [Kubernetes components](#kubernetes-components) that allow Kubernetes orchestrate the containers
- [Kubernetes Objects](#kubernetes-objects) that will allow you to deploy the resources
- [How to install Kubernetes](#installing-kubernetes) and `kubectl` which will help you create requests to the Kubernetes API

# Kubernetes Components


> Kubernetes has __a lot__ of concepts one should be familiar with in order to work with the platform efficiently

When we __deploy Kubernetes__ we obtain a cluster with the components shown in the diagram below:

![](./images/components-of-kubernetes.svg)

## High level components overview

- __`Node`s__ - worker machines (__either physical or virtual__) which host a containerized application
- __`Pod`s__ - component of the application, __smallest deployable unit of work in `k8s`__. They consist of a group of one or more containers to which Kubernetes is going to have access, so it will be able to orchestrate them
- __Control Plane__ - It contains the Kubernetes' orchestrators that manage `node`s and `Pod`s

> For the purpose of learning, at the beginning we will just have a single `Node` on the same physical machine as we are working on

> In a production environment, the __`Control Plane`__ can span multiple machines for increased fault tolerance

> In a production environment, we usually have multiple `Node`s (up to thousands)

## Control Plane Components

### kube-apiserver

> __Exposes `Kubernetes` API using which we can communicate with the cluster__

Communication with the cluster can be done via:
- `http` requests as `kube-apiserver` provides REST server (__read more about it [here](https://kubernetes.io/docs/concepts/overview/kubernetes-api/)__)
- __through command line__:
    - `kubectl` - control kubernetes cluster and workload
    - `kubeadm` - bootstraps kubernetes clusters (initializing it from config, tearing down etc.)
      
> __We should almost exclusively use `command line` tools to interact with `kube-apiserver`!__

Reasons are:
- increased readability
- easier automation
- `cli` tools handle different internal API version of `k8s` for us

### etcd

> __`etcd` stores Kubernetes Objects which define the whole cluster__

Features:
- Data is stored locally on control plane node(s)
- __If all of our control planes go down we lose the cluster state__ (and we have to recreate the whole infrastructure a-new)

> `etcd` data should be backed in an external storage, check [here](https://kubernetes.io/docs/tasks/administer-cluster/configure-upgrade-etcd/#backing-up-an-etcd-cluster) for more information

### kube-scheduler

> __Watches newly created `pod`s and assigns them to a `node`__

How is the node chosen?
- Resource requirements for our `pod`
- Applied policies/constraints
- Data locality (__move code towards data `spark`__)

If you are curious about scheduling pods in more detail check out [this part of documentation](https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/)

### kube-controller-manager

> Runs controller processes

What is a `control process`?

> Loop which watches the cluster (through `apiserver`) and __attempts to move the current state to the desired one__

Each controller runs in a separate process, one example could be a `node controller` which responds when node goes down.

### cloud-controller-manager

> Similar to `kube-controller-manager` __but watches the state of cloud resources__

For example, the `node controller` queries our cloud provider to check health of nodes. 

## Node Components

Each node consists of the following:

### kubelet

> __Ensures appropriate containers are run within `pod` and are healthy__, given a `PodSpec` (a specification we will see later) 

### kube-proxy

> Governs networks rules on nodes. __Enables communication to/from `pod`s within and outside cluster__

### Container runtime

> The container runtime is the software that is responsible for running containers

Some possibilities include:
- [`Docker`](https://docs.docker.com/engine/) (which we already know)
- [`containerd`](https://containerd.io/docs/) - simple container runtime (alternative to `Docker`)

Any project which fulfills [Kubernetes CRI specification](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-node/container-runtime-interface.md) 

# Kubernetes Objects


> <font size=+1>Kubernetes objects are persistent entities in `k8s` system which represent the DESIRED STATE of the cluster</font>

Basically, you will create your Kubernetes resources from these Kubernetes objects. We are going to see how to create these objects using declarative commands, but you can also use imperative commands to manage the objects (Take a look a this [link](https://kubernetes.io/docs/tasks/manage-kubernetes-objects/imperative-command/) to see how).

Using objects we can describe:
- What applications to use and where (which node) to run these applications on.
- How applications behave (e.g. when to restart, delete, upgrade)
- Resources available to applications

In order to create them __we have to request `kubelet-apiserver`'s API__!

As mentioned previously, instead of directly sending REST requests we can (and will) use `kubectl` for that. But before _requesting_ the creation of resources, we need to describe the object we want to create.


## Describing Kubernetes objects

Kubernetes is a "low level" platform, on top of which other projects reside; consider it "_a programming language for deployment_". Kubernetes is going to use __declarative programming__, meaning we are going to define the end result we want, rather than specifying the steps we want it to take.

> <font size=+1> Kubernetes uses declarative programming to describe state of the whole system </font>

<br>

<details>
    <summary> <font size=+1>Click here to see the difference between Imperative and Declarative</font> </summary>

<br>

> Using imperative programming (or configuration) __we define each necessary step to get to the result__

Some examples could be:
- Create variable `x` and assign value `1` to it
- Scrape data from the website and get appropriate `<image>` tag
- Pass image through a function to produce a probability that it contains a logo

> Using declarative programming (or configuration) __we describe the desired state of system without describing the steps__

Respectively for the examples above:
- I want variable `x` with value `1` assigned to it
- I want the `<image>` out of this website
- I want a probability of the image containing a logo

</details>

As the API is RESTful we can include `JSON` description of object.

__This way is supported, although hard to maintain__

> __What we can do instead is use more readable `YAML` to specify our objects and let `kubectl` transform the representation to `JSON`__

Let's see an example object configuration and describe the fields:

```
# configuration file stored in deployment_example.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  # Unique key of the Deployment instance
  name: deployment-example
spec:
  # 3 Pods should exist at all times.
  replicas: 3
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        # Apply this label to pods and default
        # the Deployment label selector to this value
        app: nginx
    spec:
      containers:
      - name: nginx
        # Run this image
        image: nginx:1.14
```

- `apiVersion` - specifies `k8s` API version. Depending on the application you want to deploy, you might need to use a different API version. 
- `kind` - Kind of object we want to create (we will talk about it later during __Workloads__)
- `metadata` - Uniquely defines the object, usually using `name` field

> <font size=+1>[Take a look at the Kuberenetes' documentation](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.22/#-strong-api-overview-strong-) for the API version you need based on the resource you are going to use</font>

For example, in the declarative code above, we have a Deployment resource. In the documentation, you can look at the Deployment API to find out that you would need the `apps/v1` API version

<p align=center><img src=images/Deployment.png></p>

## spec

> __`spec` describes the DESIRED characteristics of our resource__

This is provided by the user, and, in this case, tells the `kubernetes` platform that:
- Every `Pod` which has a label `app` with name `nginx` with the kind `Deployment` (__described later__)
- Three replica `Pod`s should be run on the cluster.
- Each one should be created from a `template` which:
    - Gives `app` label with a value `nginx`
    - `spec`ifies it is created from a single container which will be named `nginx` (for discoverability)
    - Uses `image` `nginx:1.14`

In the [API documentation](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.22/#-strong-api-overview-strong-) you can see what `spec`s you can specify for a resource. 

<p align=center><img src=images/Deployment_Spec.png width=600></p>

The same way you can see what `spec`s you can specify, you can also refer to the parameters that you can give to the `status` of the resource

## status

> __`status` describes the CURRENT state of the resource__

__While we provide `spec`, `status` is updated internally by `kubernetes`__

Providing `status` has the following benefits:
- Allows users to query the current status of the deployment
- __Allows controllers to act based on the current `status`__ and drive the objects toward the desired state

### Desired vs Actual state

A little digression:

> __Do not worry about `actual` state, because our cluster might never reach it !__

As long as `controllers` are continuously running everything is fine.

Discrepancy between current and desired state might happen due to plethora of reasons, e.g: 

- Frequent updates of `config` files (`kubernetes` had no time to act accordingly to our changes)
- We are handling hundreds of nodes and our change hasn't being propagated yet
- Random node failures (`kubernetes` will reinstantiate necessary `Pod`s) 

## Creating and managing objects

Now that we have our object described we should communicate to `kube-apiserver` we want it created.

There are `3` techniques to do that:

- __Imperative commands__ - operates on existing objects __via commands__ (discouraged!)
- __Imperative object configuration__ - operate on individual files
- __Declarative object configuration__ - works on directories of files

> __Approaches should not be mixed together as it may result in `undefined behavior`!__

As mentioned above, we are going to focus on the `declarative` approach. You can have a look at this [link](https://kubernetes.io/docs/tasks/manage-kubernetes-objects/imperative-command/) to see how to use imperative commands

## Declarative object configuration

> Specify configuration files, __but do not specify the operation which should be performed__

`create`, `update`, `delete` operations are __automatically picked up__ by Kubernetes engine.

Pros of this approach:
- __We can easily work on directories of `config` files__ - all of them will be `applied` appropriately
- __Different configs allow us to handle more objects__
- __Only two commands needed__
- We don't have to replace the whole configuration we only need to patch the required parts

To check which operations will be applied we can run:

```
kubectl diff -f ./configs

kubectl diff -R -f ./configs  # Recursively find all `.yaml` files
```

Once we verify that parsed configs brings the cluster into the desired state, one can run:

```
kubectl apply -R -f ./configs # R for recursive
```

The downside of using declarative commands are:
- That they might be harder to debug (especially when starting out with `k8s`)
- The complexity of operations hidden under the hood

This was a lot to process! Maybe a few examples can make everything a little bit more clear. Let's install Kubernetes to see it in action!

# Installing Kubernetes

You can run commands to deploy Kubernetes resources on the cloud using `kubectl`. On the other hand if you want to deploy your resources in your local machine and make experiments, you can use `minikube`. We will install both applications, so you can learn how to deploy your Kubernetes resources without worrying about launching cloud resources. However, we will see some examples on how to deploy these Kubernetes objects on the cloud, so you start getting familiar with the industrial environment. 

## `kubectl`

`kubectl` allows you to deploy your applications by setting up Kubernetes resources from your command line interface. 

<details>
    <summary> For Linux Users </summary>

You can simply run the following commands in your terminal:
```
curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl"
curl -LO "https://dl.k8s.io/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl.sha256"
echo "$(<kubectl.sha256) kubectl" | sha256sum --check
sudo install -o root -g root -m 0755 kubectl /usr/local/bin/kubectl
```
If you had any error during the installation, you can refer to the [Kubernetes website](https://kubernetes.io/docs/tasks/tools/install-kubectl-linux/) 
</details>

<details>
    <summary> For Mac Users </summary>

You can simply use `homebrew` to install it:
```
brew install kubectl 
```
If you had any error during the installation, you can refer to the [Kubernetes website](https://kubernetes.io/docs/tasks/tools/install-kubectl-linux/) 
</details>

<details>
    <summary> For Windows Users </summary>

Install the latest release running the following command
```
curl -LO "https://dl.k8s.io/release/v1.22.0/bin/windows/amd64/kubectl.exe"
```
Move the directory containing `kubectl.exe` to your PATH.
Important, if you have installed Docker Desktop, by default, it will come with its own `kubectl` version. Make sure the one you downloaded in above Docker in your PATH. For example, in my case I have it installed in `C:\Users\yingy`, so it is above the Docker location:
<p align=center><img src=images/kubectl_path.png width=300></p>
</details>

Once installed, regardless of the OS, run the following command to check that everything is installed correctly:

`kubectl version --client`

We will see more examples on `kubectl` during the next lessons, however, it is important to knwo that the syntax looks like this:
```
kubectl [command] [TYPE] [NAME] [flags]
```
- `command`: specifies operation one wants to perform (e.g. `get`, `describe`)
- `TYPE`: specifies resource type; __case in-sensitive, works with abbreviations and plurals__

For example: `kubectl get namespace` does the same as `kubectl get ns`. Also `kubectl get pods` does the same as `kubectl get po`

- `NAME`: specifies name of the resource; __can be ommitted, if so, will return everything__ (e.g. `kubectl get pods` as seen above)

Below the most useful ones are provided, full list can be seen [here](https://kubernetes.io/docs/reference/kubectl/overview/#in-cluster-authentication-and-namespace-overrides)

When working using imperative commands, you usually will find two main commands:

- `kubectl annotate` could be described via `.yaml` config for specific `type`
- `kubectl create` should not be used, `kubectl apply` and declarative config files are prefferred

However, we will focus on the declarative commands that are shown below:

## Core commands


### __apply__

> __Apply configuration change to a `resource`__

Using this command, after starting `cluster` (e.g. with `minikube` for local development or `kubeadm` for real deployment) __we can configure the cluster according to our liking__.

> This is the preferred way of managment

Important flags we will often use:
- `-R` - __recursively read `config` files within a directory__
- `-f` - filename (__or directory, usually__) to read `config` from

### __diff__

`diff` checks what changes will be applied to `cluster` desired state. One should always run this command before `apply`ing configuration!

```
kubectl diff -f -R FILENAME [flags]
```


### __get__

`get` lists one or more resources. It allows us to get basic information about `k8s` objects.

```bash
kubectl get (-f FILENAME | TYPE [NAME | /NAME | -l label]) [--watch] [--sort-by=FIELD] [[-o | --output]=OUTPUT_FORMAT] [flags]
```


### __describe__

`describe` displays the state of one or more resources. It allows us to easily get an overview of the resource(s)

```bash
kubectl describe (-f FILENAME | TYPE [NAME_PREFIX | /NAME | -l label]) [flags]
```

As previously:
- One can specify `file`(s) which describe the desired state __in order to get the current one__
- We can specify `TYPE`, `NAME` and other identifying resources for this command


## Helpful debugging Commands

### __attach__

`attach` is used to read a container's `stdout` and/or interact with its `stdin`

```bash
kubectl attach POD -c CONTAINER [-i] [-t] [flags]
```

Works just like `docker attach`, __useful for debugging__


### __cp__

`cp` copies files from to/from containers

```bash
kubectl cp <file-spec-src> <file-spec-dest> [options]
```

it is useful for debugging to disk output of container(s)


### __exec__

This command `exec`utes a command against `container` within a pod

```bash
kubectl exec POD [-c CONTAINER] [-i] [-t] [flags] [-- COMMAND [args...]]
```

Useful for debugging container state

## `minikube`

 [`minikube`](https://minikube.sigs.k8s.io/docs/start/) makes the process of learning Kubernetes easy from your local machine. If you already have installed Docker in your local computer, you should be able to install.

<details>
    <summary> For Linux Users </summary>

You can simply run the following commands in your terminal:
```
curl -LO https://storage.googleapis.com/minikube/releases/latest/minikube-linux-amd64
sudo install minikube-linux-amd64 /usr/local/bin/minikube
```
and that would do it!
</details>

 <details>
    <summary> For Mac Users </summary>

You can simply run the following commands in your terminal:
```
curl -LO https://storage.googleapis.com/minikube/releases/latest/minikube-darwin-amd64
sudo install minikube-darwin-amd64 /usr/local/bin/minikube
```
If that doesn't work, you will need to use `homebrew`:
```
brew install minikube
```
_If which minikube fails after installation via brew, you may have to remove the old minikube links and link the newly installed binary_
```
brew unlink minikube
brew link minikube
```
</details>
<details>
    <summary> For Windows Users </summary>

You can download the .exe file here: https://storage.googleapis.com/minikube/releases/latest/minikube-installer.exe
After installing it, run PowerShell as Administrator to add the binary to your `PATH`:
```
$oldPath = [Environment]::GetEnvironmentVariable('Path', [EnvironmentVariableTarget]::Machine)
if ($oldPath.Split(';') -inotcontains 'C:\minikube'){ `
  [Environment]::SetEnvironmentVariable('Path', $('{0};C:\minikube' -f $oldPath), [EnvironmentVariableTarget]::Machine) `
}
```
</details>

Make sure you installed correctly running `minikube --help`. You might need to close and open the terminal that you used to run any command.

Whichever OS you used, you can run `minikube start`. This will create a Kubernetes `config` file to tell Kubernetes (or `kubectl`) where to start deploying the Kubernetes resources. __You can change this config file in case you want to deploy the resources to a Cloud Infrastructure__

We can check the config context using the following command

In [4]:
!kubectl config get-contexts

CURRENT   NAME       CLUSTER    AUTHINFO   NAMESPACE
*         minikube   minikube   minikube   default


You can see that, indeed, we are using the cluster provided by `minikube`. In the configuration file (usually located in `~/.kube`), you can find this information and check that they are consistent with the information obtained:

<p align=center><img src=images/config_minikube.png width=500></p>

One thing that you can observe from the information above is that the namespace is `default`. Namespaces are virtual clusters within physical cluster. Namespaces should be used for most of our tasks and should group semantically similar `k8s` objects. 

> <font size=+1>Namespaces provide a scope for names. Names of resources need to be unique within a namespace, but not across namespaces</font>

By default, when you start `minikube` or when you change the kubeconfig file, kubernetes is going to create four namespaces. You can take a look at them by running the following command:

In [5]:
!kubectl get namespace

NAME              STATUS   AGE
default           Active   143m
kube-node-lease   Active   143m
kube-public       Active   143m
kube-system       Active   143m


You can learn more about namespaces in this [link](https://kubernetes.io/docs/concepts/overview/working-with-objects/namespaces/). For now, what you need to know is that each namespace contains different resources. You can take a look at, for example, the pods inside the `kube-system` namespace

In [8]:
!kubectl -n kube-system get pods -o wide

NAME                               READY   STATUS    RESTARTS       AGE    IP             NODE       NOMINATED NODE   READINESS GATES
coredns-78fcd69978-8j8hz           1/1     Running   0              170m   172.17.0.2     minikube   <none>           <none>
etcd-minikube                      1/1     Running   0              170m   192.168.49.2   minikube   <none>           <none>
kube-apiserver-minikube            1/1     Running   0              170m   192.168.49.2   minikube   <none>           <none>
kube-controller-manager-minikube   1/1     Running   0              170m   192.168.49.2   minikube   <none>           <none>
kube-proxy-w2pt7                   1/1     Running   0              170m   192.168.49.2   minikube   <none>           <none>
kube-scheduler-minikube            1/1     Running   0              170m   192.168.49.2   minikube   <none>           <none>
storage-provisioner                1/1     Running   1 (169m ago)   170m   192.168.49.2   minikube   <none>         

One thing we mentioned at the beginning of the notebook is that, if a Kubernetes resource goes down, Kubernetes will try its best to spin it up again. Let's try it by deleting one pod

In [18]:
!kubectl -n kube-system delete pod kube-proxy-w2pt7 

pod "kube-proxy-w2pt7" deleted


If we take a look at the pods again:

In [20]:
!kubectl -n kube-system get pods

NAME                               READY   STATUS    RESTARTS   AGE
coredns-78fcd69978-8j8hz           1/1     Running   0          175m
etcd-minikube                      1/1     Running   0          175m
kube-apiserver-minikube            1/1     Running   0          175m
kube-controller-manager-minikube   1/1     Running   0          175m
kube-proxy-jb22b                   1/1     Running   0          77s
kube-scheduler-minikube            1/1     Running   0          175m
storage-provisioner                1/1     Running   0          13s


Awesome! It looks like it's up and running again. Finally let's observe the resources we have using a nice User Interface. Minikube has a dashboard that we can initialize by running:
```
minikube dashboard
```
And this is what we will see (after selecting `all namespaces` in the dropdown menu)

<p><img src=images/minikube_dashboard.png></p>

Take a moment to look at all the resources you can see on the left hand side (Workloads, Service, Config and Storage...). During the next lessons you will learn how to create (the most important of) them.

For now, the resources you learnt during this notebook are a good introduction of how kubernetes work, and how you can deploy your Kubernetes resources from the command line interface.

## Connect to an AWS EKS

You can create an Elastic Kubernetes Service in Amazon to deploy your Kubernetes resources to a Cloud Infrastructure. This way not only you will run your own application, you will be able to run AWS services in an orchestrated way using clusters. If you don't want to deploy your Kubernetes resources to the cloud, you can skip the rest of this notebook and start working locally.

<font size=+1>__The following steps require you to launch an instance that is not free! If you want to continue, be advised that you will need to deploy an instance that costs, at least, 0.0104 USD/hour__</font>

The first step will be creating an Amazon VPC that meets the EKS requirements. In your console, you can run the following command:

```
aws cloudformation create-stack \
  --region us-west-2 \
  --stack-name <your_stack_name> \
  --template-url https://amazon-eks.s3.us-west-2.amazonaws.com/cloudformation/2020-10-29/amazon-eks-vpc-private-subnets.yaml
```

In the example above, you are creating the VPC in us-west-2 region (Oregon), but you can choose any EKS supported region. Take a look at this [link](https://docs.aws.amazon.com/general/latest/gr/eks.html) to learn more about it. In this example, we are using `my-eks-vpc-stack` as `<your_stack_name>`

Now we need to create an IAM role, you can do it from the console or from the AWS console. Let's do it in our local machine. In this [link](https://aicore-files.s3.amazonaws.com/Cloud-DevOps/eks_policy.json), you will see a JSON file called `eks_policy.json`. This policy allows EKS to make calls to other AWS services. Using this policy we can create the role
```
aws iam create-role --role-name <your_role_name> --assume-role-policy-document file://"eks_policy.json"
```
In our example we are using `AiCoreEKSClusterRole` as `<your_role_name>`. Now, attach this policy to the role:
```
aws iam attach-role-policy --policy-arn arn:aws:iam::aws:policy/AmazonEKSClusterPolicy --role-name <your_role_name>
```
Make sure the role name is consistent!


Go to your AWS portal and search for EKS

<p align=center><img src=images/EKS.png width=600></p>

And click on add cluster. Make sure you are in the right region. In the example we established the policy in us-west-2, which corresponds to Oregon

<p align=center><img src=images/EKS_2.png width=600></p>

In the next window, give a name your cluster and select the IAM role you just created:

<p align=center><img src=images/EKS_3.png width=600></p>

Press Next, and in the specify networking section, select the VPC you created before:
<p align=center><img src=images/EKS_4.png width=600></p>

Leave the remaining values to their default, and click Create. 

Once created you can use the `aws` command in your command line, and type:
```
aws eks --region <your-region> update-kubeconfig --name <name_of_your_cluster>
```
This will automatically change your config file (`~/.kube`). After running it, this is what I obtain:

In [1]:
!kubectl config get-contexts

CURRENT   NAME                                                        CLUSTER                                                     AUTHINFO                                                    NAMESPACE
*         arn:aws:eks:us-west-2:168573745887:cluster/aicore-cluster   arn:aws:eks:us-west-2:168573745887:cluster/aicore-cluster   arn:aws:eks:us-west-2:168573745887:cluster/aicore-cluster   
          minikube                                                    minikube                                                    minikube                                                    default


We still need to grant permission to Kubernetes service accounts to access the resources. Thus, we need to create an IAM OpenID Connect (OIDC) provider. This will be used so that Kubernetes service accounts can access AWS resources.

To do so, go to your cluster, click `Configuration`, and then `Details`

<p align=center><img src=images/EKS_5.png width=700></p>

Copy the OpenID Connect provider URL, and go to the IAM console: [https://console.aws.amazon.com/iam/](https://console.aws.amazon.com/iam/)

In the Identity Providers section, click on Add Provider

<p align=center><img src=images/EKS_6.png width=700></p>

- For Provider Type, choose OpenID Connect.

- For Provider URL, paste the OIDC provider URL for your cluster, and then choose Get thumbprint.

- For Audience, enter `sts.amazonaws.com` and choose Add provider.


If you want to run AWS services, you need to create a Node. Mainly, you have two alternatives, Fargate and Managed nodes:
1. Fargate: In case you want to run applications on AWS Fargate
2. Managed nodes: In case you want to run applications on EC2 instances

In this notebook we will create a managed node, but if you want to see how to create a Fargate node, you can look at this [webpage](https://aws.amazon.com/es/fargate/)

First, you will need to create a new role for the VPC Container Network Interface (CNI) plugin for Kubernetes. This plugin assigns an IP address from your VPC to each pod. 
In this [link](https://aicore-files.s3.amazonaws.com/Cloud-DevOps/cni_policy.json), you will see a `cni_policy.json` file that looks like this:
```
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Federated": "<Your ARN>"
      },
      "Action": "sts:AssumeRoleWithWebIdentity",
      "Condition": {
        "StringEquals": {
          "<Your Provider>:sub": "system:serviceaccount:kube-system:aws-node"
        }
      }
    }
  ]
}
```

Make sure you change `<Your ARN>` and `<Your Provider>` with your corresponding credentials from the OIDC you created before:

<p align=center><img src=images/EKS_7.png width=700></p>

In your console run the following command to create a new role:
```
aws iam create-role --role-name <role_name> --assume-role-policy-document file://"cni_policy.json"
```
In this example, we substituted `<role_name>` for `AiCoreEKSCNIRole`. Now, attach the CNI policy to the created role:
```
aws iam attach-role-policy --policy-arn arn:aws:iam::aws:policy/AmazonEKS_CNI_Policy --role-name <role_name>
```
Next, associate the Kubernetes service account used by the VPC CNI to the IAM role you just created:
```
aws eks update-addon --region <your_region> --cluster-name <your_cluster> --addon-name vpc-cni --service-account-role-arn arn:aws:iam::<your_account_number>:role/<role_name>
```



You need to create a node IAM role, so the AWS EKS `kubelet` can make calls on our behalf. You will use the `node_policy.json` file in this [link](https://aicore-files.s3.amazonaws.com/Cloud-DevOps/node_policy.json):
```
aws iam create-role --role-name <node_role_name> --assume-role-policy-document file://"node_policy.json"
```
We used `AiCoreEKSNodeRole` for the `<noe_role_name>`. And as always, attach the necessary policies to this role:
```
aws iam attach-role-policy --policy-arn arn:aws:iam::aws:policy/AmazonEKSWorkerNodePolicy --role-name <node_role_name>

aws iam attach-role-policy --policy-arn arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly --role-name <node_role_name>
```

In your cluster, go to `Configuration` and then `Compute` to add a Node Group:
<p align=center><img src=images/EKS_8.png width=600></p>

Give a name to your node, and select the IAM role you created:

<p align=center><img src=images/EKS_9.png width=600></p>

In the `Set compute and scaling configuration` is where you will have to deploy a paid instance. You can use the instance you want, in this example we went with the `t3.medium`. <font size=+1>Once the example is finished, make sure to clean up everything!</font>

In case you don't have a key-pair in the region you specified, you will need one now. You can quickly create one with the following command:
```
aws ec2 create-key-pair --region <your_region> --key-name <key_name> --output text > <key_name>.pem
```
In this example, we used `aicore-key` as `<key_name>`

Coming back to the Node configuration, in `Specify networking` enable the `Configure SSH access to nodes` and select the key pair you have (or you have just created)

<p align=center><img src=images/EKS_10.png width=700></p>

Once created, you will be able to see the resources in your cluster, and the Workload that your cluster have deployed:

<p align=center><img src=images/EKS_10.png width=600></p>
<p align=center><img src=images/EKS_11.png width=600></p>

Inside each node, you can see the pods that have been created, as well as the events that take place related to that node:

<p align=center><img src=images/EKS_13.png width=600></p>
<p align=center><img src=images/EKS_14.png width=600></p>

With this, you will be able to deploy your AWS applications in EC2 instances using clusters. However, during these notebooks, we are going to focus on learning how to use Kubernetes, so we __will use Minikube for that purpose__.

## Clean Up

Before wrapping up, make sure to clean up all the resources you set up! You can simply delete the nodes, and then the cluster containing these nodes.

<p align=center><img src=images/EKS_15.png width=600></p>

Once deleted, you can delete the whole cluster.

Also (just in case) delete the VPC you created. Go to https://console.aws.amazon.com/cloudformation and select the VPC you created to delete it. 
You can do the same with the IAM role you created: https://console.aws.amazon.com/iam/

# Challenges

> ### You should try these challenges after finishing all of the `kubernetes` related lessons!

It will make your progress substantially faster.


## Mandatory

- What are [addons](https://kubernetes.io/docs/concepts/overview/components/#addons) in Kubernetes ecosystem? Which one is necessary?
- Check out [kustomize](https://kubernetes.io/blog/2018/05/29/introducing-kustomize-template-free-configuration-customization-for-kubernetes/) (available via `kubectl`). What are the use cases for it? 
- See additional ways to match `POD`s to specific `Node`s [here](https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/)
- Check out [`CronJob` workload resource](https://kubernetes.io/docs/concepts/workloads/controllers/cron-jobs/). Make sure you understand the concept and `cron` syntax included.

## Additional

- Check out [Kubernetes API Conventions](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api-conventions.md) to get deeper understanding of Kubernetes Objects and internals of the project.
- What are the alternatives to `minikube`? What is `kind` or `k3s`? What are upsides/downsides of using them?
- Check out [Client Libraries](https://kubernetes.io/docs/reference/using-api/client-libraries/) (which allow us to communicate with `kubelet-apiserver` via certain programming languages). Check out [Python Client Library](https://github.com/kubernetes-client/python/). Which task could you automate using it?
- Check out [imperative commands way to manage `k8s` objects](https://kubernetes.io/docs/concepts/overview/working-with-objects/object-management/#imperative-commands), __but don't use it!__ What are the downsides due to which we did not describe it?
- Read more about [TTL (time to live) Controller](https://kubernetes.io/docs/concepts/workloads/controllers/ttlafterfinished/).
- Read about [canary `Deployment`s](https://kubernetes.io/docs/concepts/cluster-administration/manage-deployment/#canary-deployments)
- Read about [`Job` template expansion](https://kubernetes.io/docs/tasks/job/parallel-processing-expansion/)
- What are `DaemonSet` tolerations?