# MLOPS Kubernetes and Kubeflow Hands On Notebook

## Kubernetes Fundamentals

### Overview of Containers and Docker

#### Task: Build a simple Docker image for a Python application and run it locally.
---
**Solution:** 
* Simple FastAPI web application ---> can be found in the `./webapp/` path
* Dockerfile to build the image ---> can be found at `./Dockerfile`
* Commands to run the fastAPI application using Docker.
    ```
    $ docker build -t fastapi_sample_app .
    $ docker run -p 8080:8080 fastapi_sample_app
    ```
* Open `localhost:8080` in the browser to access the application (or use postman to test the API)


### Intro. to Kubernetes : Architecture and Components

#### Task: Explore a running Kubernetes cluster using kubectl (get nodes, pods)
----
* `kubectl get pods`

![get pods](images/pods.png)

* `kubectl get svc`

![get svc](images/svc.png)

* `kubectl get nodes`
* `kubectl get deployments`

etc...

### Setting up kubernetes cluster (Minikube/Kind) 

#### Task: Install Minikube and create a local kubernetes cluster
---

To install Minikube on Windows (WSL2):
```
   > curl -Lo minikube https://storage.googleapis.com/minikube/releases/latest/minikube-linux-amd64

   > chmod +x ./minikube
   > sudo mv ./minikube /usr/local/bin/
   > minikube config set driver docker

```
Next install Kubectl and use minikube kubectl as default kubectl
```
   > curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl"
   > chmod +x ./kubectl
   > sudo mv ./kubectl /usr/local/bin/
   > kubectl config use-context minikube
   > alias kubectl="minikube kubectl --"
```


### Hands-on: Deploy your first Application on Kubernetes

#### Task 1: Create a simple web app and deploy it as K8s deployment.
---
**Solution:**
1. In my case I am using WSL2 inside windows, hence I need to point minikube to the correct docker engine inside minikube.
2. Run below command in terminal (Command can be found in minikube docs)
```
    > eval $(minikube -p minikube docker-env)
```
3. Build the Docker image again (alternatively we can pull the local built image from the local registry, but for simplicity I will just build this image again inside minikube). Also giving a tag to the image is important as it causes issues when creating deployment.
4. Run below command in terminal (Command can be found in minikube docs)
```
    > docker build -t sample_fast_api:v2 .
```
5. Create the deployment
```
    > kubectl create deployment app-dep --image=sample_fast_api:v2
```
6. Now the application is ready and you can check the deployment my using the `kubectl get pods` command to view the pod exactly. or use `kubectl get deploy` to view the deployments.


---
#### Task 2: Expose the application using NodePort or LoadBalancer service. 
---
**Solution:**
1. Now tht we have a running application, we need to expose the application using NodePort or LoadBalancer service. For this I will use the NodePort service.
2. Run below command in terminal (Command can be found in minikube docs)
```
    > kubectl expose deployment app-dep --type=NodePort --port=8080
```
3. Now the service is created, to test this locally using minikube you can expose it using the below commnad:
```
    > minikube service app-dep
```
4. We should see something like this:

    ![minikube service](images/minikube_service.png)

---

## Kubernetes Objects

### Pods, Deployments and Services

#### Task 1: Hands-on: Create a deployment with multiple replicas of a pod.
---

**Solution:**
1. I have create a deployment and a service in a single file ---> `./app-dep-deployment`
2. This will have 2 replicas, and created a nodeport serive on the port 30000 mapping the container port of 8080.
3. create the deplyment using `kubectl create -f app-dep-deployment`
4. Because I am using docker desktop's engine here inside WSL2 we need to run the `minikube service app-dep-service` command to expose the service to test locally.

#### Task 2: Expose the deployment as a service and access it.
---

**Solution:**
1. This is already achieved in the previous step, we will see something like this:

    ![minikube service](images/service_nodeport.png)



### ConfigMaps and Secrets

#### Task 1: Create a Configmap to store application configuration and a secret to store sensitive information.
---
**Solution:**
1. I have created a new YAML file name `app-dep-secret-configmap,yaml`, this contains a secret and a configmap. The deployment will be similar to the previous task, this time I have rebuilt the system with a new docker build as I have made a few changes to the fastAPI application to verify if we are able to access these values as environment variables.

    ![newly added settings](images/secret_configmap.png)
2. Run `kubectl create -f app-dep-secret-configmap.yaml`
3. As we are in WSL2 run `minikube service app-dep-service`

  
---

#### Task 2: Mount ConfigMap and Secret as environment variable in the pod.
---
4. If we hit the `/secret` and `/configMap` endpoints, we should see the values  in the response. These were set as env variables in the deployment.
  ![configmap output](images/configMap.png)
---

## Orchestration using Kubeflow

### Setting up Kubeflow on k8s

#### Task : Install kubeflow on minikube
---

**Solution:**
1. First we need to install a dependancy called Kustomize so that we can install all the components of kubeflow.
2. Run below command in terminal:
```
    > curl -s "<https://raw.githubusercontent.com/kubernetes-sigs/kustomize/master/hack/install_kustomize.sh>"  | bash
    > sudo install kustomize /usr/local/bin/kustomize
    > git clone <https://github.com/kubeflow/manifests.git>
    > cd manifests
    > while ! kustomize build example | awk '!/well-defined/' | kubectl apply -f -; do echo "Retrying to apply resources"; sleep 10; done
    > kubectl port-forward svc/istio-ingressgateway -n istio-system 8080:80
```
**Note:** Keep at least 8GB ram and 8 CPU to the minikube cluster for it to work and get installed.
**Note:** The last command is necessary to port forward the dashboard of kubeflow.

If the installation is not happening then use this link to install each component individually: https://github.com/kubeflow/manifests#install-individual-components

After Successful installation of kubeflow components, you can access the dashboard of kubeflow by opening the following link: http://localhost:8080, It should look something like this:

![kubeflowdash](images/kubeflow_dash.png)

#### Task Kubeflow Pipelines UI overview and its components
---
1. Pipelines lists all the installed/added pipelines. (These can be aded through a YAML file or python SDK)
![pipelines](images/pipeline.png)

2. Experiment: This stores all the runs and organizes them under experiment names/ID's
![experiment](images/experiment.png)

3. We can get the details of the pipeline in a visual manner when we click on the corresponding run.
![pipeline_details](images/kfp_run_dag.png)


### Building and Managing pipelines

#### Task : Build a custom ML pipeline and visualize its execution.
---
1. You can find the pipeline code at `webapp/kubeflow_code/kubeflow_pipeline.py`. 
2. After compiling the pipeline, you will the YAML file at `webapp/kubeflow_code/kubeflow_iris_classifier.yaml`. 
3. Adding the YAML to the kubeflow pipeline UI and create a run.
4. The run could be triggered from the pipeline UI or set a recurring run.
5. In the pipeline we have multiple components which will run one after the other as configured.
6. The pipeline will then run step by step which could be seen visually like below.
![kfp_dag](images/kfp_run_dag.png)
7. After the run is completed we can see the metrics of the run in the logs section of the get_metrics step:
![kfp_metrics](images/metrics.png)

## Distributed Training and AutoML

### Distributed Training with Kubeflow


#### Task: Setup Distributed training job with Kubeflow
---
1. To setup distributed training on kubeflow, we need to install Training Operator.
```
> kubectl apply -k "github.com/kubeflow/training-operator.git/manifests/overlays/standalone?ref=v1.8.1"

> kubectl apply -k "github.com/kubeflow/training-operator.git/manifests/overlays/standalone?ref=master"

> pip install -U kubeflow-training 
> pip install git+https://github.com/kubeflow/training-operator.git@master#subdirectory=sdk/python

```
This will install the Training Operator and Python SDK for Kubeflow. I am doing indiviual component installs because of lack of memory and compute.

2. Craete a training job, for an eample I have created a Xgboost training job that can be found at `webapp/training_operator/xgboost.yaml`. This will create 1 master and 2 worker node replicas. Which will do the training process.

![dist_train](images/dist_train.png)

#### Task: Compare the performance of the distributed training job with the single node training job
---

1. This completes the classification in 130 seconds as opposed to around 135 seconds it took for the single node training.

<div style="display: flex; justify-content: space-between; position: relative;">
    <div style="width: 48%; position: relative;">
        <img src="images/dist_train.png" alt="dist_train" style="width: 100%;">
    </div>
    <div style="width: 48%; position: relative;">
        <img src="images/test_train_single1.png" alt="dist_train" style="width: 100%; height">
    </div>
    <div style="width: 48%; position: relative;">
        <img src="images/train_single.png" alt="dist_train" style="width: 100%;">
    </div>
</div>
<div style="text-align: center; font-size: 0.8em; margin-top: 10px;">
    Left: Distributed Job.      Right: Single Node Test Train Split and Train Step
</div>




