# <img style="float: left; padding-right: 10px; width: 45px" src="https://raw.githubusercontent.com/Harvard-IACS/2018-CS109A/master/content/styles/iacs.png"> APCOMP 295 Advanced Practical Data Science
## Exercise 2: From Docker to Kubernetes



**Harvard University**<br/>
**Fall 2020**<br/>
**Instructors**: Pavlos Protopapas


<hr style="height:2pt">

**Each assignment is graded out of 5 points.  The topic for this assignment is getting started with Kubernetes.**


###  <font color=red>Remember to delete your virtual machines/clusters.</font>

## Question 1: Install (i) [kubectl](https://kubernetes.io/docs/tasks/tools/install-kubectl/), (ii) [Virtualbox](https://www.virtualbox.org/ ), (iii) [minikube](https://kubernetes.io/docs/tasks/tools/install-minikube/)  (1 point)

**minikube:**

After you have installed minikube, start a cluster using the VirtualBox driver by issuing these two commands from the minikube installation instructions:

You may need - `minikube config set vm-driver virtualbox` <br/> 


`minikube start`

Now check that everything is working with the status command like this:
`minikube status`

Hint: If you are on Linux machine, you may need to reboot after installing VirtualBox before this step worked.  We think it’s because the installation adds some modules to the Linux kernel, which only load on the next boot.



**Submit:** 
(i) Screenshot with the output of the (a) `which kubectl` (b) `kubectl version --client`

(ii) Screenshot showing that minikube is running a cluster with the VirtualBox driver 


**Example Submission:**

![image.png](images/image1.png)



![image.png](images/image3.png)


__P1.2 Submission__
![image.png](images/1.png)



![image.png](images/2.png)

## Question 2:  Create our own application (4 points)

In this exercise we are going to deploy a really simple flask application on Kubernetes. This browser based application accepts a number (let’s pretend it’s a record id of some .csv file)  and returns the corresponding record from our (hypothetical) database.  We will create two containers one for the front end and one at the back (database). At a high level there are three steps - 

(a)  Create the app and run/test locally. 

(b) Dockerize your app i.e create docker images and test the same. 

(c) Create Kubernetes yaml files and deploy on minikube and Google Cloud Kubernetes. 


### (a) Create the app and run/test locally
- Step 1: put all the code in place. 
- Step 2: Run/Test

(Step 2:)  Run task1.py on Terminal 1 and run maindb.py on Terminal 2. 

`python task1.py` 

`python maindb.py` 

On your browser - http://0.0.0.0:8081/ 
Enter a number, for e.g.: 1234

You should see “This is the data corresponding to data id: 1234” message on your browser. 
Check what do you get with  http://0.0.0.0:8082/ 

**Submit Screenshots**
![image](images/image11.png)

![image](images/image12.png)


__P1.3.1 Submission__

![image.png](images/3.png)
![image.png](images/4.png)

### (b) Dockerize your app i.e create docker images and test the same.
- Step 1: Create the required dockerfiles `Docker_task1frontend` and `Docker_maindb`. Also add `requirements.txt`
- Step 2: Build and run docker images (using docker network)

Partial Dockerfiles are provided below.   

(Step 2:) Build and run our docker images i.e. build and run 2 docker images. 
- Build a docker image with tag `task1:frontend` using the docker file `Docker_task1frontend`
- Build a docker image with tag `webapp:db` using the docker file `Docker_maindb`

(Step 3:) Create a docker network. 
There are two ways to do this - docker compose or docker network. We will be using docker network. (docker compose provided below for your reference) 
- Run `docker network create appNetwork`
- Run `docker network ls`. You should see appNetwork in the list.


(Step 4:) Run our images using docker network. <br/>
`docker run --name mywebdb -d --network appNetwork webapp:db` <br/>
`docker run --name fe -d -p 5000:8081 -e DB_HOST=mywebdb --network appNetwork task1:frontend`

By default, when you create a container, it does not publish any of its ports to the outside world. -p 5000:8081  option allows us access from outside the container.  This means port 5000 on the host (i.e. your laptop) is bound to container’s port 8081. 

<hr>

**Test** to ensure everything is working as expected. 

On your browser - http://0.0.0.0:5000/ 
Enter a number, for e.g.: 1234
You should see “This is the data corresponding to data id: 1234” message on your browser. 
Check what you get with  http://0.0.0.0:8082/  and explain the result. 

**Submit Screenshots** <br/>
(i) `docker images` <br/>
(ii) `docker ps`  <br/>
(iii) Output and URL on your browser  <br/>


__P1.3.2 Submission__

![image.png](images/5.png)
![image.png](images/6.png)
![image.png](images/7.png)
![image.png](images/8.png)

#### Optional (docker compose)

We can do the same with docker compose. You should stop the running containers and delete the appNetwork.

To run our images - 
`docker-compose up -d`

To clean up - 
`docker-compose down`


### (c) Kubernetes ! Create Kubernetes yaml files and deploy.

We are ready to deploy our application - we will use minikube first and then deploy on Google Cloud. 

Fill the blanks in the following .yaml files. 

(Step 2:) 

(a) Start Minikube  
(b) Check resources - `kubectl get all` should be empty <br>
(c) `eval $(minikube docker-env)` This will ensure that we use minikube's docker daemon. [(link)](https://stackoverflow.com/questions/52310599/what-does-minikube-docker-env-mean) 
To use your machine's docker daemon  use this command `eval $(minikube -u minikube docker-env)` <br> <br> 
(d) `docker images` #to ensure we have the images we need. You may need to build/tag images again, because we changed the docker-env. 
(e) Deploy the webapp_configmap   <br>
(f) Deploy the webapp_db <br>
(g) Deploy the task1 <br>
(h) Check the services, `minikube service list` <br>
(i) Check the URL <br>
 
(Browser input/output should be similar to what you got in step (a), URL would be different. **Take screenshots before cleaning up - see sample below** )

*Cleanup*  

`kubectl delete all --all`<br>
`minikube delete`

[minikube stop vs delete](https://kubernetes.io/docs/setup/learning-environment/minikube/#stopping-a-cluster)

*Do not proceed if your app is not successfully deployed on minikube. Post on Ed/attend OH.*


**Submit Screenshots:** 
(i) `docker images`  <br/>
(ii) `kubectl get all`  <br/>
(iii) `minikube service list` <br/>
(iv) Output and URL on your browser

**Example Submission:**

![image](images/image10.png)

![image](images/image9.png)


![image](images/image13.png)

![image](images/image14.png)

![image](images/image15.png)


__P1.3.3 Submission__

![image.png](images/9.png)
![image.png](images/10.png)
![image.png](images/11.png)
![image.png](images/12.png)
![image.png](images/13.png)

### (c) Kubernetes ! Create Kubernetes yaml files and deploy. Now we will deploy on Google Cloud

(i) `export PROJECT_ID=ac295datascience` Create a environment variable because we will use this frequently. This must be your project name on Google Cloud console. <br/>

(ii) `eval $(minikube -u minikube docker-env)` set the docker daemon back to your machine. (or open another tab on Terminal) <br/>  

(iii) `gcloud config list` # good to check your config so that we create clusters in the same zone  <br/>

(iv) Please enable [Google Container Registry API](https://console.cloud.google.com/apis/api/containerregistry.googleapis.com/overview?project=ac295datascience) before performing this operation. <br/>
`gcloud auth configure-docker` # We will be using Google Container Registry (sort of docker hub for google cloud, but with private repository)

(v) Now we can either rebuild images or [retag](https://docs.docker.com/engine/reference/commandline/image_tag/) them. 
Our images should be named `gcr.io/${PROJECT_ID}/webapp:db` and `gcr.io/${PROJECT_ID}/task1:frontend`

To retag: 
`docker image tag 305df86edf8a gcr.io/${PROJECT_ID}/task1:frontend`  <br/>
`docker image tag 75af1dcc902e gcr.io/${PROJECT_ID}/webapp:db` <br/>

OR 

Build docker images `docker build -t gcr.io/${PROJECT_ID}/webapp:db -f Docker_maindb .` <br/>
`docker build -t gcr.io/${PROJECT_ID}/task1:frontend -f Docker_task1frontend .`  <br/>

(vi) Push docker images to container registry.   <br/>
`docker push gcr.io/${PROJECT_ID}/webapp:db`  <br/>
`docker push gcr.io/${PROJECT_ID}/task1:frontend`  <br/>

You can check your pushed images on Google Cloud Console - Tools ->  Container Registry.   <br/>

(vii) Before we create our cluster and deploy, we need to change our .yaml files to use the new docker images that we created. In `webapp_db_deployment_k8s.yaml` change the name of the image from `webapp:db` to `gcr.io/ac295datascience/webapp:db` <br/>

In `task1_deployment_k8s.yaml` change the name of the image from `task1:frontend` to `gcr.io/ac295datascience/task1:frontend` <br/>

(viii) `gcloud container clusters create exercise2-cluster --num-nodes 2` Creating our cluster. 

(ix) Deploy `webapp_configmap`, `webapp_db_deployment_k8s.yaml`  
`task1_deployment_k8s.yaml` <br/>


(x) Get the external IP URL from `kubectl get all` and test it. You can also check Google Cloud console->Kubernetes Engine page to see your clusters.  My url looks like [this](https://console.cloud.google.com/kubernetes/list?project=ac295datascience)

(xi) **DELETE** the cluster `gcloud container clusters delete exercise2-cluster` 

**Submit Screenshots:**
(i) `kubectl get all`  <br/> 
(ii) Output and URL on your browser

**Example submission:** 
![image](images/image16.png)

![image](images/image17.png)

![image](images/image18.png)


###  <font color=red>Remember to delete your virtual machines/clusters.</font>

__P1.3.4 Submission__

![image.png](images/14.png)
![image.png](images/15.png)
![image.png](images/16.png)
