## Homework

In this homework, we'll deploy Credit Card prediction model from the homework 5.
We already have a docker image for this model - we'll use it for 
deploying the model to Kubernetes.


## Bulding the image

Clone the course repo if you haven't:

```
git clone https://github.com/alexeygrigorev/mlbookcamp-code.git
```

Go to the `course-zoomcamp/cohorts/2022/05-deployment/homework` folder and 
execute the following:


```bash
docker build -t zoomcamp-model:v001 .
```

> **Note:** If you have troubles building the image, you can 
> use the image we built and published to docker hub:
> `svizor42/zoomcamp-model:v001`

In [1]:
!cd ../05-deployment/homework && docker build -t zoomcamp-model:v001 .

[1A[1B[0G[?25l[+] Building 0.0s (0/1)                                                         
[?25h[1A[0G[?25l[+] Building 0.1s (4/9)                                                         
[34m => [internal] load build definition from Dockerfile                       0.0s
[0m[34m => => transferring dockerfile: 302B                                       0.0s
[0m[34m => [internal] load .dockerignore                                          0.0s
[0m[34m => => transferring context: 2B                                            0.0s
[0m[34m => [internal] load metadata for docker.io/svizor/zoomcamp-model:3.9.12-s  0.0s
[0m[34m => CACHED [1/5] FROM docker.io/svizor/zoomcamp-model:3.9.12-slim          0.0s
[0m => [internal] load build context                                          0.1s
 => => transferring context: 27B                                           0.1s
 => [2/5] RUN pip install pipenv                                           0.1s
[?25h[1A[1A[1A[1A[1A

### Question 1

Run it to test that it's working locally:

```bash
docker run -it --rm -p 9696:9696 zoomcamp-model:v001
```

And in another terminal, execute `q6_test.py` file:

```bash
python q6_test.py
```

You should see this:

```
{'get_card': True, 'get_card_probability': <value>}
```

Here `<value>` is the probability of getting a credit card. You need to choose the right one.

* 0.289
* 0.502
* 0.769
* 0.972

Now you can stop the container running in Docker.

In [14]:
!python q6_test.py

{'get_card': True, 'get_card_probability': 0.7692649226628628}


Answer: 0.769

### Question 2

What's the version of `kind` that you have? 

Use `kind --version` to find out.


### Creating a cluster

Now let's create a cluster with `kind`:

```bash
kind create cluster
```

And check with `kubectl` that it was successfully created:

```bash
kubectl cluster-info
```

In [2]:
!kind --version

kind version 0.17.0


In [8]:
!kubectl cluster-info

[0;32mKubernetes control plane[0m is running at [0;33mhttps://127.0.0.1:38121[0m
[0;32mCoreDNS[0m is running at [0;33mhttps://127.0.0.1:38121/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy[0m

To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.


### Question 3

What's the smallest deployable computing unit that we can create and manage 
in Kubernetes (`kind` in our case)?

* Node
* Pod
* Deployment
* Service


Answer: Pod

### Question 4

Now let's test if everything works. Use `kubectl` to get the list of running services.

What's the `Type` of the service that is already running there?

* ClusterIP
* NodePort
* LoadBalancer
* ExternalName

In [9]:
!kubectl get svc

NAME         TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE
kubernetes   ClusterIP   10.96.0.1    <none>        443/TCP   28s


Answer: ClusterIP

### Question 5

To be able to use the docker image we previously created (`zoomcamp-model:v001`),
we need to register it with `kind`.

What's the command we need to run for that?

* `kind create cluster`
* `kind build node-image`
* `kind load docker-image`
* `kubectl apply`

In [10]:
# Doing this now will save some time
!kind load docker-image zoomcamp-model:v001

Image: "" with ID "sha256:b647ce8124835a218fe8da6b971adfa75f6b58b91d6def1b9bb4e5a6c4f0e7bc" not yet present on node "kind-control-plane", loading...


Answer: `kind load docker-image`

### Question 6

Now let's create a [deployment config](./deployment.yaml)

In [11]:
!kubectl apply -f deployment.yaml

deployment.apps/credit-card created


In [12]:
!kubectl get pods

NAME                           READY   STATUS    RESTARTS   AGE
credit-card-6f988cd849-84b2z   1/1     Running   0          10s


Answer: 9696

### Question 6

Now let's create a [service config](./service.yaml)

In [13]:
!kubectl apply -f service.yaml 

service/credit-card created


In [15]:
!kubectl get svc

NAME          TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)   AGE
credit-card   ClusterIP   10.96.140.170   <none>        80/TCP    82s
kubernetes    ClusterIP   10.96.0.1       <none>        443/TCP   5m6s


Answer: credit-card

### Testing the service

We can test our service locally by forwarding the port 9696 on our computer 
to the port 80 on the service:

```bash
kubectl port-forward service/<Service name> 9696:80
```

Run `q6_test.py` (from the homework 5) once again to verify that everything is working. 
You should get the same result as in Question 1.

In [18]:
# The port is being forwarded in another terminal
!python q6_test.py

{'get_card': True, 'get_card_probability': 0.7692649226628628}


### Autoscaling

Now we're going to use a [HorizontalPodAutoscaler](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale-walkthrough/) 
(HPA for short) that automatically updates a workload resource (such as our deployment), 
with the aim of automatically scaling the workload to match demand.

Use the following command to create the HPA:

In [34]:
!kubectl autoscale deployment credit-card --name credit-card-hpa --cpu-percent=20 --min=1 --max=3

horizontalpodautoscaler.autoscaling/credit-card-hpa autoscaled


You can check the current status of the new HPA by running:

In [35]:
!kubectl get hpa

NAME              REFERENCE                TARGETS         MINPODS   MAXPODS   REPLICAS   AGE
credit-card-hpa   Deployment/credit-card   <unknown>/20%   1         3         1          3s


The output should be similar to the next:

```bash
NAME              REFERENCE                TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
credit-card-hpa   Deployment/credit-card   1%/20%    1         3         1          27s
```

`TARGET` column shows the average CPU consumption across all the Pods controlled by the corresponding deployment.
Current CPU consumption is about 0% as there are no clients sending requests to the server.
> 
>Note: In case the HPA instance doesn't run properly, try to install the latest Metrics Server release 
> from the `components.yaml` manifest:
> ```bash
> kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
>```

In [36]:
!kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml

serviceaccount/metrics-server unchanged
clusterrole.rbac.authorization.k8s.io/system:aggregated-metrics-reader unchanged
clusterrole.rbac.authorization.k8s.io/system:metrics-server unchanged
rolebinding.rbac.authorization.k8s.io/metrics-server-auth-reader unchanged
clusterrolebinding.rbac.authorization.k8s.io/metrics-server:system:auth-delegator unchanged
clusterrolebinding.rbac.authorization.k8s.io/system:metrics-server unchanged
service/metrics-server unchanged
deployment.apps/metrics-server configured
apiservice.apiregistration.k8s.io/v1beta1.metrics.k8s.io unchanged


## Increase the load

Let's see how the autoscaler reacts to increasing the load. To do this, we can slightly modify the existing
[`q6_test.py`](./q6_test.py) script by putting the operator that sends the request to the credit-card service into a loop.

```python
while True:
    sleep(0.1)
    response = requests.post(url, json=client).json()
    print(response)
```

Now you can run this script.

### Question 8 (optional)

Run `kubectl get hpa credit-card-hpa --watch` command to monitor how the autoscaler performs. 
Within a minute or so, you should see the higher CPU load; and then - more replicas. 
What was the maximum amount of the replicas during this test?


* 1
* 2
* 3
* 4

>Note: It may take a few minutes to stabilize the number of replicas. Since the amount of load is not controlled 
> in any way it may happen that the final number of replicas will differ from initial.

In [37]:
!kubectl get hpa credit-card-hpa --watch

NAME              REFERENCE                TARGETS         MINPODS   MAXPODS   REPLICAS   AGE
credit-card-hpa   Deployment/credit-card   <unknown>/20%   1         3         1          10s
^C


Answer: 1 (could not get anything to change)