## Homework

In this homework, we'll deploy Credit Card prediction model from the homework 5. We already have a docker image for this model - we'll use it for deploying the model to Kubernetes.

### Bulding the image

Clone the course repo if you haven't:

```
git clone https://github.com/alexeygrigorev/mlbookcamp-code.git
```

Go to the `course-zoomcamp/cohorts/2022/05-deployment/homework` folder and execute the following:

```
docker build -t zoomcamp-model:v001 .
```

   
 > Note: If you have troubles building the image, you can use the image we built and published to docker hub: `svizor42/zoomcamp-model:v001`


### Question 1

Run it to test that it's working locally:

```
docker run -it --rm -p 9696:9696 zoomcamp-model:v001
```

And in another terminal, execute `q6_test.py file`:

```
python q6_test.py
```

You should see this:

```
{'get_card': True, 'get_card_probability': <value>}
```

Here <value> is the probability of getting a credit card. You need to choose the right one.

- 0.289
    
- 0.502
    
- 0.769

- 0.972
    
Now you can stop the container running in Docker.

In [2]:
import requests


url = "http://localhost:9696/predict"

client = {"reports": 0, "share": 0.245, "expenditure": 3.438, "owner": "yes"}
response = requests.post(url, json=client).json()

print(response)

{'get_card': True, 'get_card_probability': 0.7692649226628628}

Answer: `0.769`

You need to install:

- kubectl - https://kubernetes.io/docs/tasks/tools/ (you might already have it - check before installing)
 
- kind - https://kind.sigs.k8s.io/docs/user/quick-start/

In [5]:
!kubectl

kubectl controls the Kubernetes cluster manager.

 Find more information at:
https://kubernetes.io/docs/reference/kubectl/overview/

Basic Commands (Beginner):
  create        Create a resource from a file or from stdin.
  expose        Take a replication controller, service, deployment or pod and
expose it as a new Kubernetes Service
  run           Run a particular image on the cluster
  set           Set specific features on objects

Basic Commands (Intermediate):
  explain       Documentation of resources
  get           Display one or many resources
  edit          Edit a resource on the server
  delete        Delete resources by filenames, stdin, resources and names, or by
resources and label selector

Deploy Commands:
  rollout       Manage the rollout of a resource
  scale         Set a new size for a Deployment, ReplicaSet or Replication
Controller
  autoscale     Auto-scale a Deployment, ReplicaSet, StatefulSet, or
ReplicationController

Cluster Mana

In [7]:
!brew install kind

Running `brew update --auto-update`...
[34m==>[0m [1mAuto-updated Homebrew![0m
Updated 2 taps (homebrew/core and homebrew/cask).
[34m==>[0m [1mNew Formulae[0m
bindgen                    libdivide                  pluto
btrfs-progs                libemf2svg                 pomsky
cdsclient                  libgrapheme                proxsuite
cmctl                      libunibreak                python-lsp-server
cntb                       license-eye                rnr
conda-zsh-completion       llama                      ruff
corrosion                  macpine                    socket_vmnet
d2                         mariadb@10.8               tart
graphqxl                   mariadb@10.9               temporal
hotbuild                   markdownlint-cli2          tut
huggingface-cli            muon                       typos-cli
hysteria                   node@18                    valijson
joker                      pandemics                  vhs
kubevious                  

In [8]:
!kind

kind creates and manages local Kubernetes clusters using Docker container 'nodes'

Usage:
  kind [command]

Available Commands:
  build       Build one of [node-image]
  completion  Output shell completion code for the specified shell (bash, zsh or fish)
  create      Creates one of [cluster]
  delete      Deletes one of [cluster]
  export      Exports one of [kubeconfig, logs]
  get         Gets one of [clusters, nodes, kubeconfig]
  help        Help about any command
  load        Loads images into nodes
  version     Prints the kind CLI version

Flags:
  -h, --help              help for kind
      --loglevel string   DEPRECATED: see -v instead
  -q, --quiet             silence all stderr output
  -v, --verbosity int32   info log verbosity, higher value produces more output
      --version           version for kind

Use "kind [command] --help" for more information about a command.


## Question 2

What's the version of kind that you have?

Use `kind --version` to find out.

In [9]:
!kind --version

kind version 0.17.0


### Creating a cluster

Now let's create a cluster with `kind`:

```
kind create cluster
```

And check with `kubectl` that it was successfully created:

```
kubectl cluster-info
```

In [10]:
!kind create cluster

Creating cluster "kind" ...
 [32m✓[0m Ensuring node image (kindest/node:v1.25.3) 🖼7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7l[?7

In [11]:
!kubectl cluster-info

[0;32mKubernetes control plane[0m is running at [0;33mhttps://127.0.0.1:64550[0m
[0;32mCoreDNS[0m is running at [0;33mhttps://127.0.0.1:64550/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy[0m

To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.


### Question 3

What's the smallest deployable computing unit that we can create and manage in Kubernetes (kind in our case)?

- Node

- Pod

- Deployment

- Service

Answer: `Pod`

### Question 4

Now let's test if everything works. Use `kubectl` to get the list of running services.

What's the `Type` of the service that is already running there?

- ClusterIP

- NodePort

- LoadBalancer

- ExternalName

In [12]:
!kubectl get services

NAME         TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE
kubernetes   ClusterIP   10.96.0.1    <none>        443/TCP   111s


Answer: `ClusterIP`

### Question 5

To be able to use the docker image we previously created (zoomcamp-model:v001), we need to register it with kind.

What's the command we need to run for that?

- kind create cluster

- kind build node-image

- kind load docker-image

- kubectl apply

In [14]:
!kind load docker-image zoomcamp-model:v001

Answer: `kind load docker-image`

### Question 6

Now let's create a deployment config (e.g. `deployment.yaml`):
```
apiVersion: apps/v1
kind: Deployment
metadata:
  name: credit-card
spec:
  selector:
    matchLabels:
      app: credit-card
  replicas: 1
  template:
    metadata:
      labels:
        app: credit-card
    spec:
      containers:
      - name: credit-card
        image: <Image>
        resources:
          requests:
            memory: "64Mi"
            cpu: "100m"            
          limits:
            memory: <Memory>
            cpu: <CPU>
        ports:
        - containerPort: <Port>
```
Replace `<Image>`, `<Memory>`, `<CPU>`, `<Port>` with the correct values.

What is the value for `<Port>`?

Apply this deployment using the appropriate command and get a list of running Pods. You can see one running Pod.

In [None]:
!kubectl apply -f deployment.yaml
!kubectl get deployment
!kubectl get pod

### Question 7

Let's create a service for this deployment (service.yaml):

```
apiVersion: v1
kind: Service
metadata:
  name: <Service name>
spec:
  type: LoadBalancer
  selector:
    app: <???>
  ports:
  - port: 80
    targetPort: <PORT>
```

Fill it in. What do we need to write instead of `<???>`?

Apply this config file.

Answer: `credit-card`

In [15]:
!kubectl apply -f service.yaml
!kubectl get service

service/credit-card-service created
NAME                  TYPE           CLUSTER-IP     EXTERNAL-IP   PORT(S)        AGE
credit-card-service   LoadBalancer   10.96.194.18   <pending>     80:32727/TCP   0s
kubernetes            ClusterIP      10.96.0.1      <none>        443/TCP        11m


### Testing the service

We can test our service locally by forwarding the port 9696 on our computer to the port 80 on the service:

```
kubectl port-forward service/<Service name> 9696:80
```

`Run q6_test.py` (from the homework 5) once again to verify that everything is working. You should get the same result as in Question 1.

### Autoscaling

Now we're going to use a [HorizontalPodAutoscaler](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale-walkthrough/) (HPA for short) that automatically updates a workload resource (such as our deployment), with the aim of automatically scaling the workload to match demand.

Use the following command to create the HPA:

```
kubectl autoscale deployment credit-card --name credit-card-hpa --cpu-percent=20 --min=1 --max=3
```

You can check the current status of the new HPA by running:

```
kubectl get hpa
```

The output should be similar to the next:

```
NAME              REFERENCE                TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
credit-card-hpa   Deployment/credit-card   1%/20%    1         3         1          27s
```

`TARGET` column shows the average CPU consumption across all the Pods controlled by the corresponding deployment. Current CPU consumption is about 0% as there are no clients sending requests to the server.

>Note: In case the HPA instance doesn't run properly, try to install the latest Metrics Server release from the components.yaml manifest:
```
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
```

### Increase the load

Let's see how the autoscaler reacts to increasing the load. To do this, we can slightly modify the existing q6_test.py script by putting the operator that sends the request to the credit-card service into a loop.

```
while True:
    sleep(0.1)
    response = requests.post(url, json=client).json()
    print(response)
```

Now you can run this script.

In [17]:
!kubectl port-forward service/credit-card-service 9696:80

In [None]:
import requests


url = "http://localhost:9696/predict"

client = {"reports": 0, "share": 0.245, "expenditure": 3.438, "owner": "yes"}
response = requests.post(url, json=client).json()

print(response)

In [None]:
!kubectl autoscale deployment credit-card --name credit-card-hpa --cpu-percent=20 --min=1 --max=3
!kubectl get hpa # on MACOS for live feedback: watch kubectl get hpa

In [None]:
from time import sleep
import requests

url = "http://localhost:9696/predict"

client = {"reports": 0, "share": 0.245, "expenditure": 3.438, "owner": "yes"}

while True:
    sleep(0.01)
    response = requests.post(url, json=client).json()

### Question 8 (optional)

Run kubectl get hpa credit-card-hpa --watch command to monitor how the autoscaler performs. Within a minute or so, you should see the higher CPU load; and then - more replicas. What was the maximum amount of the replicas during this test?

- 1

- 2

- 3

- 4
> Note: It may take a few minutes to stabilize the number of replicas. Since the amount of load is not controlled in any way it may happen that the final number of replicas will differ from initial.

Answer: `3`