# Homework

In this homework, we'll deploy the Bank Marketing model from the homework 5. We already have a docker image for this model - we'll use it for deploying the model to Kubernetes.

## Building the image

In [1]:
# Clone the course repo if you haven't:

# !git clone https://github.com/DataTalksClub/machine-learning-zoomcamp.git

In [2]:
# !docker build -t zoomcamp-model:3.11.5-hw10 machine-learning-zoomcamp/cohorts/2024/05-deployment/homework

# Note: If you have troubles building the image, you can use the image we built and published to docker hub: docker pull svizor/zoomcamp-model:3.11.5-hw10

## Question 1

Run it to test that it's working locally:
```
docker run -it --rm -p 9696:9696 zoomcamp-model:3.11.5-hw10
```

And in another terminal, execute q6_test.py file:
```
python q6_test.py
```

You should see this:
```
{'has_subscribed': True, 'has_subscribed_probability': <value>}
```

Here <value> is the probability of getting a subscription. You need to choose the right one.

- 0.287
- 0.530
- 0.757
- 0.960

Now you can stop the container running in Docker.

### Answer 1: 0.757

## Installing kubectl and kind

You need to install:

- kubectl - https://kubernetes.io/docs/tasks/tools/ (you might already have it - check before installing)
- kind - https://kind.sigs.k8s.io/docs/user/quick-start/

## Question 2

What's the version of kind that you have?

Use :

```
kind --version 
```

to find out.

Creating a cluster.

Now let's create a cluster with kind:
```
kind create cluster
```

And check with kubectl that it was successfully created:
```
kubectl cluster-info
```


In [13]:
!kind --version 

kind version 0.17.0


## Question 3

What's the smallest deployable computing unit that we can create and manage in Kubernetes (kind in our case)?

- Node
- Pod
- Deployment
- Service

**Pod**: A Pod is the smallest and simplest unit in the Kubernetes ecosystem. It represents one or more containers that share storage, network, and instructions on how to run the containers. You can directly create and manage Pods.

**Node**: A Node is a physical or virtual machine where Pods run. While Nodes are part of the Kubernetes infrastructure, they are not a "deployable unit" you manage for running workloads.

**Deployment**: A Deployment is a higher-level abstraction that manages a set of Pods. It ensures the desired number of Pod replicas are running and helps with updates and rollbacks.

**Service**: A Service is an abstraction that exposes a set of Pods to enable communication (e.g., load balancing and discovery). It's not a computing unit itself but a way to expose Pods.

### Answer 3: Pod

## Question 4

Now let's test if everything works. Use kubectl to get the list of running services.

What's the Type of the service that is already running there?

- NodePort
- ClusterIP
- ExternalName
- LoadBalancer

In [3]:
!kubectl get services

NAME                   TYPE           CLUSTER-IP      EXTERNAL-IP   PORT(S)        AGE
engagement-service     LoadBalancer   10.96.36.40     <pending>     80:30087/TCP   13h
kubernetes             ClusterIP      10.96.0.1       <none>        443/TCP        13h
subscription-service   LoadBalancer   10.96.178.136   <pending>     80:30555/TCP   69m


### Answer 4: ClusterIP

## Question 5

To be able to use the docker image we previously created (zoomcamp-model:3.11.5-hw10), we need to register it with kind.

What's the command we need to run for that?

- kind create cluster
- kind build node-image
- kind load docker-image
- kubectl apply

### Answer 5: kind load docker-image

## Question 6

Now let's create a deployment config (e.g. deployment.yaml):

```
apiVersion: apps/v1
kind: Deployment
metadata:
  name: subscription
spec:
  selector:
    matchLabels:
      app: subscription
  replicas: 1
  template:
    metadata:
      labels:
        app: subscription
    spec:
      containers:
      - name: subscription
        image: zoomcamp-model:3.11.5-hw10
        resources:
          requests:
            memory: "64Mi"
            cpu: "100m"            
          limits:
            memory: "1024Mi"
            cpu: "400m"
        ports:
        - containerPort: 9696
```

What is the value for <Port>? ==> 9696

Apply this deployment using the appropriate command and get a list of running Pods. You can see one running Pod.


In [4]:
!kubectl apply -f deployment.yaml

deployment.apps/subscription configured


## Question 7

Let's create a service for this deployment (service.yaml):

```
apiVersion: v1
kind: Service
metadata:
  name: subscription-service
spec:
  type: LoadBalancer
  selector:
    app: subscription
  ports:
    - port: 80
      targetPort: 9696

```




In [5]:
!kubectl apply -f service.yaml

service/subscription-service unchanged


## Testing the service

We can test our service locally by forwarding the port 9696 on our computer to the port 80 on the service:

```
kubectl port-forward service/<Service name> 9696:80
```

Run q6_test.py (from the homework 5) once again to verify that everything is working. You should get the same result as in Question 1.

In [17]:
!kubectl get all | grep pod/subscription

pod/subscription-65669bc8c7-vn5x4       1/1     Running            0          86m


In [9]:
!python machine-learning-zoomcamp/cohorts/2024/05-deployment/homework/q6_test.py 

{'has_subscribed': True, 'has_subscribed_probability': 0.756743795240796}


## Autoscaling

Now we're going to use a HorizontalPodAutoscaler (HPA for short) that automatically updates a workload resource (such as our deployment), with the aim of automatically scaling the workload to match demand.

Use the following command to create the HPA:

```
kubectl autoscale deployment subscription --name subscription-hpa --cpu-percent=20 --min=1 --max=3
```

You can check the current status of the new HPA by running:

```
kubectl get hpa
```

The output should be similar to the next:
```
NAME               REFERENCE                 TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
subscription-hpa   Deployment/subscription   1%/20%    1         3         1          27s
```

TARGET column shows the average CPU consumption across all the Pods controlled by the corresponding deployment. Current CPU consumption is about 0% as there are no clients sending requests to the server.

Note: In case the HPA instance doesn't run properly, try to install the latest Metrics Server release from the components.yaml manifest:

```
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
```


In [10]:
!kubectl autoscale deployment subscription --name subscription-hpa --cpu-percent=20 --min=1 --max=3

Error from server (AlreadyExists): horizontalpodautoscalers.autoscaling "subscription-hpa" already exists


In [11]:
!kubectl get hpa

NAME               REFERENCE                 TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
subscription-hpa   Deployment/subscription   1%/20%    1         3         1          56m


## Increase the load

Let's see how the autoscaler reacts to increasing the load. To do this, we can slightly modify the existing q6_test.py script by putting the operator that sends the request to the subscription service into a loop.

```
while True:
    sleep(0.1)
    response = requests.post(url, json=client).json()
    print(response)
```

Now you can run this script.

In [12]:
!python machine-learning-zoomcamp/cohorts/2024/05-deployment/homework/q6_test_update.py

{'has_subscribed': True, 'has_subscribed_probability': 0.756743795240796}
{'has_subscribed': True, 'has_subscribed_probability': 0.756743795240796}
{'has_subscribed': True, 'has_subscribed_probability': 0.756743795240796}
{'has_subscribed': True, 'has_subscribed_probability': 0.756743795240796}
{'has_subscribed': True, 'has_subscribed_probability': 0.756743795240796}
{'has_subscribed': True, 'has_subscribed_probability': 0.756743795240796}
{'has_subscribed': True, 'has_subscribed_probability': 0.756743795240796}
{'has_subscribed': True, 'has_subscribed_probability': 0.756743795240796}
{'has_subscribed': True, 'has_subscribed_probability': 0.756743795240796}
{'has_subscribed': True, 'has_subscribed_probability': 0.756743795240796}
{'has_subscribed': True, 'has_subscribed_probability': 0.756743795240796}
{'has_subscribed': True, 'has_subscribed_probability': 0.756743795240796}
{'has_subscribed': True, 'has_subscribed_probability': 0.756743795240796}
{'has_subscribed': True, 'has_subscrib

## Question 8 (optional)

Run 

```
kubectl get hpa subscription-hpa --watch
```

command to monitor how the autoscaler performs. Within a minute or so, you should see the higher CPU load; and then - more replicas. 

What was the maximum amount of the replicas during this test?

- 1
- 2
- 3
- 4

Note: It may take a few minutes to stabilize the number of replicas. Since the amount of load is not controlled in any way it may happen that the final number of replicas will differ from initial.

### Answer 8: 1