## Homework

> Note: sometimes your answer might not match one of the options exactly. That's fine. 
Select the option that's closest to your solution.

The goal of this homework is to familiarize you with BentoML and how to build and test an ML production service.

## Background

You are a new recruit at ACME corp. Your manager is emailing you about your first assignment.


## Email from your manager

Good morning recruit! It's good to have you here! I have an assignment for you. I have a data scientist that's built
a credit risk model in a jupyter notebook. I need you to run the notebook and save the model with BentoML and see
how big the model is. If it's greater than a certain size, I'm going to have to request additional resources from 
our infra team. Please let me know how big it is.

Thanks,

Mr McManager

## Question 1

* Install BentoML
* What's the version of BentoML you installed? **1.0.7**
* Use `--version` to find out 

In [1]:
!pip install bentoml

Collecting bentoml
  Downloading bentoml-1.0.7-py3-none-any.whl (858 kB)
[K     |████████████████████████████████| 858 kB 3.5 MB/s eta 0:00:01
Collecting cattrs>=22.1.0
  Downloading cattrs-22.2.0-py3-none-any.whl (35 kB)
Collecting circus
  Downloading circus-0.17.2-py3-none-any.whl (204 kB)
[K     |████████████████████████████████| 204 kB 7.0 MB/s eta 0:00:01
Collecting deepmerge
  Downloading deepmerge-1.1.0-py3-none-any.whl (8.5 kB)
Collecting fs
  Downloading fs-2.4.16-py2.py3-none-any.whl (135 kB)
[K     |████████████████████████████████| 135 kB 8.1 MB/s eta 0:00:01
Collecting opentelemetry-api>=1.9.0
  Downloading opentelemetry_api-1.13.0-py3-none-any.whl (50 kB)
[K     |████████████████████████████████| 50 kB 3.0 MB/s  eta 0:00:01
[?25hCollecting opentelemetry-instrumentation==0.33b0
  Downloading opentelemetry_instrumentation-0.33b0-py3-none-any.whl (23 kB)
Collecting opentelemetry-instrumentation-aiohttp-client==0.33b0
  Downloading opentelemetry_instrumentation_aiohttp_

In [2]:
import bentoml
bentoml.__version__

'1.0.7'

In [3]:
!bentoml --version

bentoml, version 1.0.7


## Question 2

Run the notebook which contains the xgboost model from module 6 i.e previous module and save the xgboost model with BentoML. To make it easier for you we have prepared this [notebook](https://github.com/alexeygrigorev/mlbookcamp-code/blob/master/course-zoomcamp/07-bentoml-production/code/train.ipynb). 


How big approximately is the saved BentoML model? Size can slightly vary depending on your local development environment.
Choose the size closest to your model.

* 924kb
* 724kb
* **114kb**
* 8kb

In [32]:
!bentoml models list

[1m [0m[1mTag                         [0m[1m [0m[1m [0m[1mModule         [0m[1m [0m[1m [0m[1mSize      [0m[1m [0m[1m [0m[1mCreation Time      [0m[1m [0m
 credit_risk_model:ywdyugcur…  bentoml.xgboost  197.77 KiB  2022-10-25 23:08:00 
 credit_risk_model:mzfzizcur…  bentoml.xgboost  197.77 KiB  2022-10-25 22:58:11 
 mlzoomcamp_homework:jsi67fs…  bentoml.sklearn  5.82 KiB    2022-10-14 20:18:43 
 mlzoomcamp_homework:qtzdz3s…  bentoml.sklearn  5.79 KiB    2022-10-14 02:12:14 


## Another email from your manager

Great job recruit! Looks like I won't be having to go back to the procurement team. Thanks for the information.

However, I just got word from one of the teams that's using one of our ML services and they're saying our service is "broken"
and their trying to blame our model. I looked at the data their sending and it's completely bogus. I don't want them
to send bad data to us and blame us for our models. Could you write a pydantic schema for the data that they should be sending?
That way next time it will tell them it's their data that's bad and not our model.

Thanks,

Mr McManager

## Question 3

Say you have the following data that you're sending to your service:

```json
{
  "name": "Tim",
  "age": 37,
  "country": "US",
  "rating": 3.14
}
```

What would the pydantic class look like? You can name the class `UserProfile`.

In [5]:
from pydantic import BaseModel


class UserProfile(BaseModel):
    name: str
    age: int
    country: str
    rating: float

## Email from your CEO

Good morning! I hear you're the one to go to if I need something done well! We've got a new model that a big client
needs deployed ASAP. I need you to build a service with it and test it against the old model and make sure that it performs
better, otherwise we're going to lose this client. All our hopes are with you!

Thanks,

CEO of Acme Corp

## Question 4

We've prepared a model for you that you can import using:

```bash
curl -O https://s3.us-west-2.amazonaws.com/bentoml.com/mlzoomcamp/coolmodel.bentomodel
bentoml models import coolmodel.bentomodel
```

What version of scikit-learn was this model trained with?

* **1.1.1**
* 1.1.2
* 1.1.3
* 1.1.4
* 1.1.5

In [6]:
!curl -O https://s3.us-west-2.amazonaws.com/bentoml.com/mlzoomcamp/coolmodel.bentomodel

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  1724  100  1724    0     0   1586      0  0:00:01  0:00:01 --:--:--  1584


In [7]:
!bentoml models import coolmodel.bentomodel

Model(tag="mlzoomcamp_homework:qtzdz3slg6mwwdu5") imported


In [8]:
!bentoml models get mlzoomcamp_homework:qtzdz3slg6mwwdu5

[38;2;249;38;114;48;2;39;40;34mname[0m[38;2;248;248;242;48;2;39;40;34m:[0m[38;2;248;248;242;48;2;39;40;34m [0m[48;2;39;40;34mmlzoomcamp_homework[0m[48;2;39;40;34m                                                       [0m
[38;2;249;38;114;48;2;39;40;34mversion[0m[38;2;248;248;242;48;2;39;40;34m:[0m[38;2;248;248;242;48;2;39;40;34m [0m[48;2;39;40;34mqtzdz3slg6mwwdu5[0m[48;2;39;40;34m                                                       [0m
[38;2;249;38;114;48;2;39;40;34mmodule[0m[38;2;248;248;242;48;2;39;40;34m:[0m[38;2;248;248;242;48;2;39;40;34m [0m[48;2;39;40;34mbentoml.sklearn[0m[48;2;39;40;34m                                                         [0m
[38;2;249;38;114;48;2;39;40;34mlabels[0m[38;2;248;248;242;48;2;39;40;34m:[0m[38;2;248;248;242;48;2;39;40;34m [0m[48;2;39;40;34m{[0m[48;2;39;40;34m}[0m[48;2;39;40;34m                                                                      [0m
[38;2;249;38;114;48;2;39;40;34moptions[0m[38;2;248;24

## Question 5 

Create a bento out of this scikit-learn model. The output type for this endpoint should be `NumpyNdarray()`

Send this array to the Bento:

```
[[6.4,3.5,4.5,1.2]]
```

You can use curl or the Swagger UI. What value does it return? 

* 0
* **1**
* 2
* 3

(Make sure your environment has Scikit-Learn installed) 

In [27]:
%%writefile service.py
import bentoml
import numpy as np
from bentoml.io import JSON, NumpyNdarray
from pydantic import BaseModel


tag = "mlzoomcamp_homework:qtzdz3slg6mwwdu5"
model_ref = bentoml.sklearn.get(tag)


model_runner = model_ref.to_runner()

svc = bentoml.Service("classifier", runners=[model_runner])

@svc.api(input=NumpyNdarray(), output=NumpyNdarray())
def classify(input_series: np.ndarray) -> np.ndarray:
    result = model_runner.predict.run(input_series)
    return result  

Overwriting service.py


In [25]:
!bentoml serve service.py:svc --reload

  "class": algorithms.Blowfish,
2022-10-26T00:26:27+0530 [INFO] [cli] Prometheus metrics for HTTP BentoServer from "service.py:svc" can be accessed at http://localhost:3000/metrics.
2022-10-26T00:26:28+0530 [INFO] [cli] Starting development HTTP BentoServer from "service.py:svc" running on http://0.0.0.0:3000 (Press CTRL+C to quit)
  "class": algorithms.Blowfish,
2022-10-26 00:26:29 circus[25212] [INFO] Loading the plugin...
2022-10-26 00:26:29 circus[25212] [INFO] Endpoint: 'tcp://127.0.0.1:42519'
2022-10-26 00:26:29 circus[25212] [INFO] Pub/sub: 'tcp://127.0.0.1:45235'
2022-10-26T00:26:29+0530 [INFO] [observer] Watching directories: ['/home/elite/Documents/ML-Zoomcamp-2022/homework', '/home/elite/bentoml/models']
https://scikit-learn.org/stable/model_persistence.html#security-maintainability-limitations
2022-10-26T00:26:36+0530 [INFO] [dev_api_server:classifier] 127.0.0.1:54910 (scheme=http,method=GET,path=/,type=,length=) (status=200,type=text/html; charset=utf-8,length=2859) 1.118m

## Question 6

Ensure to serve your bento with `--production` for this question

Install locust using:

```bash
pip install locust
```

Use the following locust file: [locustfile.py](https://github.com/alexeygrigorev/mlbookcamp-code/blob/master/course-zoomcamp/cohorts/2022/07-bento-production/locustfile.py)

Ensure that it is pointed at your bento's endpoint (In case you didn't name your endpoint "classify")

<img src="resources/classify-endpoint.png">

Configure 100 users with ramp time of 10 users per second. Click "Start Swarming" and ensure that it is working.

Now download a second model with this command:

```bash
curl -O https://s3.us-west-2.amazonaws.com/bentoml.com/mlzoomcamp/coolmodel2.bentomodel
```

Or you can download with this link as well:
[https://s3.us-west-2.amazonaws.com/bentoml.com/mlzoomcamp/coolmodel2.bentomodel](https://s3.us-west-2.amazonaws.com/bentoml.com/mlzoomcamp/coolmodel2.bentomodel)

Now import the model:

```bash
bentoml models import coolmodel2.bentomodel
```

Update your bento's runner tag and test with both models. Which model allows more traffic (more throughput) as you ramp up the traffic?

**Hint 1**: Remember to turn off and turn on your bento service between changing the model tag. Use Ctl-C to close the service in between trials.

**Hint 2**: Increase the number of concurrent users to see which one has higher throughput

Which model has better performance at higher volumes?

* The first model
* **The second model**

In [28]:
!pip install locust

Defaulting to user installation because normal site-packages is not writeable
Collecting locust
  Downloading locust-2.12.2-py3-none-any.whl (823 kB)
[2K     [38;2;114;156;31m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m823.6/823.6 kB[0m [31m5.6 MB/s[0m eta [36m0:00:00[0m MB/s[0m eta [36m0:00:01[0m:01[0m
[?25hCollecting Flask-BasicAuth>=0.2.0
  Downloading Flask-BasicAuth-0.2.0.tar.gz (16 kB)
  Preparing metadata (setup.py) ... [?25ldone
Collecting gevent>=20.12.1
  Downloading gevent-22.10.1-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (6.6 MB)
[2K     [38;2;114;156;31m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m6.6/6.6 MB[0m [31m8.6 MB/s[0m eta [36m0:00:00[0mm eta [36m0:00:01[0m0:01[0m:01[0mm
[?25hCollecting geventhttpclient>=2.0.2
  Downloading geventhttpclient-2.0.8-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (104 kB)
[2K     [38;2;114;156;31m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

In [33]:
%%writefile locustfile.py

from locust import task
from locust import between
from locust import HttpUser

sample = [[6.4,3.5,4.5,1.2]]

class MLZoomUser(HttpUser):
    """
    Usage:
        Start locust load testing client with:
            locust -H http://localhost:3000
        Open browser at http://0.0.0.0:8089, adjust desired number of users and spawn
        rate for the load test from the Web UI and start swarming.
    """

    @task
    def classify(self):
        self.client.post("/classify", json=sample)

    wait_time = between(0.01, 2)

Writing locustfile.py


In [35]:
!locust -H http://localhost:3000

It's not high enough for load testing, and the OS didn't allow locust to increase it by itself.
See https://github.com/locustio/locust/wiki/Installation#increasing-maximum-number-of-open-files-limit for more info.
[2022-10-26 00:44:18,702] the-python-ninja/INFO/locust.main: Starting web interface at http://0.0.0.0:8089 (accepting connections from all network interfaces)
[2022-10-26 00:44:18,715] the-python-ninja/INFO/locust.main: Starting Locust 2.12.2
KeyboardInterrupt
2022-10-25T19:16:16Z
[2022-10-26 00:46:16,552] the-python-ninja/INFO/locust.main: Shutting down (exit code 0)
Type     Name  # reqs      # fails |    Avg     Min     Max    Med |   req/s  failures/s
--------||-------|-------------|-------|-------|-------|-------|--------|-----------
--------||-------|-------------|-------|-------|-------|-------|--------|-----------
         Aggregated       0     0(0.00%) |      0       0       0      0 |    0.00        0.00

Response time percentiles (approximated)
Type     Name      

In [29]:
!curl -O https://s3.us-west-2.amazonaws.com/bentoml.com/mlzoomcamp/coolmodel2.bentomodel


  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  1728  100  1728    0     0   1560      0  0:00:01  0:00:01 --:--:--  1560


In [31]:
!bentoml models import coolmodel2.bentomodel

Model(tag="mlzoomcamp_homework:jsi67fslz6txydu5") imported


In [None]:
!bentoml serve service.py:svc --reload --production

  "class": algorithms.Blowfish,
2022-10-26T00:49:22+0530 [INFO] [cli] Environ for worker 0: set CPU thread count to 4
2022-10-26T00:49:22+0530 [INFO] [cli] Prometheus metrics for HTTP BentoServer from "service.py:svc" can be accessed at http://localhost:3000/metrics.
2022-10-26T00:49:23+0530 [INFO] [cli] Starting production HTTP BentoServer from "service.py:svc" running on http://0.0.0.0:3000 (Press CTRL+C to quit)
https://scikit-learn.org/stable/model_persistence.html#security-maintainability-limitations
2022-10-26T00:49:43+0530 [INFO] [runner:mlzoomcamp_homework:1] _ (scheme=http,method=POST,path=/predict,type=application/octet-stream,length=412) (status=200,type=application/vnd.bentoml.NdarrayContainer,length=155) 14.278ms (trace=9927de484d89e646688c0ee9d6a33cca,span=71c0fa6246399a59,sampled=0)
2022-10-26T00:49:43+0530 [INFO] [runner:mlzoomcamp_homework:1] _ (scheme=http,method=POST,path=/predict,type=application/octet-stream,length=412) (status=200,type=application/vnd.bentoml.Ndar

## Email from marketing

Hello ML person! I hope this email finds you well. I've heard there's this cool new ML model called Stable Diffusion.
I hear if you give it a description of a picture it will generate an image. We need a new company logo and I want it
to be fierce but also cool, think you could help out?

Thanks,

Mike Marketer

## Question 7 (optional)

Go to this Bento deployment of Stable Diffusion: http://54.176.205.174/ (or deploy it yourself)

Use the txt2image endpoint and update the prompt to: "A cartoon dragon with sunglasses". 
Don't change the seed, it should be 0 by default

What is the resulting image?

### #1
<img src="resources/dragon1.jpeg">

### #2 
<img src="resources/dragon2.jpeg">

### #3 
<img src="resources/dragon3.jpeg">

### #4
<img src="resources/dragon4.jpeg">

## Submit the results

* Submit your results here: https://forms.gle/Hh9FWy6LGXk3wJYs8
* You can submit your solution multiple times. In this case, only the last submission will be used 
* If your answer doesn't match options exactly, select the closest one


## Deadline

The deadline for submitting is **24 October 2022 (Monday), 23:00 CEST (Berlin time)**. 

After that, the form will be closed.