# 7. Production-Ready Machine Learning (Bento ML)
## Homework

### Background

You are a new recruit at ACME corp. Your manager is emailing you about your first assignment.

### Email from your manager

Good morning recruit! It's good to have you here! I have an assignment for you. I have a data scientist that's built
a credit risk model in a jupyter notebook. I need you to run the notebook and save the model with BentoML and see
how big the model is. If it's greater than a certain size, I'm going to have to request additional resources from 
our infra team. Please let me know how big it is.

Thanks,

Mr McManager

## Question 1

* Install BentoML
* What's the version of BentoML you installed?
* Use `--version` to find out

### Solution steps:

Enter in Terminal:

```bash
install bentoml
```
```bash
bentoml --version
```

In [1]:
! bentoml --version

bentoml, version 1.0.7.post41+gac8e68b


**Answer 1:** The version of installed BentoML is **1.0.7**.

## Question 2

Run the notebook which contains the xgboost model from module 6 i.e previous module and save the xgboost model with BentoML. To make it easier for you we have prepared this [notebook](https://github.com/alexeygrigorev/mlbookcamp-code/blob/master/course-zoomcamp/07-bentoml-production/code/train.ipynb).

How big approximately is the saved BentoML model? Size can slightly vary depending on your local development environment.
Choose the size closest to your model.

* 924kb
* 724kb
* 114kb
* 8kb

### Solution steps:

We follow the [Tutorial: Intro to BentoML](https://docs.bentoml.org/en/latest/tutorial.html).


The model is built and trained in the notebook [train.ipynb](./train.ipynb).

We run the notebook, the model is saved with BentoML API in its model store (a local directory managed by BentoML).

Enter in Terminal:

```bash
bentoml models list
```

We can see information about saved models. The size is 116.25 KiB.

In [2]:
! bentoml models list

[1m [0m[1mTag                         [0m[1m [0m[1m [0m[1mModule         [0m[1m [0m[1m [0m[1mSize      [0m[1m [0m[1m [0m[1mCreation Time      [0m[1m [0m
 credit_risk_model:zssgr7cr6…  bentoml.xgboost  116.25 KiB  2022-10-22 13:17:57 


**Answer 2:** The size of the saved BentoML model is **114kb**.

## Another email from your manager

Great job recruit! Looks like I won't be having to go back to the procurement team. Thanks for the information.

However, I just got word from one of the teams that's using one of our ML services and they're saying our service is "broken"
and their trying to blame our model. I looked at the data their sending and it's completely bogus. I don't want them
to send bad data to us and blame us for our models. Could you write a pydantic schema for the data that they should be sending?
That way next time it will tell them it's their data that's bad and not our model.

Thanks,

Mr McManager

## Question 3

Say you have the following data that you're sending to your service:

```json
{
  "name": "Tim",
  "age": 37,
  "country": "US",
  "rating": 3.14
}
```

What would the pydantic class look like? You can name the class `UserProfile`.

### Solution steps:

Enter in Terminal:    
```bash
pip3 install pydantic
```

Then in ```service.py``` add lines:

```from pydantic import BaseModel```

```class UserProfile(BaseModel):
    name: str
    age: int
    country: str
    rating: float```

**Answer 3:** 
```class UserProfile(BaseModel):
    name: str
    age: int
    country: str
    rating: float``` 

## Email from your CEO

Good morning! I hear you're the one to go to if I need something done well! We've got a new model that a big client
needs deployed ASAP. I need you to build a service with it and test it against the old model and make sure that it performs
better, otherwise we're going to lose this client. All our hopes are with you!

Thanks,

CEO of Acme Corp

## Question 4

We've prepared a model for you that you can import using:

```bash
curl -O https://s3.us-west-2.amazonaws.com/bentoml.com/mlzoomcamp/coolmodel.bentomodel
bentoml models import coolmodel.bentomodel
```

What version of scikit-learn was this model trained with?

* 1.1.1
* 1.1.2
* 1.1.3
* 1.1.4
* 1.1.5

### Solution steps:

In [3]:
# download a model with the command:
!curl -O https://s3.us-west-2.amazonaws.com/bentoml.com/mlzoomcamp/coolmodel.bentomodel

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  1724  100  1724    0     0   1267      0  0:00:01  0:00:01 --:--:--  1268


In [4]:
# now import the model:
!bentoml models import coolmodel.bentomodel

Model(tag="mlzoomcamp_homework:qtzdz3slg6mwwdu5") imported.


In [5]:
# to see information about saved model enter in Terminal (without !-sign):
!bentoml models list

[1m [0m[1mTag                         [0m[1m [0m[1m [0m[1mModule         [0m[1m [0m[1m [0m[1mSize      [0m[1m [0m[1m [0m[1mCreation Time      [0m[1m [0m
 credit_risk_model:zssgr7cr6…  bentoml.xgboost  116.25 KiB  2022-10-22 13:17:57 
 mlzoomcamp_homework:qtzdz3s…  bentoml.sklearn  5.79 KiB    2022-10-13 23:42:14 


Then we can copy the name and version pair of the latest saved model: 
```mlzoomcamp_homework:qtzdz3slg6mwwdu5```

In [6]:
# to view details of this model:
!bentoml models get mlzoomcamp_homework:qtzdz3slg6mwwdu5

[91;40mname[0m[97;40m:[0m[97;40m [0m[40mmlzoomcamp_homework[0m[40m                                                       [0m
[91;40mversion[0m[97;40m:[0m[97;40m [0m[40mqtzdz3slg6mwwdu5[0m[40m                                                       [0m
[91;40mmodule[0m[97;40m:[0m[97;40m [0m[40mbentoml.sklearn[0m[40m                                                         [0m
[91;40mlabels[0m[97;40m:[0m[97;40m [0m[40m{[0m[40m}[0m[40m                                                                      [0m
[91;40moptions[0m[97;40m:[0m[97;40m [0m[40m{[0m[40m}[0m[40m                                                                     [0m
[91;40mmetadata[0m[97;40m:[0m[97;40m [0m[40m{[0m[40m}[0m[40m                                                                    [0m
[91;40mcontext[0m[97;40m:[0m[40m                                                                        [0m
[97;40m  [0m[91;40mframework_name[0m[

**Answer 4:** The model was trained with scikit-learn, version **1.1.1**.

## Question 5 

Create a bento out of this scikit-learn model. The output type for this endpoint should be `NumpyNdarray()`

Send this array to the Bento:

```
[[6.4,3.5,4.5,1.2]]
```

You can use curl or the Swagger UI. What value does it return? 

* 0
* 1
* 2
* 3

(Make sure your environment has Scikit-Learn installed) 

### Solution steps:

Let's create a python file [service.py](./service.py). Services are the core components of BentoML, where the serving logic is defined.

```
import bentoml
from bentoml.io import NumpyNdarray

model_ref = bentoml.sklearn.get("mlzoomcamp_homework:qtzdz3slg6mwwdu5")
 
model_runner = model_ref.to_runner()

svc = bentoml.Service("mlzoomcamp_homework", runners=[model_runner])


@svc.api(input=NumpyNdarray(), output=NumpyNdarray())
async def classify(vector):
    prediction = await model_runner.predict.async_run(vector)
    print(prediction)
    
    return prediction
```

To run the BentoML server for our new service in development mode in Terminal go inside the folder with ```service.py``` and enter the following command:

```bash
bentoml serve service.py:svc --reload
```

Now our service is running.

Open http://127.0.0.1:3000 in our browser and send prediction request ```[[6.4,3.5,4.5,1.2]]``` from the web UI.

Or use the curl-command:

In [7]:
!curl -X POST -H "content-type: application/NumpyNdarray" --data "[[6.4,3.5,4.5,1.2]]" http://127.0.0.1:3000/classify
    

[1]

**Answer 5:** **1**.

## Question 6

Ensure to serve your bento with `--production` for this question

Install locust using:

```bash
pip install locust
```

Use the following locust file: [locustfile.py](locustfile.py)

Ensure that it is pointed at your bento's endpoint (In case you didn't name your endpoint "classify").

Configure 100 users with ramp time of 10 users per second. Click "Start Swarming" and ensure that it is working.

Now download a second model with this command:

```bash
curl -O https://s3.us-west-2.amazonaws.com/bentoml.com/mlzoomcamp/coolmodel2.bentomodel
```

Or you can download with this link as well:
[https://s3.us-west-2.amazonaws.com/bentoml.com/mlzoomcamp/coolmodel2.bentomodel](https://s3.us-west-2.amazonaws.com/bentoml.com/mlzoomcamp/coolmodel2.bentomodel)

Now import the model:

```bash
bentoml models import coolmodel2.bentomodel
```

Update your bento's runner tag and test with both models. Which model allows more traffic (more throughput) as you ramp up the traffic?

**Hint 1**: Remember to turn off and turn on your bento service between changing the model tag. Use Ctl-C to close the service in between trials.

**Hint 2**: Increase the number of concurrent users to see which one has higher throughput

Which model has better performance at higher volumes?

* The first model
* The second model

### Solution steps:

We installed locust using:

```bash
brew install locust
```

Then we run the BentoServer in production mode using:
```bash
bentoml serve --production -q --host localhost
```

Then we start the Locust process using command:
```bash
locust -H http://localhost:3000
```

Open web interface at http://0.0.0.0:8089 in our browser.

Configure 100 users with ramp time of 10 users per second.

Then 200/50, then 300/50, after that 500/50 and finally 1000/50.

For model ```mlzoomcamp_homework:qtzdz3slg6mwwdu5``` we got the following performance:

In [8]:
from IPython.display import Image

In [9]:
Image(url= "./locus-stats/Statistics_model1.png")

In [10]:
Image(url= "./locus-stats/Charts_model1.png")

Turn off our bento service by pressing Ctl-C.

Let's test the second model.

We downloaded the second model with this command in Terminal:

```bash
curl -O https://s3.us-west-2.amazonaws.com/bentoml.com/mlzoomcamp/coolmodel2.bentomodel
```

Then import the model:

```bash
bentoml models import coolmodel2.bentomodel
```

Model (tag="mlzoomcamp_homework:jsi67fslz6txydu5") imported.

In [11]:
!bentoml models list

[1m [0m[1mTag                         [0m[1m [0m[1m [0m[1mModule         [0m[1m [0m[1m [0m[1mSize      [0m[1m [0m[1m [0m[1mCreation Time      [0m[1m [0m
 credit_risk_model:zssgr7cr6…  bentoml.xgboost  116.25 KiB  2022-10-22 13:17:57 
 mlzoomcamp_homework:jsi67fs…  bentoml.sklearn  5.82 KiB    2022-10-14 17:48:43 
 mlzoomcamp_homework:qtzdz3s…  bentoml.sklearn  5.79 KiB    2022-10-13 23:42:14 


Let's update the bento's runner tag  in the file service.py.

```model_ref = bentoml.sklearn.get("mlzoomcamp_homework:jsi67fslz6txydu5")```

Then we run the BentoServer in production mode using:
```bash
bentoml serve --production -q --host localhost
```
Then we start the Locust process using command:
```bash
locust -H http://localhost:3000
```
Open web interface at http://0.0.0.0:8089 in our browser.

Configure 100 users with ramp time of 10 users per second.

Then 200/50, then 300/50, after that 500/50 and finally 1000/50.

For model ```mlzoomcamp_homework:jsi67fslz6txydu5``` we got the following performance:

In [12]:
Image(url= "./locus-stats/Statistics_model2.png")

In [13]:
Image(url= "./locus-stats/Charts_model2.png")

**Answer 6:** **The second model** has better performance at higher volumes.