# BentoML - Production ready machine learning

* BentoML: open source model serving library
* Module 6: developed a model for credit approval
    * What needs to be done next?
    * How can the model be used by people?
    * This can be done as a webservice
        * Module 5: wrap into flask app
        * This works well in development, but in real-world scenarios more factors need to be considered (especially much more peaople are going to use it)
    * Goal of this module: 
        * build and deploy ML model at scale
        * Customize your service to fit your use case
        * Make your service production ready
* What is 'Production ready'?
    * Scalability
    * Operationally efficiency
    * Repeatability (CI/CD)
    * Flexibility
    * Resiliency
    * Easy to use- ity
        

"BentoML makes it easy to **create** and **package** your ML service for production"

# 7.2 Building a Prediction Service
* Use model from module 6 from  2022 (copied from github)
* Using BentoML we can save the model as it is recommended for each framework and version
* See end of notebook: module6_2022.ipynb

In [1]:
!pip install bentoml

Collecting bentoml
  Downloading bentoml-1.0.7-py3-none-any.whl (858 kB)
[2K     [38;2;114;156;31m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m858.3/858.3 kB[0m [31m13.1 MB/s[0m eta [36m0:00:00[0m MB/s[0m eta [36m0:00:01[0m
[?25hCollecting python-multipart
  Downloading python-multipart-0.0.5.tar.gz (32 kB)
  Preparing metadata (setup.py) ... [?25ldone
[?25hCollecting opentelemetry-semantic-conventions==0.33b0
  Downloading opentelemetry_semantic_conventions-0.33b0-py3-none-any.whl (26 kB)
Collecting circus
  Downloading circus-0.17.1-py3-none-any.whl (182 kB)
[2K     [38;2;114;156;31m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m182.7/182.7 kB[0m [31m36.0 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting cattrs>=22.1.0
  Downloading cattrs-22.2.0-py3-none-any.whl (35 kB)
Collecting opentelemetry-util-http==0.33b0
  Downloading opentelemetry_util_http-0.33b0-py3-none-any.whl (6.6 kB)
Collecting deepmerge
  Downloading deepmerge-1.0.1-py3-none-any.whl (8.0 kB)
Collect

Collecting h11>=0.8
  Downloading h11-0.14.0-py3-none-any.whl (58 kB)
[2K     [38;2;114;156;31m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m58.3/58.3 kB[0m [31m9.0 MB/s[0m eta [36m0:00:00[0m
Collecting pep517>=0.9.1
  Downloading pep517-0.13.0-py3-none-any.whl (18 kB)
Building wheels for collected packages: python-multipart
  Building wheel for python-multipart (setup.py) ... [?25ldone
[?25h  Created wheel for python-multipart: filename=python_multipart-0.0.5-py3-none-any.whl size=31678 sha256=39ea1a4a102528167c3d19d873a238315f114d27e25156a9bf495cc2d2f9ddbf
  Stored in directory: /home/frauke/.cache/pip/wheels/9e/fc/1c/cf980e6413d3ee8e70cd8f39e2366b0f487e3e221aeb452eb0
Successfully built python-multipart
Installing collected packages: deepmerge, commonmark, simple-di, rich, python-multipart, python-dotenv, pynvml, pep517, pathspec, opentelemetry-util-http, opentelemetry-semantic-conventions, h11, fs, exceptiongroup, deprecated, circus, asgiref, watchfiles, uvicorn, starle

In [2]:
!bentoml --version

bentoml, version 1.0.7


```import bentoml
bentoml.xgboost.save_model("credit_risk_model", model,
                          custom_objects={
                              "dictVectorizer": dv
                          })```

* ```custom_objects``` allows to save other things we need for our model, as e.g. in this case the dictionary vectorizer

Output:
```Model(tag="credit_risk_model:4kf5u7coewdndaoi", path="/home/frauke/bentoml/models/credit_risk_model/4kf5u7coewdndaoi/")```

* This creates a unique tag each time 'save_model' is called
* The model is saved at aspecific path

### Create a Service

* saved as 'service.py'
* Call the service from the terimal: ```bentoml serve service.py:svc```
* We then have a service running at ```localhost:3000```
* We can use ```bentoml serve service.py:svc --reload``` to automatically reload the service, when we change it

# 7.3 Deploying your Prediction Service

* bentoml provides a command line tool:

```bentoml models list```: lists all models saved.

Output:

| Tag |                                Module |          Size |       Creation Time |
|-----|---------------------------------------|---------------|---------------------|
| credit_risk_model:4kf5u7coewdndaoi | bentoml.xgboost | 195.66 KiB | 2022-10-17 16:13:33 |
| credit_risk_model:fxjqrmcoekdndaoi | bentoml.xgboost | 195.27 KiB | 2022-10-17 15:47:02 |
    
```bentoml models get credit_risk_model:fxjqrmcoekdndaoi```: gives detailed information about the model

```
name: credit_risk_model                                                                                                             
version: fxjqrmcoekdndaoi                                                                                                           
module: bentoml.xgboost                                                                                                             
labels: {}                                                                                                                          
options:                                                                                                                            
  model_class: Booster                                                                                                              
metadata: {}                                                                                                                        
context:                                                                                                                            
  framework_name: xgboost                                                                                                           
  framework_versions:                                                                                                               
    xgboost: 1.6.2                                                                                                                  
  bentoml_version: 1.0.7                                                                                                            
  python_version: 3.8.3                                                                                                             
signatures:                                                                                                                         
  predict:                                                                                                                          
    batchable: false                                                                                                                
api_version: v2                                                                                                                     
creation_time: '2022-10-17T13:47:02.302464+00:00'  
```

### How to build our bento

* We need to create a 'bentofile.yaml'
* See documentation for complete list of possible parameters: https://docs.bentoml.org/en/latest/concepts/bento.html
* It not only specifies things about the model itself, but also about the environment
* build the bento: ```bentoml build``` in terminal
* Output: ```Successfully built Bento(tag="credit_risk_classifier:tcr675covor57p7e")```
* Going to ```~/bentoml/bentos/credit_risk_classifier/tcr675covor57p7e``` we can see the files stored in the bento
![bento1.png](bento1.png)
* dockerfile is automatically build (can be customized)
* Standardized way of combining all things needed for an ML service at one place
* If we containerize it, we hava a single image to deploy 
* T actually containerize it, go back to the folder, where service.py and bentofile.yaml are stores, then in the terminal: ```bentoml containerize credit_risk_classifier:tcr675covor57p7e```
* Output: ```Successfully built docker image for "credit_risk_classifier" with tags "credit_risk_classifier:tcr675covor57p7e"
To run your newly built Bento container, pass "credit_risk_classifier:tcr675covor57p7e" to "docker run". For example: "docker run -it --rm -p 3000:3000 credit_risk_classifier:tcr675covor57p7e serve --production".```
* Run ```docker run -it --rm -p 3000:3000 credit_risk_classifier:tcr675covor57p7e``` to start docker. Then we can go to ```localhost:3000``` to see our service

# 7.4 Sending, Receiving and Validation Data

* The service as it is at the moment also gives a response, when the input data is not valid, in the sense that some entry is missing or has a wrong name. To avoid this we use the library ```pydantic```
* A list of input and output dscripters can be found in the documentation: https://docs.bentoml.org/en/latest/reference/api_io_descriptors.html