# BentoML Example: H2O Classification



**BentoML makes moving trained ML models to production easy:**

* Package models trained with **any ML framework** and reproduce them for model serving in production
* **Deploy anywhere** for online API serving or offline batch serving
* High-Performance API model server with *adaptive micro-batching* support
* Central hub for managing models and deployment process via Web UI and APIs
* Modular and flexible design making it *adaptable to your infrastrcuture*

BentoML is a framework for serving, managing, and deploying machine learning models. It is aiming to bridge the gap between Data Science and DevOps, and enable teams to deliver prediction services in a fast, repeatable, and scalable way.

Before reading this example project, be sure to check out the [Getting started guide](https://github.com/bentoml/BentoML/blob/master/guides/quick-start/bentoml-quick-start-guide.ipynb) to learn about the basic concepts in BentoML.


This notebook demonstrates how to use BentoML to __turn a H2O model into a docker image containing a REST API server__ serving this model, as well as distributing your model as a command line tool or a pip-installable PyPI package.

![Impression](https://www.google-analytics.com/collect?v=1&tid=UA-112879361-3&cid=555&t=event&ec=h2o&ea=h2o-prostate-cancer-classification&dt=h2o-prostate-cancer-classification)

In [1]:
%reload_ext autoreload
%autoreload 2
%matplotlib inline

In [None]:
!pip install -q bentoml "h2o>=3.24.0.2"

In [2]:
import h2o
import bentoml

h2o.init()

Checking whether there is an H2O instance running at http://localhost:54321 ..... not found.
Attempting to start a local H2O server...
  Java Version: java version "9.0.1"; Java(TM) SE Runtime Environment (build 9.0.1+11); Java HotSpot(TM) 64-Bit Server VM (build 9.0.1+11, mixed mode)
  Starting server from /usr/local/anaconda3/envs/dev-py3/lib/python3.7/site-packages/h2o/backend/bin/h2o.jar
  Ice root: /var/folders/kn/xnc9k74x03567n1mx2tfqnpr0000gn/T/tmpi0_3wcnb
  JVM stdout: /var/folders/kn/xnc9k74x03567n1mx2tfqnpr0000gn/T/tmpi0_3wcnb/h2o_bozhaoyu_started_from_python.out
  JVM stderr: /var/folders/kn/xnc9k74x03567n1mx2tfqnpr0000gn/T/tmpi0_3wcnb/h2o_bozhaoyu_started_from_python.err
  Server is running at http://127.0.0.1:54321
Connecting to H2O server at http://127.0.0.1:54321 ... successful.


0,1
H2O cluster uptime:,02 secs
H2O cluster timezone:,America/Los_Angeles
H2O data parsing timezone:,UTC
H2O cluster version:,3.24.0.2
H2O cluster version age:,"1 year, 5 months and 5 days !!!"
H2O cluster name:,H2O_from_python_bozhaoyu_yjtyb8
H2O cluster total nodes:,1
H2O cluster free memory:,4 Gb
H2O cluster total cores:,8
H2O cluster allowed cores:,8


This show case considers prostate cancer data and tries to find an algorithm to prognose a certain phase of cancer. The dataset was collected at the Ohio State University Comprehensive Cancer Center and includes demographic and medical data from each of the 380 patients as well as a classifier identifying if the patients tumor has already penetrated the prostatic capsule. This latter event is a clear sign for an advanced cancer state and also helps the doctor to decide on biopsy and treatment methods.

In this show case a deep learning algorithm is used to classify the tumors of the patients into 'penetrating prostatic capsule' and 'not penetrating prostatic capsule'. 

# Prepare Dataset & Model Training

In [3]:
prostate = h2o.import_file(path="https://raw.githubusercontent.com/multicode/h2o-notebook/master/prostate.csv")
prostate.describe()

Parse progress: |█████████████████████████████████████████████████████████| 100%
Rows:380
Cols:9




Unnamed: 0,ID,CAPSULE,AGE,RACE,DPROS,DCAPS,PSA,VOL,GLEASON
type,int,int,int,int,int,int,real,real,int
mins,1.0,0.0,43.0,0.0,1.0,1.0,0.3,0.0,0.0
mean,190.5,0.4026315789473684,66.03947368421049,1.0868421052631572,2.2710526315789488,1.1078947368421048,15.408631578947375,15.812921052631573,6.3842105263157904
maxs,380.0,1.0,79.0,2.0,4.0,2.0,139.7,97.6,9.0
sigma,109.84079387914127,0.4910743389630552,6.527071269173311,0.3087732580252793,1.0001076181502861,0.3106564493514939,19.99757266856046,18.347619967271175,1.0919533744261092
zeros,0,227,0,3,0,0,0,167,2
missing,0,0,0,0,0,0,0,0,0
0,1.0,0.0,65.0,1.0,2.0,1.0,1.4,0.0,6.0
1,2.0,0.0,72.0,1.0,3.0,2.0,6.7,0.0,7.0
2,3.0,0.0,70.0,1.0,1.0,2.0,4.9,0.0,6.0


In [4]:
# import the deep learning estimator module
from h2o.estimators.deeplearning import H2ODeepLearningEstimator
# transform the target variable into a factor
prostate["CAPSULE"] = prostate["CAPSULE"].asfactor()
# construct and define the estimator object 
model = H2ODeepLearningEstimator(activation = "Tanh", hidden = [10, 10, 10], epochs = 100)
# train the model on the whole prostate dataset
model.train(x = list(set(prostate.columns) - set(["ID","CAPSULE"])), y ="CAPSULE", training_frame = prostate)
model.show()

deeplearning Model Build progress: |██████████████████████████████████████| 100%
Model Details
H2ODeepLearningEstimator :  Deep Learning
Model Key:  DeepLearning_model_python_1600823263720_1


ModelMetricsBinomial: deeplearning
** Reported on train data. **

MSE: 0.13576334084363104
RMSE: 0.36846077246245773
LogLoss: 0.41941284208572555
Mean Per-Class Error: 0.1859289971495206
AUC: 0.8930638334629005
pr_auc: 0.8454837877810907
Gini: 0.786127666925801
Confusion Matrix (Act/Pred) for max f1 @ threshold = 0.6488881940477385: 


0,1,2,3,4
,0.0,1.0,Error,Rate
0,196.0,31.0,0.1366,(31.0/227.0)
1,36.0,117.0,0.2353,(36.0/153.0)
Total,232.0,148.0,0.1763,(67.0/380.0)


Maximum Metrics: Maximum metrics at their respective thresholds



0,1,2,3
metric,threshold,value,idx
max f1,0.6488882,0.7774086,147.0
max f2,0.2166894,0.8598351,236.0
max f0point5,0.8142729,0.8149406,108.0
max accuracy,0.7475682,0.8289474,133.0
max precision,0.9959869,1.0,0.0
max recall,0.0157314,1.0,344.0
max specificity,0.9959869,1.0,0.0
max absolute_mcc,0.7475682,0.6406792,133.0
max min_per_class_accuracy,0.5132417,0.7929515,168.0


Gains/Lift Table: Avg response rate: 40.26 %, avg score: 45.81 %



0,1,2,3,4,5,6,7,8,9,10,11,12,13
,group,cumulative_data_fraction,lower_threshold,lift,cumulative_lift,response_rate,score,cumulative_response_rate,cumulative_score,capture_rate,cumulative_capture_rate,gain,cumulative_gain
,1,0.0105263,0.9934707,2.4836601,2.4836601,1.0,0.9951197,1.0,0.9951197,0.0261438,0.0261438,148.3660131,148.3660131
,2,0.0210526,0.9857142,2.4836601,2.4836601,1.0,0.9898245,1.0,0.9924721,0.0261438,0.0522876,148.3660131,148.3660131
,3,0.0315789,0.9806336,2.4836601,2.4836601,1.0,0.9837304,1.0,0.9895582,0.0261438,0.0784314,148.3660131,148.3660131
,4,0.0421053,0.9765042,2.4836601,2.4836601,1.0,0.9784319,1.0,0.9867766,0.0261438,0.1045752,148.3660131,148.3660131
,5,0.05,0.9711804,2.4836601,2.4836601,1.0,0.9731642,1.0,0.9846273,0.0196078,0.1241830,148.3660131,148.3660131
,6,0.1,0.9362154,2.3529412,2.4183007,0.9473684,0.9512556,0.9736842,0.9679415,0.1176471,0.2418301,135.2941176,141.8300654
,7,0.15,0.9218727,1.6993464,2.1786492,0.6842105,0.9289272,0.8771930,0.9549367,0.0849673,0.3267974,69.9346405,117.8649237
,8,0.2,0.8981038,2.4836601,2.2549020,1.0,0.9116475,0.9078947,0.9441144,0.1241830,0.4509804,148.3660131,125.4901961
,9,0.3,0.7975257,1.8954248,2.1350763,0.7631579,0.8490849,0.8596491,0.9124379,0.1895425,0.6405229,89.5424837,113.5076253



Scoring History: 


0,1,2,3,4,5,6,7,8,9,10,11,12,13
,timestamp,duration,training_speed,epochs,iterations,samples,training_rmse,training_logloss,training_r2,training_auc,training_pr_auc,training_lift,training_classification_error
,2020-09-22 18:07:49,0.000 sec,,0.0,0,0.0,,,,,,,
,2020-09-22 18:07:50,1.478 sec,27338 obs/sec,10.0,1,3800.0,0.4305173,0.5562144,0.2293961,0.7870202,0.7074640,2.4836601,0.2815789
,2020-09-22 18:07:50,1.699 sec,113772 obs/sec,100.0,10,38000.0,0.3684608,0.4194128,0.4355410,0.8930638,0.8454838,2.4836601,0.1763158


Variable Importances: 


0,1,2,3
variable,relative_importance,scaled_importance,percentage
PSA,1.0,1.0,0.2030484
VOL,0.7475824,0.7475824,0.1517954
GLEASON,0.7474829,0.7474829,0.1517752
DPROS,0.7206053,0.7206053,0.1463177
AGE,0.6067061,0.6067061,0.1231907
RACE,0.5727540,0.5727540,0.1162968
DCAPS,0.5298041,0.5298041,0.1075759


In [5]:
predictions=model.predict(prostate)
predictions.show()

deeplearning prediction progress: |███████████████████████████████████████| 100%


predict,p0,p1
0,0.838943,0.161057
0,0.750357,0.249643
0,0.964811,0.0351892
0,0.748095,0.251905
0,0.988347,0.0116525
1,0.0284198,0.97158
0,0.537292,0.462708
0,0.927079,0.0729207
0,0.733022,0.266978
0,0.93077,0.0692299


## Define BentoService for model serving

In [6]:
%%writefile h2o_model_service.py
import pandas as pd
import h2o
import bentoml
from bentoml.frameworks.h2o import H2oModelArtifact
from bentoml.adapters import DataframeInput

@bentoml.artifacts([H2oModelArtifact('model')])
@bentoml.env(
    pip_packages=['pandas', 'h2o==3.24.0.2'],
    conda_channels=['h2oai'],
    conda_dependencies=['h2o==3.24.0.2']
)
class H2oModelService(bentoml.BentoService):

    @bentoml.api(input=DataframeInput(), batch=True)
    def predict(self, df):     
        hf = h2o.H2OFrame(df)
        predictions = self.artifacts.model.predict(hf)
        return predictions.as_data_frame()

Overwriting h2o_model_service.py


## Save BentoService to file archive

In [7]:
# 1) import the custom BentoService defined above
from h2o_model_service import H2oModelService

# 2) `pack` it with required artifacts
bento_svc = H2oModelService()
bento_svc.pack('model', model)

# 3) save your BentoSerivce
saved_path = bento_svc.save()
print(saved_path)

[2020-09-22 18:11:22,120] INFO - Using default docker base image: `None` specified inBentoML config file or env var. User must make sure that the docker base image either has Python 3.7 or conda installed.
[2020-09-22 18:11:22,858] INFO - Detected non-PyPI-released BentoML installed, copying local BentoML modulefiles to target saved bundle path..


  normalized_version,
no previously-included directories found matching 'e2e_tests'
no previously-included directories found matching 'tests'
no previously-included directories found matching 'benchmark'


UPDATING BentoML-0.9.0rc0+3.gcebf2015/bentoml/_version.py
set BentoML-0.9.0rc0+3.gcebf2015/bentoml/_version.py to '0.9.0.pre+3.gcebf2015'
[2020-09-22 18:11:26,714] INFO - BentoService bundle 'H2oModelService:20200922181122_181C0D' saved to: /Users/bozhaoyu/bentoml/repository/H2oModelService/20200922181122_181C0D
/Users/bozhaoyu/bentoml/repository/H2oModelService/20200922181122_181C0D


## REST API Model Serving


To start a REST API model server with the BentoService saved above, use the bentoml serve command:

In [8]:
!bentoml serve {saved_path}

[2020-09-22 18:22:13,513] INFO - Starting BentoML API server in development mode..
[2020-09-22 18:22:14,814] INFO - Using default docker base image: `None` specified inBentoML config file or env var. User must make sure that the docker base image either has Python 3.7 or conda installed.
Checking whether there is an H2O instance running at http://localhost:54321 . connected.
--------------------------  ---------------------------------------------------
H2O cluster uptime:         14 mins 30 secs
H2O cluster timezone:       America/Los_Angeles
H2O data parsing timezone:  UTC
H2O cluster version:        3.24.0.2
H2O cluster version age:    1 year, 5 months and 5 days !!!
H2O cluster name:           H2O_from_python_bozhaoyu_yjtyb8
H2O cluster total nodes:    1
H2O cluster free memory:    4.000 Gb
H2O cluster total cores:    8
H2O cluster allowed cores:  8
H2O cluster status:         locked, healthy
H2O connection url:         http://localhost:54321
H2O connection proxy:
H2O internal secu

If you are running this notebook from Google Colab, you can start the dev server with `--run-with-ngrok` option, to gain acccess to the API endpoint via a public endpoint managed by [ngrok](https://ngrok.com/):

In [None]:
!bentoml serve H2oModelService:latest --run-with-ngrok

#### Send prediction request to REST API server

Run the following command in terminal to make a HTTP request to the API server:
```bash
curl -i \
--header "Content-Type: text/csv" \
--request POST \
--data 'ID,CAPSULE,AGE,RACE,DPROS,DCAPS,PSA,VOL,GLEASON\n
1,0,65,1,2,1,1.4,0,6\n
2,0,72,1,3,2,6.7,0,7\n' \
localhost:5000/predict
```


## Containerize model server with Docker


One common way of distributing this model API server for production deployment, is via Docker containers. And BentoML provides a convenient way to do that.

Note that docker is **not available in Google Colab**. You will need to download and run this notebook locally to try out this containerization with docker feature.

If you already have docker configured, simply run the follow command to product a docker container serving the IrisClassifier prediction service created above:

In [None]:
!bentoml containerize H2oModelService:latest

In [None]:
!docker run -p 5000:5000 h2omodelservice

## Load saved BentoService

bentoml.load is the API for loading a BentoML packaged model in python:

In [10]:
import bentoml
import pandas as pd

# Load saved BentoService archive from file directory
loaded_bento_svc = bentoml.load(saved_path)

# Access the predict function of loaded BentoService
df = pd.read_csv("https://raw.githubusercontent.com/multicode/h2o-notebook/master/prostate.csv")
loaded_bento_svc.predict(df)

Checking whether there is an H2O instance running at http://localhost:54321 . connected.


0,1
H2O cluster uptime:,17 mins 42 secs
H2O cluster timezone:,America/Los_Angeles
H2O data parsing timezone:,UTC
H2O cluster version:,3.24.0.2
H2O cluster version age:,"1 year, 5 months and 5 days !!!"
H2O cluster name:,H2O_from_python_bozhaoyu_yjtyb8
H2O cluster total nodes:,1
H2O cluster free memory:,4.000 Gb
H2O cluster total cores:,8
H2O cluster allowed cores:,8


Parse progress: |█████████████████████████████████████████████████████████| 100%
deeplearning prediction progress: |███████████████████████████████████████| 100%


Unnamed: 0,predict,p0,p1
0,0,0.838943,0.161057
1,0,0.750357,0.249643
2,0,0.964811,0.035189
3,0,0.748095,0.251905
4,0,0.988347,0.011653
5,1,0.028420,0.971580
6,0,0.537292,0.462708
7,0,0.927079,0.072921
8,0,0.733022,0.266978
9,0,0.930770,0.069230


## Launch inference job from CLI

BentoML cli supports loading and running a packaged model from CLI. With the DataframeInput adapter, the CLI command supports reading input Dataframe data from CLI argument or local csv or json files:

In [11]:
!wget https://raw.githubusercontent.com/multicode/h2o-notebook/master/prostate.csv
!bentoml run H2oModelService:latest predict \
    --input-file prostate.csv

[2020-09-22 18:25:34,616] INFO - Getting latest version H2oModelService:20200922181122_181C0D
[2020-09-22 18:25:35,660] INFO - Using default docker base image: `None` specified inBentoML config file or env var. User must make sure that the docker base image either has Python 3.7 or conda installed.
Checking whether there is an H2O instance running at http://localhost:54321 . connected.
--------------------------  ---------------------------------------------------
H2O cluster uptime:         17 mins 51 secs
H2O cluster timezone:       America/Los_Angeles
H2O data parsing timezone:  UTC
H2O cluster version:        3.24.0.2
H2O cluster version age:    1 year, 5 months and 5 days !!!
H2O cluster name:           H2O_from_python_bozhaoyu_yjtyb8
H2O cluster total nodes:    1
H2O cluster free memory:    4.000 Gb
H2O cluster total cores:    8
H2O cluster allowed cores:  8
H2O cluster status:         locked, healthy
H2O connection url:         http://localhost:54321
H2O connection proxy:
H2O in

# Deployment Options

If you are at a small team with limited engineering or DevOps resources, try out automated deployment with BentoML CLI, currently supporting AWS Lambda, AWS SageMaker, and Azure Functions:
- [AWS Lambda Deployment Guide](https://docs.bentoml.org/en/latest/deployment/aws_lambda.html)
- [AWS SageMaker Deployment Guide](https://docs.bentoml.org/en/latest/deployment/aws_sagemaker.html)
- [Azure Functions Deployment Guide](https://docs.bentoml.org/en/latest/deployment/azure_functions.html)

If the cloud platform you are working with is not on the list above, try out these step-by-step guide on manually deploying BentoML packaged model to cloud platforms:
- [AWS ECS Deployment](https://docs.bentoml.org/en/latest/deployment/aws_ecs.html)
- [Google Cloud Run Deployment](https://docs.bentoml.org/en/latest/deployment/google_cloud_run.html)
- [Azure container instance Deployment](https://docs.bentoml.org/en/latest/deployment/azure_container_instance.html)
- [Heroku Deployment](https://docs.bentoml.org/en/latest/deployment/heroku.html)

Lastly, if you have a DevOps or ML Engineering team who's operating a Kubernetes or OpenShift cluster, use the following guides as references for implementating your deployment strategy:
- [Kubernetes Deployment](https://docs.bentoml.org/en/latest/deployment/kubernetes.html)
- [Knative Deployment](https://docs.bentoml.org/en/latest/deployment/knative.html)
- [Kubeflow Deployment](https://docs.bentoml.org/en/latest/deployment/kubeflow.html)
- [KFServing Deployment](https://docs.bentoml.org/en/latest/deployment/kfserving.html)
- [Clipper.ai Deployment Guide](https://docs.bentoml.org/en/latest/deployment/clipper.html)