# BentoML Example: Fast AI with Tabular data

BentoML is an open-source framework for machine learning **model serving**, aiming to **bridge the gap between Data Science and DevOps.**

Data Scientists can easily package their models trained with any ML framework using BentoMl and reproduce the model for serving in production. BentoML helps with managing packaged models in the BentoML format, and allows DevOps to deploy them as online API serving endpoints or offline batch inference jobs, on any cloud platform.

Before reading this example project, be sure to check out the [Getting started guide](https://github.com/bentoml/BentoML/blob/master/guides/quick-start/bentoml-quick-start-guide.ipynb) to learn about the basic concepts in BentoML.


This notebook is based on fastai v1's cours v3 lesson 4.  It will train a model that predict salary range base on the data we provided.


![Impression](https://www.google-analytics.com/collect?v=1&tid=UA-112879361-3&cid=555&t=event&ec=fast-ai&ea=fast-ai-salary-range-prediction&dt=fast-ai-salary-range-prediction)

In [1]:
%reload_ext autoreload
%autoreload 2
%matplotlib inline

In [3]:
!pip install -q -U 'fastai<=1.0.61'

You should consider upgrading via the '/usr/local/anaconda3/envs/dev-py3/bin/python -m pip install --upgrade pip' command.[0m


In [4]:
from fastai.tabular import *

## Prepare Training Data

In [5]:
path = untar_data(URLs.ADULT_SAMPLE)
df = pd.read_csv(path/'adult.csv')

In [6]:
dep_var = 'salary'
cat_names = ['workclass', 'education', 'marital-status', 'occupation', 'relationship', 'race']
cont_names = ['age', 'fnlwgt', 'education-num']
procs = [FillMissing, Categorify, Normalize]

In [7]:
test = TabularList.from_df(df.iloc[800:1000].copy(), path=path, cat_names=cat_names, cont_names=cont_names)

In [8]:
data = (TabularList.from_df(df, path=path, cat_names=cat_names, cont_names=cont_names, procs=procs)
                           .split_by_idx(list(range(800,1000)))
                           .label_from_df(cols=dep_var)
                           .add_test(test)
                           .databunch())

In [9]:
data.show_batch(rows=10)

workclass,education,marital-status,occupation,relationship,race,education-num_na,age,fnlwgt,education-num,target
Private,Assoc-voc,Married-civ-spouse,Adm-clerical,Other-relative,White,False,-1.2891,-0.4098,0.3599,<50k
Private,Some-college,Never-married,Other-service,Own-child,White,False,-1.509,-1.2826,-0.0312,<50k
Local-gov,12th,Married-civ-spouse,Adm-clerical,Wife,White,False,-0.6294,-1.3385,-0.8135,<50k
Private,HS-grad,Married-civ-spouse,Other-service,Wife,White,False,-0.1163,-0.2701,-0.4224,>=50k
Private,Bachelors,Never-married,Exec-managerial,Not-in-family,White,False,-0.6294,0.1592,1.1422,<50k
?,10th,Divorced,?,Not-in-family,White,False,2.4491,0.7739,-1.5958,<50k
Private,HS-grad,Married-civ-spouse,Transport-moving,Own-child,Other,False,-0.9959,-0.7288,-0.4224,<50k
Self-emp-not-inc,HS-grad,Married-spouse-absent,Craft-repair,Not-in-family,White,False,0.6166,-0.5219,-0.4224,<50k
Local-gov,Assoc-acdm,Married-civ-spouse,Protective-serv,Husband,White,False,-0.776,-0.1371,0.7511,<50k
Self-emp-not-inc,Bachelors,Married-civ-spouse,Exec-managerial,Own-child,White,False,0.6166,-0.6752,1.1422,<50k


## Model Training

In [10]:
learn = tabular_learner(data, layers=[200,100], metrics=accuracy)

In [11]:
learn.fit(1, 1e-2)

epoch,train_loss,valid_loss,accuracy,time
0,0.362504,0.399898,0.835,00:04


In [12]:
row = df.iloc[0] # sample input date for testing

learn.predict(row)

(Category tensor(0), tensor(0), tensor([0.5268, 0.4732]))

## Create BentoService for model serving

In [13]:
%%writefile tabular_csv.py

from bentoml import env, api, artifacts, BentoService
from bentoml.frameworks.fastai import FastaiModelArtifact
from bentoml.adapters import DataframeInput


@env(pip_packages=['fastai'])
@artifacts([FastaiModelArtifact('model')])
class FastaiTabularModel(BentoService):
    
    @api(input=DataframeInput(), batch=True)
    def predict(self, df):
        results = []
        for _, row in df.iterrows():       
            prediction = self.artifacts.model.predict(row)
            results.append(prediction[0].obj)
        return results

Overwriting tabular_csv.py


## Save BentoService to file archive

In [14]:
# 1) import the custom BentoService defined above
from tabular_csv import FastaiTabularModel

# 2) `pack` it with required artifacts
svc = FastaiTabularModel()
svc.pack('model', learn)

# 3) save your BentoSerivce
saved_path = svc.save()

[2020-09-22 16:38:32,325] INFO - Using default docker base image: `None` specified inBentoML config file or env var. User must make sure that the docker base image either has Python 3.7 or conda installed.
[2020-09-22 16:38:33,176] INFO - Detected non-PyPI-released BentoML installed, copying local BentoML modulefiles to target saved bundle path..


  normalized_version,
no previously-included directories found matching 'e2e_tests'
no previously-included directories found matching 'tests'
no previously-included directories found matching 'benchmark'


UPDATING BentoML-0.9.0rc0+3.gcebf2015/bentoml/_version.py
set BentoML-0.9.0rc0+3.gcebf2015/bentoml/_version.py to '0.9.0.pre+3.gcebf2015'
[2020-09-22 16:38:36,944] INFO - BentoService bundle 'FastaiTabularModel:20200922163833_30289D' saved to: /Users/bozhaoyu/bentoml/repository/FastaiTabularModel/20200922163833_30289D


## REST API Model Serving


To start a REST API model server with the BentoService saved above, use the bentoml serve command:

In [15]:
!bentoml serve FastaiTabularModel:latest

[2020-09-22 16:56:10,329] INFO - Getting latest version FastaiTabularModel:20200922163833_30289D
[2020-09-22 16:56:10,330] INFO - Starting BentoML API server in development mode..
[2020-09-22 16:56:11,014] INFO - Using default docker base image: `None` specified inBentoML config file or env var. User must make sure that the docker base image either has Python 3.7 or conda installed.
 * Serving Flask app "FastaiTabularModel" (lazy loading)
 * Environment: production
[2m   Use a production WSGI server instead.[0m
 * Debug mode: off
 * Running on http://127.0.0.1:5000/ (Press CTRL+C to quit)
[2020-09-22 16:56:23,936] INFO - {'service_name': 'FastaiTabularModel', 'service_version': '20200922163833_30289D', 'api': 'predict', 'task': {'data': {}, 'task_id': 'd93bf027-f1db-4eef-bff9-c60e96d394ba', 'batch': 1, 'http_headers': (('Host', 'localhost:5000'), ('User-Agent', 'curl/7.65.3'), ('Accept', '*/*'), ('Content-Type', 'application/json'), ('Content-Length', '370'))}, 'result': {'data': '["

If you are running this notebook from Google Colab, you can start the dev server with `--run-with-ngrok` option, to gain acccess to the API endpoint via a public endpoint managed by [ngrok](https://ngrok.com/):

In [None]:
!bentoml serve FastaiTabularModel:latest --run-with-ngrok

### Send prediction requeset to the REST API server

#### JSON Request

```bash
curl -X POST \
  http://localhost:5000/predict \
  -H 'Content-Type: application/json' \
  -d '[{
  "age": 49,
  "workclass": "Private",
  "fnlwgt": 101320,
  "education": "Assoc-acdm",
  "education-num": 12.0,
  "marital-status": "Married-civ-spouse",
  "occupation": "",
  "relationship": "Wift",
  "race": "White",
  "sex": "Female",
  "capital-gain": 0,
  "capital-loss": 1902,
  "hours-per-week": 40,
  "native-country": "United-States",
  "salary": ">=50k"
}]'
```

#### CSV Request

```bash
curl -X POST "http://127.0.0.1:5000/predict" \
    -H "Content-Type: text/csv" \
    --data-binary @test.csv
```

## Containerize model server with Docker


One common way of distributing this model API server for production deployment, is via Docker containers. And BentoML provides a convenient way to do that.

Note that docker is **not available in Google Colab**. You will need to download and run this notebook locally to try out this containerization with docker feature.

If you already have docker configured, simply run the follow command to product a docker container serving the IrisClassifier prediction service created above:

In [16]:
!bentoml containerize FastaiTabularModel:latest

[2020-09-22 16:56:43,434] INFO - Getting latest version FastaiTabularModel:20200922163833_30289D
[39mFound Bento: /Users/bozhaoyu/bentoml/repository/FastaiTabularModel/20200922163833_30289D[0m
[39mTag not specified, using tag parsed from BentoService: 'fastaitabularmodel:20200922163833_30289D'[0m
Building Docker image fastaitabularmodel:20200922163833_30289D from FastaiTabularModel:latest 
-we in here
processed docker file
(None, None)
root in create archive /Users/bozhaoyu/bentoml/repository/FastaiTabularModel/20200922163833_30289D ['Dockerfile', 'FastaiTabularModel', 'FastaiTabularModel/__init__.py', 'FastaiTabularModel/__pycache__', 'FastaiTabularModel/__pycache__/tabular_csv.cpython-37.pyc', 'FastaiTabularModel/artifacts', 'FastaiTabularModel/artifacts/__init__.py', 'FastaiTabularModel/artifacts/model.pkl', 'FastaiTabularModel/bentoml.yml', 'FastaiTabularModel/tabular_csv.py', 'MANIFEST.in', 'README.md', 'bentoml-init.sh', 'bentoml.yml', 'bundled_pip_dependencies', 'bundled_pip

/[39mCollecting scipy[0m
|[39m  Downloading scipy-1.5.2-cp37-cp37m-manylinux1_x86_64.whl (25.9 MB)[0m
/[39mCollecting beautifulsoup4[0m
[39m  Downloading beautifulsoup4-4.9.1-py3-none-any.whl (115 kB)[0m
|[39mCollecting nvidia-ml-py3[0m
[39m  Downloading nvidia-ml-py3-7.352.0.tar.gz (19 kB)[0m
|[39mCollecting pyyaml[0m
\[39m  Downloading PyYAML-5.3.1.tar.gz (269 kB)[0m
|[39mCollecting matplotlib[0m
[39m  Downloading matplotlib-3.3.2-cp37-cp37m-manylinux1_x86_64.whl (11.6 MB)[0m
\[39mCollecting Pillow[0m
-[39m  Downloading Pillow-7.2.0-cp37-cp37m-manylinux1_x86_64.whl (2.2 MB)[0m
|[39mCollecting torchvision[0m
[39m  Downloading torchvision-0.7.0-cp37-cp37m-manylinux1_x86_64.whl (5.9 MB)[0m
\[39mCollecting fastprogress>=0.2.1[0m
[39m  Downloading fastprogress-1.0.0-py3-none-any.whl (12 kB)[0m
/[39mCollecting spacy>=2.0.18; python_version < "3.8"[0m
|[39m  Downloading spacy-2.3.2-cp37-cp37m-manylinux1_x86_64.whl (9.9 MB)[0m
/[39mCollectin

-[39mCollecting importlib-metadata>=0.20; python_version < "3.8"[0m
[39m  Downloading importlib_metadata-2.0.0-py2.py3-none-any.whl (31 kB)[0m
/[39mCollecting zipp>=0.5[0m
[39m  Downloading zipp-3.2.0-py3-none-any.whl (5.1 kB)[0m
|[39mBuilding wheels for collected packages: nvidia-ml-py3, pyyaml, bottleneck, future[0m
[39m  Building wheel for nvidia-ml-py3 (setup.py): started[0m
/[39m  Building wheel for nvidia-ml-py3 (setup.py): finished with status 'done'[0m
[39m  Created wheel for nvidia-ml-py3: filename=nvidia_ml_py3-7.352.0-py3-none-any.whl size=19191 sha256=9133f42be2d5c905262136fe2f11fafeee49d48405dfeccf433b1a1f6880596c
  Stored in directory: /tmp/pip-ephem-wheel-cache-3pc2f4g3/wheels/df/99/da/c34f202dc8fd1dffd35e0ecf1a7d7f8374ca05fbcbaf974b83[0m
[39m  Building wheel for pyyaml (setup.py): started[0m
\[39m  Building wheel for pyyaml (setup.py): finished with status 'done'[0m
[39m  Created wheel for pyyaml: filename=PyYAML-5.3.1-cp37-cp37m-linux_x86_64.w

[39mBuilding wheels for collected packages: BentoML[0m
[39m  Building wheel for BentoML (PEP 517): started[0m
|[39m  Building wheel for BentoML (PEP 517): finished with status 'done'[0m
[39m  Created wheel for BentoML: filename=BentoML-0.9.0rc0+3.gcebf2015-py3-none-any.whl size=3064091 sha256=762aa6ea85795b1fa82fe6196527c7db0fe6e17da0a48a128cbc2e5b2f846d2d
  Stored in directory: /root/.cache/pip/wheels/a0/45/41/62152db705af4ff47e7a3d6abf6247986eef4aa1b94a58d3b9[0m
[39mSuccessfully built BentoML[0m
\[39mInstalling collected packages: BentoML
  Attempting uninstall: BentoML[0m
[39m    Found existing installation: BentoML 0.9.0rc0[0m
/[39m    Uninstalling BentoML-0.9.0rc0:[0m
\[39m      Successfully uninstalled BentoML-0.9.0rc0[0m
-[39mSuccessfully installed BentoML-0.9.0rc0+3.gcebf2015[0m
\[39m ---> e2f758c32fe8[0m
[39mStep 10/15 : ENV PORT 5000[0m
[39m ---> Running in bcf40c69a9b0[0m
-[39m ---> 3e6f27372f5d[0m
[39mStep 11/15 : EXPOSE $PORT[0m
/[39m

In [17]:
!docker run -p 5000:5000 fastaitabularmodel:20200922163833_30289D

[2020-09-23 00:01:08,992] INFO - Starting BentoML API server in production mode..
[2020-09-23 00:01:09,478] INFO - get_gunicorn_num_of_workers: 3, calculated by cpu count
[2020-09-23 00:01:09 +0000] [1] [INFO] Starting gunicorn 20.0.4
[2020-09-23 00:01:09 +0000] [1] [INFO] Listening at: http://0.0.0.0:5000 (1)
[2020-09-23 00:01:09 +0000] [1] [INFO] Using worker: sync
[2020-09-23 00:01:09 +0000] [12] [INFO] Booting worker with pid: 12
[2020-09-23 00:01:09 +0000] [13] [INFO] Booting worker with pid: 13
[2020-09-23 00:01:09 +0000] [14] [INFO] Booting worker with pid: 14
^C
[2020-09-23 00:01:12 +0000] [1] [INFO] Handling signal: int
[2020-09-23 00:01:12 +0000] [14] [INFO] Worker exiting (pid: 14)
[2020-09-23 00:01:12 +0000] [13] [INFO] Worker exiting (pid: 13)
[2020-09-23 00:01:12 +0000] [12] [INFO] Worker exiting (pid: 12)


## Load saved BentoService

bentoml.load is the API for loading a BentoML packaged model in python:

In [20]:
from bentoml import load

svc = load(saved_path)
print(svc.predict(df.iloc[0:1]))

['<50k']


## Launch inference job from CLI

BentoML cli supports loading and running a packaged model from CLI. With the DataframeInput adapter, the CLI command supports reading input Dataframe data from CLI argument or local csv or json files:

In [None]:
!bentoml run FastaiTabularModel:latest predict \
    --input https://raw.githubusercontent.com/bentoml/gallery/master/fast-ai/salary-range-prediction/test.csv

# Deployment Options

If you are at a small team with limited engineering or DevOps resources, try out automated deployment with BentoML CLI, currently supporting AWS Lambda, AWS SageMaker, and Azure Functions:
- [AWS Lambda Deployment Guide](https://docs.bentoml.org/en/latest/deployment/aws_lambda.html)
- [AWS SageMaker Deployment Guide](https://docs.bentoml.org/en/latest/deployment/aws_sagemaker.html)
- [Azure Functions Deployment Guide](https://docs.bentoml.org/en/latest/deployment/azure_functions.html)

If the cloud platform you are working with is not on the list above, try out these step-by-step guide on manually deploying BentoML packaged model to cloud platforms:
- [AWS ECS Deployment](https://docs.bentoml.org/en/latest/deployment/aws_ecs.html)
- [Google Cloud Run Deployment](https://docs.bentoml.org/en/latest/deployment/google_cloud_run.html)
- [Azure container instance Deployment](https://docs.bentoml.org/en/latest/deployment/azure_container_instance.html)
- [Heroku Deployment](https://docs.bentoml.org/en/latest/deployment/heroku.html)

Lastly, if you have a DevOps or ML Engineering team who's operating a Kubernetes or OpenShift cluster, use the following guides as references for implementating your deployment strategy:
- [Kubernetes Deployment](https://docs.bentoml.org/en/latest/deployment/kubernetes.html)
- [Knative Deployment](https://docs.bentoml.org/en/latest/deployment/knative.html)
- [Kubeflow Deployment](https://docs.bentoml.org/en/latest/deployment/kubeflow.html)
- [KFServing Deployment](https://docs.bentoml.org/en/latest/deployment/kfserving.html)
- [Clipper.ai Deployment Guide](https://docs.bentoml.org/en/latest/deployment/clipper.html)

