# BentoML Clipper Deployment Guide

Clipper(http://clipper.ai/) is a low-latency prediction serving system for machine learning. 

It provides a powerful way to orchastrate ML model containers and supports features such as [micro batching](https://www.usenix.org/system/files/conference/nsdi17/nsdi17-crankshaw.pdf) which is critical for building low latency online model serving systems.

BentoML makes it easier to build custom containers that can be deployed to Clipper, users can easily add Clipper specify API handlers to their prediction service created with BentoML, and deploy them into clipper cluster. In this guide, we will demonstrate how to deploy a scikit-learn model to clipper, using BentoML.

In [2]:
%reload_ext autoreload
%autoreload 2

In [None]:
!pip install bentoml clipper_admin
!pip install pandas sklearn

Train a Iris classifier model:

In [3]:
from sklearn import svm
from sklearn import datasets

clf = svm.SVC()
iris = datasets.load_iris()
X, y = iris.data, iris.target
clf.fit(X, y)



SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0,
  decision_function_shape='ovr', degree=3, gamma='auto_deprecated',
  kernel='rbf', max_iter=-1, probability=False, random_state=None,
  shrinking=True, tol=0.001, verbose=False)

BentoML provides handler types that are specific for use with Clipper, including ```ClipperBytesHandler```, ```ClipperIntsHandler```, ```ClipperFloatsHandler```, ```ClipperDoublesHandler```, ```ClipperStringsHandler``` each corresponding to one input type that clipper support.

Other than using Clipper specific handler, the rest are the same as defining a regular BentoService class:

In [4]:
%%writefile iris_classifier.py
from bentoml import BentoService, api, env, artifacts
from bentoml.artifact import PickleArtifact
from bentoml.handlers import DataframeHandler, ClipperFloatsHandler

@artifacts([PickleArtifact('model')])
@env(pip_dependencies=["scikit-learn"])
class IrisClassifier(BentoService):

    @api(DataframeHandler)
    def predict(self, df):
        return self.artifacts.model.predict(df)
    
    @api(ClipperFloatsHandler)
    def predict_clipper(self, inputs):
        return self.artifacts.model.predict(inputs)

Overwriting iris_classifier.py


In [5]:
# 1) import the custom BentoService defined above
from iris_classifier import IrisClassifier

# 2) `pack` it with required artifacts
svc = IrisClassifier()
svc.pack('model', clf)

# 3) save packed BentoService as archive
saved_path = svc.save()

running sdist
running egg_info
writing requirements to BentoML.egg-info/requires.txt
writing BentoML.egg-info/PKG-INFO
writing top-level names to BentoML.egg-info/top_level.txt
writing dependency_links to BentoML.egg-info/dependency_links.txt
writing entry points to BentoML.egg-info/entry_points.txt
reading manifest file 'BentoML.egg-info/SOURCES.txt'
reading manifest template 'MANIFEST.in'


no previously-included directories found matching 'examples'
no previously-included directories found matching 'tests'
no previously-included directories found matching 'docs'


writing manifest file 'BentoML.egg-info/SOURCES.txt'
running check
creating BentoML-0.4.9+7.g429b9ec.dirty
creating BentoML-0.4.9+7.g429b9ec.dirty/BentoML.egg-info
creating BentoML-0.4.9+7.g429b9ec.dirty/bentoml
creating BentoML-0.4.9+7.g429b9ec.dirty/bentoml/archive
creating BentoML-0.4.9+7.g429b9ec.dirty/bentoml/artifact
creating BentoML-0.4.9+7.g429b9ec.dirty/bentoml/cli
creating BentoML-0.4.9+7.g429b9ec.dirty/bentoml/clipper
creating BentoML-0.4.9+7.g429b9ec.dirty/bentoml/configuration
creating BentoML-0.4.9+7.g429b9ec.dirty/bentoml/deployment
creating BentoML-0.4.9+7.g429b9ec.dirty/bentoml/deployment/sagemaker
creating BentoML-0.4.9+7.g429b9ec.dirty/bentoml/deployment/serverless
creating BentoML-0.4.9+7.g429b9ec.dirty/bentoml/handlers
creating BentoML-0.4.9+7.g429b9ec.dirty/bentoml/migrations
creating BentoML-0.4.9+7.g429b9ec.dirty/bentoml/migrations/versions
creating BentoML-0.4.9+7.g429b9ec.dirty/bentoml/proto
creating BentoML-0.4.9+7.g429b9ec.dirty/bentoml/repository
creating B

copying bentoml/server/__init__.py -> BentoML-0.4.9+7.g429b9ec.dirty/bentoml/server
copying bentoml/server/bento_api_server.py -> BentoML-0.4.9+7.g429b9ec.dirty/bentoml/server
copying bentoml/server/bento_sagemaker_server.py -> BentoML-0.4.9+7.g429b9ec.dirty/bentoml/server
copying bentoml/server/gunicorn_server.py -> BentoML-0.4.9+7.g429b9ec.dirty/bentoml/server
copying bentoml/server/metrics.py -> BentoML-0.4.9+7.g429b9ec.dirty/bentoml/server
copying bentoml/server/utils.py -> BentoML-0.4.9+7.g429b9ec.dirty/bentoml/server
copying bentoml/server/static/swagger-ui-bundle.js -> BentoML-0.4.9+7.g429b9ec.dirty/bentoml/server/static
copying bentoml/server/static/swagger-ui.css -> BentoML-0.4.9+7.g429b9ec.dirty/bentoml/server/static
copying bentoml/utils/__init__.py -> BentoML-0.4.9+7.g429b9ec.dirty/bentoml/utils
copying bentoml/utils/cloudpickle.py -> BentoML-0.4.9+7.g429b9ec.dirty/bentoml/utils
copying bentoml/utils/hybirdmethod.py -> BentoML-0.4.9+7.g429b9ec.dirty/bentoml/utils
copying be

In [6]:
# Test the clipper handler directly with list of floats as input
svc.predict_clipper([X[0]])

array([0])

### Deploy BentoService bundle to Clipper cluster

The sample code below assumes you have docker setup and starts a local Clipper cluster using Docker.

In [7]:
from clipper_admin import ClipperConnection, DockerContainerManager
cl = ClipperConnection(DockerContainerManager())
cl.start_clipper()

19-11-13:15:43:33 INFO     [docker_container_manager.py:184] [default-cluster] Starting managed Redis instance in Docker
19-11-13:15:43:37 INFO     [docker_container_manager.py:276] [default-cluster] Metric Configuration Saved at /private/var/folders/ns/vc9qhmqx5dx_9fws7d869lqh0000gn/T/tmp_V3qv1.yml
19-11-13:15:43:38 INFO     [clipper_admin.py:162] [default-cluster] Clipper is running


In [8]:
# We will register it to deploy our BentoService
cl.register_application('bentoml-test', 'floats', 'default_pred', 100000)

19-11-13:15:43:58 INFO     [clipper_admin.py:236] [default-cluster] Application bentoml-test was successfully registered


Now you can deploy the saved BentoService bundle using this clipper connection and BentoML's  ```bentoml.clipper.deploy_bentoml``` API, which will first build a clipper model docker image containing your BentoService and then deploy it to the cluster:

In [9]:
from bentoml.clipper import deploy_bentoml

clipper_model_name, clipper_model_version = deploy_bentoml(cl, saved_path, "predict_clipper")

[2019-11-13 15:45:49,772] INFO - Step 1/10 : FROM clipper/python-closure-container:0.4.1
[2019-11-13 15:45:49,775] INFO - 

[2019-11-13 15:45:49,777] INFO -  ---> e9b89c285ef8

[2019-11-13 15:45:49,780] INFO - Step 2/10 : COPY . /container
[2019-11-13 15:45:49,782] INFO - 

[2019-11-13 15:45:50,162] INFO -  ---> f402705036ff

[2019-11-13 15:45:50,164] INFO - Step 3/10 : WORKDIR /container
[2019-11-13 15:45:50,166] INFO - 

[2019-11-13 15:45:50,265] INFO -  ---> Running in 2de084367c25

[2019-11-13 15:45:50,590] INFO -  ---> ba98224ca802

[2019-11-13 15:45:50,592] INFO - Step 4/10 : RUN pip install --upgrade numpy && pip install -r /container/bento/requirements.txt
[2019-11-13 15:45:50,595] INFO - 

[2019-11-13 15:45:50,756] INFO -  ---> Running in 400dba1836e1

[2019-11-13 15:45:51,556] INFO - [91mDEPRECATION: Python 2.7 will reach the end of its life on January 1st, 2020. Please upgrade your Python as Python 2.7 won't be maintained after that date. A future version of pip will drop s

[2019-11-13 15:46:07,324] INFO - Collecting grpcio (from bentoml==0.4.9->-r /container/bento/requirements.txt (line 1))

[2019-11-13 15:46:08,219] INFO -   Downloading https://files.pythonhosted.org/packages/0c/47/35cc9f6fd43f8e5ed74fcc6dd8a0cb2e89c118dd3ef7a8ff25e65bf0909f/grpcio-1.25.0-cp27-cp27mu-manylinux2010_x86_64.whl (2.4MB)

[2019-11-13 15:46:08,572] INFO - Collecting cerberus (from bentoml==0.4.9->-r /container/bento/requirements.txt (line 1))

[2019-11-13 15:46:08,605] INFO -   Downloading https://files.pythonhosted.org/packages/90/a7/71c6ed2d46a81065e68c007ac63378b96fa54c7bb614d653c68232f9c50c/Cerberus-1.3.2.tar.gz (52kB)

[2019-11-13 15:46:08,823] INFO - Collecting tabulate (from bentoml==0.4.9->-r /container/bento/requirements.txt (line 1))

[2019-11-13 15:46:08,858] INFO -   Downloading https://files.pythonhosted.org/packages/66/d4/977fdd5186b7cdbb7c43a7aac7c5e4e0337a84cb802e154616f3cfc84563/tabulate-0.8.5.tar.gz (45kB)

[2019-11-13 15:46:09,079] INFO - Collecting humanfr

[2019-11-13 15:46:18,058] INFO -   Downloading https://files.pythonhosted.org/packages/ac/aa/063eca6a416f397bd99552c534c6d11d57f58f2e94c14780f3bbf818c4cf/monotonic-1.5-py2.py3-none-any.whl

[2019-11-13 15:46:18,099] INFO - Collecting Mako (from alembic->bentoml==0.4.9->-r /container/bento/requirements.txt (line 1))

[2019-11-13 15:46:18,148] INFO -   Downloading https://files.pythonhosted.org/packages/b0/3c/8dcd6883d009f7cae0f3157fb53e9afb05a0d3d33b3db1268ec2e6f4a56b/Mako-1.1.0.tar.gz (463kB)

[2019-11-13 15:46:18,538] INFO - Collecting python-editor>=0.3 (from alembic->bentoml==0.4.9->-r /container/bento/requirements.txt (line 1))

[2019-11-13 15:46:18,570] INFO -   Downloading https://files.pythonhosted.org/packages/55/a0/3c0ba1c10f2ca381645dd46cb7afbb73fddc8de9f957e1f9e726a846eabc/python_editor-1.0.4-py2-none-any.whl


[2019-11-13 15:46:18,608] INFO - Collecting docutils<0.16,>=0.10 (from botocore<1.14.0,>=1.13.17->boto3->bentoml==0.4.9->-r /container/bento/requirements.txt (line 1)












































[2019-11-13 15:46:40,390] INFO - Building wheels for collected packages: BentoML

[2019-11-13 15:46:40,393] INFO -   Building wheel for BentoML (setup.py): started

[2019-11-13 15:46:40,930] INFO -   Building wheel for BentoML (setup.py): finished with status 'done'

[2019-11-13 15:46:40,932] INFO -   Stored in directory: /root/.cache/pip/wheels/da/67/b3/c91c998ab11d6af43b3c59901aedd6356bc515fe408b4b5a96

[2019-11-13 15:46:40,966] INFO - Successfully built BentoML

[2019-11-13 15:46:41,295] INFO - Installing collected packages: BentoML

[2019-11-13 15:46:41,296] INFO -   Found existing installation: BentoML 0.4.9

[2019-11-13 15:46:41,316] INFO -     Uninstalling BentoML-0.4.9:

[2019-11-13 15:46:41,358] INFO -       Successfully uninstalled BentoML-0.4.9

[2019-11-13 15:46:41,485] INFO - Successfully installed BentoML-0.4.9+7.g429b9ec.dirty

You should consider upgrading via the 'pip install --upgrade pip' command.
[0m
[2019-11-13 15:46:42,460] INFO -  ---> 863da1bf0896

[2019-11-13

19-11-13:15:46:45 INFO     [docker_container_manager.py:409] [default-cluster] Found 0 replicas for irisclassifier-predict-clipper:20191113154121-e7d3ce. Adding 1
19-11-13:15:46:46 INFO     [clipper_admin.py:724] [default-cluster] Successfully registered model irisclassifier-predict-clipper:20191113154121-e7d3ce
19-11-13:15:46:46 INFO     [clipper_admin.py:642] [default-cluster] Done deploying model irisclassifier-predict-clipper:20191113154121-e7d3ce.


List all models in your clipper cluster:

In [10]:
cl.get_all_models()

[u'irisclassifier-predict-clipper:20191113154121-e7d3ce']

Link this model to the bentoml-test application created above:

In [11]:
cl.link_model_to_app('bentoml-test', clipper_model_name)

19-11-13:15:47:05 INFO     [clipper_admin.py:303] [default-cluster] Model irisclassifier-predict-clipper is now linked to application bentoml-test


Now you can test sending prediction request to your clipper application:

In [12]:
import requests, json

# Get Address
addr = cl.get_query_addr()
# Post Query
response = requests.post(
     "http://%s/%s/predict" % (addr, 'bentoml-test'),
     headers={"Content-type": "application/json"},
     data=json.dumps({
         'input': [6.5, 3.0 , 5.8, 2.2]
     }))

result = response.json()
if response.status_code == requests.codes.ok and result["default"]:
    print('A default prediction was returned.')
    print(result)

elif response.status_code != requests.codes.ok:
    print(result)
#     raise BenchmarkException(response.text)
else:
    print('Prediction Returned:', result)

('Prediction Returned:', {u'default': False, u'output': 2, u'query_id': 0})
