# Serve Machine Learning Models with Clipper

This notebook walks you through how to serve machine learning models with [Clipper](http://clipper.ai/). Specifically, we cover

- Model training
- Clipper cluster creation
- App creation & model deployment
- Model query
- Model versioning (update & rollback)
- Model replication

Jupyter notebook is great for demo purpose but in production you might want to refactor the code into Python scripts!

## Model training

We have to train a model before we can serve it. In this section we would train a decision tree on the Boston dataset.

In this demo we keep the model in memory but in real-world applications we might spend some time tuning the models and serialize the best one to storage. We can then load that model back into memory when serving.

In [None]:
from sklearn import datasets
from sklearn.tree import DecisionTreeRegressor

# load data
boston = datasets.load_boston()
X = boston.data[:500]
y = boston.target[:500]

# train a classifier
tree_v1 = DecisionTreeRegressor(random_state=2018)
tree_v1.fit(X, y)

## Clipper cluster creation

In [None]:
# First we need to import Clipper
from clipper_admin import ClipperConnection, DockerContainerManager
from clipper_admin.deployers.python import deploy_python_closure

In [None]:
# Create a Clipper connection
clipper_conn = ClipperConnection(DockerContainerManager())

In [None]:
# Start a Clipper cluster or connect to a running one
import requests
try:
    clipper_conn.start_clipper()    
except requests.exceptions.HTTPError:
    clipper_conn.connect()

## App creation & model deployment

In [None]:
# Register an app called 'boston'. This would create a REST endpoint
clipper_conn.register_application(name="boston", input_type="doubles",
                                  default_output="-1.0", slo_micros=100000)

In [None]:
# Access the trained model via closure capture
def predict(inputs):
    global model
    pred = model.predict(inputs)
    return [str(p) for p in pred]

In [None]:
# Point to the tree model
model = tree_v1

In [None]:
# Deploy the 'predict' function as a model
deploy_python_closure(clipper_conn, name="tree-model",
                      version=1, input_type="doubles", func=predict)

In [None]:
# Routes requests for the application 'boston' to the model 'tree-model'
clipper_conn.link_model_to_app(app_name="boston", model_name="tree-model")

## Query model

We can now query the deployed model using either [curl](https://curl.haxx.se/) or within Python, as shown here

In [None]:
import json
inputs = boston.data[-1] # use the last data point for query
headers = {"Content-type": "application/json"}
requests.post("http://localhost:1337/boston/predict", headers=headers,
              data=json.dumps({"input": list(inputs)})).json()

See the **'output'** part? That's the prediction by our model

## Model Replacement

Machine learning models are rarely static. Instead, data science tends to be an iterative process, with new and improved models being developed over time.

Say we found that our tree model is overfitting and we heard that there is a better algorithm called random forest. We A/B tested and found random forest outperform tree. We'd love to replace the existing tree model with a new random forest model. 

Note: I don't do any evaluation here but simply demo model replacement

In [None]:
# we train a simple random forest model first
from sklearn.ensemble import RandomForestRegressor
forest_v1 = RandomForestRegressor(random_state=2018, n_estimators=10, n_jobs=-1)
forest_v1.fit(X, y)

In [None]:
# replace the model in the closure function
model = forest_v1

In [None]:
# Deploy the 'predict' function as a model (to a new container); notice the name change
deploy_python_closure(clipper_conn, name="forest-model",
                      version=1, input_type="doubles", func=predict)

In [None]:
# Register a new app called 'boston-new', since one app can be linked to one model only 
clipper_conn.register_application(name="boston-new", input_type="doubles",
                                  default_output="-1.0", slo_micros=100000)

In [None]:
# Routes requests for the application 'boston-new' to the model 'forest-model'
clipper_conn.link_model_to_app(app_name="boston-new", model_name="forest-model")

In [None]:
# query the new model; notice the end-point change
requests.post("http://localhost:1337/boston-new/predict", headers=headers,
              data=json.dumps({"input": list(inputs)})).json()

The random forest predicts a different house price now

## Model update

Say we heard by including more trees in our random forest we are likely to get better prediction. We tried and found it indeed is the case. We decide to deploy a version 2 of our random forest model.

In [None]:
# train another random forest model with more trees
forest_v2 = RandomForestRegressor(random_state=2018, n_estimators=100, n_jobs=-1)
forest_v2.fit(X, y)

In [None]:
# replace the model in the closure function
model = forest_v2

In [None]:
# Deploy the 'predict' function as a model (to a new container); notice the name change
deploy_python_closure(clipper_conn, name="forest-model",
                      version=2, input_type="doubles", func=predict)

In [None]:
# query the new model: by default the query is routed to the lastest version
requests.post("http://localhost:1337/boston-new/predict", headers=headers,
              data=json.dumps({"input": list(inputs)})).json()

## Model rollback

Say after some time we found random forest model v2 is inferior to v1 on real-world data, and we decide to rollback to model version v1

In [None]:
# rollback
clipper_conn.set_model_version(name='forest-model', version='1')

In [None]:
# notice the prediction is same as forest-model v1
requests.post("http://localhost:1337/boston-new/predict", headers=headers,
              data=json.dumps({"input": list(inputs)})).json()

## Model replication

Many machine learning models are computationally expensive and a single instance of the model may not meet the throughput demands of a serving workload. To increase prediction throughput, you can add additional replicas of a model. 

In [None]:
# add replica to increase throughput
clipper_conn.set_num_replicas('forest-model', num_replicas=10, version='1')

## Stop Clipper

In [None]:
# clipper_conn.stop_all()