## Dynamic Inference:
Inference is the term used to describe the process of using a pre-trained model to make predictions for unseen data.
Dynamic Inference is the term used to describe making predictions on demand, using a server. 

The tutorial below demonstrates how to serve our Lending Club model trained earlier using a low latency prediction servering system called **clipper** ([docs](http://clipper.ai/), [implementation](https://github.com/ucbrise/clipper)). **clipper** can be hosted on your favorite cloud provider or on-prem.

In [1]:
import logging, xgboost as xgb, numpy as np
from sklearn.metrics import mean_absolute_error
import joblib
import pandas as pd
from datetime import datetime
import pickle
import time
import matplotlib.pyplot as plt
plt.show(block=True)

from clipper_admin import ClipperConnection, DockerContainerManager
clipper_conn = ClipperConnection(DockerContainerManager())
print("Start Clipper...")
clipper_conn.start_clipper()
print("Register Clipper application...")
clipper_conn.register_application('xgboost-airlines', 'doubles', 'default_pred', 100000)

18-08-16:22:43:45 INFO     [docker_container_manager.py:151] [default-cluster] Starting managed Redis instance in Docker


Start Clipper...


18-08-16:22:43:49 INFO     [docker_container_manager.py:229] [default-cluster] Metric Configuration Saved at /private/var/folders/kv/w56d6z9j4c79zvw8c8jsn6hw0000gn/T/tmp74l30e1z.yml
18-08-16:22:43:50 INFO     [clipper_admin.py:138] [default-cluster] Clipper is running
18-08-16:22:43:50 INFO     [clipper_admin.py:215] [default-cluster] Application xgboost-airlines was successfully registered


Register Clipper application...


In [2]:
training_examples = pd.read_pickle("../data/processed/airlines_training_examples.pkl")
f1=open("../data/processed/airlines_training_targets.pkl",'rb')
training_targets = pickle.load(f1) 
f1.close()
test_examples = pd.read_pickle("../data/processed/airlines_test_examples.pkl")

def get_train_points():
     return training_examples.values.tolist()

def get_test_points(start_row_index,end_row_index):
    return test_examples.iloc[start_row_index:end_row_index].values.tolist()

def get_test_point(row_index):
     return test_examples.iloc[row_index].tolist()

In [3]:
# Create a training matrix.
dtrain = xgb.DMatrix(get_train_points(), label=training_targets)
# We then create parameters, watchlist, and specify the number of rounds
# This is code that we use to build our XGBoost Model, and your code may differ.
param = {'max_depth': 2, 'eta': 1, 'silent': 1, 'objective': 'binary:logistic'}
watchlist = [(dtrain, 'train')]
num_round = 2
bst = xgb.train(param, dtrain, num_round, watchlist)

[0]	train-error:0.378541
[1]	train-error:0.368341


In [4]:
def predict(xs):
    result = bst.predict(xgb.DMatrix(xs))
    return result 
# make predictions
predictions = predict(test_examples.values)
print("Predict instances in test set using custom defined scoring function...")
predictions

Predict instances in test set using custom defined scoring function...


array([0.8778308 , 0.8778308 , 0.86350435, ..., 0.86350435, 0.8778308 ,
       0.8778308 ], dtype=float32)

In [5]:
from clipper_admin.deployers import python as python_deployer
# We specify which packages to install in the pkgs_to_install arg.
# For example, if we wanted to install xgboost and psycopg2, we would use
# pkgs_to_install = ['xgboost', 'psycopg2']
print("Deploy predict function closure using Clipper...")
python_deployer.deploy_python_closure(clipper_conn, name='xgboost-model', version=1,
    input_type="doubles", func=predict, pkgs_to_install=['xgboost'])

18-08-16:22:43:52 INFO     [deployer_utils.py:41] Saving function to /var/folders/kv/w56d6z9j4c79zvw8c8jsn6hw0000gn/T/tmpessyhz0pclipper
18-08-16:22:43:52 INFO     [deployer_utils.py:51] Serialized and supplied predict function
18-08-16:22:43:52 INFO     [python.py:192] Python closure saved
18-08-16:22:43:52 INFO     [python.py:206] Using Python 3.6 base image
18-08-16:22:43:52 INFO     [clipper_admin.py:467] [default-cluster] Building model Docker image with model data from /var/folders/kv/w56d6z9j4c79zvw8c8jsn6hw0000gn/T/tmpessyhz0pclipper


Deploy predict function closure using Clipper...


18-08-16:22:43:53 INFO     [clipper_admin.py:472] [default-cluster] Step 1/3 : FROM clipper/python36-closure-container:develop
18-08-16:22:43:53 INFO     [clipper_admin.py:472] [default-cluster]  ---> 0fac6e6e8242
18-08-16:22:43:53 INFO     [clipper_admin.py:472] [default-cluster] Step 2/3 : RUN apt-get -y install build-essential && pip install xgboost
18-08-16:22:43:53 INFO     [clipper_admin.py:472] [default-cluster]  ---> Using cache
18-08-16:22:43:53 INFO     [clipper_admin.py:472] [default-cluster]  ---> 761b4e2e5cea
18-08-16:22:43:53 INFO     [clipper_admin.py:472] [default-cluster] Step 3/3 : COPY /var/folders/kv/w56d6z9j4c79zvw8c8jsn6hw0000gn/T/tmpessyhz0pclipper /model/
18-08-16:22:43:53 INFO     [clipper_admin.py:472] [default-cluster]  ---> e271dde65415
18-08-16:22:43:53 INFO     [clipper_admin.py:472] [default-cluster] Successfully built e271dde65415
18-08-16:22:43:53 INFO     [clipper_admin.py:472] [default-cluster] Successfully tagged default-cluster-xgboost-model:1
18-08

In [6]:
print("Link Clipper connection to model application...")
clipper_conn.link_model_to_app('xgboost-airlines', 'xgboost-model')

18-08-16:22:44:01 INFO     [clipper_admin.py:277] [default-cluster] Model xgboost-model is now linked to application xgboost-airlines


Link Clipper connection to model application...


In [7]:
import requests, json
# Get Address
addr = clipper_conn.get_query_addr()
print("Model predict for a single instance via Python requests POST request & parse response...")

# Post Query
response = requests.post(
     "http://%s/%s/predict" % (addr, 'xgboost-airlines'),
     headers={"Content-type": "application/json"},
     data=json.dumps({
         'input': get_test_point(0)
     }))
result = response.json() 
result

Model predict for a single instance via Python requests POST request & parse response...


{'query_id': 0, 'output': 0.8778308, 'default': False}

In [8]:
import requests, json, numpy as np
print("Model predict for a single instance via Python requests POST request...")
headers = {"Content-type": "application/json"}
requests.post("http://localhost:1337/xgboost-airlines/predict", headers=headers, data=json.dumps({"input": get_test_point(0)})).json()

Model predict for a single instance via Python requests POST request...


{'query_id': 1, 'output': 0.8778308, 'default': False}

In [9]:
import requests, json, numpy as np
print("Model predict for a batch of instances via Python requests POST request...")
headers = {"Content-type": "application/json"}
requests.post("http://localhost:1337/xgboost-airlines/predict", headers=headers, data=json.dumps({"input_batch": get_test_points(0,2)})).json()

Model predict for a batch of instances via Python requests POST request...


{'batch_predictions': [{'query_id': 2, 'output': 0.8778308, 'default': False},
  {'query_id': 3, 'output': 0.8778308, 'default': False}]}

In [10]:
get_test_point(0)
print("Model predict for a single instance via curl...")
!curl -X POST --header "Content-Type:application/json" -d '{"input": [16.0, 1995.0, 1.0, 1.0, 257.0, 1670.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]}' 127.0.0.1:1337/xgboost-airlines/predict

Model predict for a single instance via curl...
{"query_id":4,"output":"default_pred","default":true,"default_explanation":"Failed to retrieve a prediction response within the specified latency SLO"}

If you want to get details...

In [11]:
# todo: insert link to clipper troubleshooting
# clipper_conn.inspect_instance()
# clipper_conn.get_clipper_logs()

In [12]:
print("Shutting down Clipper connection.")
clipper_conn.stop_all()

Shutting down Clipper connection.


18-08-16:23:17:52 INFO     [clipper_admin.py:1278] [default-cluster] Stopped all Clipper cluster and all model containers


In [13]:
# stop all containers:
!docker rm $(docker ps -a -q)

32d06b2599e6
52087d45eecc
b145c468af6a
502fa77bee03
2c859fbc9709
13877d0f7df0


In [None]:
!docker ps

In [None]:
# stop all containers:
# docker kill $(docker ps -q)

# remove all containers
# !docker rm $(docker ps -a -q)

# remove all docker images
# docker rmi $(docker images -q)