# Installing Packages on Your Clipper Container

*This tutorial will assume that you have already installed `clipper_admin` and its dependencies.*

Sometimes users may want to install packages (such as xgboost) on to their containers. Clipper offers users the ability to do so by specifying package names to install (using pip) onto their containers. This tutorial will walk you through that, as well as using XGBoost, which must be installed in this way.

It is neccessary for XGBoost to be installed on the machine running this notebook (and whenever you try and utilize Clipper with an XGBoost model).

In [1]:
import numpy as np
# Define utility function to generate test inputs for models
def get_test_point():
    return [np.random.randint(255) for _ in range(784)]

In [2]:
import os
import sys
import requests
import json
sys.path.insert(0, "../../clipper_admin/")
from clipper_admin.deployers.deployer_utils import save_python_function
sys.path.insert(0, "../../integration-tests/")
# The function below will return to us a valid clipper connection.  
from test_utils import (create_docker_connection, BenchmarkException, headers)
app_name = "xgboost-test"
model_name = "xgboost-model"
try:
    # Create a clipper connection.
    clipper_conn = create_docker_connection(
        cleanup=True, start_clipper=True)
    # Register the connection with a name, input type, default output
    # and query latency output (in ms)
    clipper_conn.register_application(app_name, "integers",
                                     "default_pred", 100000)
    # Get address
    addr = clipper_conn.get_query_addr()
    # Check that it is up
    response = requests.post(
        "http://%s/%s/predict" % (addr, app_name),
        headers=headers,
        data=json.dumps({
            'input': get_test_point()
        }))
    result = response.json()
    if response.status_code != requests.codes.ok:
        print("Error: %s" % response.text)
        raise BenchmarkException("Error creating app %s" % app_name)
except Exception as e:
    raise e

18-03-08:21:45:41 INFO     [test_utils.py:64] Creating DockerContainerManager
18-03-08:21:46:04 INFO     [clipper_admin.py:1206] Stopped all Clipper cluster and all model containers
18-03-08:21:46:04 INFO     [test_utils.py:80] Starting Clipper
18-03-08:21:46:04 INFO     [docker_container_manager.py:106] Starting managed Redis instance in Docker
18-03-08:21:46:08 INFO     [clipper_admin.py:111] Clipper is running
18-03-08:21:46:09 INFO     [clipper_admin.py:186] Application xgboost-test was successfully registered


Now, we have a valid docker container running, registered as `xgboost-test`.

Let's build our XGBoost Model. Your code may be different depending on your use case.

In [3]:
import xgboost as xgb
version = 1
# Create a training matrix
dtrain = xgb.DMatrix(get_test_point(), label=[0])
# Creating parameters
param = {'max_depth': 2, 'eta': 1, 'silent': 1, 'objective': 'binary:logistic'}
watchlist = [(dtrain, 'train')]
num_round = 2
bst = xgb.train(param, dtrain, num_round, watchlist)

[0]	train-error:0
[1]	train-error:0


Now that we have an XGBoost model, `bst`, we need to create a predict function to pass to our container to call. It is important that this function be defined within the function where the model is created, since this function will need to utilize the model.

In [4]:
def predict(xs):
    return [str(bst.predict(xgb.DMatrix(xs)))]

We need to serialize the function so that it can be copied onto the container. We can do this using a method called `save_python_function` from `clipper_admin.deployers.deployer_utils`.

In [6]:
serialization_dir = save_python_function(model_name, predict)
# Don't be concerned if you get a large output with a bunch of packages. This is normal.

18-03-08:21:46:19 INFO     [deployer_utils.py:49] Saving function to /tmp/clipper/tmpxpJMco
18-03-08:21:46:24 INFO     [deployer_utils.py:58] Anaconda environment found. Verifying packages.
18-03-08:21:46:27 INFO     [deployer_utils.py:158] The following packages in your conda environment aren't available in the linux-64 conda channel the container will use:
ca-certificates==2018.1.18=0, clangdev==5.0.0=default_0, llvmdev==5.0.0=default_0, openmp==5.0.0=0, openssl==1.0.2n=0, xgboost==0.7.post3=py27_1, alabaster==0.7.10=py27h9dd7d6e_0, anaconda-client==1.6.5=py27hc13fba8_0, anaconda==custom=py27h2cfa9e9_0, anaconda-navigator==1.6.9=py27h103b016_0, anaconda-project==0.8.0=py27h9e3d455_0, appnope==0.1.0=py27hb466136_0, appscript==1.0.1=py27h451298e_1, asn1crypto==0.22.0=py27h61af4a7_1, astroid==1.5.3=py27h96f3fd4_0, astropy==2.0.2=py27h87cc2bd_4, babel==2.5.0=py27h7311c9e_0, backports==1.0=py27hb4f9756_1, backports.functools_lru_cache==1.4=py27h2aca819_1, backports.shutil_get_terminal_siz

18-03-08:21:46:27 INFO     [deployer_utils.py:67] Supplied environment details
18-03-08:21:46:36 INFO     [deployer_utils.py:79] Supplied local modules
18-03-08:21:46:36 INFO     [deployer_utils.py:85] Serialized and supplied predict function


Next, we need to specify a base image to build our container off of. After that, we build and deploy the image. This is where we specify the packages we want to install.

In [7]:
# This will be your base_image whenever you use models you install in this way.
base_image = 'clipper/python-closure-container:develop'
# This is the normal way to deploy a container with an exception - the pkgs_to_install list
# is an optional arg. When you want to install packages, ex: xgboost and psycopg2, you would
# just pass them in, ex: pkgs_to_install=['xgboost', 'psycopg2']
clipper_conn.build_and_deploy_model(model_name, version, "integers",
    serialization_dir, base_image, pkgs_to_install=['xgboost'])

18-03-08:21:48:01 INFO     [clipper_admin.py:405] Building model Docker image with model data from /tmp/clipper/tmpxpJMco
18-03-08:21:49:07 INFO     [clipper_admin.py:409] Pushing model Docker image to xgboost-model:1
18-03-08:21:49:08 INFO     [docker_container_manager.py:243] Found 0 replicas for xgboost-model:1. Adding 1
18-03-08:21:49:09 INFO     [clipper_admin.py:583] Successfully registered model xgboost-model:1
18-03-08:21:49:09 INFO     [clipper_admin.py:501] Done deploying model xgboost-model:1.


In [8]:
clipper_conn.link_model_to_app(app_name, model_name)

18-03-08:21:49:09 INFO     [clipper_admin.py:229] Model xgboost-model is now linked to application xgboost-test


Now, let's get some predictions.

In [13]:
num_preds = 25
num_defaults = 0
addr = clipper_conn.get_query_addr()
for i in range(num_preds):
    response = requests.post(
        "http://%s/%s/predict" % (addr, app_name),
        headers=headers,
        data=json.dumps({
            'input': get_test_point()
        }))
    result = response.json()
    if response.status_code == requests.codes.ok and result["default"]:
            print('A default prediction was returned.')
    elif response.status_code != requests.codes.ok:
            print(result)
            raise BenchmarkException(response.text)
    else:
        print('Prediction Returned:', result)

('Prediction Returned:', {u'default': False, u'output': [0.5], u'query_id': 26})
('Prediction Returned:', {u'default': False, u'output': [0.5], u'query_id': 27})
('Prediction Returned:', {u'default': False, u'output': [0.5], u'query_id': 28})
('Prediction Returned:', {u'default': False, u'output': [0.5], u'query_id': 29})
('Prediction Returned:', {u'default': False, u'output': [0.5], u'query_id': 30})
('Prediction Returned:', {u'default': False, u'output': [0.5], u'query_id': 31})
('Prediction Returned:', {u'default': False, u'output': [0.5], u'query_id': 32})
('Prediction Returned:', {u'default': False, u'output': [0.5], u'query_id': 33})
('Prediction Returned:', {u'default': False, u'output': [0.5], u'query_id': 34})
('Prediction Returned:', {u'default': False, u'output': [0.5], u'query_id': 35})
('Prediction Returned:', {u'default': False, u'output': [0.5], u'query_id': 36})
('Prediction Returned:', {u'default': False, u'output': [0.5], u'query_id': 37})
('Prediction Returned:', {u'

This concludes this tutorial!