# Clipper Tensorflow Proof of Concept
In this notebook, we will show you how to serve a Tensorflow model using Clipper. In the first part, we will demonstrate how to use clipper provided general Tensorflow container to launch a Tensorflow serving service. In the second part, we will guide you how to launch a serving service using your custormized container, although here we still use the same TensorFlow model, it is easy to extend to DyNet, LightGBM and other platforms.

Before we start, make sure you have Clipper successfully installed in your environment. I suggest you install from source code instead of [the wheel file from PyPI](https://pypi.python.org/pypi/clipper_admin/0.2.0) since it is a little outmoded. Check out Part 0 below if you don't know how to install Clipper from source. Otherwise, skip part 0.

# Part 0: Install Clipper from Source
```bash
git clone https://github.com/ucbrise/clipper.git
cd clipper/clipper_admin
pip install -r requirements.txt
pip install -e .
```

# Part 1: TensorFlow Model Serving Directly
This is the Proof of Concept of how to serve a Tensorflow Model using Clipper. Here we will use a simple Logistic Regression Model written with Tensorflow. Other tensorflow models should be similar.

## Dataset

The dataset we use is a image dataset contains 104 small images with 28 x 28 size. Each has a label either be 0 or 1.

In [43]:
from __future__ import absolute_import, division, print_function
import os
import sys
import time
import json
import requests
import numpy as np

def objective(y, pos_label):
    # prediction objective
    if y == pos_label:
        return 1
    else:
        return 0


def data_transformation(train_path, pos_label):
    trainData = np.genfromtxt(train_path, delimiter=',', dtype=int)
    records = trainData[:, 1:] 
    labels = trainData[:, :1] 
    transformedlabels = [objective(ele, pos_label) for ele in labels]
    return (records, transformedlabels)

(X_train, y_train) = data_transformation("train.data", 3)
print("There are %s images with each size equals to %s." % (len(X_train), len(X_train[0])))
print("The label of the first image in the original dataset is %s.\nThe label of the last image in the original dataset is %s." % (y_train[0], y_train[-1]))

There are 104 images with each size equals to 784.
The label of the first image in the original dataset is 0.
The label of the last image in the original dataset is 1.


## Train

In [2]:
import tensorflow as tf
 
def train_logistic_regression(X_train, y_train):
    tf.reset_default_graph()
    sess = tf.Session()
    
    x = tf.placeholder(tf.float32, [None, X_train.shape[1]], name="pixels")
    y_labels = tf.placeholder(tf.int32, [None], name="labels")
    y = tf.one_hot(y_labels, depth=2)

    W = tf.Variable(tf.zeros([X_train.shape[1], 2]), name="weights")
    b = tf.Variable(tf.zeros([2]), name="biases")
    y_hat = tf.matmul(x, W) + b 

    pred = tf.argmax(tf.nn.softmax(y_hat), 1, name="predict_class")  # Softmax

    loss = tf.reduce_mean(
        tf.nn.softmax_cross_entropy_with_logits(logits=y_hat, labels=y))
    train = tf.train.GradientDescentOptimizer(0.1).minimize(loss)

    accuracy = tf.reduce_mean(
        tf.cast(tf.equal(tf.argmax(y_hat, 1), tf.argmax(y, 1)), tf.float32))
    sess.run(tf.global_variables_initializer())
    for i in range(5000):
        sess.run(train, feed_dict={x: X_train, y_labels: y_train})
        if i % 1000 == 0:
            print('Cost , Accuracy')
            print(sess.run(
                [loss, accuracy], feed_dict={
                    x: X_train,
                    y_labels: y_train
                })) 
    return sess

sess = train_logistic_regression(X_train, y_train)

  from ._conv import register_converters as _register_converters


Instructions for updating:

Future major versions of TensorFlow will allow gradients to flow
into the labels input on backprop by default.

See tf.nn.softmax_cross_entropy_with_logits_v2.

Cost , Accuracy
[6659.0625, 0.63461536]
Cost , Accuracy
[0.0, 1.0]
Cost , Accuracy
[0.0, 1.0]
Cost , Accuracy
[0.0, 1.0]
Cost , Accuracy
[0.0, 1.0]


## Option 1: Save as Checkpoint File

In [6]:
saver = tf.train.Saver()
ckp_prefix = "tf_checkpoint_file/model.ckpt"
save_path = saver.save(sess, ckp_prefix)

## Option 2: Save as SavedModel Format

In [7]:
# Remove the folder if it exists
builder = tf.saved_model.builder.SavedModelBuilder("frozen_graph/export_dir")
builder.add_meta_graph_and_variables(sess, [tf.saved_model.tag_constants.SERVING])
builder.save()

INFO:tensorflow:No assets to save.
INFO:tensorflow:No assets to write.
INFO:tensorflow:SavedModel written to: frozen_graph/export_dir/saved_model.pb


'frozen_graph/export_dir/saved_model.pb'

In [8]:
# close the session
sess.close()

## Start Clipper
To start a clipper instance, you need to use the clipper_admin tool.

In [44]:
from clipper_admin import ClipperConnection, DockerContainerManager

clipper_conn = ClipperConnection(DockerContainerManager())
clipper_conn.stop_all()
# Start Clipper. Running this command for the first time will
# download several Docker containers, so it may take some time.
clipper_conn.start_clipper()

18-04-06:16:45:00 INFO     [clipper_admin.py:1192] Stopped all Clipper cluster and all model containers
18-04-06:16:45:00 INFO     [docker_container_manager.py:106] Starting managed Redis instance in Docker
18-04-06:16:45:03 INFO     [clipper_admin.py:114] Clipper is running


Now you can see there are several containers started at the back.
![pic1](./pic/clipper_start.png "pic1")


## Register an Application

In [45]:
# Register an application called "tf-lr-app". This will create
# a prediction REST endpoint at http://localhost:1337/tf-lr-app/predict
# here 1337 is the default port inside Clipper.
clipper_conn.register_application(name="tf-lr-app",
                                  input_type="integers",
                                  default_output="rabbit",
                                  slo_micros=100000)
# Inspect Clipper to see the registered apps
clipper_conn.get_all_apps()

18-04-06:16:47:13 INFO     [clipper_admin.py:189] Application tf-lr-app was successfully registered


[u'tf-lr-app']

## Define a Predict Function
Now we need to define a predict function for our logistic regression model to serve coming requests.

In [46]:
# Note that the variable names are known inside your model(when defining the network) which is transparent to end users.
# Define the predict function that returns the prediction result of each input image.
# Note that the prediction function takes a list of input images as input and returns a list of strings.
def predict(sess, inputs):
    preds = sess.run('predict_class:0', feed_dict={'pixels:0': inputs})
    return [str(p) for p in preds]

## Deploy
To deloy the TensorFlow model in Clipper, we use predefined `deploy_tensorflow_model` deployer. As you can see, there are several options below. You can deploy you predict function using checkpoint files, SavedModel format files, or a TensorFlow runtime session restored from them.

In [47]:
app_name = "tf-lr-app"
model_name = "tf-lr-model"
from clipper_admin.deployers.tensorflow import deploy_tensorflow_model
# option 1: deloy predict function using SavedModel format files
deploy_tensorflow_model(clipper_conn,
                        model_name,
                        version=1,
                        input_type="integers",
                        func=predict,
                        tf_sess_or_saved_model_path="frozen_graph/export_dir")
# option 2: deploy predic function using checkpoint files
# Note in this case tf_sess_or_saved_model_path should be the prefix folder name for the ckp files: tf_checkpoint_file instead of tf_checkpoint_file/model.ckpt
'''
deploy_tensorflow_model(clipper_conn,
                        model_name,
                        version=2,
                        input_type="integers",
                        func=predict,
                        tf_sess_or_saved_model_path="tf_checkpoint_file")
'''
# option 3: restore sess from ckp file
# TODO
# option 4: restore sess from SavedModel
# TODO

# link the app with your deployed model
clipper_conn.link_model_to_app(app_name, model_name)

18-04-06:16:56:52 INFO     [deployer_utils.py:49] Saving function to /tmp/clipper/tmpm1MHcG
18-04-06:16:57:13 INFO     [deployer_utils.py:79] Supplied local modules
18-04-06:16:57:13 INFO     [deployer_utils.py:85] Serialized and supplied predict function
18-04-06:16:57:13 INFO     [tensorflow.py:250] TensorFlow model copied to: tfmodel 
18-04-06:16:57:15 INFO     [clipper_admin.py:391] Building model Docker image with model data from /tmp/clipper/tmpm1MHcG
18-04-06:16:57:20 INFO     [clipper_admin.py:395] Pushing model Docker image to tf-lr-model:1
18-04-06:16:57:21 INFO     [docker_container_manager.py:243] Found 0 replicas for tf-lr-model:1. Adding 1
18-04-06:16:57:21 INFO     [clipper_admin.py:569] Successfully registered model tf-lr-model:1
18-04-06:16:57:21 INFO     [clipper_admin.py:487] Done deploying model tf-lr-model:1.
18-04-06:16:57:22 INFO     [clipper_admin.py:232] Model tf-lr-model is now linked to application tf-lr-app


Now as you can see, Clipper started a new container named `tf-lr-model_1-37073` at the back.
![pic2](./pic/after_deploy.png "pic2")

## Query Predictions
For simplicity, we'll generate a random image to request the prediction. It should work!

In [48]:
headers = {'Content-type': 'application/json'}

def get_test_point():
    return [np.random.randint(255) for _ in range(784)]

def test_model(clipper_conn, app, version):
    time.sleep(25)
    num_preds = 25
    num_defaults = 0 
    addr = clipper_conn.get_query_addr()
    print(addr)
    for i in range(num_preds):
        response = requests.post(
            "http://%s/%s/predict" % (addr, app),
            headers=headers,
            data=json.dumps({
                'input': get_test_point()
            })) 
        result = response.json()
        print(result)
        if response.status_code == requests.codes.ok and result["default"]:
            num_defaults += 1
        elif response.status_code != requests.codes.ok:
            print(result)
            raise BenchmarkException(response.text)

    if num_defaults > 0:
        print("Error: %d/%d predictions were default" % (num_defaults,
                                                         num_preds))

# You need to reconnect to the clipper instance in another process
# clipper_conn.connect()
test_model(clipper_conn, "tf-lr-app", 1)

localhost:1337
{u'default': False, u'output': 1, u'query_id': 0}
{u'default': False, u'output': 1, u'query_id': 1}
{u'default': False, u'output': 1, u'query_id': 2}
{u'default': False, u'output': 1, u'query_id': 3}
{u'default': False, u'output': 1, u'query_id': 4}
{u'default': False, u'output': 1, u'query_id': 5}
{u'default': False, u'output': 1, u'query_id': 6}
{u'default': False, u'output': 1, u'query_id': 7}
{u'default': False, u'output': 1, u'query_id': 8}
{u'default': False, u'output': 1, u'query_id': 9}
{u'default': False, u'output': 1, u'query_id': 10}
{u'default': False, u'output': 1, u'query_id': 11}
{u'default': False, u'output': 1, u'query_id': 12}
{u'default': False, u'output': 1, u'query_id': 13}
{u'default': False, u'output': 1, u'query_id': 14}
{u'default': False, u'output': 1, u'query_id': 15}
{u'default': False, u'output': 1, u'query_id': 16}
{u'default': False, u'output': 1, u'query_id': 17}
{u'default': False, u'output': 1, u'query_id': 18}
{u'default': False, u'outp

# Part 2: Customized TensorFlow Model Serving
In this section, we will guide you how to launch a serving service with your own docker images. For example, you want to serve a DyNet model. But for consistency, we'll still use above TensorFlow's logistic regression example.

## Predict Service
Instead of just writing a predict function, now you need to write a prediction service. Clipper uses rpc for communication between Clipper and your predict container. Clipper's python [rpc implementation](https://github.com/ucbrise/clipper/blob/develop/containers/python/rpc.py) has very simple interface, let's check out the example below. We'll use checkpoint files as input to start a predict service. 

In [39]:
# tf_lr_container.py
from __future__ import print_function
import os
import sys
import rpc # to use this, you can use the base image(https://hub.docker.com/r/clipper/py-rpc/) to build your container 
import numpy as np
import tensorflow as tf

class TFLRContainer(rpc.ModelContainerBase):
    def __init__(self, path):
        self.sess = tf.Session('', tf.Graph())
        with self.sess.graph.as_default():
            saver = tf.train.import_meta_graph(path + '.meta')

    def predict_ints(self, inputs):
        preds = self.sess.run('predict_class:0', feed_dict = {'pixels:0': inputs})
        return [str(pred) for pred in preds]


def service():
    print('Starting TensorFlow LR container')
    model_name = os.environ["CLIPPER_MODEL_NAME"] # dynamic pass to the container
    model_version = os.environ["CLIPPER_MODEL_VERSION"] # dynamic pass to the container
    ip = "127.0.0.1"
    port = 7000

    input_type = "ints"
    model_dir_path = os.environ["CLIPPER_MODEL_PATH"] # "/model" by default 
    model_files = os.listdir(model_dir_path)
    assert len(model_files) >= 2
    fname = os.path.splitext(model_files[0])[0]
    full_fname = os.path.join(model_dir_path, fname)
    print(full_fname)
    model = TFLRContainer(full_fname)

    # start rpc service using TFLRContainer
    rpc_service = rpc.RPCService()
    rpc_service.start(model, ip, port, model_name, model_version, input_type)
    

if __name__ == "__main__":
    service()

NameError: name 'rpc' is not defined

## Entry Point
Also, let's create a entry point for our container

```code
# tf_lr_container_entry.sh
#!/usr/bin/env sh

IMPORT_ERROR_RETURN_CODE=3

echo "Attempting to run TensorFlow container without installing any dependencies"
echo "Contents of /model"
ls /model/

/bin/bash -c "exec python /container/tf_lr_container.py"
if [ $? -eq $IMPORT_ERROR_RETURN_CODE ]; then
  echo "Running TensorFlow container without installing dependencies fails"
  echo "Will install dependencies and try again"
  conda install -y --file /model/conda_dependencies.txt
  pip install -r /model/pip_dependencies.txt
  /bin/bash -c "exec python /container/tf_lr_container.py"
fi
```

## DockerFile
Now we can write our dockerfile finally :)

```DockerFile
# TensorFlowLRDockerfile
FROM clipper/py-rpc:09dfc97

COPY python_container_conda_deps.txt /lib/

RUN conda config --set ssl_verify no \
  && conda install -c anaconda cloudpickle=0.5.2 \
  && conda install -y --file /lib/python_container_conda_deps.txt \
  && conda install tensorflow

COPY tf_lr_container.py tf_lr_container_entry.sh /container/

CMD ["/container/tf_lr_container_entry.sh"]

# vim: set filetype=dockerfile:
```

In [42]:
# Now you can build the container and push it to your docker registry
%run docker build xunzhang/tf_lr_container:latest -f TensorFlowLRDockerfile .

18-04-06:11:49:27 ERROR    [execution.py:622] File `u'docker.py'` not found.


## Deploy
We will use `clipper_conn.build_and_deploy_model` to deploy the service. The Docker container will load and reconstruct the model from the serialized model checkpoint when the container is started.

In [None]:
clipper_conn.build_and_deploy_model(
    name=model_name,
    version=3,
    input_type="ints",
    model_data_path=os.path.abspath("tf_checkpoint_file"),
    base_image="xunzhang/tf_lr_container:latest",
    num_replicas=1
)