# Table of Contents
<div class="toc" style="margin-top: 1em;">
   <ul class="toc-item" id="toc-level0">
      <li>
         <span><a href="#API-Overview" data-toc-modified-id="API-Overview-1"><span class="toc-item-num">1.&nbsp;&nbsp;</span>API Overview</a></span>
         <ul class="toc-item">
            <li><span><a href="#Context" data-toc-modified-id="Context-1.1"><span class="toc-item-num">1.1&nbsp;&nbsp;</span>Context</a></span></li>
            <li><span><a href="#Creating-a-ClipperConnection" data-toc-modified-id="Create-a-ClipperConnection-1.2"><span class="toc-item-num">1.2&nbsp;&nbsp;</span>Creating a ClipperConnection</a></span></li>
            <li><span><a href="#Starting-Clipper" data-toc-modified-id="Starting-Clipper-1.3"><span class="toc-item-num">1.3&nbsp;&nbsp;</span>Starting Clipper</a></span></li>
            <li>
               <span><a href="#Deploying-a-model" data-toc-modified-id="Deploying-a-model-1.4"><span class="toc-item-num">1.4&nbsp;&nbsp;</span>Deploying a model</a></span>
               <ul class="toc-item">
                  <li>
                     <span><a href="#Creating-a-model" data-toc-modified-id="Create-the-model-1.4.1"><span class="toc-item-num">1.4.1&nbsp;&nbsp;</span>Creating a model</a></span>
                  </li>
                  <li><span><a href="#Deploying-to-Clipper" data-toc-modified-id="Deploying-to-Clipper-1.4.2"><span class="toc-item-num">1.4.2&nbsp;&nbsp;</span>Deploying to Clipper</a></span></li>
                  <li><span><a href="#A-Note-About-Types-[Optional]" data-toc-modified-id="A-Note-About-Types-[Optional]-1.4.3"><span class="toc-item-num">1.4.3&nbsp;&nbsp;</span>A Note About Types [Optional]</a></span></li>
               </ul>
            </li>
            <li><span><a href="#Registering-an-Application" data-toc-modified-id="Registering-an-Application-1.5"><span class="toc-item-num">1.5&nbsp;&nbsp;</span>Registering an Application</a></span></li>
            <li><span><a href="#Inspecting-Clipper" data-toc-modified-id="Inspecting-Clipper-1.6"><span class="toc-item-num">1.6&nbsp;&nbsp;</span>Inspecting Clipper</a></span></li>
            <li><span><a href="#Updating-the-Model" data-toc-modified-id="Updating-the-Model-1.7"><span class="toc-item-num">1.7&nbsp;&nbsp;</span>Updating the Model</a></span></li>
            <li><span><a href="#Adding-Model-Replicas" data-toc-modified-id="Adding-Model-Replicas-1.8"><span class="toc-item-num">1.8&nbsp;&nbsp;</span>Adding Model Replicas</a></span></li>
         </ul>
      </li>
<li>
         <span><a href="#Example-Application---Image-Classification" data-toc-modified-id="Example-Application---Image-Classification"><span class="toc-item-num">2.&nbsp;&nbsp;</span>Example Application - Image Classification</a></span>
      <li><span><a href="#Example-Application---Custom-Docker-Containers" data-toc-modified-id="Example-Application---Custom-Docker-Containers"><span class="toc-item-num">3.&nbsp;&nbsp;</span>Example Application - Custom Docker Containers</a></span></li>
      <li><span><a href="#Restarting-Clipper" data-toc-modified-id="Restarting-Clipper"><span class="toc-item-num">4.&nbsp;&nbsp;</span>Restarting Clipper</a></span></li>
   </ul>
</div>

<a id='api_overview'></a>
## API Overview

In the first part of this exercise, you will explore how to create and interact with a Clipper cluster. The primary way of managing Clipper is with the Clipper Admin Python tool. This tutorial will walk you through all the things you can do with the Clipper Admin tool as well as explain what happens within Clipper when you issue each command. You can find the complete API documentation for the Clipper Admin tool on our website: <http://docs.clipper.ai>.

**Goal:** Be familiar with how to create and manage a Clipper cluster, and understand what happens when you issue Clipper admin commands.

### Context
The Clipper Admin tool is distributed through Pip. You can install it with `pip install clipper_admin`, but it has already been installed in this notebook for you.

Clipper uses Docker containers as its deployment mechanism. A running Clipper cluster consists of a collection of Docker containers communicating with each other over the network. As you issue commands against Clipper, you are communicating with these containers as well as creating new ones or destroying existing ones. As you explore the Clipper API throughout this exercise, we will illustrate how each command effects the cluster state.

The main API for interacting with Clipper is exposed via a [`ClipperConnection`](http://docs.clipper.ai/en/develop/#clipper-connection) object. This is your handle to a Clipper cluster (this collection of Docker containers). It can be used to start, stop, inspect, and modify the cluster.

In order to create a `ClipperConnection` object, you must provide it with a [`ContainerManager`](http://docs.clipper.ai/en/develop/#container-managers) object. While Docker is becoming an increasingly standard mechanism for deploying applications, there are many different tools for managing a Docker cluster. These tools broadly fall into the category of *Container Orchestration frameworks*. Some popular examples are [Kubernetes](https://kubernetes.io/), [Docker Swarm](https://docs.docker.com/engine/swarm/), and [DC/OS](https://dcos.io/). One of the reasons we run Clipper in Docker containers is to make the system as general as possible and support many different deployment scenarios. Within the Clipper Admin, we abstract away all of the Docker container-specific commands behind the `ContainerManager` interface. The `ClipperConnection` object makes Clipper-specific decisions about how to issue commands, and then makes any changes to the Docker configuration (for example, to launch a container for a newly deployed model) through the `ContainerManager`. To support different container orchestration frameworks that manage Docker containers in different ways, we create different implementations of the `ContainerManager` interface.

Clipper currently provides two `ContainerManager` implementations: the `DockerContainerManager` and the `KubernetesContainerManager`. In this exercise, you will be using the `DockerContainerManager`, which runs Clipper directly on your local Docker instance. This `ContainerManager` is particularly useful for trying out Clipper without needing to set up an enterprise-grade container orchestration framework. The `DockerContainerManager` is not recommended for production use cases.

### Creating a ClipperConnection
To start Clipper, you must first create a [`ClipperConnection`](http://docs.clipper.ai/en/develop/#clipper-connection) object with the type of `ContainerManager` you want to use. In this case, you will be using the `DockerContainerManager`.

In [None]:
from clipper_admin import ClipperConnection, DockerContainerManager
clipper_conn = ClipperConnection(DockerContainerManager())

### Starting Clipper
Now that you have a `ClipperConnection` object, you can start a Clipper cluster.

The following command will start 3 Docker containers:
1. The Query Frontend: The Query Frontend container listens for incoming prediction requests and schedules and routes them to the deployed models.
2. The Management Frontend: The Management Frontend container manages and updates the cluster's internal configuration state, such as tracking which models are deployed and which application endpoints have been registered.
3. A Redis instance: Redis is used to persistently store Clipper's internal configuration state. By default, Redis is started on port 6380 instead of the standard Redis default port 6379 to avoid collisions with any Redis instances that are already running.

<img src="notebook-images/start_clipper.png" width="50%" height="50%" />

> ***NOTE:*** *Because Docker must download the Docker images from the internet (if they are not already cached) before it can start the containers, the first time you run this command it may take a few minutes to complete. Once the images have been downloaded once, they will be cached and future invocations of this command will complete much more quickly. Thanks for your patience.*

If you try to start more than one Clipper cluster at once on the same host, the second execution of the command will fail because, by default, the second cluster will try to bind to the same ports as the first one. If you run into problems with the exercise and want to start over, you can completely stop the cluster with [`clipper_conn.stop_all()`](http://docs.clipper.ai/en/v0.3.0/clipper_connection.html#clipper_admin.ClipperConnection.stop_all_model_containers).

In [None]:
clipper_conn.start_clipper()
clipper_addr = clipper_conn.get_query_addr()

### Deploying a model
At its most basic, a trained model is just a function that takes some input and produces some output. As a result, one way to think about Clipper is as a function server. While these functions are often complex models, Clipper is not restricted to serving machine learning models.

Many machine learning models are trained in commonly used machine learning frameworks such as [Scikit-Learn](http://scikit-learn.org/), [TensorFlow](https://www.tensorflow.org/), [PyTorch](https://pytorch.org/), etc. To support these common use cases, Clipper offers a wide variety of deployers for common machine learning frameworks to simplify deployment. Furthermore, any frameworks that produce models that can be pickled -- this includes Scikit-Learn and XGBoost -- do not even require a special deployer and can be deployed directly with the standard Clipper Python closure deployer.

In this exercise, you'll start by deploying a Scikit-Learn model using the Python closure deployer.

#### Creating a model
You will start by creating and training a three-class logistic regression classifier.

Complete the TODO's in the next cells to build your model!

In [None]:
# The code to train the model and produce the graph comes from this sklearn example:
# http://scikit-learn.org/stable/auto_examples/linear_model/plot_iris_logistic.html
import numpy as np
import matplotlib.pyplot as plt
from sklearn import linear_model, datasets
%matplotlib inline

np.random.seed(5)

iris = datasets.load_iris()
X = iris.data[:, :2]  # We only take the first two features.
Y = iris.target

h = .02  # Step size in the mesh

model = linear_model.LogisticRegression(C=1e5)

Now that you've initialized your model, it's time to train it.

In [None]:
model.fit(X, Y)

In [None]:
# Plot the decision boundary. For that, we will assign a color to each
# point in the mesh [x_min, x_max]x[y_min, y_max].
x_min, x_max = X[:, 0].min() - .5, X[:, 0].max() + .5
y_min, y_max = X[:, 1].min() - .5, X[:, 1].max() + .5
xx, yy = np.meshgrid(np.arange(x_min, x_max, h), np.arange(y_min, y_max, h))
Z = model.predict(np.c_[xx.ravel(), yy.ravel()])

# Put the result into a color plot
Z = Z.reshape(xx.shape)
plt.figure(1, figsize=(4, 3))
plt.pcolormesh(xx, yy, Z, cmap=plt.cm.Paired)

# Plot also the training points
plt.scatter(X[:, 0], X[:, 1], c=Y, edgecolors='k', cmap=plt.cm.Paired)
plt.xlabel('Sepal length')
plt.ylabel('Sepal width')

plt.xlim(xx.min(), xx.max())
plt.ylim(yy.min(), yy.max())
plt.xticks(())
plt.yticks(())
plt.title('3 Way Classifier')
plt.show()

#### Clipper's Model Deployers

One of the goals of Clipper is to make it simple to deploy and maintain machine-learning models in production. The prediction interface that models must implement is very simple, consisting of a single function. And the use of Docker makes it easy to include all of a model's dependencies in a self-contained environment. However, deploying a new type of model still entails writing and debugging a new model container and creating a Docker image.
To make the model deployment process even simpler, Clipper provides a library of model deployers for common types of models. If your model can be deployed with one of these deployers, you no longer need to write a model container, create a Docker image, or even figure out how to save a model. Instead, you provide your trained model directly to the model deployer function within your Python process. The model deployer takes care of saving the model and building a Docker image that is compatible with your model type.

Clipper provides model deployers for many common ML packages including PySpark, PyTorch, TensorFlow, etc. In addition, Clipper provides a Python model deployer that can deploy arbitrary Python functions as long as they can be pickled. Finally, if none of these fit your needs, you can always write a custom model container that can execute arbitrary code.

To keep the base images small, Clipper model containers install only the required dependencies to ensure that a basic model will run. However, if your model requires custom Python packages installed, you can specify these additional packages to the model deployer and it will automatically use Pip to install them inside your model's Docker container. You will use this to install Scikit-Learn when you deploy the flower model.

For more information about model deployers, check out the [documentation](http://docs.clipper.ai/en/v0.3.0/model_deployers.html).

#### Define a prediction function

Now you have a model that takes an array of length 2 as input -- a petal width and a sepal length -- and returns a flower label. Before you can deploy the model using the Python model deployer, you need to wrap that model in a function that conforms to Clipper's prediction interface.

To improve performance during inference, many machine learning models exploit opportunities for data parallelism in the inference process. Because of this, Clipper tries to provide multiple inputs at once to a deployed model. Therefore, models deployed to Clipper must have a function interface that takes a list of inputs as an argument and returns a list of predictions as strings. Returning predictions as strings provides a lot of flexibility over what your models can return. Commonly, models in Clipper will return either a single number (such as a label or score) or JSON containing a richer representation of the model output (for example, by including confidence estimates of predicted labels).

Define a `predict_flower` function below that takes a list of inputs and returns their predicted labels as strings.

In [None]:
def get_label_from_class(l):
    if l == 0:
        return 'Setosa'
    elif l == 1:
        return 'Virginica'
    else:
        return 'Versicolour'


def predict_flowers(flowers):
    labels = "" # TODO: Use the model to make predict labels
    return [get_label_from_class(l) for l in labels]

**Solution:**

```python
def predict_flowers(flowers):
    labels = model.predict(flowers)
    return [get_label_from_class(l) for l in labels]
```

Execute the following cell to test your function:

In [None]:
assert predict_flowers([X[0], X[101]]) == ['Setosa', 'Virginica']

Now you can use the Python model deployer to deploy your model to Clipper.

In [None]:
from clipper_admin.deployers import python as python_deployer
python_deployer.deploy_python_closure(
    clipper_conn,
    name="flowercat",  # The name of the model in Clipper
    version=1,  # A unique identifier to assign to this model.
    input_type="floats",  # The type of data the model function expects as input
    func=predict_flowers, # The model function to deploy
    pkgs_to_install=['scipy', 'scikit-learn'] # Packages to install in the new container. Must be a list
)

Clipper deploys each model in its own Docker container. After deploying the model, Clipper uses the DockerContainerManager to start a container for this model and create an RPC connection with the Clipper query frontend, as illustrated below (the changes to the cluster are highlighted in red).

> *Once again, Clipper must download a Docker container from the internet the first time this command is run.*

<img src="notebook-images/deploy_model.png" width="50%" height="50%" />

If you list the Clipper containers again, you can see the container running your word count model.

In [None]:
!docker ps --filter label=ai.clipper.container.label

#### A Note About Types [Optional]

When you deploy models and register applications, you must specify the input type that the model or application expects. The type that you specify has implications for how Clipper manages input serialization and deserialization. From the user's perspective, the input type affects the behavior of Clipper in two places. In the "input" field of the request JSON body, applications will reject requests where the value of that field is the wrong type. And the deployed model function will be called with a list of inputs of the specified type.

The input type can be one of the following types:

* "ints": The value of the "input" field in a request must be a JSON list of ints. The model function will be called with a list of numpy arrays of type numpy.int.
* "floats": The value of the "input" field in a request must be a JSON list of doubles. The model function will be called with a list of numpy arrays of type numpy.float32.
* "doubles": The value of the "input" field in a request must be a JSON list of doubles. The model function will be called with a list of numpy arrays of type numpy.float64.
* "bytes": The value of the "input" field in a request must be a Base64 encoded string. The model function will be called with a list of numpy arrays of type numpy.int8.
* "strings": The value of the "input" field in a request must be a string. The model function will be called with a list of strings.

### Registering an Application
You've now deployed a model to Clipper, but you don't have any way to query it yet. Instead of automatically creating a REST endpoint when you deploy a model, Clipper introduces a layer of indirection: the application. Clients query a specific application in Clipper, and the application routes the query to the correct model. One advantage of this choice is it decouples endpoint versioning from model versioning, enabling data scientists to rapidly iterate and update models without stopping or modifying a live application.

A single Clipper cluster can have many applications registered and many models deployed at once.

When you register an application you configure certain elements of the application's behavior. These include:

* The name to give the REST endpoint.
* The input type that the application expects (Clipper will ensure applications only route requests to models with matching input types).
* The latency service level objective (SLO) specified in microseconds. Clipper will manage how it schedules and routes queries for an application based on the specified latency SLO. For example, Clipper uses the specified SLO to configure its batching behavior to balance maximizing resource utilization by using larger batches with ensuring that SLOs can be met. In addition, Clipper will respond to requests by the end of the specified SLO with a default response if it has not received a prediction back from the model yet.
* The default output: Clipper will respond with the default output to requests if a real prediction isn't available by the end of the service level objective.

<img src="notebook-images/register_app.png" width="50%" height="50%" />

Register an application to query your classifier:

In [None]:
clipper_conn.register_application(
    name="flowercat-app",
    input_type="floats",
    default_output="Default",
    slo_micros=100000)

When you register an application with Clipper, it creates a REST endpoint for that application:

```
URL: /<app_name>/predict
Method: POST
Data Params: {"input": <input>}
```

If you want to send several inputs at once, you can batch them into a single request by providing a JSON object with the key `input_batch`. Note that this is separate from Clipper's internal batching mechanism, which will be used regardless if the inputs are provided in a single REST request or separate requests.

```
URL: /<app_name>/predict
Method: POST
Data Params: {"input_batch": <input_batch>}
```

Try querying the newly created application.

In [None]:
import requests, json
response = requests.post(
     "http://%s/%s/predict" % (clipper_addr, 'flowercat-app'),
     headers={"Content-type": "application/json"},
     data=json.dumps({
         'input': list(X[0]),
     }))
result = response.json()
result

You should see that your application returned the default output of "Default". This is because even though you have deployed a model and registered an application, you have not told Clipper to route requests from the "flowercat-app" application to the "flowercat" model.

You do this by linking the model to the application.

<img src="notebook-images/link_model.png" width="50%" height="50%" />

In [None]:
clipper_conn.link_model_to_app(app_name="flowercat-app", model_name="flowercat")

When you query the "wordcount-app" endpoint again, Clipper should return the correct word count. Try it with your own input.

In [None]:
response = requests.post(
     "http://%s/%s/predict" % (clipper_addr, 'flowercat-app'),
     headers={"Content-type": "application/json"},
     data=json.dumps({
         'input': "", # TODO: Specify an input
     }))
result = response.json()
result

**Solution:**

```python
response = requests.post(
     "http://%s/%s/predict" % (clipper_addr, 'flowercat-app'),
     headers={"Content-type": "application/json"},
     data=json.dumps({
         'input': list(X[0]), # TODO: Specify an input
     }))
result = response.json()
result
```

### Inspecting Clipper
The ClipperConnection object has several methods to inspect various aspects of the Clipper cluster.

You can list all of the applications.

In [None]:
clipper_conn.get_all_apps(verbose=True)

Or all of the models.

In [None]:
clipper_conn.get_all_models(verbose=True)

You can also fetch the raw container logs from all of the Clipper docker containers. The command will print the paths to the log files for further examination. You can figure out which logs belong to which container based on the unique Docker container ID in the log filename.

In [None]:
clipper_conn.get_clipper_logs()

### Updating the Model
Machine learning models are rarely static. Instead, data science tends to be an iterative process, with new and improved models being developed over time. Clipper supports this workflow by letting you deploy new versions of models. If you look back to where you linked your flowercat model to the application, you'll see that there is no mention of versioning in that method call. Instead, when a new version of a model is deployed, Clipper will automatically start routing requests to the new version.

Create a new version of the "flowercat" model that returns the probabilities that an input is in each class instead.

> *Hint: Check out the [Scikit-Learn Logistic Regression documentation](http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html#sklearn.linear_model.LogisticRegression.predict_proba).*

In [None]:
def predict_flower_probabilities(flowers):
    # TODO: Return the predicted probabilities.
    return ""

**Solution:**

```python
def predict_flower_probabilities(flowers):
    return model.predict_proba(flowers)
```

In [None]:
predict_flower_probabilities([X[0]])

Deploy this new version of the function as version "2". For this application, you are using a numeric versioning scheme. But Clipper simply treats versions as unique string identifiers, so you could use other versioning schemes (such as Git hashes or semantic versioning). Versions don't even have to be ordered, Clipper just tracks the currently active version.

<img src="notebook-images/update_model.png" width="50%" height="50%" />


In [None]:
python_deployer.deploy_python_closure(
    clipper_conn,
    name="flowercat",
    version=2, # Note that you are specifying a new version
    input_type="floats",
    func=predict_flower_probabilities, # The new predict function
    pkgs_to_install=['scipy', 'scikit-learn']
)

In [None]:
response = requests.post(
     "http://%s/%s/predict" % (clipper_addr, 'flowercat-app'),
     headers={"Content-type": "application/json"},
     data=json.dumps({
         'input': list(X[0]),
     }))
result = response.json()
result

Sometimes the "new and improved" model is not actually improved. If you deploy a model that isn't working well, you can roll back to any previous version. This just changes which version of the model the application routes requests to.

<img src="notebook-images/rollback_version.png" width="50%" height="50%" />


In [None]:
clipper_conn.set_model_version(name="flowercat", version=1)

In [None]:
response = requests.post(
     "http://%s/%s/predict" % (clipper_addr, 'flowercat-app'),
     headers={"Content-type": "application/json"},
     data=json.dumps({
         'input': list(X[0]),
     }))
result = response.json()
result

### Adding Model Replicas
Many machine learning models are computationally expensive and a single instance of the model may not meet the throughput demands of a serving workload. To increase prediction throughput, you can horizontally scale the application by creating additional model replicas. This creates additional Docker containers running the same model. Clipper will automatically load-balance requests across the set of available model replicas.

Set the number of replicas for the currently active version (1) of the "flowercat" model to 4.

<img src="notebook-images/add_replicas.png" width="50%" height="50%" />

In [None]:
clipper_conn.set_num_replicas("flowercat", num_replicas=4)

If you list the Clipper Docker containers, you should now see four containers based on the image "flowercat:1".

In [None]:
!docker ps --filter label=ai.clipper.container.label

If you want to reduce the number of replicas of a model to free up hardware resource, you can use the same command.

Set the number of replicas for "flowercat" back to 1.

<img src="notebook-images/set_replicas.png" width="50%" height="50%" />

In [None]:
clipper_conn.set_num_replicas("flowercat", num_replicas=1)

In [None]:
!docker ps --filter label=ai.clipper.container.label

## Example Application - Image Classification

In the second part of this exercise, you will deploy a pre-trained [SqueezeNet](https://arxiv.org/abs/1602.07360) model that you will download from the PyTorch model zoo to classify images.

You will create an application that labels images from the ImageNet dataset.

Some sample images have already been downloaded for you.

### Creating an application

For this tutorial, create an application named "squeezenet-classifier". Note that Clipper allows you to create the application before deploying any models.

In [None]:
app_name = "squeezenet-classsifier"
default_output = "default"

clipper_conn.register_application(
    name=app_name,
    input_type="bytes",
    default_output=default_output,
    slo_micros=10000000)

When you list the applications registered with Clipper, you should see the newly registered "squeezenet-classifier" application show up.

In [None]:
clipper_conn.get_all_apps()

### Start serving

As soon as you register the application, the REST endpoint is live, even though no models have been linked yet. However, in this exercsie you will wait until you've deployed and linked a model before querying the application.

### Import the pretrained model from the PyTorch model zoo

In [None]:
from torchvision import models, transforms
model = models.squeezenet1_1(pretrained=True)

# TODO(Dan): Left off editing here

### Deploying the PyTorch Model

Unlike the Scikit-Learn model in [Section 1](#API-Overview), PyTorch models cannot just be pickled and loaded. Instead, they must be saved using PyTorch's native serialization API. Because of this, you cannot use the generic Python model deployer to deploy the model to Clipper. Instead, you will use the Clipper PyTorch deployer to deploy it. The Docker container will load and reconstruct the model from the serialized model checkpoint when the container is started.

> *Once again, Clipper must download this Docker image from the internet, so this may take a minute. Thanks for your patience.*

We'll start by creating a predict function. Before we get into it, though, let us discuss metrics.

### Metrics
Pay special attention to the metrics - this is a new feature within Clipper. Clipper, by default, provides certain metrics, such as batch size, but it is also possible to add user metrics. Clipper uses Prometheus to track these metrics, and so the dashboard can be seen at http://localhost:9090/. It is also easy to connect this to Grafana, something that has been done for you in this tutorial. For this example, we will add a variable that tracks the average batch size to illustrate Clipper's adaptive batching.

#### Preprocessing
Before making the predict function, we must dowload the labels for the dataset, and specify preprocessing on the images. The code for this cell comes from the this [tutorial](http://blog.outcome.io/pytorch-quick-start-classifying-an-image/).

In [None]:
# First we define the preproccessing on the images:
normalize = transforms.Normalize(
   mean=[0.485, 0.456, 0.406],
   std=[0.229, 0.224, 0.225]
)
preprocess = transforms.Compose([
   transforms.Scale(256),
   transforms.CenterCrop(224),
   transforms.ToTensor(),
   normalize
])

# Then we download the labels:
labels = {int(key):value for (key, value)
          in requests.get('https://s3.amazonaws.com/outcome-blog/imagenet/labels.json').json().items()}

### Predict function
Now we must define a predit function to deploy. We send each image as bytes, so we must first deserialize the image before we can perform any operations on it.

In [None]:
import clipper_admin.metrics as metrics

def predict_torch_model(n, imgs):
    import base64
    import io
    import os
    import PIL.Image
    import tempfile
    import torch
    metrics.add_metric("batch_size", 'Gauge', 'Batch size passed to PyTorch predict function.')
    metrics.report_metric('batch_size', len(imgs))
    
    img_tensors = []
    for img in imgs:
        # Create a temp file to write to
        tmp = tempfile.NamedTemporaryFile('wb', delete=False, suffix='.jpg')
        tmp.write(io.BytesIO(img).getvalue())
        tmp.close()
        
        # Pre-process image
        img_tensor = preprocess(PIL.Image.open(tmp.name, 'r'))
        img_tensor.unsqueeze_(0)
        img_tensors.append(img_tensor)
        
        os.unlink(tmp.name) 
        
        # Pass image to model
    img_batch = torch.cat(img_tensors)
    with torch.no_grad():
        fc_out = n(img_batch)
        
        # Parse Result
    img_labs = [labels[out.data.numpy().argmax()] for out in fc_out]
        
        # Delete Temp File
        
    return img_labs

In [None]:
from clipper_admin.deployers import pytorch as pytorch_deployer
pytorch_deployer.deploy_pytorch_model(
    clipper_conn,
    name= , # TODO: Give model a name 
    version=1, 
    input_type="bytes", 
    func=predict_torch_model,
    pytorch_model= , # TODO: Pass model to function
)

Toggle the visibility on the next cell for the correct answer!

```python
from clipper_admin.deployers import pytorch as pytorch_deployer
pytorch_deployer.deploy_pytorch_model(
    clipper_conn,
    name= "pytorch-model", # TODO: Give model a name 
    version=1, 
    input_type="bytes", 
    func=predict_torch_model,
    pytorch_model= model, # TODO: Pass model to function
)
```

In [None]:
clipper_conn.link_model_to_app(app_name="squeezenet-classsifier", model_name="pytorch-model")

To visualize our metrics, we are going to connect Grafana to our Prometheus server. First, we must start up a Grafana Docker container. We will start it on port 3000.

In [None]:
!docker run -d -p 3000:3000 grafana/grafana

# TODO(dan) change the data source ip address to localhost¶

Before we open up the Grafana Dashboard, we need to cover how to add Prometheus as a data source. When you go to Grafana, you will be asked to log in. You can use the username `admin` and password `admin` to log in.

Next, click the green `Add Datasource` button, which will take you to the page shown in the image below.

<img src="notebook-images/grafana.png" width="50%" height="50%" />

Give the data source a name, such as `Clipper Metrics`, then change the settings so that `Type` is `Prometheus`, which you can select from the drop down menu. For the `URL` field under the `HTTP` tab, please run the following code block and copy paste the generated URL.

In [None]:
import requests

this_ip = requests.get('http://ip.42.pl/raw').text

print(f"""
    When adding Data Source, please set HTTP->URL field to the following:
    http://{this_ip}:9090

    This is the location of Clipper's Prometheus instance.
""")

You can also adjust the scrape interval so that Grafana queries Prometheus more often. When you are done, your setup page should look like the image below.

<img src="notebook-images/grafana_w_info.png" width="50%" height="50%" />

When you click `Save and Test`, the page should show that the data source was successfully added, like so:

<img src="notebook-images/grafana_test_succeeded.png" width="50%" height="50%" />

Next, click the plus button in the sidebar and select `Dashboard`:

<img src="notebook-images/grafana_add_dashboard.png" />

This will take you to the following screen, where you should click `New Dashboard` in the upper left of the screen:

<img src="notebook-images/grafana_new_dashboard.png" width="50%" height="50%" />

Which will take you to this screen, where you should click `Import Dashboard` in the lower right box, and copy the contents of the next (hidden) cell into the `paste JSON` box.

<img src="notebook-images/grafana_import_dashboard.png" width="50%" height="50%" />



```json
{
  "__inputs": [
    {
      "name": "DS_CLIPPER_METRIC",
      "label": "Clipper Metric",
      "description": "",
      "type": "datasource",
      "pluginId": "prometheus",
      "pluginName": "Prometheus"
    }
  ],
  "__requires": [
    {
      "type": "grafana",
      "id": "grafana",
      "name": "Grafana",
      "version": "5.2.4"
    },
    {
      "type": "panel",
      "id": "graph",
      "name": "Graph",
      "version": "5.0.0"
    },
    {
      "type": "datasource",
      "id": "prometheus",
      "name": "Prometheus",
      "version": "5.0.0"
    }
  ],
  "annotations": {
    "list": [
      {
        "builtIn": 1,
        "datasource": "-- Grafana --",
        "enable": true,
        "hide": true,
        "iconColor": "rgba(0, 211, 255, 1)",
        "name": "Annotations & Alerts",
        "type": "dashboard"
      }
    ]
  },
  "editable": true,
  "gnetId": null,
  "graphTooltip": 0,
  "id": null,
  "links": [],
  "panels": [
    {
      "aliasColors": {},
      "bars": false,
      "dashLength": 10,
      "dashes": false,
      "datasource": "${DS_CLIPPER_METRIC}",
      "fill": 1,
      "gridPos": {
        "h": 9,
        "w": 12,
        "x": 0,
        "y": 0
      },
      "id": 2,
      "legend": {
        "alignAsTable": false,
        "avg": true,
        "current": false,
        "max": false,
        "min": false,
        "rightSide": false,
        "show": true,
        "total": false,
        "values": true
      },
      "lines": true,
      "linewidth": 1,
      "nullPointMode": "null",
      "percentage": false,
      "pointradius": 5,
      "points": false,
      "renderer": "flot",
      "seriesOverrides": [],
      "spaceLength": 10,
      "stack": false,
      "steppedLine": false,
      "targets": [
        {
          "expr": "batch_size",
          "format": "time_series",
          "hide": false,
          "instant": true,
          "interval": "1s",
          "intervalFactor": 1,
          "refId": "A"
        }
      ],
      "thresholds": [],
      "timeFrom": null,
      "timeShift": null,
      "title": "Panel Title",
      "tooltip": {
        "shared": true,
        "sort": 0,
        "value_type": "individual"
      },
      "type": "graph",
      "xaxis": {
        "buckets": null,
        "mode": "time",
        "name": null,
        "show": true,
        "values": []
      },
      "yaxes": [
        {
          "format": "short",
          "label": null,
          "logBase": 1,
          "max": null,
          "min": null,
          "show": true
        },
        {
          "format": "short",
          "label": null,
          "logBase": 1,
          "max": null,
          "min": null,
          "show": true
        }
      ],
      "yaxis": {
        "align": false,
        "alignLevel": null
      }
    }
  ],
  "refresh": "5s",
  "schemaVersion": 16,
  "style": "dark",
  "tags": [],
  "templating": {
    "list": []
  },
  "time": {
    "from": "now-6h",
    "to": "now"
  },
  "timepicker": {
    "refresh_intervals": [
      "5s",
      "10s",
      "30s",
      "1m",
      "5m",
      "15m",
      "30m",
      "1h",
      "2h",
      "1d"
    ],
    "time_options": [
      "5m",
      "15m",
      "1h",
      "6h",
      "12h",
      "24h",
      "2d",
      "7d",
      "30d"
    ]
  },
  "timezone": "",
  "title": "ClipperDashboard",
  "uid": "gk1YWkomk",
  "version": 1
}
```

In [None]:
import base64

In [None]:
import requests, json
req_json = json.dumps({
        "input":
        base64.b64encode(open('images/cat.jpg', "rb").read()).decode() # bytes to unicode
    })

response = requests.post(
     "http://%s/%s/predict" % (clipper_addr, 'squeezenet-classsifier'),
     headers={"Content-type": "application/json"},
     data=req_json)
response.json()

# TODO(dan): Note about big model running on CPU will be slow¶

It's important to note that since we are sending requests in series, there aren't multiple requests in the system at any one time, and therefore, the system will not experience natural batching. We will still observe batching; however, thanks to Clipper's batching exploration algorithm, which adaptively batches requests in order to find the batch size that maximizes throughput and utilization while still meeting the latency SLO for a specific model container.

In [None]:
from IPython.display import display, Image
# Blue is my cat's name. The last one is an image of my cat.
images = ['images/cat.jpg', 'images/dog.jpg', 'images/duck.jpg', 'images/blue.jpg']
for img in images:
    print('Output for', img)
    display(Image(img, width=200, height=200))
    req_json = json.dumps({
            "input":
            base64.b64encode(open(img, "rb").read()).decode() # bytes to unicode
        })

    response = requests.post(
         "http://%s/%s/predict" % (clipper_addr, 'squeezenet-classsifier'),
         headers={"Content-type": "application/json"},
         data=req_json)
    print(response.json())

## Example Application - Custom Docker Containers

In this final example, we will go over how to create a custom Docker container. Custom Docker containers can be utilized when model containers need outside dependencies to install, or, as is the case with our example, if they need to be installed outside of Python.

For this example, we will be using the YoloV3 Real Time Object Detection System to draw bounding boxes for some images. Unlike in our previous examples, YoloV3 does not have a Python API, and must instead be downloaded and compiled using Make. To download the C files, we need to clone the darknet repo - in this case, we will be cloning a special fork made for this project that will give us the bounding box coordinates along with drawing them - and then compile it using Make. In addition, since we are no longer dealing with a Python API, but rather with a C executable, we will see that our predict function will be different.

To build our custom Docker Container, we need to create a dockerfile. This dockerfile is called `PyDarknetDockerfile` and is included in the repo. For your convenience, we include the whole code here:
```dockerfile
FROM clipper/python36-closure-container:0.3

# Install Git
RUN apt-get update \
    && apt-get install -y git

# Install cURL
RUN apt-get install -y curl

# Clone Darknet Repo
RUN git clone https://github.com/RehanSD/darknet.git /tmp/darknet
RUN mv /tmp/darknet/* .

# Make Darknet Project
RUN make -j4

#Download Weights
RUN curl -o yolov3-tiny.weights https://pjreddie.com/media/files/yolov3-tiny.weights
```

In the following sections, we will break down each part of the file.

The first line of the file:
``` dockerfile
FROM clipper/python36-closure-container:0.3
```
are required to build the container - they specify a parent container and will flesh out the basic requirements Clipper needs to be able to use the Dockerfile.

The next couple of lines:
```dockerfile
# Install Git
RUN apt-get update \
    && apt-get install -y git

# Install cURL
RUN apt-get install -y curl
```
install the basic dependencies we need in the container - git to clone the repo, annd curl to download the pretrained yolo model weights.

The last few lines:
```dockerfile
# Clone Darknet Repo
RUN git clone https://github.com/RehanSD/darknet.git /tmp/darknet
RUN mv /tmp/darknet/* .

# Make Darknet Project
RUN make -j4

#Download Weights
RUN curl -o yolov3-tiny.weights https://pjreddie.com/media/files/yolov3-tiny.weights
```
clone the darknet repo, make it, and then download the pretrained model weights. We are using `yolov3-tiny` because it offers a good trade off between accuracy and inference speed on a CPU. (For production use case, you would run it on GPU!)

After writing this dockerfile, all we have to do is build the image like so:

In [None]:
!docker build -t clipper/darknet-yolov3-container -f PyDarknetDockerfile .

### Predict Function
Now we build the predict function. We must first deserialize the image that we serialized in our request - similar to how we did in the PyTorch example, and then write it to a file to run darknet on it. Since darknet is a C executable, we must call it using the subprocess API. It is not strictly neccessary to print out the output of calling darknet, but it is useful to have the output for the sake of debugging.

In [None]:
import os
import subprocess
import base64
import io
import os
def yolo_pred(imgs):
    import base64
    import io
    import os
    import tempfile
    import subprocess

    num_imgs = len(imgs)
    ret_coords = []
    predict_procs = []
    file_names = []
    
    # First, we save the images to file
    for i in range(num_imgs):
        # Create a temp file to write to
        tmp = tempfile.NamedTemporaryFile('wb', delete=False, suffix='.jpg')
        tmp.write(io.BytesIO(imgs[i]).getvalue())
        tmp.close()
        file_names.append(tmp.name)
    
    # Second, we call ./darknet executable to detect objects in images.
    #   This is done in parallel.
    for file_name in file_names:
        process = subprocess.Popen(
            ['./darknet',
             'detector',
             'test',
             './cfg/coco.data',
             './cfg/yolov3-tiny.cfg',
             './yolov3-tiny.weights',
             file_name,
             '-json',
             '-dont_show',
             '-ext_output', '>',
             '{}.txt'.format(file_name+'_result')], stdout=subprocess.PIPE)
        predict_procs.append(process)
        
    # Lastly, we wait for all process to finished and return stdout of each process
    for process in predict_procs:
        process.wait()
        ret_coords += [' '.join(map(lambda byte_str: byte_str.decode(), process.stdout))]

    return ret_coords

# TODO(dan): Note about why we are using a different application - semantics of the model changed (output bounding boxes not just labels)

In [None]:
from clipper_admin.deployers import python as python_deployer
python_deployer.deploy_python_closure(
    clipper_conn,
    name="yolov3",  # The name of the model in Clipper
    version=1,  # A unique identifier to assign to this model.
    input_type="bytes",  # The type of data the model function expects as input
    func=yolo_pred, # The model function to deploy
    base_image='clipper/darknet-yolov3-container'
)

In [None]:
clipper_conn.register_application(
    name="darknet-app",
    input_type="bytes",
    default_output="Default",
    slo_micros=10000000 # 10 seconds
)

In [None]:
clipper_conn.link_model_to_app(app_name="darknet-app", model_name="yolov3")

In [None]:
import requests, json
url = "http://%s/darknet-app/predict" % clipper_addr
req_json = json.dumps({
    "input":
    base64.b64encode(open('images/dog.jpg', "rb").read()).decode() # bytes to unicode
})
headers = {'Content-type': 'application/json'}
r = requests.post(url, headers=headers, data=req_json)

# Let's see what does YoloV3-tiny return
print(r.json())
print()
print(r.json()['output'])

In [None]:
# Let's plot the result

def _get_bounding_boxes(output_string):
    """Yield Rectangle object from output string"""
    import re
    import matplotlib.patches as patches
    bbox_regex = re.compile(r" ([a-z]+): [\d]{2}%\n Left: ([\d]{1,3}), Bottom: ([\d]{1,3}), Right: ([\d]{1,3}), Top: ([\d]{1,3})")
    matched = bbox_regex.findall(output_string)
    
    colors = ['b', 'g', 'r', 'c', 'm', 'y', 'k']
    for c, m in zip(colors, matched):
        cat, left, bottom, right, top = m
        left, bottom, right, top = int(left), int(bottom), int(right), int(top)
        yield patches.Rectangle((left,bottom), right-left, top-bottom, linewidth=1, facecolor='none', edgecolor=c,label=cat)

def plot_bbox(im_name, output_string):
    """Plot the image with bbox overlay"""
    import matplotlib.pyplot as plt
    import matplotlib.patches as patches
    from PIL import Image
    import numpy as np
    
    fig,ax = plt.subplots(1, figsize=(10,8))
    ax.imshow(np.array(Image.open(im_name)))
    for p in _get_bounding_boxes(output_string):
        ax.add_patch(p) 
    plt.title("Prediction result for image: {}".format(im_name))
    fig.legend()

def predict_and_plot(im_name):
    """Make a prediction and plot image with bbox"""
    url = "http://%s/darknet-app/predict" % clipper_addr
    req_json = json.dumps({
        "input":
        base64.b64encode(open(im_name, "rb").read()).decode() # bytes to unicode
    })
    headers = {'Content-type': 'application/json'}
    r = requests.post(url, headers=headers, data=req_json)
    print(r.json())
    plot_bbox(im_name, r.json()['output'])

plot_bbox('images/dog.jpg', r.json()['output'])

Feel free to experiment with other images under `images/detection/*.jpg`.

In [None]:
!ls images/detection/

In [None]:
# im_name = "images/detection/*.jpg"
im_name = "images/detection/cityscapes-3.jpg"
predict_and_plot(im_name)

## Stopping Clipper
If you run into issues and want to completely stop Clipper, you can do this by calling [`ClipperConnection.stop_all()`](http://docs.clipper.ai/en/latest/#clipper_admin.ClipperConnection.stop_all).

In [None]:
clipper_conn.stop_all()

When you list all the Docker containers a final time, you should see that all of the Clipper containers have been stopped.

In [None]:
!docker ps --filter label=ai.clipper.container.label

You can now call `clipper_conn.start_clipper()` again without running into errors.

Please continue with Part 2 of the tutorial [here](2-Pong-Game.ipynb)!