A low-latency prediction-serving system
C++ Python CMake Scala Shell Java Other
Clone or download
simon-mo and dcrankshaw Multi tenancy support (#503)
* started impl, but need to dynamically set labels for prom and svcs

* Added unbound port func, still debugging

* Remove all occurance of tmp in metrics

Resolve issue #448

* Resolve hard coded prometheus path

Should resolve issue #424

* Add label selection to replicas

* Docker Multi-tenancy done

* Add templating to config yamls; add cluster_name

* Seperation of service done; need check conn

* Format Code

* Add tests, fix label selector

* Format code

* Don't connect after cleanup

* Format code

* Add debug line; tests pass locally; suspecting dangling containers

* Checkout parallel build docker

* Make persistent test less verbose

* Make Unittest all parallel

* Make maven less verbose

* Change persistent_state_test 10->100

* Refactor tests to include cluster name

* Format code

* Add logging support; push to test on ubuntu

* Fix docker issue on ubuntu

* Modify comment about prometheus

* Fix some kubernetes test

* Interpreable cluster name

* Format code

* Remove temp_dir usage in deployer serialize func

* Address comment

* Format code

* Add more comments; shorten naming for cluster

* Fix proxy addr

* Format code

* Wrap k8s metric test 2 in retry loop

This is the only way to prevent some jenkins & mismatch

* Clean up test

* Format code

* Fix frontend exporter naming

* Add readme to CI process

* Change testapp name for ekr

* Swap the python closure build step

* Add __registry__ argument

* Add registry to docker build process

* Add cluster info to image name

* Format code

* Run unittests in clippertest

* Address comment

* fix clipper_docker_images location

* Add set -x to debug "invalid reference format"

* Add debug lines

* Fix the blank line issue in clipper_docker_images.txt

* Don't pull ecr images

* Fix import

* Set git hash length to at least 10 to prevent collision

Such that different version of git in different environement can agree
that the length of the hash

* Update registry in test

* Add version back to metrics

* Remove CLIPPER_CONTAINER_REGISTRY constant in test

* Revert "Remove CLIPPER_CONTAINER_REGISTRY constant in test"

This reverts commit 99e2517.

* Add CLIPPER_CONTAINER_REGISTRY back

* Format code

* Fix docker login in unnitest

* Fix image name in R test

* Fix CLIPPER_CONTAINER_REGISTRY in k8s metrics test

* Refresh k8s metric tests;fix variable reuse
Latest commit 3c5a1cc Jun 21, 2018

README.md

Clipper

Build Status License

What is Clipper?

Clipper is a prediction serving system that sits between user-facing applications and a wide range of commonly used machine learning models and frameworks. Learn more about Clipper and view documentation at our website http://clipper.ai.

What does Clipper do?

  • Clipper simplifies integration of machine learning techniques into user facing applications by providing a simple standard REST interface for prediction and feedback across a wide range of commonly used machine learning frameworks. Clipper makes product teams happy.

  • Clipper simplifies model deployment and helps reduce common bugs by using the same tools and libraries used in model development to render live predictions. Clipper makes data scientists happy.

  • Clipper improves throughput and ensures reliable millisecond latencies by introducing adaptive batching, caching, and straggler mitigation techniques. Clipper makes the infra-team less unhappy.

  • Clipper improves prediction accuracy by introducing state-of-the-art bandit and ensemble methods to intelligently select and combine predictions and achieve real-time personalization across machine learning frameworks. Clipper makes users happy.

Quickstart

Note: This quickstart works for the latest version of code. For a quickstart that works with the released version of Clipper available on PyPi, go to our website

This quickstart requires Docker and supports Python 2.7, 3.5, and 3.6.

Start a Clipper Instance and Deploy a Model

Install Clipper

You can either install Clipper directly from GitHub:

pip install git+https://github.com/ucbrise/clipper.git@develop#subdirectory=clipper_admin

or by cloning Clipper and installing directly from the file system:

pip install -e </path/to/clipper_repo>/clipper_admin

Start a local Clipper cluster

First start a Python interpreter session.

$ python

# Or start one with iPython
$ conda install ipython
$ ipython

Create a ClipperConnection object and start Clipper. Running this command for the first time will download several Docker containers, so it may take some time.

from clipper_admin import ClipperConnection, DockerContainerManager
clipper_conn = ClipperConnection(DockerContainerManager())
clipper_conn.start_clipper()
17-08-30:15:48:41 INFO     [docker_container_manager.py:95] Starting managed Redis instance in Docker
17-08-30:15:48:43 INFO     [clipper_admin.py:105] Clipper still initializing.
17-08-30:15:48:44 INFO     [clipper_admin.py:107] Clipper is running

Register an application called "hello-world". This will create a prediction REST endpoint at http://localhost:1337/hello-world/predict

clipper_conn.register_application(name="hello-world", input_type="doubles", default_output="-1.0", slo_micros=100000)
17-08-30:15:51:42 INFO     [clipper_admin.py:182] Application hello-world was successfully registered

Inspect Clipper to see the registered apps

clipper_conn.get_all_apps()
[u'hello-world']

Define a simple model that just returns the sum of each feature vector. Note that the prediction function takes a list of feature vectors as input and returns a list of strings.

def feature_sum(xs):
    return [str(sum(x)) for x in xs]

Import the python deployer package

from clipper_admin.deployers import python as python_deployer

Deploy the "feature_sum" function as a model. Notice that the application and model must have the same input type.

python_deployer.deploy_python_closure(clipper_conn, name="sum-model", version=1, input_type="doubles", func=feature_sum)
17-08-30:15:59:56 INFO     [deployer_utils.py:50] Anaconda environment found. Verifying packages.
17-08-30:16:00:04 INFO     [deployer_utils.py:150] Fetching package metadata .........
Solving package specifications: .

17-08-30:16:00:04 INFO     [deployer_utils.py:151]
17-08-30:16:00:04 INFO     [deployer_utils.py:59] Supplied environment details
17-08-30:16:00:04 INFO     [deployer_utils.py:71] Supplied local modules
17-08-30:16:00:04 INFO     [deployer_utils.py:77] Serialized and supplied predict function
17-08-30:16:00:04 INFO     [python.py:127] Python closure saved
17-08-30:16:00:04 INFO     [clipper_admin.py:375] Building model Docker image with model data from /tmp/python_func_serializations/sum-model
17-08-30:16:00:05 INFO     [clipper_admin.py:378] Pushing model Docker image to sum-model:1
17-08-30:16:00:07 INFO     [docker_container_manager.py:204] Found 0 replicas for sum-model:1. Adding 1
17-08-30:16:00:07 INFO     [clipper_admin.py:519] Successfully registered model sum-model:1
17-08-30:16:00:07 INFO     [clipper_admin.py:447] Done deploying model sum-model:1.

Tell Clipper to route requests for the "hello-world" application to the "sum-model"

clipper_conn.link_model_to_app(app_name="hello-world", model_name="sum-model")
17-08-30:16:08:50 INFO     [clipper_admin.py:224] Model sum-model is now linked to application hello-world

Your application is now ready to serve predictions

Query Clipper for predictions

Now that you've deployed your first model, you can start requesting predictions at the REST endpoint that clipper created for your application: http://localhost:1337/hello-world/predict

With cURL:

$ curl -X POST --header "Content-Type:application/json" -d '{"input": [1.1, 2.2, 3.3]}' 127.0.0.1:1337/hello-world/predict

With Python:

import requests, json, numpy as np
headers = {"Content-type": "application/json"}
requests.post("http://localhost:1337/hello-world/predict", headers=headers, data=json.dumps({"input": list(np.random.random(10))})).json()

Clean up

If you closed the Python REPL you were using to start Clipper, you will need to start a new Python REPL and create another connection to the Clipper cluster. If you still have the Python REPL session active from earlier, you can re-use your existing ClipperConnection object.

Create a new connection. If you have still have the Python REPL from earlier, you can skip this step.

from clipper_admin import ClipperConnection, DockerContainerManager
clipper_conn = ClipperConnection(DockerContainerManager())
clipper_conn.connect()

Stop all Clipper docker containers

clipper_conn.stop_all()
17-08-30:16:15:38 INFO     [clipper_admin.py:1141] Stopped all Clipper cluster and all model containers

Contributing

To file a bug or request a feature, please file a GitHub issue. Pull requests are welcome. Additional help and instructions for contributors can be found on our website at http://clipper.ai/contributing.

The Team

You can contact us at clipper-dev@googlegroups.com

Acknowledgements

This research is supported in part by DHS Award HSHQDC-16-3-00083, DOE Award SN10040 DE-SC0012463, NSF CISE Expeditions Award CCF-1139158, and gifts from Ant Financial, Amazon Web Services, CapitalOne, Ericsson, GE, Google, Huawei, Intel, IBM, Microsoft and VMware.