A flexible, high-performance serving system for machine learning models
Switch branches/tags
Nothing to show
Clone or download
Pull request Compare This branch is 5 commits ahead, 2277 commits behind tensorflow:master.
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
tensorflow @ 0c3f116
tensorflow_serving
tf_models @ 044a5f5
tools
.gitignore
.gitmodules
AUTHORS
CONTRIBUTING.md
LICENSE
README.md
RELEASE.md
WORKSPACE
myra
myra-build.sh
myra_serving
zlib.BUILD

README.md

#TensorFlow Serving

Build Status

TensorFlow Serving is an open-source software library for serving machine learning models. It deals with the inference aspect of machine learning, taking models after training and managing their lifetimes, providing clients with versioned access via a high-performance, reference-counted lookup table.

Multiple models, or indeed multiple versions of the same model, can be served simultaneously. This flexibility facilitates canarying new versions, non-atomically migrating clients to new models or versions, and A/B testing experimental models.

The primary use-case is high-performance production serving, but the same serving infrastructure can also be used in bulk-processing (e.g. map-reduce) jobs to pre-compute inference results or analyze model performance. In both scenarios, GPUs can substantially increase inference throughput. TensorFlow Serving comes with a scheduler that groups individual inference requests into batches for joint execution on a GPU, with configurable latency controls.

TensorFlow Serving has out-of-the-box support for TensorFlow models (naturally), but at its core it manages arbitrary versioned items (servables) with pass-through to their native APIs. In addition to trained TensorFlow models, servables can include other assets needed for inference such as embeddings, vocabularies and feature transformation configs, or even non-TensorFlow-based machine learning models.

The architecture is highly modular. You can use some parts individually (e.g. batch scheduling) or use all the parts together. There are numerous plug-in points; perhaps the most useful ways to extend the system are: (a) creating a new type of servable; (b) creating a custom source of servable versions.

If you'd like to contribute to TensorFlow Serving, be sure to review the contribution guidelines.

**We use GitHub issues for tracking requests and bugs.

Download and Setup

See install instructions.

##Tutorials

##For more information