Best practices for using Tensorflow Serving in production #40

viksit · 2016-04-14T23:42:26Z

I have some queries around some real world, production deployments of the system.

Is there a way to package the C++, python and proto files together in some way via Bazel? Some sort of an "uberjar" type container which contains a manifest? At this moment, installing grpc, grpcio, bazel, compiling everything via it, having a bunch of pre-built libs et al is quite time consuming, and there's no clear way to distribute a TF Serving system. Ideally, I'd take this packaged "thing" and use upstart or daemontools to daemonize it. Is there a better way?
Incidentally, the compiled version of the dependencies are almost 1.5G with tf taking up a lot of it. Is there way to make this footprint "thinner"?

3.5M    ./tensorflow_serving/session_bundle
3.5M    ./tensorflow_serving
8.0K    ./_solib_local/_U@tf_S_Sthird_Uparty_Sgpus_Scuda_Ccudnn___Uexternal_Stf_Sthird_Uparty_Sgpus_Scuda_Slib64
8.0K    ./_solib_local/_U@tf_S_Sthird_Uparty_Sgpus_Scuda_Ccudart___Uexternal_Stf_Sthird_Uparty_Sgpus_Scuda_Slib64
8.0K    ./_solib_local/_U@tf_S_Sthird_Uparty_Sgpus_Scuda_Ccufft___Uexternal_Stf_Sthird_Uparty_Sgpus_Scuda_Slib64
8.0K    ./_solib_local/_U@tf_S_Sthird_Uparty_Sgpus_Scuda_Ccublas___Uexternal_Stf_Sthird_Uparty_Sgpus_Scuda_Slib64
36K ./_solib_local
1.2M    ./external/jpeg_archive
396K    ./external/zlib_archive
716K    ./external/png_archive
17M ./external/grpc
5.7M    ./external/re2
1.2G    ./external/tf
8.0M    ./external/boringssl_git
1.3G    ./external
216M    ./mycode/inference_server
216M    ./mycode
1.5G    .

Are there any best practices out there for statusz/healthz style interfaces for this server? I realize this might be more of a grpc related question, but getting some insights/examples from people who've deployed TF Serving in production would be useful.

Lastly, the example folders contain a bunch of pb2.py files, for which the tutorials don't seem to talk about - for instance, installing grpc, and protobuf3 c++/python, and how to use protoc to compile them into a service definition. Would be good to have for those who aren't familiar with grpc.

The text was updated successfully, but these errors were encountered:

nfiedel · 2016-04-15T19:36:13Z

Good questions. I'll try to answer each:

Regarding packaging, the anticipated use-case of TensorFlow Serving is that a production user would have a separate training pipeline and serving system. The handoff from training to serving only requires an export (see session_bundle/exporter.py). So in that use case, you shouldn't need to bundle any python files, or (not including the export) any proto files.

Regarding a thinner footprint, much of these are standard TensorFlow dependencies. There are some efforts, e.g. mobile / android support, to shrink the deployed package sizes. At the moment, TensorFlow Serving is focusing on server-side usage where resources are less scarce, but open to feedback on use-cases.

For statusz/healthz style interfaces I think this is a hybrid of a gRPC and TensorFlow Serving. I'll answer for TensorFlow Serving, where we are thinking about adding something like a /servablez that would show the status of each model. This is in early thinking / a not yet prioritized state so both feedback and contributions are welcome. You can see some related groundwork for this in recent additions to servable_state_monitor.

nfiedel · 2016-04-15T19:36:28Z

(feel free to re-open)

viksit · 2016-04-15T20:16:37Z

Thanks for the answers, @nfiedel - some comments. (PS, I can't reopen this issue for some reason).

For questions 1 and 2 - right now, I do have two pipelines - one train + export, and one to build the server itself.

I was wondering if there was a way to link only tensorflow.so, and bundle it with my inference server binary rather than the 1.2G ./external/tf directory.

This is shown here - tensorflow/tensorflow#695 - perhaps it can be leveraged?

For 3 - makes sense, thanks.

nfiedel · 2016-04-15T21:24:01Z

Hi viksit@
Filed an issue to track making a smaller footprint thinner. I'm experimenting with build options locally now (e.g. -c opt), and we can look at adding a shared library as well.
Thanks!
Noah

waichee · 2016-10-22T04:00:15Z

@viksit @nfiedel are the external dependencies required? The tensorflow_model_server binary built with -c opt when copied out from the docker container, and running the binary alone in a new host seem to be sufficient for running up the server?

nfiedel closed this as completed Apr 15, 2016

nfiedel mentioned this issue Apr 15, 2016

Create smaller size shared libraries for deployed servers. #44

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Best practices for using Tensorflow Serving in production #40

Best practices for using Tensorflow Serving in production #40

viksit commented Apr 14, 2016 •

edited

nfiedel commented Apr 15, 2016

nfiedel commented Apr 15, 2016

viksit commented Apr 15, 2016

nfiedel commented Apr 15, 2016

waichee commented Oct 22, 2016

Best practices for using Tensorflow Serving in production #40

Best practices for using Tensorflow Serving in production #40

Comments

viksit commented Apr 14, 2016 • edited

nfiedel commented Apr 15, 2016

nfiedel commented Apr 15, 2016

viksit commented Apr 15, 2016

nfiedel commented Apr 15, 2016

waichee commented Oct 22, 2016

viksit commented Apr 14, 2016 •

edited