Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Best practices for using Tensorflow Serving in production #40

Closed
viksit opened this issue Apr 14, 2016 · 5 comments
Closed

Best practices for using Tensorflow Serving in production #40

viksit opened this issue Apr 14, 2016 · 5 comments

Comments

@viksit
Copy link

viksit commented Apr 14, 2016

I have some queries around some real world, production deployments of the system.

  • Is there a way to package the C++, python and proto files together in some way via Bazel? Some sort of an "uberjar" type container which contains a manifest? At this moment, installing grpc, grpcio, bazel, compiling everything via it, having a bunch of pre-built libs et al is quite time consuming, and there's no clear way to distribute a TF Serving system. Ideally, I'd take this packaged "thing" and use upstart or daemontools to daemonize it. Is there a better way?
  • Incidentally, the compiled version of the dependencies are almost 1.5G with tf taking up a lot of it. Is there way to make this footprint "thinner"?
3.5M    ./tensorflow_serving/session_bundle
3.5M    ./tensorflow_serving
8.0K    ./_solib_local/_U@tf_S_Sthird_Uparty_Sgpus_Scuda_Ccudnn___Uexternal_Stf_Sthird_Uparty_Sgpus_Scuda_Slib64
8.0K    ./_solib_local/_U@tf_S_Sthird_Uparty_Sgpus_Scuda_Ccudart___Uexternal_Stf_Sthird_Uparty_Sgpus_Scuda_Slib64
8.0K    ./_solib_local/_U@tf_S_Sthird_Uparty_Sgpus_Scuda_Ccufft___Uexternal_Stf_Sthird_Uparty_Sgpus_Scuda_Slib64
8.0K    ./_solib_local/_U@tf_S_Sthird_Uparty_Sgpus_Scuda_Ccublas___Uexternal_Stf_Sthird_Uparty_Sgpus_Scuda_Slib64
36K ./_solib_local
1.2M    ./external/jpeg_archive
396K    ./external/zlib_archive
716K    ./external/png_archive
17M ./external/grpc
5.7M    ./external/re2
1.2G    ./external/tf
8.0M    ./external/boringssl_git
1.3G    ./external
216M    ./mycode/inference_server
216M    ./mycode
1.5G    .
  • Are there any best practices out there for statusz/healthz style interfaces for this server? I realize this might be more of a grpc related question, but getting some insights/examples from people who've deployed TF Serving in production would be useful.

Lastly, the example folders contain a bunch of pb2.py files, for which the tutorials don't seem to talk about - for instance, installing grpc, and protobuf3 c++/python, and how to use protoc to compile them into a service definition. Would be good to have for those who aren't familiar with grpc.

@nfiedel
Copy link
Contributor

nfiedel commented Apr 15, 2016

Good questions. I'll try to answer each:

Regarding packaging, the anticipated use-case of TensorFlow Serving is that a production user would have a separate training pipeline and serving system. The handoff from training to serving only requires an export (see session_bundle/exporter.py). So in that use case, you shouldn't need to bundle any python files, or (not including the export) any proto files.

Regarding a thinner footprint, much of these are standard TensorFlow dependencies. There are some efforts, e.g. mobile / android support, to shrink the deployed package sizes. At the moment, TensorFlow Serving is focusing on server-side usage where resources are less scarce, but open to feedback on use-cases.

For statusz/healthz style interfaces I think this is a hybrid of a gRPC and TensorFlow Serving. I'll answer for TensorFlow Serving, where we are thinking about adding something like a /servablez that would show the status of each model. This is in early thinking / a not yet prioritized state so both feedback and contributions are welcome. You can see some related groundwork for this in recent additions to servable_state_monitor.

@nfiedel
Copy link
Contributor

nfiedel commented Apr 15, 2016

(feel free to re-open)

@nfiedel nfiedel closed this as completed Apr 15, 2016
@viksit
Copy link
Author

viksit commented Apr 15, 2016

Thanks for the answers, @nfiedel - some comments. (PS, I can't reopen this issue for some reason).

  • For questions 1 and 2 - right now, I do have two pipelines - one train + export, and one to build the server itself.

I was wondering if there was a way to link only tensorflow.so, and bundle it with my inference server binary rather than the 1.2G ./external/tf directory.

This is shown here - tensorflow/tensorflow#695 - perhaps it can be leveraged?

  • For 3 - makes sense, thanks.

@nfiedel
Copy link
Contributor

nfiedel commented Apr 15, 2016

Hi viksit@
Filed an issue to track making a smaller footprint thinner. I'm experimenting with build options locally now (e.g. -c opt), and we can look at adding a shared library as well.
Thanks!
Noah

@waichee
Copy link
Contributor

waichee commented Oct 22, 2016

@viksit @nfiedel are the external dependencies required? The tensorflow_model_server binary built with -c opt when copied out from the docker container, and running the binary alone in a new host seem to be sufficient for running up the server?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants