Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What's wrong with the tensorflow serving? #478

Closed
mahnunchik opened this issue Jun 13, 2017 · 10 comments
Closed

What's wrong with the tensorflow serving? #478

mahnunchik opened this issue Jun 13, 2017 · 10 comments
Labels
type:performance Performance Issue

Comments

@mahnunchik
Copy link
Contributor

What's wrong with the tensorflow serving?

mnist example

Basic example could not be compiled without patch:

remove line

from tensorflow.contrib.image.python.ops.single_image_random_dot_stereograms import single_image_random_dot_stereograms

from bazel-bin/tensorflow_serving/example/mnist_saved_model.runfiles/org_tensorflow/tensorflow/contrib/image/__init__.py

inception example

Inception example is incredibly slow 😭

It is about 10 times slower then the python implementation from this tutorial.

@chrisolston
Copy link
Contributor

Sorry you're having trouble.

re: the compilation issue we'll take a look this week

re: the performance issue, please see #456

@mahnunchik
Copy link
Contributor Author

Hi @chrisolston

bazel build -c opt --copt=-mavx --copt=-mavx2 --copt=-mfma --copt=-msse4.2 speedup tensorflow_model_server about 3 times. But it is still 3 times slower then basic python implementation 😢

@chrisolston
Copy link
Contributor

It would be useful to determine whether the slowdown is in: (a) gRPC layer, (b) TF-Serving, (c) TF-Core.

For (a), one can imagine an experiment that bypasses that layer.

For (b), it's a pretty thin layer and we've benchmarked it and have not found any bottlenecks. But that was a while back so maybe we need to re-benchmark if we determine the problem lies in (b) and not in (a) or (c).

For (c), the TF-Core c++ and python Session::Run() implementations differ. We could rule out (c) by doing your same experiment with just TF-Core c++ vs. python (no TF-Serving layer).

@haiy
Copy link

haiy commented Jun 16, 2017

hi, @chrisolston , below is the python code I load model directly to a session maybe help. Use Java API is almost the same. BTW, I would appreciate that if you can give some advice about the issue #458 .Thanks a lot.

def load_tf_model(model_path):
    sess = tf.Session()
    tf.saved_model.loader.load(sess, [tf.saved_model.tag_constants.SERVING], model_path)
    return sess

@chrisolston
Copy link
Contributor

Closing due to lack of activity. If anybody winds up doing benchmarks to pinpoint the bottlenecks (whether in gRPC, TF-Serving or TF-Core) please post your data here.

@qinhaocheng
Copy link

For (a), one can imagine an experiment that bypasses that layer.

@chrisolston can you give an example of how to bypass the gRPC layer?
If not using the gRPC layer, how can I get the results from model_server?

@ydp
Copy link
Contributor

ydp commented Apr 10, 2018

@mahnunchik Hi, sorry to bother, did you solve you problem that the C++ and python perform differently?

@mahnunchik
Copy link
Contributor Author

@ydp no 😞

@thesillystudent
Copy link

in my case tf serving is 3x slower than loading the model via python.

@pharrellyhy
Copy link

I use below code to parse the result which is really slow... Does anyone know better approach to do that? Thanks!

result_future = stub.Predict.future(request, 5.0)
result_future.result().outputs['classification_score'].float_val).reshape(-1, 2)

@misterpeddy misterpeddy added the type:performance Performance Issue label Nov 18, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type:performance Performance Issue
Projects
None yet
Development

No branches or pull requests

8 participants