New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TFServing performances #456
Comments
I've also met the same difference.With JNI the avg request cost is around 10~15ms,but with serving it would be 200~300ms. |
One experience report: I got a 10x speedup from compiling TF serving as bazel build -c opt --copt=-mavx --copt=-mavx2 --copt=-mfma --copt=-msse4.2 //tensorflow_serving/model_servers:tensorflow_model_server The weird thing is I'm not sure the flags matter for the tensorflow-only aspect, but specifically for TF serving. |
I confirm that compiling Tensorflow Serving with this options resolve the problem. |
@chrisolston @kirilg @kamei86i Can anyone describe what these options mean? |
Hi,
I'm building an application using the Java JNI API (https://www.tensorflow.org/api_docs/java/reference/org/tensorflow/package-summary) and Tensorflow Serving Java GRPC interface. I should switch between these two modalities during normal usage of my application.
I correctly get the same classifications with both modalities, however i'm experimenting a performance difference in favor of the Java JNI interface. With the same model and on the same test cases, i measured that TF serving via GRPC/blocking calls is 20-25 times slower than the JNI calls. Is there something I can do to improve TF Serving performances without using batching?
The model I'm using is a CNN built with keras 2 and Tensorflow 1.1.
I built the tensorflow model server with the latest release 0.5.1 of Tensorflow Serving.
The text was updated successfully, but these errors were encountered: