Tensorflow 1.4 C++ API considerably slower than Python #15552

tensorfreitas · 2017-12-21T11:13:17Z

System information

Have I written custom code (as opposed to using a stock example script provided in TensorFlow):
OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 16.04
TensorFlow installed from (source or binary): source with all optimizations
TensorFlow version (use command below): 1.4
Python version: 3.5.2
Bazel version (if compiling from source): 0.8.1
GCC/Compiler version (if compiling from source): 5.4.0
CUDA/cuDNN version: 8.0 / 6.0
GPU model and memory: GTX960M

Describe the problem

I was trying to run several models and evaluate the performance with different batch sizes in python and c++ and noticed that the c++ API version is considerably slower than the python one. Both were compiled with the same optimizations and with cuda support.

When I try to predict the output of a single 256x256 image in python it takes me 0.5 seconds, and when i do it in tensorflow c++ api it takes me 1.7 seconds. Notice that in python I was using a non deployed model (without freezing and transforming graph), whereas in C++ I did those transformations.

Does anyone knows why this is happening? Is it because of the frozen and transformed graph?

I always thought the C++ API would be at least as fast as the Python version.

tensorfreitas · 2017-12-21T13:53:40Z

I will leave here some of the times for the predictions with different batch sizes:

Batch Size Python (s) C++ (s)
1 | 0.5 | 1.7
32 | 0.6 | 1.8
128 | 0.9 | 2.2

michaelisard · 2017-12-21T23:39:31Z

Can you try to repro using the same model from both Python and C++, to narrow down the sources of differences?

tensorfreitas · 2017-12-22T09:39:03Z

The difference in time from optimizing or not the models are in the order of 100 ms in c++- The model is the same of the one used in python. So the problem still persists. I will try a different model and post the results today

tensorfreitas · 2017-12-22T13:45:10Z

I tried the Inception example and noticed that the C++ version had a better performance then the Python one. But with a simple model with a few Convolutional layers, batch normalization and dropout layers.
The frozen model is optimized via via graph transformed tool have an execution time considerably slower.

The code is very similar to the one used in https://www.tensorflow.org/tutorials/image_recognition.

Am I doing something wrong?

tensorfreitas · 2017-12-22T16:20:14Z

After running some warm up runs in the session, I was able to increase massively the performance in c++.

Problem solved.

csytracy · 2018-02-06T00:17:26Z

Could you please let us know what kind of warm up runs you did to speed it up? Thank you so much.

tensorfreitas · 2018-02-06T09:19:42Z

Hi @csytracy. I did some warm-up runs similar to the test_benchmark code:

https://github.com/tensorflow/tensorflow/blob/master/tensorflow/tools/benchmark/benchmark_model.cc

abhigoku10 · 2018-06-12T04:21:52Z

@Goldesel23 what is the technique you used to increase the speed of c++ can you pls share those tips
Thanks in advance

tensorfreitas · 2018-06-12T12:33:59Z

If you follow the code in the samples you will get a good speed. You just need to understand that the first batches will be slower. In my application I just run some warm-up runs on the start after the session initialization.

tangjie77wd · 2019-07-31T03:09:43Z

@Goldesel23 Could you supply me more detailed warm-up you used ?

michaelisard added the stat:awaiting response Status - Awaiting response from author label Dec 21, 2017

tensorfreitas closed this as completed Dec 22, 2017

PythonImageDeveloper mentioned this issue Jun 13, 2019

Inference Time TensorFlow C++ API vs Python API #29746

Closed

hjonnala mentioned this issue Jun 17, 2022

Why is pycoral faster than libedgetpu (c API)? google-coral/edgetpu#613

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tensorflow 1.4 C++ API considerably slower than Python #15552

Tensorflow 1.4 C++ API considerably slower than Python #15552

tensorfreitas commented Dec 21, 2017

tensorfreitas commented Dec 21, 2017 •

edited

michaelisard commented Dec 21, 2017

tensorfreitas commented Dec 22, 2017 •

edited

tensorfreitas commented Dec 22, 2017 •

edited

tensorfreitas commented Dec 22, 2017

csytracy commented Feb 6, 2018

tensorfreitas commented Feb 6, 2018

abhigoku10 commented Jun 12, 2018

tensorfreitas commented Jun 12, 2018

tangjie77wd commented Jul 31, 2019

Tensorflow 1.4 C++ API considerably slower than Python #15552

Tensorflow 1.4 C++ API considerably slower than Python #15552

Comments

tensorfreitas commented Dec 21, 2017

System information

Describe the problem

tensorfreitas commented Dec 21, 2017 • edited

michaelisard commented Dec 21, 2017

tensorfreitas commented Dec 22, 2017 • edited

tensorfreitas commented Dec 22, 2017 • edited

tensorfreitas commented Dec 22, 2017

csytracy commented Feb 6, 2018

tensorfreitas commented Feb 6, 2018

abhigoku10 commented Jun 12, 2018

tensorfreitas commented Jun 12, 2018

tangjie77wd commented Jul 31, 2019

tensorfreitas commented Dec 21, 2017 •

edited

tensorfreitas commented Dec 22, 2017 •

edited

tensorfreitas commented Dec 22, 2017 •

edited