C++ api runs much slower than Python API (compile flags) #3471

lingz · 2016-07-22T22:12:07Z

My graph run in Python only takes 6 seconds for one batch, but when I run the identical batch on the same graph (graph_freeze) in the C++ Api, the time is 80 seconds. I'm guessing this 13x slowdown is probably from using the wrong C flags during compilation. This is all running on CPU only.

I'm loading the graphs using the same way as in the label_images example.

I took a look at: #2721, and added the -mavx C flag, which increased it by about double, but still 13x slower than the python.

The graph is a mostly a large multi layered regular RNN but with some feedforward as well.

Any ideas on how to get it to the same speed as python? Is there somewhere I can see what flags tensorflow installed from source is compiled with?

Environment info

Operating System: Linux ubuntu 64 bit 14.04

Installed version of CUDA and cuDNN: None (CPU Only)
(please attach the output of ls -l /path/to/cuda/lib/libcud*):

If installed from binary pip package, provide:

Which pip package you installed.
Linux 64 Bit CPU Python 3.5
The output from python -c "import tensorflow; print(tensorflow.__version__)".
0.9.0

If installed from source, provide

The commit hash (git rev-parse HEAD)
The output of bazel version

Steps to reproduce

Create graph in python
freeze_graph.py
Load graph in C++

What have you tried?

adding -mavx C flag

Logs or other output that would be helpful

(If logs are large, please upload as attachment).

The text was updated successfully, but these errors were encountered:

concretevitamin · 2016-07-22T22:22:09Z

Did you use the -c opt flag? I.e.

bazel -c opt --copt=-mavx build <...>

lingz · 2016-07-22T23:37:34Z

Ah amazing, I got the run times from 2minutes -> 4.5 seconds. However, note that you must pass the flags after the build keyword:

bazel build -c opt --copt=-mavx <...>

Maybe this should be added to documentation somewhere?

concretevitamin · 2016-07-23T00:24:37Z

Right. "-c opt" means optimized build.

On Friday, July 22, 2016, Lingliang Zhang notifications@github.com wrote:

Closed #3471 #3471.

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#3471 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAkLHqaVmdOMiWjVmjEpiJU7PqDgFTKqks5qYVRLgaJpZM4JTLKK
.

venuktan · 2017-08-02T00:30:14Z

@lingz I am trying to run the exported .pb file in C++ and getting errors.
The .pb file works in python but not c++.
I am feeding it as cv::Mat

cv::Mat frameC;
static TF_Operation *placeholder = TF_GraphOperationByName(graph, "batch:0");
static TF_Operation *output_op = TF_GraphOperationByName(graph, "probability/class_idx");

for (;;)
    {
        if (!capture.read(frameC))
        {
            std::cerr << "Failed to grab frame" << std::endl;
            continue;
        }
        cv::resize(frameC, dest, cv::Size(inputWidth, inputHeight));
        
        TF_Tensor *tensor = TF_NewTensor(TF_FLOAT, dims, 4, dest, size, &deallocator, nullptr);
        csession.SetInputs({{placeholder, tensor}});
        std::chrono::steady_clock::time_point beginRun = std::chrono::steady_clock::now();
        csession.Run(s);

    }

Can you point me to a how you ran it in C++ ?
Thank you

jimaldon · 2017-11-30T14:44:35Z

I have an independant project using Makefile and tensorflow shared object file instead of bazel to build. What's the g++ equalivalent of -c opt here?

CasiaFan · 2018-05-02T12:51:03Z

In my case, adding optimization options (all available for cpu) during bazel build works with me.
bazel build -c opt --copt=-mavx --copt=-mavx2 --copt=-mfma --copt=-mfpmath=both --copt=-msse4.2 //tensorflow:libtensorflow_cc.so

concretevitamin added the stat:awaiting response Status - Awaiting response from author label Jul 22, 2016

lingz closed this as completed Jul 22, 2016

lingz changed the title ~~C++ api runs much slower than Python API (what compile flags do I need?)~~ C++ api runs much slower than Python API (compile flags) Nov 25, 2016

fmmohammadi mentioned this issue Dec 31, 2019

C++ Tensorflow is very slower than Python Tensorflow #35512

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

C++ api runs much slower than Python API (compile flags) #3471

C++ api runs much slower than Python API (compile flags) #3471

lingz commented Jul 22, 2016

concretevitamin commented Jul 22, 2016

lingz commented Jul 22, 2016

concretevitamin commented Jul 23, 2016

venuktan commented Aug 2, 2017 •

edited

jimaldon commented Nov 30, 2017

CasiaFan commented May 2, 2018

C++ api runs much slower than Python API (compile flags) #3471

C++ api runs much slower than Python API (compile flags) #3471

Comments

lingz commented Jul 22, 2016

Environment info

Steps to reproduce

What have you tried?

Logs or other output that would be helpful

concretevitamin commented Jul 22, 2016

lingz commented Jul 22, 2016

concretevitamin commented Jul 23, 2016

venuktan commented Aug 2, 2017 • edited

jimaldon commented Nov 30, 2017

CasiaFan commented May 2, 2018

venuktan commented Aug 2, 2017 •

edited