Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recognize digits model (tensorflow version) on TensorRT #8790

Closed
sidgoyal78 opened this issue Mar 6, 2018 · 5 comments
Closed

Recognize digits model (tensorflow version) on TensorRT #8790

sidgoyal78 opened this issue Mar 6, 2018 · 5 comments
Assignees
Labels
预测 原名Inference,包含Capi预测问题等

Comments

@sidgoyal78
Copy link
Contributor

In order to compare performance of the C++ inference framework with that of TensorRT (#8671), we need to get the tensorflow model of recognize digits running with TensorRT. (As of now we need to use Tensorflow's model, as we don't have a Fluid to TensorRT converter yet.)

@sidgoyal78 sidgoyal78 self-assigned this Mar 6, 2018
@sidgoyal78 sidgoyal78 added the 预测 原名Inference,包含Capi预测问题等 label Mar 6, 2018
@sidgoyal78
Copy link
Contributor Author

sidgoyal78 commented Mar 7, 2018

Update: I managed to get the Tensorflow model corresponding the recognize_digits example (by essentially using: @dzhwinter's example: https://github.com/dzhwinter/benchmark/blob/master/tensorflow/mnist.py), and successfully running with TensorRT.

I modified the sample Python examples are provided in the TensorRT container: https://devblogs.nvidia.com/tensorrt-container/ , to do the following:

  • Train a recognize digits using TF in python
  • Save the model
  • Use the parser in TensorRT to convert this into a UFF format.
  • Use python bindings of TensorRT to read and run this model.

I will put up the code somewhere (after figuring out the license/acknowledgement aspect, since i have modified the code which is classified as "commercial" by Nvidia).

Meanwhile, below are the results:

batch time (ms)
1 0.084
2 0.114
8 0.087
32 0.131
128 0.145
256 0.148

The GPU that was used for profiling is a GeForce GTX 1080 Ti.

@Xreki
Copy link
Contributor

Xreki commented Mar 8, 2018

Thanks.

The running time increases as the batch size. It is a little strange.

@sidgoyal78
Copy link
Contributor Author

I think the reason for that is in the current benchmarking code, I have included time for memory allocation in the device and also the time taken to transfer the input array (from Python runtime) to device.

@sidgoyal78
Copy link
Contributor Author

@Xreki : I have updated the timings. (I basically removed the cudaMalloc and memcopy parts, and just timed the execution part). Now I think the values make more sense.

@Xreki
Copy link
Contributor

Xreki commented Mar 12, 2018

So, do you compare Fluid with TensorRT and what is the results?

@Xreki Xreki added this to Performance Tuning (DOING) in Inference Framework Apr 3, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
预测 原名Inference,包含Capi预测问题等
Projects
No open projects
Inference Framework
Performance Tuning (DOING)
Development

No branches or pull requests

2 participants