Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TF-TRT vs UFF-TensorRT #341

Closed
PythonImageDeveloper opened this issue Jan 17, 2020 · 2 comments
Closed

TF-TRT vs UFF-TensorRT #341

PythonImageDeveloper opened this issue Jan 17, 2020 · 2 comments
Labels
Framework: TensorFlow question Further information is requested

Comments

@PythonImageDeveloper
Copy link

PythonImageDeveloper commented Jan 17, 2020

I found that we can optimize the Tensorflow model in several ways. If I am mistaken, please tell me.

1- Using TF-TRT, This API developer by tensorflow and integreted TensoRT to Tensorflow and this API called as :
from tensorflow.python.compiler.tensorrt import trt_convert as trt
This API can be applied to any tensorflow models (new and old version models) without any converting error, because If this API don't support any new layers, don't consider these layers for TensorRT engines and these layers remain for Tensorflow engine and run on Tensorflow. right?

2- Using TensorRT, This API by developed by NVIDA and is independent of Tenorflow library (Not integrated to Tensorflow), and this API called as:
import tensorrt as trt
If we want to use this api, first, we must converting the tensorflow graph to UFF using uff-convertor and then parse the UFF graph to this API.
In this case, If the Tensorflow graph have unsupported layers we must use plugin or custom code for these layers, right?

3- I don't know, when we work with Tensorflow models, Why we use UFF converter then TensorRT, we can use directly TF-TRT API, right? If so, Are you tested the Tensorflow optimization model from these two method to get same performance? what's advantage of this UFF converter method?

I have some question about the two cases above:
4- I convert the ssd_mobilenet_v2 using two cases, In the case 1, I achieve slight improvement in speed but in the case 2, I achieve more improvement, why?
My opinion is that, In the case 1, The API only consider converting the precision (FP32 to FP16) and merging the possible layers together, But in the case 2, the graph is clean by UFF such as remove any redundant nodes like Asserts and Identity and then converted to tensorrt graph, right?

5- when we convert the trained model files like .ckpt and .meta, ... to frozen inference graph(.pb file), These layers don't remove from graph? only loss states and optimizer states , ... are removed?

@rmccorm4 rmccorm4 added Framework: TensorFlow question Further information is requested labels Jan 22, 2020
@rmccorm4
Copy link
Collaborator

rmccorm4 commented Jan 22, 2020

Hi @PythonImageDeveloper,

There is an additional route of using tf2onnx (https://github.com/onnx/tensorflow-onnx) to convert Tensorflow -> ONNX, and then using TensorRT API to convert from ONNX -> TensorRT, which is likely better supported than UFF -> TensorRT, since the UFF Parser will be deprecated in the future.

I don't know, when we work with Tensorflow models, Why we use UFF converter then TensorRT, we can use directly TF-TRT API, right?

In general, I believe TensorRT will outperform TF-TRT if you can convert your entire model as is. However, when there are unsupported operations, TF-TRT will fall back to TF ops as you said, which makes it a bit easier to use, whereas you'd have to implement plugin layers if only using TRT.

I don't think there are any up to date performance comparisons between pure TRT and TF-TRT. Similarly, I don't know how that performance stacks up between using fallback TF ops vs. TRT plugin layers for unsupported ops.

@jkjung-avt
Copy link

jkjung-avt/tensorrt_demos#43 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Framework: TensorFlow question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants