Extremely slow. Is facenet's Inception Resnet v1 supported with TensorRT? #8

congphase · 2019-10-22T08:56:47Z

@JerryJiaGit
Hello, I tried replacing face.py, facenet.py, detect_face.py as you advised, but when I run the predict.py of facenet with 2 frozen models at https://github.com/davidsandberg/facenet/wiki#pre-trained-models, it runs extremely slowly (it shows a bunch of output and hangs) so I have to stop it from running, I ran on Jetson Nano. Sorry for not providing the output, my bad.

I have searched and doubt that the Inception Resnet v1 network architecture has some layers and ops that are not supported by TensorRT. Currently I'm not sure how to handle this, please give me some advice. Thank you a lot!!

JerryJiaGit · 2019-10-23T00:43:21Z

Thanks for trying this on Nano. I didn't have a Nano, so not able to do test for you. But I did test on Xavier and its result is same as expected.

First, Inception Resnet v1 and MTCNN both have no full layer support, but TRT could convert non-support layer to TensorFlow. You can check the logs https://github.com/JerryJiaGit/facenet_trt/blob/master/log_xavier_trt5.txt

Second, the init needs few minutes to convert layers, so you could check your log to see if it hang at init or extremely slow at face recognization.

Third, suggest that you could try on one modern NVIDIA dGPU to make sure code is good at first. Nano Tegra has a Maxwell architecture GPU, please check below links for some suggestions:

congphase · 2019-10-23T01:44:19Z

Thanks for trying this on Nano. I didn't have a Nano, so not able to do test for you. But I did test on Xavier and its result is same as expected.

First, Inception Resnet v1 and MTCNN both have no full layer support, but TRT could convert non-support layer to TensorFlow. You can check the logs https://github.com/JerryJiaGit/facenet_trt/blob/master/log_xavier_trt5.txt

Second, the init needs few minutes to convert layers, so you could check your log to see if it hang at init or extremely slow at face recognization.

Third, suggest that you could try on one modern NVIDIA dGPU to make sure code is good at first. Nano Tegra has a Maxwell architecture GPU, please check below links for some suggestions:

https://devtalk.nvidia.com/default/topic/1057812/jetson-nano/optimize-tf-trt-models-on-jetson-nano-to-improve-inference-timing-and-efficiency/

https://devtalk.nvidia.com/default/topic/1051546/jetson-nano/optimizing-tf-trt-load-time/

https://jkjung-avt.github.io/tf-trt-on-nano/

Thank you for the quick reply, I really appreciate it :) , I'll check it up and update here when there's any improvements.

congphase · 2019-10-25T02:23:37Z

Hello @JerryJiaGit

I currently don't have time to test the above suggestions, but I have a small question, hope you'll help me out. Is it right that these methods are TF-TRT method, which is not a pure TensorRT method? As I have known so far, pure TensorRT method includes converting a Tensorflow frozen trained model to a file in .uff format then we use it to create an TensorRT engine which is then used by direct import tensorrt (not import tensorflow.contrib.tensorrt) API to code. And the pure TensorRT method is assumed to be faster than TF-TRT, so why everyone as well as you still uses TF-TRT method? Why don't you use pure TensorRT? Does it have additional advantages on doing that?

JerryJiaGit · 2019-10-25T07:32:52Z

To minimize code changes, but get perf improvement, it is the purpose.

biyuehuang · 2020-06-08T07:18:33Z

To minimize code changes, but get perf improvement, it is the purpose.

Hi @JerryJiaGit , thank you for sharing! I use your code to run on NV Xavier successfully. However, I run ./contributed/real_time_face_recognition.py and got only 4 FPS. Do you know TensorRT engine method to improve speed?
input video: 1x 720p@30fps
SVM classifier trained by 7 persons, total 11 images. Use python3 ./src/classifier.py TRAIN.

JerryJiaGit · 2020-06-30T01:42:29Z

To minimize code changes, but get perf improvement, it is the purpose.

Hi @JerryJiaGit , thank you for sharing! I use your code to run on NV Xavier successfully. However, I run ./contributed/real_time_face_recognition.py and got only 4 FPS. Do you know TensorRT engine method to improve speed?
input video: 1x 720p@30fps
SVM classifier trained by 7 persons, total 11 images. Use python3 ./src/classifier.py TRAIN.

4 FPS is too low, did you tried different L4T image for your Xavier? I encountered such low-performance issue with a very old image.

congphase closed this as completed Apr 20, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extremely slow. Is facenet's Inception Resnet v1 supported with TensorRT? #8

Extremely slow. Is facenet's Inception Resnet v1 supported with TensorRT? #8

congphase commented Oct 22, 2019

JerryJiaGit commented Oct 23, 2019

congphase commented Oct 23, 2019 •

edited

Loading

congphase commented Oct 25, 2019

JerryJiaGit commented Oct 25, 2019

biyuehuang commented Jun 8, 2020 •

edited

Loading

JerryJiaGit commented Jun 30, 2020

Extremely slow. Is facenet's Inception Resnet v1 supported with TensorRT? #8

Extremely slow. Is facenet's Inception Resnet v1 supported with TensorRT? #8

Comments

congphase commented Oct 22, 2019

JerryJiaGit commented Oct 23, 2019

congphase commented Oct 23, 2019 • edited Loading

congphase commented Oct 25, 2019

JerryJiaGit commented Oct 25, 2019

biyuehuang commented Jun 8, 2020 • edited Loading

JerryJiaGit commented Jun 30, 2020

congphase commented Oct 23, 2019 •

edited

Loading

biyuehuang commented Jun 8, 2020 •

edited

Loading