Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extremely slow. Is facenet's Inception Resnet v1 supported with TensorRT? #8

Closed
congphase opened this issue Oct 22, 2019 · 6 comments
Closed

Comments

@congphase
Copy link

@JerryJiaGit
Hello, I tried replacing face.py, facenet.py, detect_face.py as you advised, but when I run the predict.py of facenet with 2 frozen models at https://github.com/davidsandberg/facenet/wiki#pre-trained-models, it runs extremely slowly (it shows a bunch of output and hangs) so I have to stop it from running, I ran on Jetson Nano. Sorry for not providing the output, my bad.

I have searched and doubt that the Inception Resnet v1 network architecture has some layers and ops that are not supported by TensorRT. Currently I'm not sure how to handle this, please give me some advice. Thank you a lot!!

@JerryJiaGit
Copy link
Owner

Thanks for trying this on Nano. I didn't have a Nano, so not able to do test for you. But I did test on Xavier and its result is same as expected.

First, Inception Resnet v1 and MTCNN both have no full layer support, but TRT could convert non-support layer to TensorFlow. You can check the logs https://github.com/JerryJiaGit/facenet_trt/blob/master/log_xavier_trt5.txt

Second, the init needs few minutes to convert layers, so you could check your log to see if it hang at init or extremely slow at face recognization.

Third, suggest that you could try on one modern NVIDIA dGPU to make sure code is good at first. Nano Tegra has a Maxwell architecture GPU, please check below links for some suggestions:

@congphase
Copy link
Author

congphase commented Oct 23, 2019

Thanks for trying this on Nano. I didn't have a Nano, so not able to do test for you. But I did test on Xavier and its result is same as expected.

First, Inception Resnet v1 and MTCNN both have no full layer support, but TRT could convert non-support layer to TensorFlow. You can check the logs https://github.com/JerryJiaGit/facenet_trt/blob/master/log_xavier_trt5.txt

Second, the init needs few minutes to convert layers, so you could check your log to see if it hang at init or extremely slow at face recognization.

Third, suggest that you could try on one modern NVIDIA dGPU to make sure code is good at first. Nano Tegra has a Maxwell architecture GPU, please check below links for some suggestions:

Thank you for the quick reply, I really appreciate it :) , I'll check it up and update here when there's any improvements.

@congphase
Copy link
Author

Hello @JerryJiaGit

I currently don't have time to test the above suggestions, but I have a small question, hope you'll help me out. Is it right that these methods are TF-TRT method, which is not a pure TensorRT method? As I have known so far, pure TensorRT method includes converting a Tensorflow frozen trained model to a file in .uff format then we use it to create an TensorRT engine which is then used by direct import tensorrt (not import tensorflow.contrib.tensorrt) API to code. And the pure TensorRT method is assumed to be faster than TF-TRT, so why everyone as well as you still uses TF-TRT method? Why don't you use pure TensorRT? Does it have additional advantages on doing that?

@JerryJiaGit
Copy link
Owner

To minimize code changes, but get perf improvement, it is the purpose.

@biyuehuang
Copy link

biyuehuang commented Jun 8, 2020

To minimize code changes, but get perf improvement, it is the purpose.

Hi @JerryJiaGit , thank you for sharing! I use your code to run on NV Xavier successfully. However, I run ./contributed/real_time_face_recognition.py and got only 4 FPS. Do you know TensorRT engine method to improve speed?
input video: 1x 720p@30fps
SVM classifier trained by 7 persons, total 11 images. Use python3 ./src/classifier.py TRAIN.

@JerryJiaGit
Copy link
Owner

To minimize code changes, but get perf improvement, it is the purpose.

Hi @JerryJiaGit , thank you for sharing! I use your code to run on NV Xavier successfully. However, I run ./contributed/real_time_face_recognition.py and got only 4 FPS. Do you know TensorRT engine method to improve speed?
input video: 1x 720p@30fps
SVM classifier trained by 7 persons, total 11 images. Use python3 ./src/classifier.py TRAIN.

4 FPS is too low, did you tried different L4T image for your Xavier? I encountered such low-performance issue with a very old image.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants