Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

slow inference on jetson tx2 #23

Closed
IbrahimBond opened this issue Dec 9, 2019 · 9 comments
Closed

slow inference on jetson tx2 #23

IbrahimBond opened this issue Dec 9, 2019 · 9 comments

Comments

@IbrahimBond
Copy link

i have tested this demo on a jetson tx2 device and inference speed is at 22 fps. i expected better performance on a tx2 than a jetson nano. do you have any insights for achieving better results? and what are the expected speeds on a tx2?

anyone tried it on a tx2 and what were the results?

thanks

@jkjung-avt
Copy link
Owner

Which demo are you referring to? Is it the SSD one?

In addition, please also specify which version of JetPack and TensorFlow you are using.

@IbrahimBond
Copy link
Author

i am referring to the SSD one. i am using tensorflow 1.14 and and jetpack 4.2 with tensorrt 5.

i have tried the trt_ssd_async.py for inference and managed to get 25 fps, but i think this is slow for an optimized model on jetson tx2.

@jkjung-avt
Copy link
Owner

It indeed seems too slow. Unfortunately, I don't have a TX2 to verify that currently.

Did you notice any suspicious warnings or errors when you built the TensorRT engine and ran inferencing?

@IbrahimBond
Copy link
Author

IbrahimBond commented Dec 9, 2019

it seems like the conversion went smoothly and it does improve performance by almost 50%.
i have tested the model without optimizing it with tensorrt and it ran on 12-14 fps.

i am really disappointed with the mobilenetv2 model. i thought it would do better than yolov3-tiny.
i am currently achieving 30 fps on yolov-tiny(416*416) which is better than an optimized mobilenetv2(300 * 300) model.

do you have any idea why the mobilenetv2 model would not perform similarly to the numbers published by tensorflow?

i have also tested mobilenetv3 and it performs similar to mobilenetv2 (14 fps)

@jkjung-avt
Copy link
Owner

@IbrahimBond
Copy link
Author

thank you, this is very informative. then maybe in my case i am better off using the ssd inception model. what do you think?

@jkjung-avt
Copy link
Owner

I think it's worth digging out why FPS on your TX2 is not better than my test result on the Nano.

  1. Have you tested both USB webcam input and image/video file input? Do you observe big difference of FPS in these 2 cases?

  2. If it doesn't trouble you too much, could you profile the code on TX2 and provide the log to me? I might be able to spot problems by looking at the profiler output. Reference: https://jkjung-avt.github.io/optimize-mtcnn/
    For example, run trt_ssd.py with the following command for say 60 seconds. Then copy and paste the profiler output.
    $ python3 -m cProfile -s cumtime trt_trt_ssd.py --model ssd_mobilenet_v1_coco \ --image \ --filename test.jpg

@IbrahimBond
Copy link
Author

I am off until monday, ill get back to you then

@jkjung-avt
Copy link
Owner

Any update? Otherwise, I'll close this issue due to inactivity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants