-
Notifications
You must be signed in to change notification settings - Fork 242
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tensorrt supported detection networks #6
Comments
It's possible that it would work, but we haven't tested it. Currently, the build_detection_graph method that we provide in this repository is tested to work only against the listed models. That said, it is possible that for similar meta-architectures (SSD), configurations with different feature extractors would work. A list of feature extractors registered with the tensorflow/models repository is listed here. You would need to update the object detection configuration proto to select the desired feature extractor. Theoretically, the TensorRT integration in TensorFlow should support any model, as the operations that are not supported by TensorRT are run in native TensorFlow. That said, there may be caveats. Please let me know if you run into issues. |
Thank you for answer. I will report here after I try faster-rcnn with different backbones. |
I was able to convert "faster_rcnn_resnet101_coco"; however in order to be able to use it you should modify config file to use fixed input images. Modify line 4-8: |
@jaybdub-nv , @bezero , When I tried to convert "faster_rcnn_resnet50_coco" with TF-TRT on TX2, I met a few other issues. I wonder how you got around them. Any help/suggestion is highly appreciated.
|
@jkjung-avt I also had memory issues. I solved it by closing my browser, since it is using your memory resources (if possible close all idle applications that are using memory resources. If I am not wrong, scripts in this repo work with max_batch_size=1, so try to work with single images. For batch size >1 TX2 memory might not be sufficient. |
@bezero Thanks for the reply. But closing the web browser and all other applications on TX2 did not solve the OOM issue for me. I also used single-image input for the faster_rcnn_resnet50. I had to reduce number of proposals/detections in the model config to some very small numbers to get around that... |
I am able to run faster_rcnn_resnet50_coco, which is included in the list of supported models, but I don't seem to be getting any speedup, which makes me skeptical that any subgraphs are being optimized at all. In order to get it to run, I used the following command to build the graph (along with the other code included in the Jupyter example): trt_graph = trt.create_inference_graph( I am wondering if anyone has had success in speeding up any form of Faster R-CNN, and if so, could you share some insight into what settings need to be adjusted or how to go about getting the graph conversions to work correctly? |
I shared my test results on Jetson TX2 developer forum before: https://devtalk.nvidia.com/default/topic/1037019/jetson-tx2/tensorflow-object-detection-and-image-classification-accelerated-for-nvidia-jetson/post/5288250/#5288250 Note that I had to reduce number of region proposals in the Faster RCNN models otherwise it runs too slowly. All code I used for testing could be found in my GitHub repository: https://github.com/jkjung-avt/tf_trt_models |
I am facing the following error while trying to get the FasterRCNN model on TensorRT. I have tried changing the resizer as per @bezero comment 6 but still doesn't help yet. Any pointers would be highly appreciated. .cc:724] Can't determine the device, constructing an allocator at device 0 |
@bezero @jkjung-avt Did you run your faster rcnn models in the jupyter notebook? The notebook code works fine for me for the ssd models but if try the faster rcnn models I'm getting Can either of you reproduce this issue? I'm using Tensorflow 1.12 and TensorRT 5.0 |
I haven't managed to get the faster_rcnn_resnet50 model to work with tensorflow 1.12.0 and TensorRT. Previously I got it to work using tensorflow 1.8.0, with some tweaks. Details are all in my GitHub repository: https://github.com/jkjung-avt/tf_trt_models/blob/master/data/faster_rcnn_resnet50_egohands.config |
Hi @jkjung-avt, I used your configure file: https://github.com/jkjung-avt/tf_trt_models/blob/master/data/faster_rcnn_inception_v2_egohands.config to train a model on my dataset(class num 13) and then tried to convert it to TRT but still got the error: |
@CharlieXie, try setting 'remove_assert' to False. I recall that's how I got rid of the problem previously. https://github.com/NVIDIA-AI-IOT/tf_trt_models/blob/master/tf_trt_models/detection.py#L108 |
@jkjung-avt , I use your tf_trt_models, when I run python3 camera_tf_trt.py --image --filename=xxx, --model=faster_rcnn_resnet50_coco --build .I met an error ,like: And when I run python3 camera_tf_trt.py --image --filename=xxx, --model=faster_rcnn_resnet50_coco . Do not build, no error,but detec result is not ideal,like: |
@xiaowenhe The segmentation fault could be caused by "out of memory" issue. You could use 'tegrastat' to monitor JTX2 memory usage and try to confirm if that's the case. As to the bad detection result by TF-TRT optimized faster_rcnn_resnet50_coco model, I'm not exactly sure what the problem is. There could be many causes, e.g.
|
@jkjung-avt ,thank you! But I bo not use TX2,. I want to test it in other first and then use tx2. And GPU like : From the pic,only 5285M used! |
I can not force performance by using optimized TensorRT. Can someone tell my why? After optimizing the frozen graph, I get bigger model ??? |
@hoangtuanvu What do you mean by not being able to optimize? TensorRT optimizes your frozen model for inference, which does not mean that you get smaller model. Did you compare inference time before and after TensorRT optimization? |
@bezero I used TensorRT to optimize the frozen graph, but I did not get better speed for inference. I am currently working on person detection. |
I'm having a similar situation to @atyshka - no improvement whatsoever. Only difference after generating an 'optimized' graph is that with every frame I'm getting a warning "Engine buffer is full". Has anyone figured out how to deal with this? |
Although I ran the detection demo using ssd_mobilenet_v1_coco.pb, I found that if I used the TP16 in |
@hoangtuanvu I am facing an issue while running the inference on a tensorflow object detection model(242MB). I have TF 1.13 and TensorRT 5.1.2 . Below is the log details. ======================================================================== [10663.666441] [12648] 1000 12648 6243614 1507814 3582 12 0 0 python3 Any suggestion or feedback is appreciated. |
Hi @zhucheng725
Do you have any update on this? Thanks |
Hello, |
Not yet |
I tried to run faster_rcnn_inception_v2 and got the following error. Does anyone have any clue about this? Any suggestion or advice would definitely help me to continue my learning by understanding these concepts.
|
I'm having the same issue. Is there any update on this one? What is the meaning of this error anyway! |
If I am not wrong the error states that the system does not have enough memory to run faster_rcnn_inception_v2 model
Get Outlook for Android<https://aka.ms/ghei36>
…________________________________
From: Bob_JIANG ***@***.***>
Sent: Tuesday, March 30, 2021, 12:04 p.m.
To: NVIDIA-AI-IOT/tf_trt_models
Cc: spurani; Comment
Subject: Re: [NVIDIA-AI-IOT/tf_trt_models] Tensorrt supported detection networks (#6)
I tried to run faster_rcnn_inception_v2 and got the following error. Does anyone have any clue about this? Any suggestion or advice would definitely help me to continue my learning by understanding these concepts.
thanks
InvalidArgumentError: node BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSup
pression/Slice (defined at /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py:1748) has inputs from different frames. The input node BatchMultiClassNonMaxSuppression/map/while/Reshape_1 (defined at /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py:1748) is in frame 'BatchMultiClassNonMaxSuppression/map/while/while_context'. The input node BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/Slice/begin (defined at /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py:1748) is in frame ''.
I'm having the same issue. Is there any update on this one? What is the meaning of this error anyway!
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub<#6 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AG2BYGE7CIWHR774SEZXBSLTGHZA3ANCNFSM4FLQTZPA>.
|
Thanks for the quick answer! I'm running a different model using TRT and my memory is normal during execution... Do you know the meaning of 'has inputs from different frames' in the error message? |
Hi, I'm using TF-TRT on windows10, with tf_gpu =2.10.0 and tensorrt = 7.2.3 based on cuda 11.2 and cudnn 8.1.0. I have met the same error while building TRT engine for inference. Do you know how to deal with it? Thanks a lot for your reply. |
Hi,
It is seen that tensorrt supports resnet in classification task. Does it also support the detection networks with a resnet backbone?
What are the exception modules which tensorrt does not support ?
Thanks in advance
The text was updated successfully, but these errors were encountered: