`ultralytics 8.1.46` add TensorRT 10 support #9516

Burhan-Q · 2024-04-02T18:12:37Z

Working to add TensorRT >=10 support now that it's in early access. TensorRT 10 documentation reference. Appending code to add support for latest version changes, but retains code for use with prior versions of TensorRT.

🛠️ PR Summary

_{Made with ❤️ by Ultralytics Actions}

🌟 Summary

Improved support for NVIDIA TensorRT exports in Ultralytics YOLOv8 models.

📊 Key Changes

Added a new test for exporting YOLO models to the NVIDIA TensorRT format.
Updated Ultralytics package version to 8.1.46.
Implemented support for exporting models to TensorRT, including dynamic input sizes and FP16 precision.
Added compatibility with the latest TensorRT version 10 changes, including modifications to workspace settings and serialized network building.
Enhanced the error handling and logging for TensorRT model context creation to guide users on version compatibility.
Adjusted TensorRT export and model loading code to support both TensorRT versions below and above 10, focusing on memory management and dynamic input handling.

🎯 Purpose & Impact

Enhanced Exporting: This update enables more efficient use of the YOLO model by allowing it to be exported to TensorRT, thereby leveraging NVIDIA's optimizations for faster inference on compatible hardware. 🚀
Broader Compatibility: The changes ensure the model exportation is compatible with a wider range of TensorRT versions, including the latest release. This means users with newer NVIDIA GPUs can harness improved performance without compatibility issues. 🔧
Dynamic and Precision Support: By supporting dynamic input sizes and FP16 precision, models can handle varying image sizes more effectively and run faster on supported devices, making it more flexible and efficient for real-time applications. 🔄
Error Handling and Guidance: Enhanced error messages help users diagnose and understand compatibility issues between the exported models and their TensorRT versions, making troubleshooting easier. 🛠️

These updates are aimed at improving the user experience by ensuring compatibility with newer technologies and optimizing model performance, marking a significant step forward in deploying Ultralytics YOLO models in production environments.

only modifies existing code to keep prior functionalities in place

codecov · 2024-04-02T18:16:16Z

Codecov Report

Attention: Patch coverage is 45.00000% with 33 lines in your changes are missing coverage. Please review.

Project coverage is 76.48%. Comparing base (ea03db9) to head (ba20133).

Files	Patch %	Lines
ultralytics/nn/autobackend.py	34.88%	28 Missing ⚠️
ultralytics/engine/exporter.py	68.75%	5 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #9516      +/-   ##
==========================================
+ Coverage   76.12%   76.48%   +0.35%     
==========================================
  Files         121      121              
  Lines       15294    15327      +33     
==========================================
+ Hits        11643    11723      +80     
+ Misses       3651     3604      -47

Flag	Coverage Δ
Benchmarks	`35.97% <1.66%> (-0.08%)`	⬇️
GPU	`39.61% <45.00%> (+1.59%)`	⬆️
Tests	`71.32% <1.66%> (-0.16%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Burhan-Q · 2024-04-02T21:21:36Z

Not happy with how this is at present, but it's a reasonable starting point that doesn't impact inference speeds. When testing with TRT 9 exports with TRT 10 environment, the TRT 9 model would not load and would return None when model.create_execution_context() is called. The same occurs when attempting to use TRT 10 exports with TRT 9 environment (not really surprising).

TensorRT python API docs

Need to do more investigation into TRT version compatibility.

glenn-jocher · 2024-04-03T12:12:49Z

@Burhan-Q got it!

You can also check version like this instead of using Try: Except fields:

if check_version(trt.__version__, "<10.0.0"):
    # logic here

Or you could return the Major version like this:

trt_major_version = parse_version(trt.__version__)[0]

Burhan-Q

@glenn-jocher I saw you added some updates, I was thinking about using the walrus operator to assign is_trt_10

_ = (is_trt_10 := check_version(trt.__version__, ">=10.0.0"))

Thought this could be really clean. I'll let you make the call on how to design it tho. Let me know if you want me to test anything on Windows or Linux

Signed-off-by: Glenn Jocher <glenn.jocher@ultralytics.com>

… bugs when exporting a tensorrt file .engine with flag half=True or when inference (#9840) Co-authored-by: UltralyticsAssistant <web@ultralytics.com> Co-authored-by: Glenn Jocher <glenn.jocher@ultralytics.com>

glenn-jocher · 2024-04-07T22:40:21Z

@Burhan-Q oh hey buddy, := is nice but not supported until Python 3.9 I think, so for 3.8 compatibility for the rest of the year or so we can't introduce these. Next year though!

glenn-jocher · 2024-04-07T22:42:41Z

@Burhan-Q ok I've added a bunch of updates from @ZouJiu1 including for AutoBackend forward call. Can you review to make sure I didn't break anything? If it looks all good I think we can merge this for our next release.

Burhan-Q · 2024-04-07T22:44:21Z

My bad, I thought I read it was supported in 3.8 as well

glenn-jocher · 2024-04-07T22:46:42Z

@Burhan-Q yeah actually maybe you're right, I got myself mixed up. So even though we're officially 3.7 deprecated we still see a high amount of workflows out there running 3.7 per our analytics so we have to try to stay 3.7-friendly for the time being.

…to trt10

Burhan-Q · 2024-04-09T13:26:20Z

@glenn-jocher removed excess conditionals from Autobackend and now using methods that are available in new and old versions of TensorRT

glenn-jocher · 2024-04-09T14:49:18Z

@Burhan-Q oh this looks much nicer! But are you sure we didn't need a more signficant if-else in autobackend forward()?

glenn-jocher · 2024-04-09T15:16:52Z

@Burhan-Q issue here

Signed-off-by: Glenn Jocher <glenn.jocher@ultralytics.com>

glenn-jocher · 2024-04-09T16:34:54Z

@Burhan-Q I've added a TensorRT export and inference test in 3017bad for this PR so I don't have to run the above code manually.

I guess after this PR merges we can move to --slow if it takes too much time, but we definitely want to test this as right now there are no tests/CI for it.

Burhan-Q · 2024-04-09T16:37:23Z

@glenn-jocher sounds good. Looks like the older TensorRT versions require use of the get_bindings so I'm going to add these back in like you suggested using a singular if-else statement

Burhan-Q · 2024-04-09T17:03:58Z

The oldest version of TensorRT I could install for python 3.10 is tensorrt-8.5.3.1 does it make sense to continue support for anything older? I'm going to test out with python 3.8 to see if it will allow for an older version of TensorRT

pip install tensorrt==8.4.*

ERROR: Could not find a version that satisfies the requirement tensorrt==8.4.* 
(from versions: 0.0.1.dev5, 0.0.1, 8.5.1.7, 8.5.2.2, 8.5.3.1, 8.6.0, 8.6.1, 8.6.1.post1, 
9.0.0.post11.dev1, 9.0.0.post12.dev1, 9.0.1.post11.dev4, 9.0.1.post12.dev4, 
9.1.0.post11.dev4, 9.1.0.post12.dev4, 9.2.0.post11.dev5, 9.2.0.post12.dev5, 
9.3.0.post11.dev1, 9.3.0.post12.dev1, 10.0.0b6)

Burhan-Q · 2024-04-09T17:55:45Z

@glenn-jocher it appears that the oldest version of TensorRT that is downloadable for install with python 3.8 is 8.5.1.7 and even with 8.5.3.1 for python 3.10, there seems to be an issue with numpy as it uses np.bool which was deprecated in numpy==1.20 which might be pinned by another dependency.

…to trt10

…ape`

Burhan-Q · 2024-04-09T19:48:03Z

@glenn-jocher CI is passing now and I was able to test with TensorRT 8.5.3.1 on python 3.8 (after forcing some old dependencies) using the latest commit. It looks like versions of TensorRT 8.4 can still be downloaded from the GitHub repo, but not installed from pip. According to the release notes:

APIs deprecated before TensorRT 8.4 will be removed in TensorRT 9.0.

so I guess this means that some components of the code are deprecated during each release. These are maintained for some time before being completely removed.

Signed-off-by: Glenn Jocher <glenn.jocher@ultralytics.com> Co-authored-by: UltralyticsAssistant <web@ultralytics.com> Co-authored-by: Glenn Jocher <glenn.jocher@ultralytics.com> Co-authored-by: 九是否随意的称呼 <1069679911@qq.com>

add support for TensorRT 10

0995ecd

only modifies existing code to keep prior functionalities in place

Burhan-Q added enhancement New feature or request dependencies Dependency-related topics labels Apr 2, 2024

Burhan-Q linked an issue Apr 2, 2024 that may be closed by this pull request

Make compatible with TensorRT 10 release #9510

Closed

2 tasks

Burhan-Q and others added 2 commits April 2, 2024 17:15

add TensorRT10 compatability

4a4c68f

Auto-format by https://ultralytics.com/actions

079aa43

Burhan-Q mentioned this pull request Apr 2, 2024

Make compatible with TensorRT 10 release #9510

Closed

2 tasks

Merge branch 'main' into trt10

6850472

pnthai88 mentioned this pull request Apr 4, 2024

Multiple problem with YOLOv8 (Export .engine) - (Slow predict with best.pt or last.pt) #9556

Closed

1 task

glenn-jocher added 3 commits April 6, 2024 21:34

Merge branch 'main' into trt10

70c2f6b

Merge branch 'main' into trt10

bc21f77

Merge branch 'main' into trt10

02c4b99

glenn-jocher mentioned this pull request Apr 7, 2024

Add support to exporting or inference with TensorRT 10.0.0b6, and fix bugs when exporting a tensorrt file .engine with flag half=True or when inference #9840

Merged

glenn-jocher changed the title ~~Include TensorRT 10 support~~ Add TensorRT 10 support Apr 7, 2024

Merge branch 'main' into trt10

94ea376

glenn-jocher marked this pull request as ready for review April 7, 2024 21:44

Update exporter.py

8418063

Burhan-Q commented Apr 7, 2024

View reviewed changes

glenn-jocher and others added 2 commits April 8, 2024 00:36

Updates

32b2445

Signed-off-by: Glenn Jocher <glenn.jocher@ultralytics.com>

Add support to exporting or inference with TensorRT 10.0.0b6, and fix…

1f553c2

… bugs when exporting a tensorrt file .engine with flag half=True or when inference (#9840) Co-authored-by: UltralyticsAssistant <web@ultralytics.com> Co-authored-by: Glenn Jocher <glenn.jocher@ultralytics.com>

Update __init__.py

d0a088d

glenn-jocher changed the title ~~Add TensorRT 10 support~~ ultralytics 8.1.46 add TensorRT 10 support Apr 7, 2024

Merge branch 'main' into trt10

a090b75

glenn-jocher and others added 4 commits April 9, 2024 14:46

Refactor constants out of for loop for speed

2c03f83

Update exporter.py

d371949

refactored to use unified methods

5a5e61f

Merge branch 'trt10' of https://github.com/ultralytics/ultralytics in…

df6a3ff

…to trt10

remove redundant line

050b278

remove self. as assigned automatically

cd18b7b

glenn-jocher added 2 commits April 9, 2024 18:34

Add TensorRT export test

3017bad

Signed-off-by: Glenn Jocher <glenn.jocher@ultralytics.com>

Merge remote-tracking branch 'origin/trt10' into trt10

0e694cb

Update test_cuda.py

d263946

Burhan-Q added 5 commits April 9, 2024 13:56

refactored to include older interface for tensorrt<8.6

e221ff5

Merge branch 'trt10' of https://github.com/ultralytics/ultralytics in…

3bab6b4

…to trt10

refactor to include path for installs with tensorrt<8.6

529293f

fix dynamic where self. is needed, align else, rearrange flow

117a011

correct for error with TensorRT 8.4.3.1 without method `get_tensor_sh…

1f49b5d

…ape`

glenn-jocher added 4 commits April 10, 2024 16:05

delete redundant trt10 definition

1ef7818

Align variable names in Exporter and Autobackend

9b96f5b

Merge branch 'main' into trt10

ba20133

move test to slow

1fba2f3

glenn-jocher merged commit 4ffd6ee into main Apr 10, 2024
10 checks passed

glenn-jocher deleted the trt10 branch April 10, 2024 16:07

glenn-jocher mentioned this pull request Apr 10, 2024

Tensorrt Mix Precision or INT8 conversion, mix precision almost same size and speed with INT8, but better precision, the converted model have good detection result with mix precision. #9941

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`ultralytics 8.1.46` add TensorRT 10 support #9516

`ultralytics 8.1.46` add TensorRT 10 support #9516

Burhan-Q commented Apr 2, 2024 •

edited by github-actions bot

codecov bot commented Apr 2, 2024 •

edited

Burhan-Q commented Apr 2, 2024

glenn-jocher commented Apr 3, 2024

Burhan-Q left a comment

glenn-jocher commented Apr 7, 2024

glenn-jocher commented Apr 7, 2024

Burhan-Q commented Apr 7, 2024

glenn-jocher commented Apr 7, 2024

Burhan-Q commented Apr 9, 2024

glenn-jocher commented Apr 9, 2024

glenn-jocher commented Apr 9, 2024

glenn-jocher commented Apr 9, 2024 •

edited

Burhan-Q commented Apr 9, 2024

Burhan-Q commented Apr 9, 2024

Burhan-Q commented Apr 9, 2024

Burhan-Q commented Apr 9, 2024

ultralytics 8.1.46 add TensorRT 10 support #9516

ultralytics 8.1.46 add TensorRT 10 support #9516

Conversation

Burhan-Q commented Apr 2, 2024 • edited by github-actions bot

🛠️ PR Summary

🌟 Summary

📊 Key Changes

🎯 Purpose & Impact

codecov bot commented Apr 2, 2024 • edited

Codecov Report

Burhan-Q commented Apr 2, 2024

glenn-jocher commented Apr 3, 2024

Burhan-Q left a comment

Choose a reason for hiding this comment

glenn-jocher commented Apr 7, 2024

glenn-jocher commented Apr 7, 2024

Burhan-Q commented Apr 7, 2024

glenn-jocher commented Apr 7, 2024

Burhan-Q commented Apr 9, 2024

glenn-jocher commented Apr 9, 2024

glenn-jocher commented Apr 9, 2024

glenn-jocher commented Apr 9, 2024 • edited

Burhan-Q commented Apr 9, 2024

Burhan-Q commented Apr 9, 2024

Burhan-Q commented Apr 9, 2024

Burhan-Q commented Apr 9, 2024

`ultralytics 8.1.46` add TensorRT 10 support #9516

`ultralytics 8.1.46` add TensorRT 10 support #9516

Burhan-Q commented Apr 2, 2024 •

edited by github-actions bot

codecov bot commented Apr 2, 2024 •

edited

glenn-jocher commented Apr 9, 2024 •

edited