Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ultralytics 8.1.46 add TensorRT 10 support #9516

Merged
merged 41 commits into from Apr 10, 2024
Merged

ultralytics 8.1.46 add TensorRT 10 support #9516

merged 41 commits into from Apr 10, 2024

Conversation

Burhan-Q
Copy link
Member

@Burhan-Q Burhan-Q commented Apr 2, 2024

Working to add TensorRT >=10 support now that it's in early access. TensorRT 10 documentation reference. Appending code to add support for latest version changes, but retains code for use with prior versions of TensorRT.

  • update ultralytics/engine/exporter.py
    • tested on Windows 10 with TensorRT 10.0.0b6, Ultralytics 8.1.42, PyTorch 2.2.0, ONNX 1.15.0
    • model.export(format="engine")
    • model.export(format="engine", half=True, dynamic=True)
    • model.export(format="engine", half=True, dynamic=True, simplify=True, workspace=5, imgsz=320)
    • verify compatibility with TensorRT <10 still functional
  • update ultralytics/AutoBackend
    • Standard export
    • half + dynamic export
    • half + dynamic + simplify + imgsz=320 export
    • verify still compatible with older TensorRT model exports

πŸ› οΈ PR Summary

Made with ❀️ by Ultralytics Actions

🌟 Summary

Improved support for NVIDIA TensorRT exports in Ultralytics YOLOv8 models.

πŸ“Š Key Changes

  • Added a new test for exporting YOLO models to the NVIDIA TensorRT format.
  • Updated Ultralytics package version to 8.1.46.
  • Implemented support for exporting models to TensorRT, including dynamic input sizes and FP16 precision.
  • Added compatibility with the latest TensorRT version 10 changes, including modifications to workspace settings and serialized network building.
  • Enhanced the error handling and logging for TensorRT model context creation to guide users on version compatibility.
  • Adjusted TensorRT export and model loading code to support both TensorRT versions below and above 10, focusing on memory management and dynamic input handling.

🎯 Purpose & Impact

  • Enhanced Exporting: This update enables more efficient use of the YOLO model by allowing it to be exported to TensorRT, thereby leveraging NVIDIA's optimizations for faster inference on compatible hardware. πŸš€
  • Broader Compatibility: The changes ensure the model exportation is compatible with a wider range of TensorRT versions, including the latest release. This means users with newer NVIDIA GPUs can harness improved performance without compatibility issues. πŸ”§
  • Dynamic and Precision Support: By supporting dynamic input sizes and FP16 precision, models can handle varying image sizes more effectively and run faster on supported devices, making it more flexible and efficient for real-time applications. πŸ”„
  • Error Handling and Guidance: Enhanced error messages help users diagnose and understand compatibility issues between the exported models and their TensorRT versions, making troubleshooting easier. πŸ› οΈ

These updates are aimed at improving the user experience by ensuring compatibility with newer technologies and optimizing model performance, marking a significant step forward in deploying Ultralytics YOLO models in production environments.

only modifies existing code to keep prior functionalities in place
@Burhan-Q Burhan-Q added enhancement New feature or request dependencies Dependency-related topics labels Apr 2, 2024
@Burhan-Q Burhan-Q linked an issue Apr 2, 2024 that may be closed by this pull request
2 tasks
Copy link

codecov bot commented Apr 2, 2024

Codecov Report

Attention: Patch coverage is 45.00000% with 33 lines in your changes are missing coverage. Please review.

Project coverage is 76.48%. Comparing base (ea03db9) to head (ba20133).

Files Patch % Lines
ultralytics/nn/autobackend.py 34.88% 28 Missing ⚠️
ultralytics/engine/exporter.py 68.75% 5 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #9516      +/-   ##
==========================================
+ Coverage   76.12%   76.48%   +0.35%     
==========================================
  Files         121      121              
  Lines       15294    15327      +33     
==========================================
+ Hits        11643    11723      +80     
+ Misses       3651     3604      -47     
Flag Coverage Ξ”
Benchmarks 35.97% <1.66%> (-0.08%) ⬇️
GPU 39.61% <45.00%> (+1.59%) ⬆️
Tests 71.32% <1.66%> (-0.16%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

β˜” View full report in Codecov by Sentry.
πŸ“’ Have feedback on the report? Share it here.

@Burhan-Q
Copy link
Member Author

Burhan-Q commented Apr 2, 2024

Not happy with how this is at present, but it's a reasonable starting point that doesn't impact inference speeds. When testing with TRT 9 exports with TRT 10 environment, the TRT 9 model would not load and would return None when model.create_execution_context() is called. The same occurs when attempting to use TRT 10 exports with TRT 9 environment (not really surprising).

TensorRT python API docs

Need to do more investigation into TRT version compatibility.

@glenn-jocher
Copy link
Member

@Burhan-Q got it!

You can also check version like this instead of using Try: Except fields:

if check_version(trt.__version__, "<10.0.0"):
    # logic here

Or you could return the Major version like this:

trt_major_version = parse_version(trt.__version__)[0]

@glenn-jocher glenn-jocher marked this pull request as ready for review April 7, 2024 21:44
Copy link
Member Author

@Burhan-Q Burhan-Q left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@glenn-jocher I saw you added some updates, I was thinking about using the walrus operator to assign is_trt_10

_ = (is_trt_10 := check_version(trt.__version__, ">=10.0.0"))

Thought this could be really clean. I'll let you make the call on how to design it tho. Let me know if you want me to test anything on Windows or Linux

glenn-jocher and others added 2 commits April 8, 2024 00:36
Signed-off-by: Glenn Jocher <glenn.jocher@ultralytics.com>
… bugs when exporting a tensorrt file .engine with flag half=True or when inference (#9840)

Co-authored-by: UltralyticsAssistant <web@ultralytics.com>
Co-authored-by: Glenn Jocher <glenn.jocher@ultralytics.com>
@glenn-jocher
Copy link
Member

@Burhan-Q oh hey buddy, := is nice but not supported until Python 3.9 I think, so for 3.8 compatibility for the rest of the year or so we can't introduce these. Next year though!

@glenn-jocher
Copy link
Member

@Burhan-Q ok I've added a bunch of updates from @ZouJiu1 including for AutoBackend forward call. Can you review to make sure I didn't break anything? If it looks all good I think we can merge this for our next release.

@glenn-jocher glenn-jocher changed the title Add TensorRT 10 support ultralytics 8.1.46 add TensorRT 10 support Apr 7, 2024
@Burhan-Q
Copy link
Member Author

Burhan-Q commented Apr 7, 2024

My bad, I thought I read it was supported in 3.8 as well

@glenn-jocher
Copy link
Member

@Burhan-Q yeah actually maybe you're right, I got myself mixed up. So even though we're officially 3.7 deprecated we still see a high amount of workflows out there running 3.7 per our analytics so we have to try to stay 3.7-friendly for the time being.

@Burhan-Q
Copy link
Member Author

Burhan-Q commented Apr 9, 2024

@glenn-jocher removed excess conditionals from Autobackend and now using methods that are available in new and old versions of TensorRT

@glenn-jocher
Copy link
Member

@Burhan-Q oh this looks much nicer! But are you sure we didn't need a more signficant if-else in autobackend forward()?

@glenn-jocher
Copy link
Member

@Burhan-Q issue here

Screenshot 2024-04-09 at 17 15 51

@glenn-jocher
Copy link
Member

glenn-jocher commented Apr 9, 2024

@Burhan-Q I've added a TensorRT export and inference test in 3017bad for this PR so I don't have to run the above code manually.

I guess after this PR merges we can move to --slow if it takes too much time, but we definitely want to test this as right now there are no tests/CI for it.

@Burhan-Q
Copy link
Member Author

Burhan-Q commented Apr 9, 2024

@glenn-jocher sounds good. Looks like the older TensorRT versions require use of the get_bindings so I'm going to add these back in like you suggested using a singular if-else statement

@Burhan-Q
Copy link
Member Author

Burhan-Q commented Apr 9, 2024

The oldest version of TensorRT I could install for python 3.10 is tensorrt-8.5.3.1 does it make sense to continue support for anything older? I'm going to test out with python 3.8 to see if it will allow for an older version of TensorRT

pip install tensorrt==8.4.*

ERROR: Could not find a version that satisfies the requirement tensorrt==8.4.* 
(from versions: 0.0.1.dev5, 0.0.1, 8.5.1.7, 8.5.2.2, 8.5.3.1, 8.6.0, 8.6.1, 8.6.1.post1, 
9.0.0.post11.dev1, 9.0.0.post12.dev1, 9.0.1.post11.dev4, 9.0.1.post12.dev4, 
9.1.0.post11.dev4, 9.1.0.post12.dev4, 9.2.0.post11.dev5, 9.2.0.post12.dev5, 
9.3.0.post11.dev1, 9.3.0.post12.dev1, 10.0.0b6)

@Burhan-Q
Copy link
Member Author

Burhan-Q commented Apr 9, 2024

@glenn-jocher it appears that the oldest version of TensorRT that is downloadable for install with python 3.8 is 8.5.1.7 and even with 8.5.3.1 for python 3.10, there seems to be an issue with numpy as it uses np.bool which was deprecated in numpy==1.20 which might be pinned by another dependency.

@Burhan-Q
Copy link
Member Author

Burhan-Q commented Apr 9, 2024

@glenn-jocher CI is passing now and I was able to test with TensorRT 8.5.3.1 on python 3.8 (after forcing some old dependencies) using the latest commit. It looks like versions of TensorRT 8.4 can still be downloaded from the GitHub repo, but not installed from pip. According to the release notes:

APIs deprecated before TensorRT 8.4 will be removed in TensorRT 9.0.

so I guess this means that some components of the code are deprecated during each release. These are maintained for some time before being completely removed.

@glenn-jocher glenn-jocher merged commit 4ffd6ee into main Apr 10, 2024
10 checks passed
@glenn-jocher glenn-jocher deleted the trt10 branch April 10, 2024 16:07
hmurari pushed a commit to hmurari/ultralytics that referenced this pull request Apr 17, 2024
Signed-off-by: Glenn Jocher <glenn.jocher@ultralytics.com>
Co-authored-by: UltralyticsAssistant <web@ultralytics.com>
Co-authored-by: Glenn Jocher <glenn.jocher@ultralytics.com>
Co-authored-by: δΉζ˜―ε¦ιšζ„ηš„η§°ε‘Ό <1069679911@qq.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dependencies Dependency-related topics enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make compatible with TensorRT 10 release
4 participants