Skip to content

TensorRT v10.0.0

Compare
Choose a tag to compare
@asfiyab-nvidia asfiyab-nvidia released this 03 Apr 21:45
· 9 commits to release/10.0 since this release

Key Features and Updates:

  • Samples changes
    • Added a sample showcasing weight-stripped engines.
    • Added a sample demonstrating the use of custom tactics with IPluginV3.
    • Added a sample to showcase plugins with data-dependent output shapes, using IPluginV3.
  • Parser changes
    • Added a new class IParserRefitter that can be used to refit a TensorRT engine with the weights of an ONNX model.
    • kNATIVE_INSTANCENORM is now set to ON by default.
    • Added support for IPluginV3 interfaces from TensorRT.
    • Added support for INT4 quantization.
    • Added support for the reduction attribute in ScatterElements.
    • Added support for wrap padding mode in Pad
  • Plugin changes
    • A new plugin has been added in compliance with ONNX ScatterElements.
    • The TensorRT plugin library no longer has a load-time link dependency on cuBLAS or cuDNN libraries.
    • All plugins which relied on cuBLAS/cuDNN handles passed through IPluginV2Ext::attachToContext() have moved to use cuBLAS/cuDNN resources initialized by the plugin library itself. This works by dynamically loading the required cuBLAS/cuDNN library. Additionally, plugins which independently initialized their cuBLAS/cuDNN resources have also moved to dynamically loading the required library. If the respective library is not discoverable through the library path(s), these plugins will not work.
    • bertQKVToContextPlugin: Version 2 of this plugin now supports head sizes less than or equal to 32.
    • reorgPlugin: Added a version 2 which implements IPluginV2DynamicExt.
    • disentangledAttentionPlugin: Fixed a kernel bug.
  • Demo changes
    • HuggingFace demos have been removed. For all users using TensorRT to accelerate Large Language Model inference, please use TensorRT-LLM.
  • Updated tooling
    • Polygraphy v0.49.9
    • ONNX-GraphSurgeon v0.5.1
    • TensorRT Engine Explorer v0.1.8
  • Build Containers
    • RedHat/CentOS 7.x are no longer officially supported starting with TensorRT 10.0. The corresponding container has been removed from TensorRT-OSS.