TensorRT RTX Execution Provider ABI Package (trt-rtx-ep-abi) v0.1.0

onnxruntime TensorRT RTX EP ABI (trt-rtx-ep-abi)

The NVIDIA TensorRT RTX Execution Provider (EP) ABI package is a standalone plugin (onnxruntime_providers_nv_tensorrt_rtx.dll / libonnxruntime_providers_nv_tensorrt_rtx.so) that implements the ORT EP ABI interfaces introduced in ORT 1.23.0. It is designed specifically for NVIDIA RTX GPUs and optimized for client-centric use cases.

Unlike the in-tree EP bundled with ONNX Runtime, this ABI package is built and versioned independently — it does not need to be built together with ONNX Runtime.

The TensorRT RTX EP leverages NVIDIA's TensorRT for RTX engine to accelerate ONNX models on RTX GPUs. It supports RTX GPUs based on Ampere and later architectures (NVIDIA GeForce RTX 30xx and above).

Benefits:

Small package footprint: Optimized resource usage on end-user systems at just under 200 MB.
Faster model compile and load times: Leverages just-in-time compilation to build RTX hardware-optimized engines on end-user devices in seconds.
Portability: Seamlessly use cached models across multiple RTX GPUs.
Independent versioning: Ship EP updates without requiring a new ONNX Runtime build.

Feature Support

CUDA Graph — Reduced CPU overhead through CUDA graph capture and replay.
EP Context Model — Save and reload pre-compiled engine context for faster session initialization.
Runtime Cache — Cache compiled engines across sessions to avoid redundant compilation.
GPU I/O Binding — Bind inputs and outputs directly on GPU memory to eliminate host-device copies.
Optimized Memory Management — Efficient GPU memory allocation and reuse.
Intermediate Memory Sharing Across Sessions — Multiple inference sessions running on the same device can share intermediate GPU memory buffers, significantly reducing overall GPU memory consumption when hosting multiple models concurrently.
Device Compatibility API (IHV Support) — Implements the ORT Device Compatibility API (GetHardwareDeviceSupportStatus in OrtEpFactory). The EP reports machine-readable compatibility verdicts for each hardware device, including structured incompatibility reasons (driver version, unsupported device, missing features, policy blocks, runtime conflicts) via OrtEpDeviceCompatibilityDetails. GetEpDevices() only returns devices that are truly supported under current system conditions.
V2 Device-Based EP API — Uses the modern SessionOptionsAppendExecutionProvider_V2 API with device enumeration.
Secure DLL Loading — Delay-loaded TensorRT RTX DLLs with production signature verification (Windows).

Known Limitations

D3D12 Interop is not supported — The EP ABI package does not currently support Direct3D 12 resource interop. Applications requiring D3D12 tensor sharing or D3D12 fence synchronization should use the in-tree TensorRT RTX EP bundled with ONNX Runtime instead.

Dependencies

Component	Minimum Version
ONNX Runtime SDK	1.24.0+
CUDA Toolkit	12.9+
TensorRT RTX SDK	1.4.x.x
CMake	3.15+
Visual Studio (Windows)	2019 or 2022
GCC / Clang (Linux)	C++17-capable

Getting Started

Run via Python

import onnxruntime as ort

# 1. Register the EP plugin DLL
ort.register_execution_provider_library(
    "NvTensorRTRTXExecutionProvider",
    "onnxruntime_providers_nv_tensorrt_rtx.dll")

# 2. Discover the TensorRT RTX EP device
ep_devices = ort.get_ep_devices()
trt_device = None
for ep_device in ep_devices:
    if ep_device.ep_name == "NvTensorRTRTXExecutionProvider":
        trt_device = ep_device
        break

# 3. Add EP device to session options with provider options
session_options = ort.SessionOptions()
session_options.add_provider_for_devices(
    [trt_device],
    {"nv_runtime_cache_path": "./cache"})

# 4. Create session and run inference
session = ort.InferenceSession("model.onnx", sess_options=session_options)
result = session.run([], {"input": input_data})

# 5. Cleanup: delete session before unregistering
del session
ort.unregister_execution_provider_library("NvTensorRTRTXExecutionProvider")

Run via C++

#include <onnxruntime_cxx_api.h>

OrtApi const& ortApi = Ort::GetApi();
Ort::Env env(ORT_LOGGING_LEVEL_WARNING, "MyApp");
Ort::SessionOptions session_options;

// 1. Register the EP plugin library
ortApi.RegisterExecutionProviderLibrary(
    env, "NvTensorRTRTXExecutionProvider",
    ORT_TSTR("onnxruntime_providers_nv_tensorrt_rtx.dll"));

// 2. Enumerate available EP devices and find TensorRT RTX
const OrtEpDevice* const* ep_devices = nullptr;
size_t num_ep_devices;
ortApi.GetEpDevices(env, &ep_devices, &num_ep_devices);

const OrtEpDevice* trt_device = nullptr;
for (size_t i = 0; i < num_ep_devices; i++) {
    if (strcmp(ortApi.EpDevice_EpName(ep_devices[i]),
              "NvTensorRTRTXExecutionProvider") == 0) {
        trt_device = ep_devices[i];
        break;
    }
}

// 3. Append the EP with provider options
const char* keys[]   = {"enable_cuda_graph"};
const char* values[] = {"1"};
ortApi.SessionOptionsAppendExecutionProvider_V2(
    session_options, env, &trt_device, 1,
    keys, values, 1);

// 4. Create session
Ort::Session session(env, ORT_TSTR("model.onnx"), session_options);

See the C++ examples for full samples demonstrating EP device selection, device tensors, CUDA stream interop, and EP context model workflows.

Build from Source

Clone the Repository

git clone https://github.com/NVIDIA/TensorRT-RTX-EP-ABI.git
cd TensorRT-RTX-EP-ABI

Build

Windows

cmake -B build -G "Visual Studio 17 2022" -A x64 `
      -DONNXRUNTIME_ROOT="C:\SDK\onnxruntime-win-x64-1.24.0" `
      -DTRT_RTX_ROOT="C:\SDK\TensorRT-RTX-1.1.1.36"
cmake --build build --config Release

Linux

cmake -B build \
      -DONNXRUNTIME_ROOT=/path/to/onnxruntime \
      -DTRT_RTX_ROOT=/path/to/tensorrt-rtx
cmake --build build

The output library is at:

Windows: build\Release\onnxruntime_providers_nv_tensorrt_rtx.dll
Linux: build/libonnxruntime_providers_nv_tensorrt_rtx.so

Documentation

For detailed documentation on features, execution provider options, and API usage, see the official ONNX Runtime documentation:

NVIDIA TensorRT RTX Execution Provider

Package Contents

Libraries

File	Description
`onnxruntime_providers_nv_tensorrt_rtx.dll`	TensorRT RTX EP ABI plugin library
`tensorrt_rtx_1_4.dll`	TensorRT RTX 1.4 inference engine
`tensorrt_onnxparser_rtx_1_4.dll`	TensorRT RTX 1.4 ONNX model parser
`tensorrt_plugins.dll`	TensorRT custom layer plugins
`onnxruntime_providers_nv_tensorrt_rtx.pdb`	Debug symbols (for diagnostics)

Documentation & Legal

File	Description
`LICENSE`	Apache 2.0 license
`Privacy.md`	Privacy and data collection statement
`ThirdPartyNotices.txt`	Microsoft / ONNX Runtime third-party notices
`TRT_RTX_Acknowledgements.txt`	NVIDIA TensorRT RTX third-party acknowledgements

Contributions

Contributors to this release of TensorRT RTX EP ABI:
@gaugarg-nv, @keshavv27, @gedoensmax, @anujj, @ishwar-raut1, @umangb-09, @thevishalagarwal, @anskumar01, @theHamsta

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TensorRT RTX Execution Provider ABI Package (trt-rtx-ep-abi) v0.1.0

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

TensorRT RTX Execution Provider ABI Package (trt-rtx-ep-abi) v0.1.0

onnxruntime TensorRT RTX EP ABI (trt-rtx-ep-abi)

Feature Support

Known Limitations

Dependencies

Getting Started

Run via Python

Run via C++

Build from Source

Clone the Repository

Build

Documentation

Package Contents

Libraries

Documentation & Legal

Contributions

Contributors

Uh oh!