TensorRT Lean cannot deserialize engine built in full TensorRT (ReformatRunner error)

Summary

I am unable to load a .trt engine using tensorrt_lean runtime. The engine fails during deserialization with a ReformatRunner error, even though the setup is inference-only and uses a CUDA runtime container.

Environment
Docker base image: nvidia/cuda:13.1.2-cudnn-runtime-ubuntu24.04
TensorRT version: 10.16.1.11 (tensorrt_lean)
Python: 3.12
GPU: NVIDIA GPU (CUDA enabled container)
Engine format: .trt (prebuilt outside container)
Installed packages
tensorrt_lean
numpy
Issue description

When trying to deserialize a TensorRT engine using:

import tensorrt_lean as trt

TRT_LOGGER = trt.Logger(trt.Logger.WARNING)

runtime = trt.Runtime(TRT_LOGGER)
with open("trt_weights_f16/mos.trt", "rb") as f:
    engine = runtime.deserialize_cuda_engine(f.read())

The following error occurs:

[TRT] [E] IRuntime::deserializeCudaEngine: Error Code 1: Internal Error
Unexpected call to stub loadRunner for ReformatRunner

As a result:

engine is None

and inference cannot proceed.

Expected behavior

The engine should deserialize successfully and allow inference using:

context = engine.create_execution_context()
context.execute_async_v3(...)
Actual behavior
Engine fails during deserialization
ReformatRunner stub error is triggered
engine == None
No inference possible
What I tried
Switching to inference-only CUDA runtime image
Using tensorrt_lean instead of full TensorRT
Minimal Python inference script (no PyCUDA, no training dependencies)
Verifying engine file path and loading logic
Key observation

The engine was built using a full TensorRT environment, and fails when loaded with tensorrt_lean.

It seems that tensorrt_lean does not support certain internal runners (e.g., ReformatRunner) required by the engine.

Question

Is there a compatibility requirement between:

TensorRT engine build environment
TensorRT Lean runtime

Specifically:

Are engines built with full TensorRT incompatible with Lean runtime?
Is there a required “Lean-compatible engine export” workflow?
Additional context

This setup is intended for inference-only deployment, and the goal was to use a minimal runtime container without full TensorRT SDK.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TensorRT Lean cannot deserialize engine built in full TensorRT (ReformatRunner error) #4790

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

TensorRT Lean cannot deserialize engine built in full TensorRT (ReformatRunner error) #4790

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions