Description
TensorRT appears to produce different results from ONNX Runtime for ONNX Cast from float to int32 when the input contains special floating-point values.
For the same ONNX model and input, ONNX Runtime converts NaN, +Inf, and -Inf to INT32_MIN. TensorRT instead converts NaN to 0 and +Inf to INT32_MAX.
This appears to be a TensorRT execution issue for ONNX Cast from float to int32 with special floating-point values.
Environment
TensorRT Version: 10.16.1.11
NVIDIA GPU: N/A / not detected by nvidia-smi
NVIDIA Driver Version: N/A / nvidia-smi failed
CUDA Version: N/A / nvcc not found
CUDNN Version: N/A / torch.backends.cudnn.version() returned None
Operating System: Linux 6.17.0-20-generic x86_64, glibc 2.39
Python Version (if applicable): Python 3.11.15
Tensorflow Version (if applicable): N/A
PyTorch Version (if applicable): N/A
Baremetal or Container (if so, version): Baremetal / non-Docker environment (/proc/1/cgroup: 0::/init.scope)
Additional package versions:
ONNX Version: 1.21.0
ONNX Runtime Version: 1.25.1
Relevant Files
Model link: N/A
The ONNX model is generated inline by the minimal reproducible script below.
Steps To Reproduce
Commands or scripts:
import numpy as np
import onnx
import onnxruntime as ort
from onnx import helper, TensorProto
from _trt_helper import build_engine_from_onnx, run_engine
X = helper.make_tensor_value_info("x", TensorProto.FLOAT, [5])
Y = helper.make_tensor_value_info("y", TensorProto.INT32, [5])
g = helper.make_graph(
[helper.make_node("Cast", ["x"], ["y"], to=int(TensorProto.INT32))],
"g",
[X],
[Y],
)
m = helper.make_model(g, opset_imports=[helper.make_opsetid("", 18)])
m.ir_version = 10
ob = m.SerializeToString()
x = np.array([-2.5, np.nan, 7.7, np.inf, -np.inf], dtype=np.float32)
ort_y = ort.InferenceSession(
ob,
providers=["CPUExecutionProvider"],
).run(["y"], {"x": x})[0]
eng, _ = build_engine_from_onnx(ob)
trt_y = run_engine(
eng,
{"x": x},
["y"],
[(5,)],
[np.int32],
)["y"]
print("ORT:", ort_y.tolist())
print("TRT:", trt_y.tolist())
assert int(ort_y[1]) == np.iinfo(np.int32).min
assert int(trt_y[1]) == 0
Have you tried the latest release?: Yes, reproduced with TensorRT 10.16.1.11.
Attach the captured .json and .bin files from TensorRT's API Capture tool if you're on an x86_64 Unix system Not attached. The issue is reproducible from the self-contained Python script above.
Can this model run on other frameworks? For example run ONNX model with ONNXRuntime (polygraphy run <model.onnx> --onnxrt): For example run ONNX model with ONNXRuntime (polygraphy run <model.onnx> --onnxrt):
Yes. ONNX Runtime runs the same model, but TensorRT produces different integer results for NaN and +Inf.
Actual output:
ORT: [-2, -2147483648, 7, -2147483648, -2147483648]
TRT: [-2, 0, 7, 2147483647, -2147483648]
TensorRT converts NaN to 0 and +Inf to 2147483647, while ONNX Runtime converts both to -2147483648.
Description
TensorRT appears to produce different results from ONNX Runtime for ONNX
Castfromfloattoint32when the input contains special floating-point values.For the same ONNX model and input, ONNX Runtime converts
NaN,+Inf, and-InftoINT32_MIN. TensorRT instead convertsNaNto0and+InftoINT32_MAX.This appears to be a TensorRT execution issue for ONNX
Castfrom float to int32 with special floating-point values.Environment
TensorRT Version: 10.16.1.11
NVIDIA GPU: N/A / not detected by
nvidia-smiNVIDIA Driver Version: N/A /
nvidia-smifailedCUDA Version: N/A /
nvccnot foundCUDNN Version: N/A /
torch.backends.cudnn.version()returnedNoneOperating System: Linux 6.17.0-20-generic x86_64, glibc 2.39
Python Version (if applicable): Python 3.11.15
Tensorflow Version (if applicable): N/A
PyTorch Version (if applicable): N/A
Baremetal or Container (if so, version): Baremetal / non-Docker environment (
/proc/1/cgroup:0::/init.scope)Additional package versions:
ONNX Version: 1.21.0
ONNX Runtime Version: 1.25.1
Relevant Files
Model link: N/A
The ONNX model is generated inline by the minimal reproducible script below.
Steps To Reproduce
Commands or scripts:
Have you tried the latest release?: Yes, reproduced with TensorRT 10.16.1.11.
Attach the captured .json and .bin files from TensorRT's API Capture tool if you're on an x86_64 Unix system Not attached. The issue is reproducible from the self-contained Python script above.
Can this model run on other frameworks? For example run ONNX model with ONNXRuntime (
polygraphy run <model.onnx> --onnxrt): For example run ONNX model with ONNXRuntime (polygraphy run <model.onnx> --onnxrt):Yes. ONNX Runtime runs the same model, but TensorRT produces different integer results for NaN and +Inf.
Actual output:
TensorRT converts NaN to 0 and +Inf to 2147483647, while ONNX Runtime converts both to -2147483648.