Skip to content

Incorrect Cast result of TensorRT 10.16.1.11 when running ONNX Cast float-to-int32 on GPU #4775

@ALinrunrun

Description

@ALinrunrun

Description

TensorRT appears to produce different results from ONNX Runtime for ONNX Cast from float to int32 when the input contains special floating-point values.

For the same ONNX model and input, ONNX Runtime converts NaN, +Inf, and -Inf to INT32_MIN. TensorRT instead converts NaN to 0 and +Inf to INT32_MAX.

This appears to be a TensorRT execution issue for ONNX Cast from float to int32 with special floating-point values.

Environment

TensorRT Version: 10.16.1.11

NVIDIA GPU: N/A / not detected by nvidia-smi

NVIDIA Driver Version: N/A / nvidia-smi failed

CUDA Version: N/A / nvcc not found

CUDNN Version: N/A / torch.backends.cudnn.version() returned None

Operating System: Linux 6.17.0-20-generic x86_64, glibc 2.39

Python Version (if applicable): Python 3.11.15

Tensorflow Version (if applicable): N/A

PyTorch Version (if applicable): N/A

Baremetal or Container (if so, version): Baremetal / non-Docker environment (/proc/1/cgroup: 0::/init.scope)

Additional package versions:

ONNX Version: 1.21.0
ONNX Runtime Version: 1.25.1

Relevant Files

Model link: N/A

The ONNX model is generated inline by the minimal reproducible script below.

Steps To Reproduce

Commands or scripts:

import numpy as np
import onnx
import onnxruntime as ort
from onnx import helper, TensorProto
from _trt_helper import build_engine_from_onnx, run_engine

X = helper.make_tensor_value_info("x", TensorProto.FLOAT, [5])
Y = helper.make_tensor_value_info("y", TensorProto.INT32, [5])

g = helper.make_graph(
    [helper.make_node("Cast", ["x"], ["y"], to=int(TensorProto.INT32))],
    "g",
    [X],
    [Y],
)

m = helper.make_model(g, opset_imports=[helper.make_opsetid("", 18)])
m.ir_version = 10
ob = m.SerializeToString()

x = np.array([-2.5, np.nan, 7.7, np.inf, -np.inf], dtype=np.float32)

ort_y = ort.InferenceSession(
    ob,
    providers=["CPUExecutionProvider"],
).run(["y"], {"x": x})[0]

eng, _ = build_engine_from_onnx(ob)
trt_y = run_engine(
    eng,
    {"x": x},
    ["y"],
    [(5,)],
    [np.int32],
)["y"]

print("ORT:", ort_y.tolist())
print("TRT:", trt_y.tolist())

assert int(ort_y[1]) == np.iinfo(np.int32).min
assert int(trt_y[1]) == 0

Have you tried the latest release?: Yes, reproduced with TensorRT 10.16.1.11.

Attach the captured .json and .bin files from TensorRT's API Capture tool if you're on an x86_64 Unix system Not attached. The issue is reproducible from the self-contained Python script above.

Can this model run on other frameworks? For example run ONNX model with ONNXRuntime (polygraphy run <model.onnx> --onnxrt): For example run ONNX model with ONNXRuntime (polygraphy run <model.onnx> --onnxrt):

Yes. ONNX Runtime runs the same model, but TensorRT produces different integer results for NaN and +Inf.

Actual output:

ORT: [-2, -2147483648, 7, -2147483648, -2147483648]
TRT: [-2, 0, 7, 2147483647, -2147483648]

TensorRT converts NaN to 0 and +Inf to 2147483647, while ONNX Runtime converts both to -2147483648.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Module:ONNXIssues relating to ONNX usage and import

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions