Incorrect Cast result of TensorRT 10.16.1.11 when running ONNX Cast float-to-int32 on GPU

## Description

TensorRT appears to produce different results from ONNX Runtime for ONNX `Cast` from `float` to `int32` when the input contains special floating-point values.

For the same ONNX model and input, ONNX Runtime converts `NaN`, `+Inf`, and `-Inf` to `INT32_MIN`. TensorRT instead converts `NaN` to `0` and `+Inf` to `INT32_MAX`.

This appears to be a TensorRT execution issue for ONNX `Cast` from float to int32 with special floating-point values.

## Environment

**TensorRT Version**: 10.16.1.11

**NVIDIA GPU**: N/A / not detected by `nvidia-smi`

**NVIDIA Driver Version**: N/A / `nvidia-smi` failed

**CUDA Version**: N/A / `nvcc` not found

**CUDNN Version**: N/A / `torch.backends.cudnn.version()` returned `None`


Operating System: Linux 6.17.0-20-generic x86_64, glibc 2.39

Python Version (if applicable): Python 3.11.15

Tensorflow Version (if applicable): N/A

PyTorch Version (if applicable): N/A

Baremetal or Container (if so, version): Baremetal / non-Docker environment (`/proc/1/cgroup`: `0::/init.scope`)

Additional package versions:

ONNX Version: 1.21.0  
ONNX Runtime Version: 1.25.1

## Relevant Files

**Model link**: N/A

The ONNX model is generated inline by the minimal reproducible script below.

## Steps To Reproduce

**Commands or scripts**:

```
import numpy as np
import onnx
import onnxruntime as ort
from onnx import helper, TensorProto
from _trt_helper import build_engine_from_onnx, run_engine

X = helper.make_tensor_value_info("x", TensorProto.FLOAT, [5])
Y = helper.make_tensor_value_info("y", TensorProto.INT32, [5])

g = helper.make_graph(
    [helper.make_node("Cast", ["x"], ["y"], to=int(TensorProto.INT32))],
    "g",
    [X],
    [Y],
)

m = helper.make_model(g, opset_imports=[helper.make_opsetid("", 18)])
m.ir_version = 10
ob = m.SerializeToString()

x = np.array([-2.5, np.nan, 7.7, np.inf, -np.inf], dtype=np.float32)

ort_y = ort.InferenceSession(
    ob,
    providers=["CPUExecutionProvider"],
).run(["y"], {"x": x})[0]

eng, _ = build_engine_from_onnx(ob)
trt_y = run_engine(
    eng,
    {"x": x},
    ["y"],
    [(5,)],
    [np.int32],
)["y"]

print("ORT:", ort_y.tolist())
print("TRT:", trt_y.tolist())

assert int(ort_y[1]) == np.iinfo(np.int32).min
assert int(trt_y[1]) == 0
```

**Have you tried [the latest release](https://developer.nvidia.com/tensorrt)?**: Yes, reproduced with TensorRT 10.16.1.11.

**Attach the captured .json and .bin files from [TensorRT's API Capture tool](https://docs.nvidia.com/deeplearning/tensorrt/latest/inference-library/capture-replay.html) if you're on an x86_64 Unix system** Not attached. The issue is reproducible from the self-contained Python script above.

**Can this model run on other frameworks?** For example run ONNX model with ONNXRuntime (`polygraphy run <model.onnx> --onnxrt`): For example run ONNX model with ONNXRuntime (polygraphy run <model.onnx> --onnxrt):

Yes. ONNX Runtime runs the same model, but TensorRT produces different integer results for NaN and +Inf.

## Actual output:

```
ORT: [-2, -2147483648, 7, -2147483648, -2147483648]
TRT: [-2, 0, 7, 2147483647, -2147483648]
```

TensorRT converts NaN to 0 and +Inf to 2147483647, while ONNX Runtime converts both to -2147483648.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Incorrect Cast result of TensorRT 10.16.1.11 when running ONNX Cast float-to-int32 on GPU #4775

Description

Environment

Relevant Files

Steps To Reproduce

Actual output:

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Incorrect Cast result of TensorRT 10.16.1.11 when running ONNX Cast float-to-int32 on GPU #4775

Description

Description

Environment

Relevant Files

Steps To Reproduce

Actual output:

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions