Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] [Relay] [Torch] [ONNX] Robustness of Cast operator accepting NaN values #17081

Open
shaoyuyoung opened this issue Jun 11, 2024 · 0 comments
Labels
needs-triage PRs or issues that need to be investigated by maintainers to find the right assignees to address it type: bug

Comments

@shaoyuyoung
Copy link

shaoyuyoung commented Jun 11, 2024

Description

Here is a single op: Cast
image

In TVM, when it accepts NaN value, it outputs False.

However, in PyTorch, it outputs True.

In Pytorch and ONNX, Cast would cast the Nonzero value to False, the others to True.
The evidence is here: https://onnx.ai/onnx/operators/onnx__Cast.html#l-onnx-doc-cast
image

I am unsure how the Cast op is defined in TVM. But if it is different from other frameworks/compilers (e.g., Pytorch & ONNX), the final results would be inconsistent with other frameworks/compilers in complex scenarios (i.e., a model containing more ops).

Code to repro

import pickle
import torch
import torch.nn as nn
import tvm
from tvm import relay
from tvm.contrib import graph_executor
import numpy as np
import onnx
import numpy.testing as npt


class Model(nn.Module):
    def __init__(self):
        super(Model, self).__init__()

    def forward(self, input_tensor):
        cast_output = input_tensor.to(torch.bool)

        return cast_output


model = Model()
input_tensor = torch.tensor([float('nan')])

torch_output = model(input_tensor).numpy()

torch.onnx.export(
    model,
    input_tensor,
    "test.onnx",
    input_names=["input"],
    output_names=["output"],
    opset_version=14,
    do_constant_folding=True,
)
onnx_model = onnx.load("test.onnx")

target = "llvm"

shape_dict = {"input": input_tensor.shape}

mod, params = relay.frontend.from_onnx(onnx_model, shape_dict)

dev = tvm.cpu(0)
with tvm.transform.PassContext(opt_level=4):
    executor = relay.build_module.create_executor(
        "graph", mod, dev, target, params
    ).evaluate()

inputs = {"input": tvm.nd.array(input_tensor.numpy())}

tvm_output = executor(**inputs).numpy()

npt.assert_allclose(torch_output, tvm_output, rtol=1e-5, atol=1e-8)

Error log

AssertionError: 
Not equal to tolerance rtol=1e-05, atol=1e-08

Mismatched elements: 1 / 1 (100%)
 x: array([ True])
 y: array([False])

Environment & Version

ubuntu 20
TVM d1ac1c0

cc @KJlaccHoeUM9l @shingjan @yelite

@shaoyuyoung shaoyuyoung added needs-triage PRs or issues that need to be investigated by maintainers to find the right assignees to address it type: bug labels Jun 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-triage PRs or issues that need to be investigated by maintainers to find the right assignees to address it type: bug
Projects
None yet
Development

No branches or pull requests

1 participant