-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug] TVM model give different results when run multiple times #10545
Comments
Make sure you get more basic examples working first. And post a complete working repro script, what you showed is not really executable. It's probably a general usage question not a bug, so please post your question to the forum. |
Thanks for your attention. |
You can use https://gist.github.com rather than pasting your code on github |
Thanks for your help, I'll try! |
The gist code: https://gist.github.com/qingcd/6d9d228d92a7b6d09732a6070473a229 |
My pytorch model will have different results when run multiple times with the same input after converting to tvm model. The cuda target fmt is ptx. If the target fmt chage back to cubin, then there is no problem.
Expected behavior
The result of multiple run using the same input should stay the same, the print of the sample code should be:
max abs diff is: 0
Actual behavior
max abs diff is: 7.818208
Environment
gpu: rtx 2070
nvcc: Cuda compilation tools, release 11.1, V11.1.74
Nvidia Driver Version: 470.86
system: Linux shukun-desktop 5.13.0-27-generic #29~20.04.1-Ubuntu SMP Fri Jan 14 00:32:30 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
TVM commit: 0c836b7
Steps to reproduce
change the target_fmt from cubin to ptx in python/tvm/contrib/nvcc.py
@tvm._ffi.register_func def tvm_callback_cuda_compile(code): """use nvcc to generate fatbin code for better optimization""" ptx = compile_cuda(code, target_format="fatbin") return ptx
run this code
import math
import numpy as np
import torch
import torch.nn.functional as F
from torch import nn
import tvm
from tvm import relay
from tvm.contrib import graph_executor
class BatchActivateConvLayer(nn.Module):
def init(
self, channel_in, growth_rate, bottleneck_size_basic_factor, drop_ratio=0.8
):
class DenseBlock(nn.Module):
def init(
self,
current_block_layers_number,
channel_in,
growth_rate,
bottleneck_size_basic_factor,
drop_ratio=0.8,
):
class DenseNet(nn.Module):
def init(
self,
growth_rate=24,
block_config=(2, 2),
compression=0.5,
num_init_features=24,
bottleneck_size_basic_factor=2,
drop_rate=0,
num_classes=2,
small_inputs=True,
rnn_units=512,
):
super(DenseNet, self).init()
def run_tvm_module(module, inpt):
module.set_input(0, inpt)
module.run()
tvm.cuda().sync()
res = module.get_output(0).numpy()
return res
if name == "main":
model = DenseNet()
model.eval()
model_jit = torch.jit.trace(model, example_inputs=torch.randn((4,2,64,64,64)))
print("finish gen trace model")
The text was updated successfully, but these errors were encountered: