Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: CUDA error: device-side assert triggered #121

Open
987410 opened this issue Jun 17, 2022 · 1 comment
Open

RuntimeError: CUDA error: device-side assert triggered #121

987410 opened this issue Jun 17, 2022 · 1 comment

Comments

@987410
Copy link

987410 commented Jun 17, 2022

Hello,
when i add traced_cpu = torch.jit.trace(model, images.clone().detach()), try to convert model, I got error
model.load_state_dict(self.exp.get_epoch_model(epoch))
model = model.to(self.device)
model.eval()
if on_val:
dataloader = self.get_val_dataloader()
else:
dataloader = self.get_test_dataloader()
test_parameters = self.cfg.get_test_parameters()
predictions = []
self.exp.eval_start_callback(self.cfg)
with torch.no_grad():
for idx, (images, _, _) in enumerate(tqdm(dataloader)):
images = images.to(self.device)
import pdb
pdb.set_trace()
traced_cpu = torch.jit.trace(model, images.clone().detach())
torch.jit.save(traced_cpu, "laneATT.pth")
output = model(images, **test_parameters)

/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [17,0,0], thread: [24,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [17,0,0], thread: [25,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [17,0,0], thread: [26,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [17,0,0], thread: [27,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [17,0,0], thread: [28,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [17,0,0], thread: [29,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [17,0,0], thread: [30,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [17,0,0], thread: [31,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
0%| | 0/1 [00:03<?, ?it/s]
Traceback (most recent call last):
File "/media/disk_8t/tvm/venvs/laneATT/lib/python3.9/site-packages/torch/jit/_trace.py", line 443, in run_mod_and_filter_tensor_outputs
outs = wrap_retval(mod(*_clone_inputs(inputs)))
RuntimeError: CUDA error: device-side assert triggered

How to fix it? thanks

@lucastabelini
Copy link
Owner

I have never worked with PyTorch's JIT tracing, so I can't really help you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants