Unknown error with CUDA #62

zzhat0706 · 2020-02-15T11:50:11Z

Thank you for your great work at first!
When I try to run deformation of sphere to dolphin tutorial, I found an unexpected errors when loading the vertices of meshes to device, which be set to CUDA:0. Here is the error log:

THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1579027003190/work/aten/src/THC/THCGeneral.cpp line=50 error=30 : unknown error
Traceback (most recent call last):
File "dolphin.py", line 33, in
faces_idx = faces.verts_idx.to(device)
File "/home/jormungandr/anaconda3/envs/pytorch3d/lib/python3.6/site-packages/torch/cuda/init.py", line 197, in _lazy_init
torch._C._cuda_init()
RuntimeError: cuda runtime error (30) : unknown error at /opt/conda/conda-bld/pytorch_1579027003190/work/aten/src/THC/THCGeneral.cpp:50

I've tried rmmod nvidia, nvidia-uvm, but each of these commands has an error about
rmmod: ERROR: Module nvidia_uvm is not currently loaded or
rmmod: ERROR: Module nvidia is in use by: nvidia_modeset
I rebooted once, but nothing changed either.
And my environment is as follows:
Pytorch : 1.4
Python : 3.6.10
CUDA : 10.0 by nvcc(runtime)
cuDNN : 7.0
OS : Ubuntu 18.04

nikhilaravi · 2020-02-16T16:55:18Z

@zzhat0706 I am assuming you built PyTorch3D from local clone? Were you able to run if you change the device to cpu? Are you able to run other pytorch code on device i.e. not with PyTorch3D? This looks like an issue with PyTorch not PyTorch3D.

There's an issue on the PyTorch repo which is referencing this problem - did you check this? pytorch/pytorch#17108

zzhat0706 · 2020-02-17T02:29:05Z

@nikhilaravi Thanks for your answers!
Before Pytorch3D, I've run some pytorch codes such as CycleGAN and W-GAN. And I built Pytorch3D from Anaconda Cloud but not the local clone.
But I haven't tried cpu yet, I'll check it later.

shersoni610 · 2020-02-19T06:16:02Z

Hello,

I did all the things mentioned above but I still get the error:

RuntimeError Traceback (most recent call last)
in
23
24 # Create a textures object
---> 25 tex = Textures(verts_uvs=verts_uvs, faces_uvs=faces_uvs, maps=texture_image)
26
27 # Create a meshes object with textures

~/Disk/Software/Anaconda3/envs/pytorch3d/lib/python3.7/site-packages/pytorch3d/structures/textures.py in init(self, maps, faces_uvs, verts_uvs, verts_rgb)
118
119 if self._faces_uvs_padded is not None:
--> 120 self._num_faces_per_mesh = faces_uvs.gt(-1).all(-1).sum(-1).tolist()
121
122 def clone(self):

RuntimeError: CUDA error: no kernel image is available for execution on the device

shersoni610 · 2020-02-19T06:17:24Z

I downgraded the Nvidia driver but the error persists:

shersoni610 · 2020-02-19T06:20:16Z

Surprisingly, all the tests in the test folder passed. But the following error comes in the notebook tutorials.
RuntimeError: CUDA error: no kernel image is available for execution on the device

bottler · 2020-02-19T11:27:35Z

I can't see the name of the GPU you are using due to truncation. ("GeForce GTX TIT...") Is it a TITAN X? If it is one of the other TITANs, then I think it will have compute capability 3.5 and so need a local build of pytorch.

shersoni610 · 2020-02-19T21:42:17Z

Hello, It is Titan GTX Black card. Ubuntu 18.04 now have many Nvidia drivers. Which is the latest required driver to build the pytorch3d? (440, 435, 410)

…

On Wed, Feb 19, 2020 at 3:27 AM Jeremy Reizenstein ***@***.***> wrote: I can't see the name of the GPU you are using due to truncation. ("GeForce GTX TIT...") Is it a TITAN X? If it is one of the other TITANs, then I think it will have compute capability 3.5 and so need a local build of pytorch. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#62?email_source=notifications&email_token=ANZR6GWGGDHS542JASYT4N3RDUJSPA5CNFSM4KVX6OOKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEMHNBKQ#issuecomment-588173482>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ANZR6GWPVYR4ULVLW35H25TRDUJSPANCNFSM4KVX6OOA> .

bottler · 2020-02-20T11:42:08Z

I don't think the driver matters, but you might as well use the latest driver which is working with your GPU. If nvidia-smi is working then you probably have a working driver (although I am not sure about this).

The problem is that you cannot install pytorch (except old versions which we don't support) from conda with your GPU. You will need to set up a new conda environment, and follow the instructions at https://github.com/pytorch/pytorch#from-source for your gpu. I suggest you checkout the branch v1.4. I think you will then need to install torchvision from source as well, and then install pytorch3d from source.

(If all this sounds too hard, and you just want to get a feel for the tutorials, and you are not expecting that you will be using pytorch3d much on your computer, maybe you can run them on colab instead. Alternatively, if you install just pytorch3d from github, then you may be able to run the tutorial entirely on the CPU - just change device = torch.device("cuda:0") to device = torch.device("cpu") in the tutorial.)

nikhilaravi · 2020-02-24T06:15:49Z

@zzhat0706, @shersoni610 were you able to resolve this installation issue? If so, please share what you did here for others to replicate!

zzhat0706 · 2020-03-09T09:03:10Z

@nikhilaravi Sorry for the late reply, I eventually chose to run on nvidia-docker, then everything just worked out perfectly!
Thx for your great work again!

vilhub mentioned this issue Feb 15, 2020

Cuda runtime error during mesh optimization #63

Closed

nikhilaravi self-assigned this Feb 18, 2020

nikhilaravi added the question Further information is requested label Feb 24, 2020

zzhat0706 closed this as completed Mar 9, 2020

rohitdavas mentioned this issue Jul 4, 2020

Not compiled with gpu support error while installing on system. #257

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unknown error with CUDA #62

Unknown error with CUDA #62

zzhat0706 commented Feb 15, 2020

nikhilaravi commented Feb 16, 2020

zzhat0706 commented Feb 17, 2020

shersoni610 commented Feb 19, 2020

shersoni610 commented Feb 19, 2020

shersoni610 commented Feb 19, 2020

bottler commented Feb 19, 2020

shersoni610 commented Feb 19, 2020 via email

bottler commented Feb 20, 2020

nikhilaravi commented Feb 24, 2020

zzhat0706 commented Mar 9, 2020 •

edited

Loading

Unknown error with CUDA #62

Unknown error with CUDA #62

Comments

zzhat0706 commented Feb 15, 2020

nikhilaravi commented Feb 16, 2020

zzhat0706 commented Feb 17, 2020

shersoni610 commented Feb 19, 2020

shersoni610 commented Feb 19, 2020

shersoni610 commented Feb 19, 2020

bottler commented Feb 19, 2020

shersoni610 commented Feb 19, 2020 via email

bottler commented Feb 20, 2020

nikhilaravi commented Feb 24, 2020

zzhat0706 commented Mar 9, 2020 • edited Loading

zzhat0706 commented Mar 9, 2020 •

edited

Loading