-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unknown error with CUDA #62
Comments
@zzhat0706 I am assuming you built PyTorch3D from local clone? Were you able to run if you change the device to cpu? Are you able to run other pytorch code on device i.e. not with PyTorch3D? This looks like an issue with PyTorch not PyTorch3D. There's an issue on the PyTorch repo which is referencing this problem - did you check this? pytorch/pytorch#17108 |
@nikhilaravi Thanks for your answers! |
Hello, I did all the things mentioned above but I still get the error: RuntimeError Traceback (most recent call last) ~/Disk/Software/Anaconda3/envs/pytorch3d/lib/python3.7/site-packages/pytorch3d/structures/textures.py in init(self, maps, faces_uvs, verts_uvs, verts_rgb) RuntimeError: CUDA error: no kernel image is available for execution on the device |
I downgraded the Nvidia driver but the error persists: -----------------------------------------------------------------------------+ |
Surprisingly, all the tests in the test folder passed. But the following error comes in the notebook tutorials. |
I can't see the name of the GPU you are using due to truncation. ("GeForce GTX TIT...") Is it a TITAN X? If it is one of the other TITANs, then I think it will have compute capability 3.5 and so need a local build of pytorch. |
Hello,
It is Titan GTX Black card.
Ubuntu 18.04 now have many Nvidia drivers. Which is the latest required
driver to build the pytorch3d?
(440, 435, 410)
…On Wed, Feb 19, 2020 at 3:27 AM Jeremy Reizenstein ***@***.***> wrote:
I can't see the name of the GPU you are using due to truncation. ("GeForce
GTX TIT...") Is it a TITAN X? If it is one of the other TITANs, then I
think it will have compute capability 3.5 and so need a local build of
pytorch.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#62?email_source=notifications&email_token=ANZR6GWGGDHS542JASYT4N3RDUJSPA5CNFSM4KVX6OOKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEMHNBKQ#issuecomment-588173482>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ANZR6GWPVYR4ULVLW35H25TRDUJSPANCNFSM4KVX6OOA>
.
|
I don't think the driver matters, but you might as well use the latest driver which is working with your GPU. If nvidia-smi is working then you probably have a working driver (although I am not sure about this). The problem is that you cannot install pytorch (except old versions which we don't support) from conda with your GPU. You will need to set up a new conda environment, and follow the instructions at (If all this sounds too hard, and you just want to get a feel for the tutorials, and you are not expecting that you will be using |
@zzhat0706, @shersoni610 were you able to resolve this installation issue? If so, please share what you did here for others to replicate! |
@nikhilaravi Sorry for the late reply, I eventually chose to run on nvidia-docker, then everything just worked out perfectly! |
Thank you for your great work at first!
When I try to run deformation of sphere to dolphin tutorial, I found an unexpected errors when loading the vertices of meshes to device, which be set to CUDA:0. Here is the error log:
THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1579027003190/work/aten/src/THC/THCGeneral.cpp line=50 error=30 : unknown error
Traceback (most recent call last):
File "dolphin.py", line 33, in
faces_idx = faces.verts_idx.to(device)
File "/home/jormungandr/anaconda3/envs/pytorch3d/lib/python3.6/site-packages/torch/cuda/init.py", line 197, in _lazy_init
torch._C._cuda_init()
RuntimeError: cuda runtime error (30) : unknown error at /opt/conda/conda-bld/pytorch_1579027003190/work/aten/src/THC/THCGeneral.cpp:50
I've tried rmmod nvidia, nvidia-uvm, but each of these commands has an error about
rmmod: ERROR: Module nvidia_uvm is not currently loaded or
rmmod: ERROR: Module nvidia is in use by: nvidia_modeset
I rebooted once, but nothing changed either.
And my environment is as follows:
Pytorch : 1.4
Python : 3.6.10
CUDA : 10.0 by nvcc(runtime)
cuDNN : 7.0
OS : Ubuntu 18.04
The text was updated successfully, but these errors were encountered: