What to do if CUDA doesn't work

I've been having considerable difficulty getting cuda to work, and I've found very sparse resources on what to do when it doesn't.

I'm on Ubuntu 16.04; cuda is installed; cuDNN is in place; all the cuda samples work properly. There were no problems installing anything; the output of nvidia-smi looks fine, etc. etc.

However, `torch.cuda.is_available()` returns False, and `torch.Tensor().cuda()` gives:
`THCudaCheck FAIL file=/py/conda-bld/pytorch_1493681908901/work/torch/lib/THC/THCGeneral.c line=66 error=30 : unknown error
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/nate/miniconda3/lib/python3.6/site-packages/torch/_utils.py", line 65, in _cuda
    return new_type(self.size()).copy_(self, async)
  File "/home/nate/miniconda3/lib/python3.6/site-packages/torch/cuda/__init__.py", line 272, in __new__
    _lazy_init()
  File "/home/nate/miniconda3/lib/python3.6/site-packages/torch/cuda/__init__.py", line 85, in _lazy_init
    torch._C._cuda_init()
RuntimeError: cuda runtime error (30) : unknown error at /py/conda-bld/pytorch_1493681908901/work/torch/lib/THC/THCGeneral.c:66`

Is there some troubleshooting document I should be looking at? Can anyine help me here? I've been searching in circles for quite some time.

The oddest thing is, it worked yesterday, but I woke up today and it doesn't. That suggests to me it was an environment variable issue or something similar, but I haven't found any way of telling what the environment variables are actually supposed to be so I'm at a loss.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

What to do if CUDA doesn't work #1778

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

What to do if CUDA doesn't work #1778

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions