Skip to content

What to do if CUDA doesn't work #1778

@natetyoung

Description

@natetyoung

I've been having considerable difficulty getting cuda to work, and I've found very sparse resources on what to do when it doesn't.

I'm on Ubuntu 16.04; cuda is installed; cuDNN is in place; all the cuda samples work properly. There were no problems installing anything; the output of nvidia-smi looks fine, etc. etc.

However, torch.cuda.is_available() returns False, and torch.Tensor().cuda() gives:
THCudaCheck FAIL file=/py/conda-bld/pytorch_1493681908901/work/torch/lib/THC/THCGeneral.c line=66 error=30 : unknown error Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/home/nate/miniconda3/lib/python3.6/site-packages/torch/_utils.py", line 65, in _cuda return new_type(self.size()).copy_(self, async) File "/home/nate/miniconda3/lib/python3.6/site-packages/torch/cuda/__init__.py", line 272, in __new__ _lazy_init() File "/home/nate/miniconda3/lib/python3.6/site-packages/torch/cuda/__init__.py", line 85, in _lazy_init torch._C._cuda_init() RuntimeError: cuda runtime error (30) : unknown error at /py/conda-bld/pytorch_1493681908901/work/torch/lib/THC/THCGeneral.c:66

Is there some troubleshooting document I should be looking at? Can anyine help me here? I've been searching in circles for quite some time.

The oddest thing is, it worked yesterday, but I woke up today and it doesn't. That suggests to me it was an environment variable issue or something similar, but I haven't found any way of telling what the environment variables are actually supposed to be so I'm at a loss.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions