-
Notifications
You must be signed in to change notification settings - Fork 26.8k
Open
Labels
module: buildBuild system issuesBuild system issuesmodule: cudaRelated to torch.cuda, and CUDA support in generalRelated to torch.cuda, and CUDA support in generaltriagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate moduleThis issue has been looked at a team member, and triaged and prioritized into an appropriate module
Description
🐛 Describe the bug
This is on a fresh install of Ubuntu 22,04 in WSL. All I did before running this was an apt-get upgrade and wget and install the Linux Miniconda installer. I did not download any Cuda packages from Nvidia, or the Ubuntu nvidia-cuda-toolkit package.
conda create -n torch pytorch torchvision torchaudio cudatoolkit=11.3 -c pytorch
(skip install text)
conda activate torch
python test.py
test.py:
import torch
print(f"{torch.cuda.is_available()=}")
f = torch.nn.Conv2d(3, 8, 3, device="cuda")
X = torch.randn(2, 3, 4, 4, device="cuda")
Y = X @ X
print(f"{Y.shape=}")
print("matrix multiply works")
Y = f(X)
print(f"{Y.shape=}")
print("Conv2d works")Console output;
torch.cuda.is_available()=True
Y.shape=torch.Size([2, 3, 4, 4])
matrix multiply works
Could not load library libcudnn_cnn_infer.so.8. Error: libcuda.so: cannot open shared object file: No such file or directory
Please make sure libcudnn_cnn_infer.so.8 is in your library path!
Aborted
The shared objects are where I would expect them to be:
$ ls $CONDA_PREFIX/lib/python3.10/site-packages/torch/lib
libc10_cuda.so libcudnn_adv_train.so.8 libcudnn_ops_train.so.8 libtorch_cpu.so libtorch_cuda.so
libc10.so libcudnn_cnn_infer.so.8 libcudnn.so.8 libtorch_cuda_cpp.so libtorch_global_deps.so
libcaffe2_nvrtc.so libcudnn_cnn_train.so.8 libcupti-53b4cc5d.so.11.3 libtorch_cuda_cu.so libtorch_python.so
libcudnn_adv_infer.so.8 libcudnn_ops_infer.so.8 libshm.so libtorch_cuda_linalg.so libtorch.so
This error does not happen when running Pytorch on the Windows that's hosting this WSL container. On Windows, everything works fine.
This is a different bug from #73487. . They report Pytorch not finding Cuda, while what I've found is Pytorch crashing on a call to, e.g., a forward pass of a Conv2d layer with an error saying it's unable to load the libcudnn shared object.
Versions
$ python collect_env.py
Collecting environment information...
PyTorch version: 1.12.1
Is debug build: False
CUDA used to build PyTorch: 11.3
ROCM used to build PyTorch: N/A
OS: Ubuntu 22.04.1 LTS (x86_64)
GCC version: Could not collect
Clang version: Could not collect
CMake version: Could not collect
Libc version: glibc-2.35
Python version: 3.10.4 (main, Mar 31 2022, 08:41:55) [GCC 7.5.0] (64-bit runtime)
Python platform: Linux-5.10.102.1-microsoft-standard-WSL2-x86_64-with-glibc2.35
Is CUDA available: True
CUDA runtime version: Could not collect
GPU models and configuration: GPU 0: NVIDIA GeForce RTX 3090
Nvidia driver version: 516.94
cuDNN version: Could not collect
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True
Versions of relevant libraries:
[pip3] numpy==1.23.1
[pip3] torch==1.12.1
[pip3] torchaudio==0.12.1
[pip3] torchvision==0.13.1
[conda] blas 1.0 mkl
[conda] cudatoolkit 11.3.1 h2bc3f7f_2
[conda] ffmpeg 4.3 hf484d3e_0 pytorch
[conda] mkl 2021.4.0 h06a4308_640
[conda] mkl-service 2.4.0 py310h7f8727e_0
[conda] mkl_fft 1.3.1 py310hd6ae3a3_0
[conda] mkl_random 1.2.2 py310h00e6091_0
[conda] numpy 1.23.1 py310h1794996_0
[conda] numpy-base 1.23.1 py310hcba007f_0
[conda] pytorch 1.12.1 py3.10_cuda11.3_cudnn8.3.2_0 pytorch
[conda] pytorch-mutex 1.0 cuda pytorch
[conda] torchaudio 0.12.1 py310_cu113 pytorch
[conda] torchvision 0.13.1 py310_cu113 pytorch
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
module: buildBuild system issuesBuild system issuesmodule: cudaRelated to torch.cuda, and CUDA support in generalRelated to torch.cuda, and CUDA support in generaltriagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate moduleThis issue has been looked at a team member, and triaged and prioritized into an appropriate module