Skip to content

Conda Pytorch (Pytorch channel) in WSL2 Ubuntu can't find libcudnn shared objects #85773

@mawright

Description

@mawright

🐛 Describe the bug

This is on a fresh install of Ubuntu 22,04 in WSL. All I did before running this was an apt-get upgrade and wget and install the Linux Miniconda installer. I did not download any Cuda packages from Nvidia, or the Ubuntu nvidia-cuda-toolkit package.

conda create -n torch pytorch torchvision torchaudio cudatoolkit=11.3 -c pytorch

(skip install text)

conda activate torch
python test.py

test.py:

import torch

print(f"{torch.cuda.is_available()=}")

f = torch.nn.Conv2d(3, 8, 3, device="cuda")
X = torch.randn(2, 3, 4, 4, device="cuda")

Y = X @ X
print(f"{Y.shape=}")
print("matrix multiply works")

Y = f(X)
print(f"{Y.shape=}")
print("Conv2d works")

Console output;

torch.cuda.is_available()=True
Y.shape=torch.Size([2, 3, 4, 4])
matrix multiply works
Could not load library libcudnn_cnn_infer.so.8. Error: libcuda.so: cannot open shared object file: No such file or directory
Please make sure libcudnn_cnn_infer.so.8 is in your library path!
Aborted

The shared objects are where I would expect them to be:

$ ls $CONDA_PREFIX/lib/python3.10/site-packages/torch/lib
libc10_cuda.so           libcudnn_adv_train.so.8  libcudnn_ops_train.so.8    libtorch_cpu.so          libtorch_cuda.so
libc10.so                libcudnn_cnn_infer.so.8  libcudnn.so.8              libtorch_cuda_cpp.so     libtorch_global_deps.so
libcaffe2_nvrtc.so       libcudnn_cnn_train.so.8  libcupti-53b4cc5d.so.11.3  libtorch_cuda_cu.so      libtorch_python.so
libcudnn_adv_infer.so.8  libcudnn_ops_infer.so.8  libshm.so                  libtorch_cuda_linalg.so  libtorch.so

This error does not happen when running Pytorch on the Windows that's hosting this WSL container. On Windows, everything works fine.

This is a different bug from #73487. . They report Pytorch not finding Cuda, while what I've found is Pytorch crashing on a call to, e.g., a forward pass of a Conv2d layer with an error saying it's unable to load the libcudnn shared object.

Versions

$ python collect_env.py
Collecting environment information...
PyTorch version: 1.12.1
Is debug build: False
CUDA used to build PyTorch: 11.3
ROCM used to build PyTorch: N/A

OS: Ubuntu 22.04.1 LTS (x86_64)
GCC version: Could not collect
Clang version: Could not collect
CMake version: Could not collect
Libc version: glibc-2.35

Python version: 3.10.4 (main, Mar 31 2022, 08:41:55) [GCC 7.5.0] (64-bit runtime)
Python platform: Linux-5.10.102.1-microsoft-standard-WSL2-x86_64-with-glibc2.35
Is CUDA available: True
CUDA runtime version: Could not collect
GPU models and configuration: GPU 0: NVIDIA GeForce RTX 3090
Nvidia driver version: 516.94
cuDNN version: Could not collect
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

Versions of relevant libraries:
[pip3] numpy==1.23.1
[pip3] torch==1.12.1
[pip3] torchaudio==0.12.1
[pip3] torchvision==0.13.1
[conda] blas                      1.0                         mkl
[conda] cudatoolkit               11.3.1               h2bc3f7f_2
[conda] ffmpeg                    4.3                  hf484d3e_0    pytorch
[conda] mkl                       2021.4.0           h06a4308_640
[conda] mkl-service               2.4.0           py310h7f8727e_0
[conda] mkl_fft                   1.3.1           py310hd6ae3a3_0
[conda] mkl_random                1.2.2           py310h00e6091_0
[conda] numpy                     1.23.1          py310h1794996_0
[conda] numpy-base                1.23.1          py310hcba007f_0
[conda] pytorch                   1.12.1          py3.10_cuda11.3_cudnn8.3.2_0    pytorch
[conda] pytorch-mutex             1.0                        cuda    pytorch
[conda] torchaudio                0.12.1              py310_cu113    pytorch
[conda] torchvision               0.13.1              py310_cu113    pytorch

cc @malfet @seemethere @ngimel

Metadata

Metadata

Assignees

No one assigned

    Labels

    module: buildBuild system issuesmodule: cudaRelated to torch.cuda, and CUDA support in generaltriagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate module

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions