Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

libcusparse.so.10 error when importing #1092

Closed
3 of 4 tasks
legohyl opened this issue Apr 3, 2020 · 9 comments
Closed
3 of 4 tasks

libcusparse.so.10 error when importing #1092

legohyl opened this issue Apr 3, 2020 · 9 comments

Comments

@legohyl
Copy link

legohyl commented Apr 3, 2020

馃摎 Installation

Environment

  • OS: Ubuntu 16.04
  • Python version: 3.6.8
  • PyTorch version: 1.4.0
  • CUDA/cuDNN version: 10.1
  • GCC version: 5.4.0
  • How did you try to install PyTorch Geometric and its extensions (pip, source): pip
  • Any other relevant information:

Utilized the following commands:
pip install torch-scatter==latest+cu101 -f https://pytorch-geometric.com/whl/torch-1.4.0.html
pip install torch-sparse==latest+cu101 -f https://pytorch-geometric.com/whl/torch-1.4.0.html
pip install torch-cluster==latest+cu101 -f https://pytorch-geometric.com/whl/torch-1.4.0.html
pip install torch-spline-conv==latest+cu101 -f https://pytorch-geometric.com/whl/torch-1.4.0.html
pip install torch-geometric

Checklist

  • [x ] I followed the installation guide.
  • I cannot find my error message in the FAQ.
  • I set up CUDA correctly and can compile CUDA code via nvcc.
  • I have cloned the repository and tried a manual installation from source.
  • I do have multiple CUDA versions on my machine.

Additional context

Hi there, I had version of torch_geometric at 1.4.1, but decided to upgrade the package to the latest. It then complained on doing the SparseTensor, which I followed the forums to get me to update torch-sparse. Now, after upgrading the packages, I can't import torch_geometric as I encounter the error OSError: libcusparse.so.10: cannot open shared object file: No such file or directory.

The package used to work for me, nothing changed in terms of installation other than just upgraded torch_geometric and then torch-sparse. Also, my LD_LIBRARY_PATH is /usr/local/cuda/lib64: which I think is correct?

I'm using pyenv to manage python versions and I'm also using the virtualenv wrapper for pyenv.

@rusty1s
Copy link
Member

rusty1s commented Apr 3, 2020

Is libcusparse.so and libcusparse.so.10 included in usr/local/cuda/lib64?

@legohyl
Copy link
Author

legohyl commented Apr 3, 2020

Yup, both are included there. I see both files in the folder.

@rusty1s
Copy link
Member

rusty1s commented Apr 3, 2020

What does nvcc --version output?

@legohyl
Copy link
Author

legohyl commented Apr 4, 2020

nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2018 NVIDIA Corporation Built on Sat_Aug_25_21:08:01_CDT_2018 Cuda compilation tools, release 10.0, V10.0.130
Oh wow, it says CUDA 10.0. But oddly, Pytorch 1.4.0 is installed (which requires CUDA 10.1). Let me downgrade my pytorch and all the torch_geometric libraries and try again.

@legohyl
Copy link
Author

legohyl commented Apr 4, 2020

Yup it works now that I forced install torch to be 1.4.0 at CUDA 10.0. No idea why it kept installing the CUDA 10.1 version when I do a pip install torch torchvision despite my nvcc --version saying it's CUDA 10.0.

@legohyl legohyl closed this as completed Apr 4, 2020
@THHHomas
Copy link

@legohyl I have the same problem. how to install the torch with cuda10? pip install torch==1.4.0, the torch with cuda10.1 is installed.

@legohyl
Copy link
Author

legohyl commented Apr 23, 2020

hey @THHHomas , are you using torch==1.4.0+cu100? torch==1.4.0 defaults to CUDA 10.1. If that still doesn't work, download the .whl directly and install from the wheel at https://download.pytorch.org/whl/torch_stable.html

Just search for 1.4.0 and download the cu100 wheel.

@THHHomas
Copy link

@legohyl Thanks a lot. I found the correct version in https://download.pytorch.org/whl/torch_stable.html. the problem is solved.

@abd96
Copy link

abd96 commented Jan 4, 2021

I have the same problem :
nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2019 NVIDIA Corporation Built on Sun_Jul_28_19:07:16_PDT_2019 Cuda compilation tools, release 10.1, V10.1.243

Tensorflow seem to find my gpu and list it, but can't find/ load libcudnn.so.7, here is a detailed output :

2021-01-04 23:00:17.213321: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2021-01-04 23:00:17.255077: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-01-04 23:00:17.255559: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties:
pciBusID: 0000:09:00.0 name: GeForce RTX 2060 computeCapability: 7.5
coreClock: 1.755GHz coreCount: 30 deviceMemorySize: 5.79GiB deviceMemoryBandwidth: 312.97GiB/s
2021-01-04 23:00:17.255755: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic
library libcudart.so.10.1
2021-01-04 23:00:17.256877: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic
library libcublas.so.10
2021-01-04 23:00:17.257862: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic
library libcufft.so.10
2021-01-04 23:00:17.258059: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2021-01-04 23:00:17.259068: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2021-01-04 23:00:17.259616: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2021-01-04 23:00:17.259717: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcudnn.so.7'; dlerror: libcudnn.so.7: cannot open shared object file: No such file or directory
2021-01-04 23:00:17.259727: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1598] Cannot dlopen some GPU libraries.
Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
2021-01-04 23:00:17.259933: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2021-01-04 23:00:17.265186: I tensorflow/core/platform/profile_utils/cpu_utils.cc:102] CPU Frequency: 3393140000 Hz
2021-01-04 23:00:17.265564: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7f8534000b60 initialized for
platform Host (this does not guarantee that XLA will be used). Devices:
2021-01-04 23:00:17.265584: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
2021-01-04 23:00:17.266689: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-01-04 23:00:17.266700: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1108]

Tensorflow version :

tensorflow-estimator 2.2.0
tensorflow-gpu 2.2.0

Help me!!!!!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants