-
Notifications
You must be signed in to change notification settings - Fork 74k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Could not load dynamic library 'libcusolver.so.10' - TF-2.4.0RC, Cuda,CudNN, RTX 3080 #44777
Comments
I built Tensorflow with CUDA 11.1.1 and CuDNN 8.0.5 myself today and works just fine, when comparing the loaded libraries there are some differences: On my system; In your case we find: Could not load dynamic library 'libcusolver.so.10', while it should be looking for libcusolver.so.11. Maybe you accidentally built Tensorflow against some leftover files from an old CUDA installation? Try to completely delete CUDA first (delete /usr/local/cuda*), then do a fresh CUDA 11.1.1 install, then install CuDNN 8.0.5, then install TensorRT. Then during ./configure, ensure that the libraries it finds are indeed the correct ones, and ensure you enter CUDA 11, CuDNN 8 and TensorRT 7, rather than the defaults 10, 7, and 6. Hope this helps! :) Also, even though it should not be the root cause of this problem, clean up your LD_LIBRARY_PATH, it should not contain the same path so many times. EDIT Oh my bad, I did not notice you didn't build from source. I am unsure why the tf-nightly build is built against libcusolver.so.10. You could try to compile it yourself. Takes a bit more time, but works for me. |
Do you encounter the same issues with 2.4.0 RC0 or RC1? |
Yup @ion-elgreco |
@Amokstakov Look for the library in cuda 10.1 or cuda 10.2 folders in your installation directory and then add it to the path. |
I have the same problem with tf-nightly, CUDA 11.1.1 and cudnn 8.0.5 on Ubuntu. Edit: |
@nilskk I clicked the link to the comment you added, and I am not quite sure what to do? |
Can you post your set up also? with dependencies for CUDA version and CUdNN version ? |
System information
I think sudo ln creates a link from libcusolver.so.11 (which is not yet supported) to libcusolver.so.10, so that tensorflow finds it. But I don't have much knowledge on that part. Just guessing and trying. For me it worked and my RTX3080 now runs Tensorflow. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you. |
I do a soft link for libcusolver.so.10 to libcusolver.so.11. Now it seems to work but I don't know it really works and doesn't do wrong later on. |
2020-11-24 15:54:49.479877: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0 |
previously, it reports the error:Could not load dynamic library 'libcusolver.so.10' |
+1 experiencing this issue on tf-nightly 11/30/2020 and tensorflow 2.4 RC3 Could not load dynamic library 'libcusolver.so.10'; dlerror: libcusolver.so.10: cannot open shared object file: No such file or directory Will CUDA 11.1 be supported in TF 2.4? |
Closing as stale. Please reopen if you'd like to work on this further. |
So what was the solution? I am having the same issue and can't seem to figure out what to do |
I am also having this issue with a brand new installation of TF2.4 - was this left hanging somehow? I think a solution might be to install cuda 10 and add libcusolver to your path.. Symlinking libcusolver.so.11 to .so.10 does NOT work, though this issue gives it as a solution: What finally worked for me was installing CUDA11.0 - seems any higher versions are not supported. |
Hmm thanks! I ended up moving back to 2.3.1 and everything works like a breeze... |
The only workaround which is consistent with 2.4 seems to be to uninstall 11.2 and install 11.0. Nothing else resolves the "Could not load dynamic library 'libcusolver.so.10'; dlerror: libcusolver.so.10: cannot open shared object file: No such file or directory" This is not a tensorflow bug -- it is an NVIDIA bug. |
One workaround is to install libcusolver-11-0 side by side with cuda-11-1 or cuda-11-2 and tweak your LD_LIBRARY_PATH. For example, to get TensorFlow 2.4 to work with CUDA 11.2:
|
I was able to solve the issue with missing hard linking |
Worked for me on Debian Testing. cd /usr/lib/x86_64-linux-gnu |
this also worked for me... is this dangerous in some way? i mean, it's loading a different verson than it expects, no? |
This appears to still be an issue, but maybe everyone is finding different ways around the problem. Running an strace showed me that tensorflow is trying to find "libcusolver.so.10" under "~/.local", and not in its (probably expected) location, since none of the other shared libs are here. Could have something to do with the fact that I installed tensorflow via a "local" install not system-level install. After I put soft link here everything worked fine: While troubleshooting I found a different, but maybe already reported issue. keras will complain if I try to run tf-nightly. It doesn't seem to recognize that the tf-nightly version is above its required version. |
Is this because tf 2.4 still bases on libcusolver.so.10? With newer version of tf, this problem will go away? |
This works perfectly,just make sure you have set LD_library path exactly where libcusolver.so.11 is present and make the hard link... check your LD_library_path Thanks |
For me, this does not work. Making a link to the tensorflow library instead of $LD_LIBRARY_PATH solves my problem.
In my case
Either soft or hard link works. (CUDA 11.2, GTX970, debian 10) |
it's work for me. |
@cameronjacobson good catch!
System: cuda 11.2, cuDNN 8.2, GTX1080, pop_OS 20.04 (ubuntu 20.04) |
|
I have cuda 11.1 and cudnn 8.0.5 installed, while as suggested by TensorFlow Tested build configurations, we should have cuda 11.0 when installing TF 2.4. This should be that reason. If you are using Conda, a workaround is to:
|
Under Ubuntu 18.04 with TensorFlow 2.4.0, I needed to link the libcusolver in my Python 3.8 virtual environment managed by tox: ln -s /usr/local/cuda/lib64/libcusolver.so.11 .tox/py38-cuda/lib/python3.8/site-packages/tensorflow/python/libcusolver.so.10 |
Reference: tensorflow/tensorflow#44777 (comment) Command and error message: (softlearning) root@d552e69dd7c6:~/softlearning# python Python 3.8.18 (default, Sep 11 2023, 13:40:15) [GCC 11.2.0] :: Anaconda, Inc. on linux Type "help", "copyright", "credits" or "license" for more information. >>> import tensorflow as tf t_physical_devices('GPU')2023-11-29 13:31:22.648469: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0 >>> tf.config.experimental.list_physical_devices('GPU') 2023-11-29 13:31:25.436965: I tensorflow/compiler/jit/xla_cpu_device.cc:41] Not creating XLA devices, tf_xla_enable_xla_devices not set 2023-11-29 13:31:25.439436: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcuda.so.1 2023-11-29 13:31:25.495281: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2023-11-29 13:31:25.495745: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 0 with properties: pciBusID: 0000:01:00.0 name: Quadro RTX 6000 computeCapability: 7.5 coreClock: 1.77GHz coreCount: 72 deviceMemorySize: 23.64GiB deviceMemoryBandwidth: 625.94GiB/s 2023-11-29 13:31:25.495831: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0 2023-11-29 13:31:25.506591: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.11 2023-11-29 13:31:25.506717: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublasLt.so.11 2023-11-29 13:31:25.508412: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcufft.so.10 2023-11-29 13:31:25.508899: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcurand.so.10 2023-11-29 13:31:25.509175: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'libcusolver.so.10'; dlerror: libcusolver.so.10: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /root/.mujoco/mujoco200_linux/bin:/root/.mujoco/mujoco200/bin:/root/.mujoco/mjpro150/bin:/usr/local/cuda/extras/CUPTI/lib64:/usr/local/cuda/compat/lib:/usr/local/nvidia/lib:/usr/local/nvidia/lib64:/usr/lib/nvidia-000 2023-11-29 13:31:25.510741: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusparse.so.11 2023-11-29 13:31:25.510973: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.8 2023-11-29 13:31:25.510997: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1757] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform. Skipping registering GPU devices... []
Thank you so much!! I could not find libcusolver.so.10 and libcudnn.so.8 after installing tensorflow 2.4.0 and CUDA 11.3(could work with torch). I installed cudnn by apt but still fails. Thank you again for your reply to solve my problems! Remember that the pkg installed by conda could only be find with |
Please make sure that this is a build/installation issue. As per our GitHub Policy, we only address code/doc bugs, performance issues, feature requests and build/installation issues on GitHub. tag:build_template
System information
Describe the problem
Downloading everything per instructions, all GPU libs are being read except the one in the title. No idea why.
2020-11-11 14:48:06.269458: I tensorflow/compiler/jit/xla_cpu_device.cc:41] Not creating XLA devices, tf_xla_enable_xla_devices not set 2020-11-11 14:48:06.269897: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcuda.so.1 2020-11-11 14:48:06.303715: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2020-11-11 14:48:06.304046: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1724] Found device 0 with properties: pciBusID: 0000:08:00.0 name: GeForce RTX 3080 computeCapability: 8.6 coreClock: 1.71GHz coreCount: 68 deviceMemorySize: 9.78GiB deviceMemoryBandwidth: 707.88GiB/s 2020-11-11 14:48:06.304061: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0 2020-11-11 14:48:06.305025: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.11 2020-11-11 14:48:06.305052: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublasLt.so.11 2020-11-11 14:48:06.305398: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcufft.so.10 2020-11-11 14:48:06.305501: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcurand.so.10 2020-11-11 14:48:06.305583: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'libcusolver.so.10'; dlerror: libcusolver.so.10: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/lib/cuda/include:/usr/lib/cuda/lib64:/usr/lib/cuda/include:/usr/lib/cuda/lib64:/usr/lib/cuda/include:/usr/lib/cuda/lib64:/usr/lib/cuda/include:/usr/lib/cuda/lib64:/usr/lib/cuda/include:/usr/lib/cuda/lib64:/usr/lib/cuda/include:/usr/lib/cuda/lib64:/usr/lib/cuda/include:/usr/lib/cuda/lib64:/usr/lib/cuda/include:/usr/lib/cuda/lib64:/usr/lib/cuda/include:/usr/lib/cuda/lib64:/usr/lib/cuda/include:/usr/lib/cuda/lib64:/usr/lib/cuda/include:/usr/lib/cuda/lib64:/usr/lib/cuda/include:/usr/lib/cuda/lib64:/usr/lib/cuda/include:/usr/lib/cuda/lib64:/usr/lib/cuda/include:/usr/lib/cuda/lib64:/usr/lib/cuda/include:/usr/lib/cuda/lib64:/usr/lib/cuda/include:/usr/lib/cuda/lib64:/usr/lib/cuda/include:/usr/lib/cuda/lib64:/usr/lib/cuda/include:/usr/lib/cuda/lib64:/usr/lib/cuda/include:/usr/lib/cuda/lib64:/usr/lib/cuda/include:/usr/lib/cuda/lib64:/usr/lib/cuda/include:/usr/lib/cuda/lib64:/usr/lib/cuda/include:/usr/lib/cuda/lib64:/usr/lib/cuda/include:/usr/lib/cuda/lib64:/usr/lib/cuda/include:/usr/lib/cuda/lib64:/usr/lib/cuda/include:/usr/lib/cuda/lib64:/usr/lib/cuda/include:/usr/lib/cuda/lib64:/usr/lib/cuda/include:/usr/lib/cuda/lib64:/usr/lib/cuda/include:/usr/lib/cuda/lib64:/usr/lib/cuda/include:/usr/lib/cuda/lib64:/usr/lib/cuda/include:/usr/lib/cuda/lib64:/usr/lib/cuda/include:/usr/lib/cuda/lib64:/usr/lib/cuda/include:/usr/lib/cuda/lib64:/usr/lib/cuda/include:/usr/lib/cuda/lib64:/usr/lib/cuda/include:/usr/lib/cuda/lib64:/usr/lib/cuda/include:/usr/lib/cuda/lib64:/usr/lib/cuda/include:/usr/lib/cuda/lib64:/usr/lib/cuda/include:/usr/lib/cuda/lib64:/usr/lib/cuda/include:/usr/lib/cuda/lib64:/usr/lib/cuda/include:/usr/lib/cuda/lib64:/usr/lib/cuda/include:/usr/lib/cuda/lib64:/usr/lib/cuda/include:/usr/lib/cuda/lib64:/usr/lib/cuda/include:/usr/lib/cuda/lib64:/usr/lib/cuda/include:/usr/lib/cuda/lib64:/usr/lib/cuda/include:/usr/lib/cuda/lib64:/usr/lib/cuda/include:/usr/lib/cuda/lib64:/usr/lib/cuda/include:/usr/lib/cuda/lib64:/usr/lib/cuda/include:/usr/lib/cuda/lib64: 2020-11-11 14:48:06.305805: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusparse.so.11 2020-11-11 14:48:06.305877: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.8 2020-11-11 14:48:06.305883: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1761] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform. Skipping registering GPU devices... 2020-11-11 14:48:06.306061: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. 2020-11-11 14:48:06.306326: I tensorflow/compiler/jit/xla_gpu_device.cc:99] Not creating XLA devices, tf_xla_enable_xla_devices not set 2020-11-11 14:48:06.306336: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1265] Device interconnect StreamExecutor with strength 1 edge matrix: 2020-11-11 14:48:06.306340: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1271]
The text was updated successfully, but these errors were encountered: