Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ImportError: libcublas.so.10.0: cannot open shared object file: No such file or directory #26182

Closed
gian1312 opened this Issue Feb 27, 2019 · 44 comments

Comments

@gian1312
Copy link

gian1312 commented Feb 27, 2019

Please make sure that this is a build/installation issue. As per our GitHub Policy, we only address code/doc bugs, performance issues, feature requests and build/installation issues on GitHub. tag:build_template

System information

  • Linux Mint
    -Anaconda - pip install tensorflow-gpu
  • 9.0/7.5:
  • 1080 ti

I was using tensorflow gpu last year. I wanted to set it up again. I got it running on my Windows 10 partition. Now I have tried to set it up again on my Mint partition. I always get the following error.
ImportError: libcublas.so.10.0: cannot open shared object file: No such file or directory.
I thought TF needs cuda 9.0 and not 10.0?

The error occurs if I execute the following code.

import tensorflow as tf
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))

@ppwwyyxx

This comment has been minimized.

Copy link
Contributor

ppwwyyxx commented Feb 27, 2019

  • Latest TensorFlow supports cuda 8-10. cudnn 6-7.
  • Each TensorFlow binary has to work with the version of cuda and cudnn it was built with. If they don't match, you have to change either the TensorFlow binary or the Nvidia softwares.
  • Official tensorflow-gpu binaries (the one downloaded by pip or conda) are built with cuda 9.0, cudnn 7 since TF 1.5, and cuda 10.0, cudnn 7 since TF 1.13. These are written in the release notes. You have to use the matching version of cuda if using the official binaries.
  • If you don't like to change your Nvidia software, you can:
    (1) Use a different version of TensorFlow
    (2) Use non-official binaries built by others. e.g.: https://github.com/mind/wheels/releases, https://github.com/hadim/docker-tensorflow-builder#builds,
    https://github.com/inoryy/tensorflow-optimized-wheels
    (3) Build the binaries by yourself from source with your version of Nvidia software.
@jvishnuvardhan

This comment has been minimized.

Copy link

jvishnuvardhan commented Feb 27, 2019

@gian1312 I think it is looking for CUDA10 file. The error is due to mismatch is CUDA version. Best approach is install TF from clean state. Please follow @ppwwyyxx suggestion to select best versions (TF1.12, CUDA9.0 or TF1.13,CUDA10.0) for your need. Please uninstall python and tensorflow and then follow the instructions to install TF fresh. Please let me know how it progresses. Thanks!

@rhinsall

This comment has been minimized.

Copy link

rhinsall commented Feb 27, 2019

identical problem here.

clean installation of Nvidia drivers, CUDA 10.1 and TF

libcublas.so.10.0 error as soon as TF is called.

Ubuntu 18.04.2 LTS; Also Anaconda install of Python 3.7 (is the anaconda install relevant?); 2070

@jvishnuvardhan

This comment has been minimized.

Copy link

jvishnuvardhan commented Feb 27, 2019

@rhinsall Which TF version you are trying to install? Could you install CUDA10 or correctly reference the CUDA10.1 path in cuDNN. Thanks

@OmnipotentEntity

This comment has been minimized.

Copy link

OmnipotentEntity commented Feb 28, 2019

It does not seem possible to install Tensorflow with default packaging on Ubuntu 18.04. You have to either build TF from scratch, which requires sourcing an older version of bazel than is available through the default repositories, or manually install specific versions of nvidia drivers and libraries.

None of the linked wheels from upthread are yet built against CUDA 10.1.

@gian1312

This comment has been minimized.

Copy link
Author

gian1312 commented Feb 28, 2019

Thanks a lot. I relyed on the website and haven't realised, that a new version came out a few days ago. I am sorry. I downgraded to 1.12. Now, my graphic card gets found with the mentioned code.

Sadly, the code (an example from a lecture I attend) which runs on my Windows installation perfectly fine (30 s) takes 6 min on my Linux installation an puts the CPU under load. Is there a work around to force Tensorflow to use the GPU?

@rhinsall

This comment has been minimized.

Copy link

rhinsall commented Feb 28, 2019

@rhinsall Which TF version you are trying to install? Could you install CUDA10 or correctly reference the CUDA10.1 path in cuDNN. Thanks

I'll come home much later and report the exact numbers and paths - but it's a fresh install, downloaded yesterday, CUDA 10.1 per Nvidia's instructions and TF clean install using PIP & Python 3.7

@fabricatedmath

This comment has been minimized.

Copy link

fabricatedmath commented Mar 2, 2019

@rhinsall
I just found this out myself, not sure if it's common knowledge, but got around this by doing

conda install cudatoolkit
conda install cudnn

I have cuda-10.1 installed on my box, this installed a local conda-only cuda-10.0. Obviously this is to just keep tensorflow working while waiting for better support.

@rhinsall

This comment has been minimized.

Copy link

rhinsall commented Mar 2, 2019

Excellent advice. Immediate rescue. Thank you very much fabricatedmath.

@jvishnuvardhan

This comment has been minimized.

Copy link

jvishnuvardhan commented Mar 4, 2019

@gian1312 That is strange. There is a guide on using gpu here. Using those instructions you can force TF to use a gpu. Some times it is better to uninstall and reinstall TF. Please let me know how it progresses. If the issue was resolved, please close the ticket. Thanks!

@ivineetm007

This comment has been minimized.

Copy link

ivineetm007 commented Mar 9, 2019

hi,
I am having the similar problem. So , I created new conda environment and installed tensorflow-gpu as
`
conda install tensorflow-gpu
Collecting package metadata: done
Solving environment: done

Package Plan

environment location: /home/lasii/anaconda3/envs/drunk2

added / updated specs:
- tensorflow-gpu

The following packages will be downloaded:

package                    |            build
---------------------------|-----------------
_tflow_select-2.1.0        |              gpu           2 KB  defaults
absl-py-0.4.1              |           py35_0         144 KB  defaults
astor-0.7.1                |           py35_0          43 KB  defaults
cupti-9.2.148              |                0         1.7 MB  defaults
gast-0.2.0                 |           py35_0          15 KB  defaults
grpcio-1.12.1              |   py35hdbcaa40_0         1.7 MB  defaults
libprotobuf-3.6.0          |       hdbcaa40_0         4.1 MB  defaults
markdown-2.6.11            |           py35_0         104 KB  defaults
mkl_fft-1.0.6              |   py35h7dd41cf_0         149 KB  defaults
mkl_random-1.0.1           |   py35h4414c95_1         362 KB  defaults
numpy-1.15.2               |   py35h1d66e8a_0          47 KB  defaults
numpy-base-1.15.2          |   py35h81de0dd_0         4.2 MB  defaults
protobuf-3.6.0             |   py35hf484d3e_0         615 KB  defaults
six-1.11.0                 |           py35_1          21 KB  defaults
tensorboard-1.10.0         |   py35hf484d3e_0         3.3 MB  defaults
tensorflow-1.10.0          |gpu_py35hd9c640d_0           3 KB  defaults
tensorflow-base-1.10.0     |gpu_py35had579c0_0       190.6 MB  defaults
tensorflow-gpu-1.10.0      |       hf154084_0           2 KB  defaults
termcolor-1.1.0            |           py35_1           7 KB  defaults
------------------------------------------------------------
                                       Total:       207.1 MB

The following NEW packages will be INSTALLED:

_tflow_select pkgs/main/linux-64::_tflow_select-2.1.0-gpu
absl-py pkgs/main/linux-64::absl-py-0.4.1-py35_0
astor pkgs/main/linux-64::astor-0.7.1-py35_0
blas pkgs/main/linux-64::blas-1.0-mkl
cudatoolkit pkgs/main/linux-64::cudatoolkit-9.2-0
cudnn pkgs/main/linux-64::cudnn-7.3.1-cuda9.2_0
cupti pkgs/main/linux-64::cupti-9.2.148-0
gast pkgs/main/linux-64::gast-0.2.0-py35_0
grpcio pkgs/main/linux-64::grpcio-1.12.1-py35hdbcaa40_0
intel-openmp pkgs/main/linux-64::intel-openmp-2019.1-144
libgfortran-ng pkgs/main/linux-64::libgfortran-ng-7.3.0-hdf63c60_0
libprotobuf pkgs/main/linux-64::libprotobuf-3.6.0-hdbcaa40_0
markdown pkgs/main/linux-64::markdown-2.6.11-py35_0
mkl pkgs/main/linux-64::mkl-2018.0.3-1
mkl_fft pkgs/main/linux-64::mkl_fft-1.0.6-py35h7dd41cf_0
mkl_random pkgs/main/linux-64::mkl_random-1.0.1-py35h4414c95_1
numpy pkgs/main/linux-64::numpy-1.15.2-py35h1d66e8a_0
numpy-base pkgs/main/linux-64::numpy-base-1.15.2-py35h81de0dd_0
protobuf pkgs/main/linux-64::protobuf-3.6.0-py35hf484d3e_0
six pkgs/main/linux-64::six-1.11.0-py35_1
tensorboard pkgs/main/linux-64::tensorboard-1.10.0-py35hf484d3e_0
tensorflow pkgs/main/linux-64::tensorflow-1.10.0-gpu_py35hd9c640d_0
tensorflow-base pkgs/main/linux-64::tensorflow-base-1.10.0-gpu_py35had579c0_0
tensorflow-gpu pkgs/main/linux-64::tensorflow-gpu-1.10.0-hf154084_0
termcolor pkgs/main/linux-64::termcolor-1.1.0-py35_1
werkzeug pkgs/main/linux-64::werkzeug-0.14.1-py35_0
`
After installation . I just imported tensorflow and got the error.

`Traceback (most recent call last):
File "/home/lasii/.local/lib/python3.5/site-packages/tensorflow/python/pywrap_tensorflow.py", line 58, in
from tensorflow.python.pywrap_tensorflow_internal import *
File "/home/lasii/.local/lib/python3.5/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 28, in
_pywrap_tensorflow_internal = swig_import_helper()
File "/home/lasii/.local/lib/python3.5/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 24, in swig_import_helper
_mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
File "/home/lasii/anaconda3/envs/drunk2/lib/python3.5/imp.py", line 243, in load_module
return load_dynamic(name, filename, file)
File "/home/lasii/anaconda3/envs/drunk2/lib/python3.5/imp.py", line 343, in load_dynamic
return _load(spec)
ImportError: libcublas.so.10.0: cannot open shared object file: No such file or directory

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "", line 1, in
File "/home/lasii/.local/lib/python3.5/site-packages/tensorflow/init.py", line 24, in
from tensorflow.python import pywrap_tensorflow # pylint: disable=unused-import
File "/home/lasii/.local/lib/python3.5/site-packages/tensorflow/python/init.py", line 49, in
from tensorflow.python import pywrap_tensorflow
File "/home/lasii/.local/lib/python3.5/site-packages/tensorflow/python/pywrap_tensorflow.py", line 74, in
raise ImportError(msg)
ImportError: Traceback (most recent call last):
File "/home/lasii/.local/lib/python3.5/site-packages/tensorflow/python/pywrap_tensorflow.py", line 58, in
from tensorflow.python.pywrap_tensorflow_internal import *
File "/home/lasii/.local/lib/python3.5/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 28, in
_pywrap_tensorflow_internal = swig_import_helper()
File "/home/lasii/.local/lib/python3.5/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 24, in swig_import_helper
_mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
File "/home/lasii/anaconda3/envs/drunk2/lib/python3.5/imp.py", line 243, in load_module
return load_dynamic(name, filename, file)
File "/home/lasii/anaconda3/envs/drunk2/lib/python3.5/imp.py", line 343, in load_dynamic
return _load(spec)
ImportError: libcublas.so.10.0: cannot open shared object file: No such file or directory

Failed to load the native TensorFlow runtime.

See https://www.tensorflow.org/install/errors
`

I just started using github. Guide me if I am posting improperly.

@codexponent

This comment has been minimized.

Copy link

codexponent commented Mar 9, 2019

@ivineetm007 , Can you check the CUDA version!

@ivineetm007

This comment has been minimized.

Copy link

ivineetm007 commented Mar 9, 2019

@codexponent
It's 9.20
Conda automatically installed it while installing tensorflow-gpu.

@codexponent

This comment has been minimized.

Copy link

codexponent commented Mar 9, 2019

I think you should update your CUDA version to 10 along.
This link will help you
Link: https://www.nvidia.com/Download/index.aspx?lang=en-us

@ivineetm007

This comment has been minimized.

Copy link

ivineetm007 commented Mar 9, 2019

@codexponent
I installed cuda 10.0 in conda by
conda install -c fragcolor cuda10.0

Now , there are two cuda in conda environment package list.
cudatoolkit 9.2
cuda 10.0

But the same error occurs on importing tensorflow.

@codexponent

This comment has been minimized.

Copy link

codexponent commented Mar 9, 2019

@ivineetm007 , Can you do nvidia-smi and check the head of the table! I am sure that you need to update cuda by downloading the nvidia driver from their website.

@ivineetm007

This comment has been minimized.

Copy link

ivineetm007 commented Mar 9, 2019

@codexponent
header
NVIDIA-SMI 396.54 Driver Version: 396.54

I am working on a PC in college which is alloted to two or three students. I am not sure if I install cuda by downloading , it will not affect the other environment in conda.
A little history...
I am using code in the link
(https://github.com/DevendraPratapYadav/gsoc18_RedHenLab/tree/master/video_processing_pipeline)
In this link, setup is done on conda . Two weeks ago, tensorflow was [running] perfectly while running the above code.
But someone updated conda in the PC. Now, I am having libculas.so.10.0 error.

@codexponent

This comment has been minimized.

Copy link

codexponent commented Mar 9, 2019

@ivineetm007 , if this is not your pc i suggest you don't update it as it might break other environments working for cuda 9. Do one thing, create a new environment, install tensorflow with the specific version number
pip install tensorfow==1.10.0 and then test a very simple code like addition of 2 numbers(tf.add). See if this runs or not.

@ivineetm007

This comment has been minimized.

Copy link

ivineetm007 commented Mar 9, 2019

@codexponent
I tried your suggestion. It worked fine . Then I tried to install tf-gpu and keras as -
conda install -y -c anaconda tensorflow-gpu==1.7.0
conda install -y keras
Now I am having error-
AttributeError: module 'tensorflow.python.training.checkpointable' has no attribute 'CheckpointableBase'
I followed the solution for this error in the link
(https://github.com/tensorflow/tensorflow/issues/20499l)
which suggested reinstalling.
I think some other version of tensorflow-gpu will work

@codexponent

This comment has been minimized.

Copy link

codexponent commented Mar 9, 2019

@ivineetm007 , try to do the same thing with opening tf session on the gpu. This link may help
Link: https://www.tensorflow.org/guide/using_gpu

Another solution: Don't install anything from conda, just install from pip
Steps:

  1. Create a fresh environment
  2. pip install tensorflow==1.12.0
  3. pip install tensorflow-gpu==1.12.0
  4. pip install keras==2.1.3
    If you have anything that you want to install from conda, check if it is available on the pip version. If it is not then,
    Let's say that your env name is my_env_1
    after activating that environment, type which conda,
    if this gives the path to your created environment (...\my_env_1...), then you can install other essential environments. If this gives (..\...), then type pip install conda, then install other essential environments. (be sure to check again by typing which conda)
@lipingbj

This comment has been minimized.

Copy link

lipingbj commented Mar 10, 2019

Same problem.My cuda version is 10.1,but the the libcublas.so.10.0 file is not in the catalogue of lib64.I am installing the tensorflow-gpu with the command 'pip install tensorflow-gpu'.

@lipingbj

This comment has been minimized.

Copy link

lipingbj commented Mar 10, 2019

Same problem.My cuda version is 10.1,but the the libcublas.so.10.0 file is not in the catalogue of lib64.I am installing the tensorflow-gpu with the command 'pip install tensorflow-gpu'.

It seems that the libcublas-version is removed by the cuda 10

@codexponent

This comment has been minimized.

Copy link

codexponent commented Mar 10, 2019

@lipingbj , did you update the cuda version from conda command or through nvidia official site, I think doing from the actual site might help t get those .so files
Link: https://www.nvidia.com/Download/index.aspx?lang=en-us

@RazorBladeQuant

This comment has been minimized.

Copy link

RazorBladeQuant commented Mar 13, 2019

@lipingbj so i had a similar issue, when pushing an upgrade to a tensorflow code which would call 200 sagemakers in parallel. i solved it by fixing the numpy version to numpy==1.14.5 and tensorflow-gpu to 1.12.0. If you would you like i can paste the dockerfile i created to ensure it works?

@mostafaelhoushi

This comment has been minimized.

Copy link

mostafaelhoushi commented Mar 14, 2019

Same problem.My cuda version is 10.1,but the the libcublas.so.10.0 file is not in the catalogue of lib64.I am installing the tensorflow-gpu with the command 'pip install tensorflow-gpu'.

It seems that the libcublas-version is removed by the cuda 10

After installing CUDA 10 I have found libcublas.so.10 under /usr/lib/x86_64-linux-gnu/.
So you need to add /usr/lib/x86_64-linux-gnu/ to your library path by calling:

> export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/lib/x86_64-linux-gnu/

And also since TensorFlow is looking for libcublas.so.10.0 rather than libcublas.so.10 (without the last .0) you need to create a symlink:

ln -s /usr/lib/x86_64-linux-gnu/libcublas.so.10 /usr/lib/x86_64-linux-gnu/libcublas.so.10
@mostafaelhoushi

This comment has been minimized.

Copy link

mostafaelhoushi commented Mar 14, 2019

Please look at the instructions here after installing CUDA 10:
https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#environment-setup

@priyakansal

This comment has been minimized.

Copy link

priyakansal commented Mar 15, 2019

hi all,

I am facilng the same issue, but my problem is little different, i am able to install and import tensorflow-gpu on my local machine as well as when building the docker container, everything is working fine. but when I am building my docker image from Dockerfile and docker-compose-up...build, i am getting this error.
Please help me out, I really dont know why this is happening in the building of docker image.

@dattran2346

This comment has been minimized.

Copy link

dattran2346 commented Mar 19, 2019

After installing cuda, you need to export $PATH and $LD_LIBRARY_PATH. Tensorflow will use these environment variables to load package. For example, if you install cuda at /usr/local/, you can add this to your .zshrc or .bashrc (depend on the shell you using)

CUDA_VERSION=10.0

export PATH=/usr/local/cuda-$CUDA_VERSION/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda-$CUDA_VERSION/lib64:$LD_LIBRARY_PATH

This trick can be used to change the version of cuda you want to use.

@loretoparisi

This comment has been minimized.

Copy link

loretoparisi commented Mar 19, 2019

@mostafaelhoushi I did the simlink but this does not make the trick:

ln -s /usr/lib/x86_64-linux-gnu/libcublas.so.10 /usr/lib/x86_64-linux-gnu/libcublas.so.10
root@b55736f184ff:/notebooks# python3.6 -c "import tensorflow as tf; print(tf.__version__);"
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/pywrap_tensorflow.py", line 58, in <module>
    from tensorflow.python.pywrap_tensorflow_internal import *
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 28, in <module>
    _pywrap_tensorflow_internal = swig_import_helper()
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 24, in swig_import_helper
    _mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
  File "/usr/lib/python3.6/imp.py", line 243, in load_module
    return load_dynamic(name, filename, file)
  File "/usr/lib/python3.6/imp.py", line 343, in load_dynamic
    return _load(spec)
ImportError: libcublas.so.10.0: cannot open shared object file: No such file or directory

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/__init__.py", line 24, in <module>
    from tensorflow.python import pywrap_tensorflow  # pylint: disable=unused-import
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/__init__.py", line 49, in <module>
    from tensorflow.python import pywrap_tensorflow
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/pywrap_tensorflow.py", line 74, in <module>
    raise ImportError(msg)
ImportError: Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/pywrap_tensorflow.py", line 58, in <module>
    from tensorflow.python.pywrap_tensorflow_internal import *
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 28, in <module>
    _pywrap_tensorflow_internal = swig_import_helper()
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 24, in swig_import_helper
    _mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
  File "/usr/lib/python3.6/imp.py", line 243, in load_module
    return load_dynamic(name, filename, file)
  File "/usr/lib/python3.6/imp.py", line 343, in load_dynamic
    return _load(spec)
ImportError: libcublas.so.10.0: cannot open shared object file: No such file or directory


Failed to load the native TensorFlow runtime.
@loretoparisi

This comment has been minimized.

Copy link

loretoparisi commented Mar 19, 2019

@dattran2346 This should work with CUDA10 installed already, but if you start from older Docker images, you may have installed

root@b55736f184ff:/notebooks# echo  $CUDA_VERSION
9.0.176
@mostafaelhoushi

This comment has been minimized.

Copy link

mostafaelhoushi commented Mar 20, 2019

@mostafaelhoushi I did the simlink but this does not make the trick:

ln -s /usr/lib/x86_64-linux-gnu/libcublas.so.10 /usr/lib/x86_64-linux-gnu/libcublas.so.10
root@b55736f184ff:/notebooks# python3.6 -c "import tensorflow as tf; print(tf.__version__);"
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/pywrap_tensorflow.py", line 58, in <module>
    from tensorflow.python.pywrap_tensorflow_internal import *
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 28, in <module>
    _pywrap_tensorflow_internal = swig_import_helper()
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 24, in swig_import_helper
    _mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
  File "/usr/lib/python3.6/imp.py", line 243, in load_module
    return load_dynamic(name, filename, file)
  File "/usr/lib/python3.6/imp.py", line 343, in load_dynamic
    return _load(spec)
ImportError: libcublas.so.10.0: cannot open shared object file: No such file or directory

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/__init__.py", line 24, in <module>
    from tensorflow.python import pywrap_tensorflow  # pylint: disable=unused-import
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/__init__.py", line 49, in <module>
    from tensorflow.python import pywrap_tensorflow
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/pywrap_tensorflow.py", line 74, in <module>
    raise ImportError(msg)
ImportError: Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/pywrap_tensorflow.py", line 58, in <module>
    from tensorflow.python.pywrap_tensorflow_internal import *
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 28, in <module>
    _pywrap_tensorflow_internal = swig_import_helper()
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 24, in swig_import_helper
    _mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
  File "/usr/lib/python3.6/imp.py", line 243, in load_module
    return load_dynamic(name, filename, file)
  File "/usr/lib/python3.6/imp.py", line 343, in load_dynamic
    return _load(spec)
ImportError: libcublas.so.10.0: cannot open shared object file: No such file or directory


Failed to load the native TensorFlow runtime.

Did you make sure you installed CUDA10.0? Or which version is installed?

@priyakansal

This comment has been minimized.

Copy link

priyakansal commented Mar 20, 2019

@dattran2346 @mostafaelhoushi
Do i need to install cuda and cudnn during the build of the docker image also, like this :

conda install -c fragcolor cuda10.0
@priyakansal

This comment has been minimized.

Copy link

priyakansal commented Mar 20, 2019

@dattran2346
i have exported the paths as you suggested, but still not working...

@dattran2346

This comment has been minimized.

Copy link

dattran2346 commented Mar 20, 2019

  • Nvidia-docker only give us access to the driver, the cudatoolkit and cudnn stuff we have to install by ourselves. cudatoolkit and cudnn are just dynamic libraries and their location depends on where we installed.

  • If we install tensorflow-gpu with pip (inside conda or not), tensorflow-gpu will look at $PATH and $LD_LIBRARY_PATH to find cudatoolkit and cudnn at runtime. In this approach, we need to install cudatoolkit and cudnn before hand and do export $PATH. We also need to ensure the version compatible between cudatoolkit, cudnn and tensorflow. Check version compatibility here

  • If we install tensorflow-gpu with conda, conda will also install appropriate version of cudatoolkit and cudnn. In this approach, we do not need to install cudatoolkit and cudnn before hand.

Tensorflow-gpu conda

@priyakansal, you may need to conda uninstall cuda10.0 and run conda install tensorflow-gpu.
@loretoparisi, may be try lower version of tensorflow or use conda to install or even upgrade your cuda version 🤔

Ps 1: for installing cudatoolkit and cudnn, I found this guide very useful.
Ps 2: Install cudatoolkit and cudnn by runtime file will install the library in /usr/local/ while install by .deb file will install in /usr/lib/x86_64-linux-gnu/. So your $PATH and $LD_LIBRARY_PATH need to change accordingly. Install cudatoolkit and cudnn by conda will install the library ~/miniconda3/envs/<name>/lib. And you do not need to export
Ps 3: What if I have installed cudatoolkit and cudnn and also install tensorflow-gpu using conda. Tensoflow-gpu will use the libaries install within conda enviroment.

Hope this help,
Correct me if I'm wrong 😄
Cheers

@priyakansal

This comment has been minimized.

Copy link

priyakansal commented Mar 20, 2019

@dattran2346
Thankyou so much for so detailed explanation.
if i am running conda install tensorflow-gpu, then also it is not working, however, i have not tried is with conda uninstall cuda10.0 . Here, the problem is that i also want to install tensorflow-serving-api-gpu, which is not available for conda-install, so need to install using pip, but when installing this.. i am getting the same error.
please note that, i am doing all this inside the docker. On my local machine(ubuntu), everything is working fine.

@loretoparisi

This comment has been minimized.

Copy link

loretoparisi commented Mar 20, 2019

What I did was this https://gist.github.com/loretoparisi/4a096fc3625f60403c8734de9660cbcc

add-apt-repository ppa:jonathonf/python-3.6
apt-get update & apt-get install -y python3.6
curl https://bootstrap.pypa.io/get-pip.py > get-pip.py
python3.6 get-pip.py
pip3 uninstall tensorflow-gpu
pip3.6 install tensorflow-gpu==1.12.0
python3.6 -c "import tensorflow as tf; print(tf.__version__);"

Basically you will get Python3.6, CUDA 9 and TF 1.12.0. We have to remote TF-GPU 1.13.0, and then install TF 1.12.0 GPU.

@priyakansal

This comment has been minimized.

Copy link

priyakansal commented Mar 20, 2019

@loretoparisi
Hi ,
Sorry I am bit new to docker ... so when I am building some image ... either or < tensorflow-serving: latest -gpu> using docker run or nivida-docker rum and importing the packages related to tensorflow every thing is working fine ... but when i am building my custom image with anaconda as a base image using docker-compose or nivida-docker- compose build command .. it is not working..

@mostafaelhoushi

This comment has been minimized.

Copy link

mostafaelhoushi commented Mar 20, 2019

@dattran2346
i have exported the paths as you suggested, but still not working...

Can you try to search for the missing file libcublas.so.10.0 on your file system. e.g. by using

find / -name "libcublas.so.10.0"

and then when you find the path add it to LD_LIBRARY_PATH environment variable.
If you can't find it, then you probably need to install the correct version.

@priyakansal

This comment has been minimized.

Copy link

priyakansal commented Mar 21, 2019

@mostafaelhoushi
When i am running this command find / -name "libcublas.so.10.0"
the output is

/var/lib/docker/overlay2/33ff618e94595ffbdc09016439dc6a469fa8adc3ec3b5231f776d6065aab7968/diff/root/.cache/bazel/_bazel_root/e53bbb0b0da4e26d24b415310219b953/execroot/tf_serving/bazel-out/k8-opt/bin/tensorflow_serving/model_servers/tensorflow_model_server.runfiles/tf_serving/external/local_config_cuda/cuda/cuda/lib/libcublas.so.10.0
/var/lib/docker/overlay2/33ff618e94595ffbdc09016439dc6a469fa8adc3ec3b5231f776d6065aab7968/diff/root/.cache/bazel/_bazel_root/e53bbb0b0da4e26d24b415310219b953/execroot/tf_serving/bazel-out/k8-opt/bin/tensorflow_serving/model_servers/tensorflow_model_server.runfiles/tf_serving/_solib_local/_U@local_Uconfig_Ucuda_S_Scuda_Ccublas___Uexternal_Slocal_Uconfig_Ucuda_Scuda_Scuda_Slib/libcublas.so.10.0
/var/lib/docker/overlay2/33ff618e94595ffbdc09016439dc6a469fa8adc3ec3b5231f776d6065aab7968/diff/root/.cache/bazel/_bazel_root/e53bbb0b0da4e26d24b415310219b953/execroot/tf_serving/bazel-out/k8-opt/bin/tensorflow_serving/model_servers/tensorflow_model_server.runfiles/local_config_cuda/cuda/cuda/lib/libcublas.so.10.0
/var/lib/docker/overlay2/33ff618e94595ffbdc09016439dc6a469fa8adc3ec3b5231f776d6065aab7968/diff/root/.cache/bazel/_bazel_root/e53bbb0b0da4e26d24b415310219b953/execroot/tf_serving/bazel-out/k8-opt/bin/_solib_local/_U@local_Uconfig_Ucuda_S_Scuda_Ccublas___Uexternal_Slocal_Uconfig_Ucuda_Scuda_Scuda_Slib/libcublas.so.10.0
/var/lib/docker/overlay2/33ff618e94595ffbdc09016439dc6a469fa8adc3ec3b5231f776d6065aab7968/diff/root/.cache/bazel/_bazel_root/e53bbb0b0da4e26d24b415310219b953/execroot/tf_serving/bazel-out/k8-opt/genfiles/external/local_config_cuda/cuda/cuda/lib/libcublas.so.10.0
/var/lib/docker/overlay2/33ff618e94595ffbdc09016439dc6a469fa8adc3ec3b5231f776d6065aab7968/diff/root/.cache/bazel/_bazel_root/e53bbb0b0da4e26d24b415310219b953/execroot/tf_serving/bazel-out/host/genfiles/external/local_config_cuda/cuda/cuda/lib/libcublas.so.10.0
/var/lib/docker/overlay2/97cb0c942535cde4622f53bf094251cd1aef1cfc744e8ddda1472ee691f87618/diff/usr/local/cuda-10.0/targets/x86_64-linux/lib/libcublas.so.10.0
/var/lib/docker/overlay2/2fb234250d278545f55a004fcd436b4cba5e847c40503b990ffe800f3b440cb5/diff/usr/local/cuda-10.0/targets/x86_64-linux/lib/libcublas.so.10.0
/var/lib/docker/overlay2/c704b6be3bc1a5d25119fa46216a4e64f872d8001d8bed6d40930f6420ffb091/diff/usr/local/cuda-10.0/targets/x86_64-linux/lib/libcublas.so.10.0
/usr/local/cuda-10.0/lib64/libcublas.so.10.0
1 similar comment
@priyakansal

This comment has been minimized.

Copy link

priyakansal commented Mar 21, 2019

@mostafaelhoushi
When i am running this command find / -name "libcublas.so.10.0"
the output is

/var/lib/docker/overlay2/33ff618e94595ffbdc09016439dc6a469fa8adc3ec3b5231f776d6065aab7968/diff/root/.cache/bazel/_bazel_root/e53bbb0b0da4e26d24b415310219b953/execroot/tf_serving/bazel-out/k8-opt/bin/tensorflow_serving/model_servers/tensorflow_model_server.runfiles/tf_serving/external/local_config_cuda/cuda/cuda/lib/libcublas.so.10.0
/var/lib/docker/overlay2/33ff618e94595ffbdc09016439dc6a469fa8adc3ec3b5231f776d6065aab7968/diff/root/.cache/bazel/_bazel_root/e53bbb0b0da4e26d24b415310219b953/execroot/tf_serving/bazel-out/k8-opt/bin/tensorflow_serving/model_servers/tensorflow_model_server.runfiles/tf_serving/_solib_local/_U@local_Uconfig_Ucuda_S_Scuda_Ccublas___Uexternal_Slocal_Uconfig_Ucuda_Scuda_Scuda_Slib/libcublas.so.10.0
/var/lib/docker/overlay2/33ff618e94595ffbdc09016439dc6a469fa8adc3ec3b5231f776d6065aab7968/diff/root/.cache/bazel/_bazel_root/e53bbb0b0da4e26d24b415310219b953/execroot/tf_serving/bazel-out/k8-opt/bin/tensorflow_serving/model_servers/tensorflow_model_server.runfiles/local_config_cuda/cuda/cuda/lib/libcublas.so.10.0
/var/lib/docker/overlay2/33ff618e94595ffbdc09016439dc6a469fa8adc3ec3b5231f776d6065aab7968/diff/root/.cache/bazel/_bazel_root/e53bbb0b0da4e26d24b415310219b953/execroot/tf_serving/bazel-out/k8-opt/bin/_solib_local/_U@local_Uconfig_Ucuda_S_Scuda_Ccublas___Uexternal_Slocal_Uconfig_Ucuda_Scuda_Scuda_Slib/libcublas.so.10.0
/var/lib/docker/overlay2/33ff618e94595ffbdc09016439dc6a469fa8adc3ec3b5231f776d6065aab7968/diff/root/.cache/bazel/_bazel_root/e53bbb0b0da4e26d24b415310219b953/execroot/tf_serving/bazel-out/k8-opt/genfiles/external/local_config_cuda/cuda/cuda/lib/libcublas.so.10.0
/var/lib/docker/overlay2/33ff618e94595ffbdc09016439dc6a469fa8adc3ec3b5231f776d6065aab7968/diff/root/.cache/bazel/_bazel_root/e53bbb0b0da4e26d24b415310219b953/execroot/tf_serving/bazel-out/host/genfiles/external/local_config_cuda/cuda/cuda/lib/libcublas.so.10.0
/var/lib/docker/overlay2/97cb0c942535cde4622f53bf094251cd1aef1cfc744e8ddda1472ee691f87618/diff/usr/local/cuda-10.0/targets/x86_64-linux/lib/libcublas.so.10.0
/var/lib/docker/overlay2/2fb234250d278545f55a004fcd436b4cba5e847c40503b990ffe800f3b440cb5/diff/usr/local/cuda-10.0/targets/x86_64-linux/lib/libcublas.so.10.0
/var/lib/docker/overlay2/c704b6be3bc1a5d25119fa46216a4e64f872d8001d8bed6d40930f6420ffb091/diff/usr/local/cuda-10.0/targets/x86_64-linux/lib/libcublas.so.10.0
/usr/local/cuda-10.0/lib64/libcublas.so.10.0
@mostafaelhoushi

This comment has been minimized.

Copy link

mostafaelhoushi commented Mar 21, 2019

@mostafaelhoushi
When i am running this command find / -name "libcublas.so.10.0"
the output is

/var/lib/docker/overlay2/33ff618e94595ffbdc09016439dc6a469fa8adc3ec3b5231f776d6065aab7968/diff/root/.cache/bazel/_bazel_root/e53bbb0b0da4e26d24b415310219b953/execroot/tf_serving/bazel-out/k8-opt/bin/tensorflow_serving/model_servers/tensorflow_model_server.runfiles/tf_serving/external/local_config_cuda/cuda/cuda/lib/libcublas.so.10.0
/var/lib/docker/overlay2/33ff618e94595ffbdc09016439dc6a469fa8adc3ec3b5231f776d6065aab7968/diff/root/.cache/bazel/_bazel_root/e53bbb0b0da4e26d24b415310219b953/execroot/tf_serving/bazel-out/k8-opt/bin/tensorflow_serving/model_servers/tensorflow_model_server.runfiles/tf_serving/_solib_local/_U@local_Uconfig_Ucuda_S_Scuda_Ccublas___Uexternal_Slocal_Uconfig_Ucuda_Scuda_Scuda_Slib/libcublas.so.10.0
/var/lib/docker/overlay2/33ff618e94595ffbdc09016439dc6a469fa8adc3ec3b5231f776d6065aab7968/diff/root/.cache/bazel/_bazel_root/e53bbb0b0da4e26d24b415310219b953/execroot/tf_serving/bazel-out/k8-opt/bin/tensorflow_serving/model_servers/tensorflow_model_server.runfiles/local_config_cuda/cuda/cuda/lib/libcublas.so.10.0
/var/lib/docker/overlay2/33ff618e94595ffbdc09016439dc6a469fa8adc3ec3b5231f776d6065aab7968/diff/root/.cache/bazel/_bazel_root/e53bbb0b0da4e26d24b415310219b953/execroot/tf_serving/bazel-out/k8-opt/bin/_solib_local/_U@local_Uconfig_Ucuda_S_Scuda_Ccublas___Uexternal_Slocal_Uconfig_Ucuda_Scuda_Scuda_Slib/libcublas.so.10.0
/var/lib/docker/overlay2/33ff618e94595ffbdc09016439dc6a469fa8adc3ec3b5231f776d6065aab7968/diff/root/.cache/bazel/_bazel_root/e53bbb0b0da4e26d24b415310219b953/execroot/tf_serving/bazel-out/k8-opt/genfiles/external/local_config_cuda/cuda/cuda/lib/libcublas.so.10.0
/var/lib/docker/overlay2/33ff618e94595ffbdc09016439dc6a469fa8adc3ec3b5231f776d6065aab7968/diff/root/.cache/bazel/_bazel_root/e53bbb0b0da4e26d24b415310219b953/execroot/tf_serving/bazel-out/host/genfiles/external/local_config_cuda/cuda/cuda/lib/libcublas.so.10.0
/var/lib/docker/overlay2/97cb0c942535cde4622f53bf094251cd1aef1cfc744e8ddda1472ee691f87618/diff/usr/local/cuda-10.0/targets/x86_64-linux/lib/libcublas.so.10.0
/var/lib/docker/overlay2/2fb234250d278545f55a004fcd436b4cba5e847c40503b990ffe800f3b440cb5/diff/usr/local/cuda-10.0/targets/x86_64-linux/lib/libcublas.so.10.0
/var/lib/docker/overlay2/c704b6be3bc1a5d25119fa46216a4e64f872d8001d8bed6d40930f6420ffb091/diff/usr/local/cuda-10.0/targets/x86_64-linux/lib/libcublas.so.10.0
/usr/local/cuda-10.0/lib64/libcublas.so.10.0

OK. I see libcublas.so.10.0 is found in /usr/local/cuda-10.0/lib64/.
Try running this command:

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda-10.0/lib64/

and try again.

NOTE: I see the library is also found in your docker system. I am not familiar with dockers, so maybe someone else could help here. But try the above command and see.

@gian1312 gian1312 closed this Mar 22, 2019

@tensorflow-bot

This comment has been minimized.

Copy link

tensorflow-bot bot commented Mar 22, 2019

Are you satisfied with the resolution of your issue?
Yes
No

@littlehome-eugene

This comment has been minimized.

Copy link

littlehome-eugene commented Mar 24, 2019

It happened to me when I installed cuda-10.1 not cuda-10.0 , downgrading to 10.0 did fix it

@priyakansal

This comment has been minimized.

Copy link

priyakansal commented Mar 24, 2019

@littlehome-eugene
But I am using cuda-10.0 only
Btw, have you done it for docker

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.