Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OSError: libmkl_intel_lp64.so: cannot open shared object file: No such file or directory #1666

Closed
nosound2 opened this issue Feb 20, 2020 · 19 comments · Fixed by #1668
Closed
Assignees

Comments

@nosound2
Copy link

🐛 Bug

I tried to update torch-xla-nightly but broke it, getting an error
OSError: libmkl_intel_lp64.so: cannot open shared object file: No such file or directory

To Reproduce

Steps to reproduce the behavior:

Updating torch-xla-nightly finishes in an error

/usr/share/torch-xla-nightly/pytorch/xla$ ./scripts/update_nightly_torch_wheels.sh

Cloning into 'vision'...
fatal: Remote branch v0.5.0a0+07cbb46 not found in upstream origin

Then I tried to update torch vision with just

git clone https://github.com/pytorch/vision.git 
cd vision/
python setup.py install

but then get an error

OSError: libmkl_intel_lp64.so: cannot open shared object file: No such file or directory

I get the same error when I try to run something of my models.

Appreciate any advice, thank you

@AVancans
Copy link

I had the same issue. Not sure if the way I solved it correct but it worked.
pip install mkl
Find where it installed the library (.user, env or system) and add the library to the following env var
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/path/to/library/

@nosound2
Copy link
Author

Cool, it worked, thank you @AVancans .

In my case the path was
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/anaconda3/envs/torch-xla-nightly/lib

@dlibenzi
Copy link
Collaborator

I think this came up a few weeks ago but was supposed to have been fixed @jysohn23

@chris-clem
Copy link

I have the same issue and cannot resolve it with the above mentioned method.

@jysohn23
Copy link
Collaborator

Hi @nosound2 @chris-clem, this issue had been fixed and you should be able to pickup that fix if you recreate your GCE VM. The fix was baked into the GCE VM Image.

@chris-clem
Copy link

Hi @nosound2 @chris-clem, this issue had been fixed and you should be able to pickup that fix if you recreate your GCE VM. The fix was baked into the GCE VM Image.

I just recreated the VM and it is still there.

@jysohn23
Copy link
Collaborator

@chris-clem Did you just create a fresh GCE VM with our GCE Images and then re-install the latest nightly? Mind explaining what you did?

@chris-clem
Copy link

Sure:

  1. Create new GCE VM with the XLA image
  2. ssh onto VM with gcloud
  3. cd into /usr/share/torch-xla-nightly/pytorch/xla to run
    . ./scripts/update_nightly_torch_wheels.sh

Then the connection closes:

++ '[' '' '!=' torch-xla-nightly ']'
++ conda activate torch-xla-nightly
++ '[' 2 -lt 1 ']'
++ local cmd=activate
++ shift
++ case "$cmd" in
++ _conda_activate torch-xla-nightly
++ '[' -n '' ']'
++ local ask_conda
+++ PS1='\[\e]0;\u@\h: \w\a\]${debian_chroot:+($debian_chroot)}\[\033[01;32m\]\u@\h\[\033[00m\]:\[\033[01;34m\]\w\[\033[00m\]\$ '
+++ /anaconda3/bin/conda shell.posix activate torch-xla-nightly
++ ask_conda='PS1='\''(torch-xla-nightly) \[\e]0;\u@\h: \w\a\]${debian_chroot:+($debian_chroot)}\[\033[01;32m\]\u@\h\[\033[00m\]:\[\033[01;34m\]\w\[\033[00m\]\$ '\''
\export CONDA_DEFAULT_ENV='\''torch-xla-nightly'\''
\export CONDA_EXE='\''/anaconda3/bin/conda'\''
\export CONDA_PREFIX='\''/anaconda3/envs/torch-xla-nightly'\''
\export CONDA_PROMPT_MODIFIER='\''(torch-xla-nightly) '\''
\export CONDA_PYTHON_EXE='\''/anaconda3/bin/python'\''
\export CONDA_SHLVL='\''1'\''
\export PATH='\''/anaconda3/envs/torch-xla-nightly/bin:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games'\''
\. "/anaconda3/envs/torch-xla-nightly/etc/conda/activate.d/env_vars.sh"'
++ eval 'PS1='\''(torch-xla-nightly) \[\e]0;\u@\h: \w\a\]${debian_chroot:+($debian_chroot)}\[\033[01;32m\]\u@\h\[\033[00m\]:\[\033[01;34m\]\w\[\033[00m\]\$ '\''
\export CONDA_DEFAULT_ENV='\''torch-xla-nightly'\''
\export CONDA_EXE='\''/anaconda3/bin/conda'\''
\export CONDA_PREFIX='\''/anaconda3/envs/torch-xla-nightly'\''
\export CONDA_PROMPT_MODIFIER='\''(torch-xla-nightly) '\''
\export CONDA_PYTHON_EXE='\''/anaconda3/bin/python'\''
\export CONDA_SHLVL='\''1'\''
\export PATH='\''/anaconda3/envs/torch-xla-nightly/bin:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games'\''
\. "/anaconda3/envs/torch-xla-nightly/etc/conda/activate.d/env_vars.sh"'
+++ PS1='(torch-xla-nightly) \[\e]0;\u@\h: \w\a\]${debian_chroot:+($debian_chroot)}\[\033[01;32m\]\u@\h\[\033[00m\]:\[\033[01;34m\]\w\[\033[00m\]\$ '
+++ export CONDA_DEFAULT_ENV=torch-xla-nightly
+++ CONDA_DEFAULT_ENV=torch-xla-nightly
+++ export CONDA_EXE=/anaconda3/bin/conda
+++ CONDA_EXE=/anaconda3/bin/conda
+++ export CONDA_PREFIX=/anaconda3/envs/torch-xla-nightly
+++ CONDA_PREFIX=/anaconda3/envs/torch-xla-nightly
+++ export 'CONDA_PROMPT_MODIFIER=(torch-xla-nightly) '
+++ CONDA_PROMPT_MODIFIER='(torch-xla-nightly) '
+++ export CONDA_PYTHON_EXE=/anaconda3/bin/python
+++ CONDA_PYTHON_EXE=/anaconda3/bin/python
+++ export CONDA_SHLVL=1
+++ CONDA_SHLVL=1
+++ export PATH=/anaconda3/envs/torch-xla-nightly/bin:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+++ PATH=/anaconda3/envs/torch-xla-nightly/bin:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+++ . /anaconda3/envs/torch-xla-nightly/etc/conda/activate.d/env_vars.sh
++++ export LD_LIBRARY_PATH=/anaconda3/envs/torch-xla-nightly/lib/
++++ LD_LIBRARY_PATH=/anaconda3/envs/torch-xla-nightly/lib/
++ _conda_hashr
++ case "$_CONDA_SHELL_FLAVOR" in
++ hash -r
+++ dirname -bash
dirname: invalid option -- 'b'
Try 'dirname --help' for more information.
++ /update_torch_wheels.sh
-bash: /update_torch_wheels.sh: No such file or directory
Connection to 34.90.143.84 closed.

@jysohn23
Copy link
Collaborator

@chris-clem I'll try reproduce this, but in the meantime could you use a nightly GCE VM since that should have all the latest bits?

@chris-clem
Copy link

@jysohn23 How do I create a nightly VM?

I see the following options for the images and I choose the last one (PyTorch/XLA):
image

@jysohn23
Copy link
Collaborator

Yeah that one and then just use the conda environment torch-xla-nightly.

@chris-clem
Copy link

Yeah that one and then just use the conda environment torch-xla-nightly.

I use it and still get the error:

Traceback (most recent call last):
  File "/tmp/pycharm_project_164/deepfake_detection_challenge/lightning/train.py", line 7, in <module>
    import torch
  File "/anaconda3/envs/torch-xla-nightly/lib/python3.6/site-packages/torch/__init__.py", line 124, in <module>
    _load_global_deps()
  File "/anaconda3/envs/torch-xla-nightly/lib/python3.6/site-packages/torch/__init__.py", line 82, in _load_global_deps
    ctypes.CDLL(lib_path, mode=ctypes.RTLD_GLOBAL)
  File "/anaconda3/envs/torch-xla-nightly/lib/python3.6/ctypes/__init__.py", line 348, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: libmkl_intel_lp64.so: cannot open shared object file: No such file or directory

@jysohn23
Copy link
Collaborator

Yeah but that's only after you try update the wheels right (I just created one fresh which works fine without updating anything after creating)? You shouldn't need to if you're already using torch-xla-nightly. That issue I'll separately look into, but use the nightly environment without updating for now. That should already have latest bits.

@jysohn23
Copy link
Collaborator

Once above PR is merged, that script should make its way into tomorrow' GCE VM images so you can use it from there. Alternatively just git pull from /usr/share/torch-xla-nightly/pytorch/xla.

@chris-clem
Copy link

The problem could also have been PyCharm Remote Deployement. If I run my script on the VM directly, I do not get the OSError.

@julien-c
Copy link

julien-c commented May 6, 2020

Same here with VSCode remote debugger (but workaround above worked)

@YuxianMeng
Copy link

The problem could also have been PyCharm Remote Deployement. If I run my script on the VM directly, I do not get the OSError.

Hi, I encountered the same issue, did you solve it?

@Borda
Copy link

Borda commented Jul 3, 2020

similar issue solving in Lightning-AI/pytorch-lightning#2460

@ShoufaChen
Copy link

ShoufaChen commented Jan 15, 2021

pip install mkl

solved my issue.

update:
I encounter this problem when I install custom build pytorch dist using pip.
conda install numpy ninja pyyaml mkl mkl-include setuptools cmake cffi typing_extensions future six requests dataclasses works

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants