Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Intel MKL FATAL ERROR: Cannot load libmkl_avx2.so or libmkl_def.so #720

Closed
jamesjrg opened this issue Mar 23, 2016 · 26 comments
Closed

Intel MKL FATAL ERROR: Cannot load libmkl_avx2.so or libmkl_def.so #720

jamesjrg opened this issue Mar 23, 2016 · 26 comments

Comments

@jamesjrg
Copy link

jamesjrg commented Mar 23, 2016

I see there was already a very similar issue here, the only difference being the other issue mentions libmkl_avx.so, not libmkl_avx2.so:

#698

The other issue has been closed because someone said "I tried importing numpy and scipy, which did not fail indicating this is probably fixed", however this error did not occur immediately on importing the Python libraries, only when trying to run certain functions. I don't understand the details of how Python links to shared native libraries, but perhaps it only fully loads libraries when you actually call the native functions, not just when you import the module?

In that other issue people say it is a problem on Linux but not on Mac. That is the same experience as we have here - it worked fine on a Mac but not on Linux x64.

I used the LD_DEBUG environment variable to view linking errors, and amongst the thousands of line of debug output I noticed this:

30051: /home/james/anaconda3/envs/elevate.jobtitles/lib/python3.4/site-packages/scipy/special/../../../../libmkl_avx2.so: error: symbol lookup error: undefined symbol: mkl_dft_fft_fix_twiddle_table_32f (fatal)

This occurred when loading libmkl_core.so, which in turn loaded libmkl_avx2.so:

30051: file=/home/james/anaconda3/envs/elevate.jobtitles/lib/python3.4/site-packages/scipy/special/../../../../libmkl_avx2.so [0]; dynamically loaded by /home/james/anaconda3/envs/elevate.jobtitles/lib/python3.4/site-packages/scipy/special/../../../../libmkl_core.so [0]

One guess is that maybe libmkl_core.so is expecting a different version of libmkl_avx2.so which had an extra symbol defined.

@ilanschnell
Copy link
Contributor

Thanks for the error report. Which versions of scipy and MKL do you have installed. What is the output of:

$ conda list

?

@jamesjrg
Copy link
Author

mkl 11.3.1 0 defaults
...
numpy 1.10.0 py34_0 defaults
...
scikit-learn 0.17.1 np110py34_0 defaults
scipy 0.17.0 np110py34_2 defaults
...

@gauss256
Copy link

I have run into the same problem and can add some more detail. I have set up nearly identical environments for two users. One user has the problem and the other does not.

I can invoke the error as simply as by running this command:

$ python -c 'from skimage.io import imread'
Intel MKL FATAL ERROR: Cannot load libmkl_avx2.so or libmkl_def.so.

I've compared library package versions, reinstalled various packages like scikit-image, etc. The error remains.

@gauss256
Copy link

It is working for me now, but it is hard to know what I did to resolve it. I just kept doing things like clean, install/remove nomkl, install/remove scipy, etc. Finally something clicked and it worked.

@csoja
Copy link
Contributor

csoja commented Jun 15, 2016

Is anyone still having this issue? I don't believe we were able to reproduce the error - but we suspect this may have been fixed with mkl 11.3.3 as that update happened ~May 13th. I am going to go ahead and close the issue - but will reopen if anyone identifies they still have the problem with mkl 11.3.3.

@csoja csoja closed this as completed Jun 15, 2016
@atabakd
Copy link

atabakd commented Aug 24, 2016

Have the same issue after fresh installation of Anaconda3-4.1.1-Linux-x86_64 on Ubuntu 16.04
mkl 11.3.3 0
numpy 1.11.1 py35_0
scikit-learn 0.17.1 np111py35_2
scipy 0.17.1 np111py35_1

@jamesjrg
Copy link
Author

The problem stopped occurring here after upgrading to numpy 1.11.1 and scipy 0.17.1
(Linux x64 Mint rosa and Ubuntu 14.04)

@poquirion
Copy link

I do have the problem here, ubuntu 16.04 and conda 4.1.11

scikit-learn              0.17.1              np111py27_2  
scipy                     0.18.0              np111py27_0  
numpy                     1.11.1                   py27_0
mkl                       11.3.3                        0

@poquirion
Copy link

poquirion commented Sep 7, 2016

It seems to be related to the packages installation order. I reinstalled numpy with

conda install  -f  numpy

and the problem was gone.

@victoriastuart
Copy link

I had the same issue/error:

Intel MKL FATAL ERROR: Cannot load libmkl_avx2.so or libmkl_def.so

when running a word2vev Python script in my TensorFlow virtual environment (Anaconda). MKL was installed:

conda list | grep -i mkl
    mkl                       11.3.3                        0    defaults
    mkl-service               1.1.2                    py27_2    defaults

Per user gauss256 's suggestion I could trigger that MKL error in my TensorFlow virtual environment,

python -c 'from skimage.io import imread'

but that command did not trigger a warning in my Python 2.7, Python 3.5 or Theano virtual environments.

Per user poquirion's suggestion I reinstalled numpy in my tf-env

conda install  -f  numpy

and that MKL error disappeared (thank you!). :-D

FYR, I am working on an Arch Linux 64-bit system with an Intel Core i7-4790 CPU @ 3.60 GHz ...

@carolinux
Copy link

I did the following (as listed above) and the problem was gone.

conda install  -f  numpy

It's a bit inconveniencing to have to care about installation order, because the reason I went for anaconda is exactly so that I don't have to worry about things like this. Also, I am not sure if this will always work in the future. Ie, what if another numpy version becomes available and then -f installs a version that also has issues?

@kris-at-ata
Copy link

reinstalling numpy using
conda install -f numpy
also worked for me. Thank you.
I was trying to run the GaussianProcessRegressor demo's from sklearn: http://scikit-learn.org/stable/auto_examples/gaussian_process/plot_gpr_noisy.html

@masdeval
Copy link

Works for me too!

@zhenv5
Copy link

zhenv5 commented Jun 15, 2017

works for me too.

@ilanschnell
Copy link
Contributor

I'm glad that the problem appears to have been resolved.

@dartdog
Copy link

dartdog commented Jun 18, 2017

Nope the solution fails for me... I run the default Anaconda 3.6 env and I get Intel MKL FATAL ERROR: Cannot load libmkl_avx2.so or libmkl_def.so.
Using Ubuntu 16.04, I just updated anaconda, created a new py3.6 env (and installed anaconda in it) installed TF 1.2 (not even imported for this though)
I reinstall numpy as suggested (conda install -f numpy) and it still fails, same error I happen to be using this public notebook https://www.kaggle.com/sudalairajkumar/simple-exploration-notebook-zillow-prize (the data is private though) :-( it fails about cell 25...
The new numpy install plan says: numpy: 1.13.0-py36_blas_openblas_200 conda-forge [blas_openblas]

@dartdog
Copy link

dartdog commented Jun 18, 2017

oddly, just to note a different env, with "substantially" the same packages and using the exact same notebook and code works fine... If anyone can tell me how to provide a file comparing the two environments since I now have two similar env's one that has the issue and one that does not, my notebook is the same.., we might be able to determine what causes this?

@dartdog
Copy link

dartdog commented Jun 18, 2017

This might help! this version works: (my older env py36)
numpy 1.12.1 py36_0
numpy 1.12.1
This version throws the error: (My newer env py36j)
numpy 1.13.0 py36_blas_openblas_200 [blas_openblas] conda-forge
numpy 1.13.0

@mmderakhshani
Copy link

I have got the same issue with these package versions:

mkl                       2017.0.1                      0  
mkl-service               1.1.2                    py35_3 
numpy                     1.12.1                   py35_0  
numpy                     1.12.1                    <pip>

and import skimage.io as io caused this error! Any solution? I have tried to do conda install -f numpy, but it did not change any thing. problem is still continuing :)

@d-chambers
Copy link

I am in the same boat as @MOHAMMAD-PY, having the issue not fixed by force installing numpy.

I followed this SO post to fix the problem by switching from MKL to BLAS (I think)

@dartdog
Copy link

dartdog commented Aug 10, 2017

FWIW I suspect that the issue is using a non conda build of NP with a conda python.. eg if you somehow pip install or upgrade numpy (or it gets done for you by another installer.. the version may look the same but it is not compiled for the mkl..

So for me the solution has been to carefully remove any numpy reference in the environment and install only the conda provided versions... (even if that means going back a revision)

@johangithub
Copy link

Same issue here. I executed conda install -f numpy and my scikit-learn works, but my tf is now broken. python -c 'import tensorflow' throws the error below:

RuntimeError: module compiled against API version 0xb but this version of numpy is 0xa
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/jh/anaconda3/lib/python3.6/site-packages/tensorflow/__init__.py", line 24, in <module>
    from tensorflow.python import *
  File "/home/jh/anaconda3/lib/python3.6/site-packages/tensorflow/python/__init__.py", line 49, in <module>
    from tensorflow.python import pywrap_tensorflow
  File "/home/jh/anaconda3/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow.py", line 41, in <module>
    from tensorflow.python.pywrap_tensorflow_internal import *
  File "/home/jh/anaconda3/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 28, in <module>
    _pywrap_tensorflow_internal = swig_import_helper()
  File "/home/jh/anaconda3/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 24, in swig_import_helper
    _mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
  File "/home/jh/anaconda3/lib/python3.6/imp.py", line 242, in load_module
    return load_dynamic(name, filename, file)
  File "/home/jh/anaconda3/lib/python3.6/imp.py", line 342, in load_dynamic
    return _load(spec)
SystemError: initialization of _pywrap_tensorflow_internal raised unreported exception

@dartdog
Copy link

dartdog commented Aug 24, 2017

see my message above, you need to carefully remove old pip numpy's.. Even if you have condtion installed ones, a simple uninstall leaves artifacts..

@pavelkomarov
Copy link

pavelkomarov commented Jan 10, 2018

I had this same issue using scikit-learn 0.19 and numpy 1.13.3 when running MLPRegressor (and also with a package called pyearth running an algorithm called MARS). I believe the root of the problem was that our python is part of an Anaconda install, but scikit-learn and numpy were installed via pip, and their expectations for mkl must not agree.

Unfortunately my framework is managed by some dedicated company admins, not by me, so I haven't gotten my guy to try recompiling numpy yet. But I was able to find a workaround based on this thread: Adding export LD_PRELOAD=/path/to/anaconda/lib/libmkl_def.so:/path/to/anaconda/lib/libmkl_avx.so:/path/to/anaconda/lib/libmkl_core.so:/path/to/anaconda/lib/libmkl_intel_lp64.so:/path/to/anaconda/lib/libmkl_intel_thread.so:/path/to/anaconda/lib/libiomp5.so to my ~/.bashrc causes the problem to disappear. It's super hacky, and I'd be lying if I said I knew exactly what it's doing (but this is helpful), so I'm hoping a recompile of numpy is a cleaner fix. But at least it works.

@eungbean
Copy link

eungbean commented Oct 25, 2018

works me! thank you!

@JhusiJeremy
Copy link

Aug 22, 2019. Reinstalling numpy still works! Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests