Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Linking problem with atlas on OS X #1247

Closed
amueller opened this issue Oct 18, 2012 · 27 comments
Closed

Linking problem with atlas on OS X #1247

amueller opened this issue Oct 18, 2012 · 27 comments
Labels
Milestone

Comments

@amueller
Copy link
Member

See mailing list.

@ogrisel
Copy link
Member

ogrisel commented Oct 18, 2012

To me there is no issue to fix in scikit-learn: you need to build sklearn against the same blas lib as numpy and scipy.

If you build numpy / scipy / scikit-learn with the default build environment of OSX (python / clang / accelerate framework) [1] then everything work fine and all tests pass on OSX 10.8.

[1] this is what happens when you do python setup.py install on the 3 projects.

@amueller
Copy link
Member Author

I was not sure if it is necessary that all three projects are build against the same blas. But I guess it makes sense.

@cdeil
Copy link
Contributor

cdeil commented Oct 23, 2012

@amueller @ogrisel I am having a similar problem with missing ATLAS symbols in sklearn, although I think in my case numpy, scipy and sklearn was linked against Accelerate:
http://trac.macports.org/ticket/36696

I didn't have this problem a few weeks ago, my guess would be that it was introduced by the recent update to scipy 0.11 in Macports?

If you have Macports, could you please check if you can reproduce the issue?
(I hope the problem is my setup and that sklearn is not broken for all Macports users at the moment.)

@amueller
Copy link
Member Author

Can you find out what k_means.so was linked against?

@cdeil
Copy link
Contributor

cdeil commented Oct 23, 2012

@amueller You mean _k_means.so?

$ otool -L /opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/sklearn/cluster/_k_means.so
/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/sklearn/cluster/_k_means.so:
    /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 169.3.0)
    /System/Library/Frameworks/Accelerate.framework/Versions/A/Accelerate (compatibility version 1.0.0, current version 4.0.0)

I gave some more info on what numpy / scipy / sklearn is linked against in the Macports ticket.
Let me know what else is needed to identify the problem.

@ogrisel
Copy link
Member

ogrisel commented Oct 23, 2012

I have built numpy / scipy / scikit-learn from sources (using the setup.py files) against Accelerate (on OSX 10.8) without any issue myself:

$ otool -L coding/scikit-learn/sklearn/linear_model/cd_fast.so
coding/scikit-learn/sklearn/linear_model/cd_fast.so:
    /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib (compatibility version 1.0.0, current version 1.0.0)
    /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 169.3.0)
    /System/Library/Frameworks/Accelerate.framework/Versions/A/Accelerate (compatibility version 1.0.0, current version 4.0.0)

$ otool -L coding/scikit-learn/sklearn/cluster/_k_means.so 
coding/scikit-learn/sklearn/cluster/_k_means.so:
    /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib (compatibility version 1.0.0, current version 1.0.0)
    /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 169.3.0)
    /System/Library/Frameworks/Accelerate.framework/Versions/A/Accelerate (compatibility version 1.0.0, current version 4.0.0)

The python interpreter itself has been installed using homebrew:

$ which python
/usr/local/bin/python
$ ls -l /usr/local/bin/python
lrwxr-xr-x  1 ogrisel  admin  33 17 oct 14:01 /usr/local/bin/python -> ../Cellar/python/2.7.3/bin/python

I assume that it would also work with the default python from the system but I prefer to now install custom python package on it.

I have not tried macports because I am quite happy with homebrew already.

@cdeil
Copy link
Contributor

cdeil commented Oct 23, 2012

@ogrisel According to the build log (https://gist.github.com/3938458), my sklearn was built against the Accelerate BLAS:

blas_opt_info:
  FOUND:
    extra_link_args = ['-Wl,-framework', '-Wl,Accelerate']
    define_macros = [('NO_ATLAS_INFO', 3)]
    extra_compile_args = ['-msse3', '-I/System/Library/Frameworks/vecLib.framework/Headers']

Why does my _k_means.so contain a reference to an ATLAS symbol then?

$ nm /opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/sklearn/cluster/_k_means.so | grep ddot
                 U _ATL_ddot
000000000000ad00 T _cblas_ddot
$ nm /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib | grep ddot
000000000006a062 T _cblas_ddot
00000000000164d6 T _ddot
00000000000164d6 T _ddot_

Note that I do have the atlas @3.10.0_1+gcc45 port installed, maybe this is incorrectly used in the build for some reason?
Is the -DNO_ATLAS_INFO=3 option correct in my case?

@amueller amueller reopened this Oct 23, 2012
@amueller
Copy link
Member Author

@cdeil I'll have a look later if @ogrisel didn't figure it out by then ;)
for the future: please open a new issue, that might make it easier to keep track.

@ogrisel
Copy link
Member

ogrisel commented Oct 23, 2012

I don't have time to dig deeper now but indeed there is probably a bug in one (or all) of our setup.py.

@amueller
Copy link
Member Author

I refactored that so that now the bug is in only one function ;)

@amueller
Copy link
Member Author

According to line 1347 in the gist, the linker flag is just -lcblas and -L/opt/local/lib.
How does the linker disambiguate which what to link against for -lcblas?

@amueller
Copy link
Member Author

I am a bit confused why you have NO_ATLAS_INFO=3
This is the code that set's the value:

        if sys.platform=='darwin' and not os.environ.get('ATLAS',None):
            args = []
            link_args = []
            if get_platform()[-4:] == 'i386':
                intel = 1
            else:
                intel = 0
            if os.path.exists('/System/Library/Frameworks/Accelerate.framework/'):
                if intel:
                    args.extend(['-msse3'])
                else:
                    args.extend(['-faltivec'])
                args.extend([
                    '-I/System/Library/Frameworks/vecLib.framework/Headers'])
                link_args.extend(['-Wl,-framework','-Wl,Accelerate'])
            elif os.path.exists('/System/Library/Frameworks/vecLib.framework/'):
                if intel:
                    args.extend(['-msse3'])
                else:
                    args.extend(['-faltivec'])
                args.extend([
                    '-I/System/Library/Frameworks/vecLib.framework/Headers'])
                link_args.extend(['-Wl,-framework','-Wl,vecLib'])
            if args:
                self.set_info(extra_compile_args=args,
                              extra_link_args=link_args,
                              define_macros=[('NO_ATLAS_INFO',3)])
                return

Do you have any idea why it didn't find accelerate?

@cdeil
Copy link
Contributor

cdeil commented Oct 23, 2012

On my machine:

In [10]: sys.platform=='darwin' and not os.environ.get('ATLAS',None)
Out[10]: True
In [11]: os.path.exists('/System/Library/Frameworks/Accelerate.framework/')
Out[11]: True

and thus there will be something in args and self.set_info will be executed at the end.

The code says: "if Accelerate is there, set NO_ATLAS_INFO to 3".
Is that what it should do?

@amueller
Copy link
Member Author

oh sorry. I misread the code. You are completely right.

@cdeil
Copy link
Contributor

cdeil commented Oct 23, 2012

Note that at the end of line 1341 there is also: -Wl,-framework -Wl,Accelerate

Without a closer look I don't know which cblas (Macports or Accelerate) is then actually chosen by the linker:

$ find /opt/local/lib -name '*cblas*'
/opt/local/lib/libcblas.a
/opt/local/lib/libgslcblas.0.dylib
/opt/local/lib/libgslcblas.a
/opt/local/lib/libgslcblas.dylib
/opt/local/lib/libgslcblas.la
/opt/local/lib/libptcblas.a
$ find /System -name '*cblas*'
/System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/Headers/cblas.h
/System/Library/Frameworks/vecLib.framework/Versions/A/Headers/cblas.h

@amueller
Copy link
Member Author

I guess we should get rid of -L/opt/local/lib then?

@amueller
Copy link
Member Author

I think you have LIBRARY_PATH='/opt/local/lib' in your environment variables. (line 718)
That confuses the linker, I would guess. Could you try to set it empty?

@cdeil
Copy link
Contributor

cdeil commented Oct 23, 2012

When building sklearn outside Macports, I didn't have $LD_LIBRARY_PATH and $DYLD_LIBRARY_PATH set. The -L/opt/local/lib addition must come from python or numpy. For the Macports build the user environment is irrelevant, I have no control there.

Removing -L/opt/local/lib by hand from the linker command I get rid of the undefined symbol _ATL_ddot, but now _cblas_ddot is undefined:

$ nm build/temp.macosx-10.8-x86_64-2.7/sklearn/cluster/_k_means.o | grep ddot
                 U _cblas_ddot

$ /usr/bin/clang -bundle -undefined dynamic_lookup -L/opt/local/lib build/temp.macosx-10.8-x86_64-2.7/sklearn/cluster/_k_means.o -Lbuild/temp.macosx-10.8-x86_64-2.7 -lcblas -lm -o build/lib.macosx-10.8-x86_64-2.7/sklearn/cluster/_k_means.so -Wl,-framework -Wl,Accelerate

$ nm build/lib.macosx-10.8-x86_64-2.7/sklearn/cluster/_k_means.so | grep ddot
                 U _ATL_ddot
0000000000011940 T _cblas_ddot

$ /usr/bin/clang -bundle -undefined dynamic_lookup build/temp.macosx-10.8-x86_64-2.7/sklearn/cluster/_k_means.o -Lbuild/temp.macosx-10.8-x86_64-2.7 -lcblas -lm -o build/lib.macosx-10.8-x86_64-2.7/sklearn/cluster/_k_means.so -Wl,-framework -Wl,Accelerate

$ nm build/lib.macosx-10.8-x86_64-2.7/sklearn/cluster/_k_means.so | grep ddot
                 U _cblas_ddot

Can one of you try to reproduce the issue?
This should do it:

sudo port install py27-scikits-learn
# wait a bit until Macports installs gfortran, python, numpy, scipy, ...
export PYTHONPATH=/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages
python -c 'import sklearn.cluster'

@amueller
Copy link
Member Author

Sorry, no OS X here....

@ChrisBeaumont
Copy link

I've hit the same issue (building scikit-learn from source). Any movement on this?

@ChrisBeaumont
Copy link

Ok, I tried re-running all of the link commands, but removing all instances of -L/opt/local/lib. This seems to have coaxed the linker into using the system BLAS, and allows things to be imported

My flavor of the issue:

python -c "import sklearn.cluster"

ImportError: dlopen(/Users/beaumont/Library/Python/2.7/lib/python/site-packages/sklearn/linear_model/cd_fast.so, 2): Symbol not found: _ATL_daxpy
  Referenced from: /Users/beaumont/Library/Python/2.7/lib/python/site-packages/sklearn/linear_model/cd_fast.so
  Expected in: flat namespace
 in /Users/beaumont/Library/Python/2.7/lib/python/site-packages/sklearn/linear_model/cd_fast.so
beaumont@beaumont-3:~$ otool -L  /Users/beaumont/Library/Python/2.7/lib/python/site-packages/sklearn/linear_model/cd_fast.so
/Users/beaumont/Library/Python/2.7/lib/python/site-packages/sklearn/linear_model/cd_fast.so:
    /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 159.1.0)
    /System/Library/Frameworks/Accelerate.framework/Versions/A/Accelerate (compatibility version 1.0.0, current version 4.0.0)

And the workaround:

cd scikit-learn
python workaround.py

cd
python -c "import sklearn.cluster" #ok
otool -L /Users/beaumont/Library/Python/2.7/lib/python/site-packages/sklearn/linear_model/cd_fast.so:
    /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib (compatibility version 1.0.0, current version 1.0.0)
    /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 159.1.0)
    /System/Library/Frameworks/Accelerate.framework/Versions/A/Accelerate (compatibility version 1.0.0, current version 4.0.0)

The contents of workaround.py are at https://gist.github.com/4498773

@vene
Copy link
Member

vene commented May 28, 2013

I can reproduce this. I did two changes two my setup, at the same time: 1) switch from installer-python to macports python, 2) switch from installer-scipy to the current git head (ie, built instead of binary release). I suppose the second one is at fault, but still the issue was with scikit-learn, and I needed @ChrisBeaumont 's workaround.

I will update when I understand it better.

@vene
Copy link
Member

vene commented May 29, 2013

Uninstalling macports atlas fixes this. I suppose we should find out where the -L/opt/local/lib comes from and remove it unless that's the atlas path found by config.

@gerigk
Copy link

gerigk commented Jun 21, 2013

UPDATE:

I solved the issue and the NO_ATLAS_INFO , -1 brought me on the way

apparently I had a second blas/lapack via ubuntu. I deleted those and reinstalled numpy/scipy/sklearn and now everything works like a charm.


I have the same issue...but on ubuntu.
is there any workaround known on ubuntu?

----> 3 from sklearn import svm

/usr/local/lib/python2.7/dist-packages/sklearn/svm/__init__.py in <module>()
     11 # License: New BSD, (C) INRIA 2010
     12 
---> 13 from .classes import SVC, NuSVC, SVR, NuSVR, OneClassSVM, LinearSVC
     14 from .bounds import l1_min_c
     15 from . import sparse, libsvm, liblinear, libsvm_sparse

/usr/local/lib/python2.7/dist-packages/sklearn/svm/classes.py in <module>()
      1 from .base import BaseLibLinear, BaseSVC, BaseLibSVM
      2 from ..base import RegressorMixin
----> 3 from ..linear_model.base import LinearClassifierMixin
      4 from ..feature_selection.selector_mixin import SelectorMixin
      5 

/usr/local/lib/python2.7/dist-packages/sklearn/linear_model/__init__.py in <module>()
     10 # complete documentation.
     11 
---> 12 from .base import LinearRegression
     13 
     14 from .bayes import BayesianRidge, ARDRegression

/usr/local/lib/python2.7/dist-packages/sklearn/linear_model/base.py in <module>()
     27 from ..utils.sparsefuncs import (csc_mean_variance_axis0,
     28                                  inplace_csc_column_scale)
---> 29 from .cd_fast import sparse_std
     30 
     31 

ImportError: /usr/local/lib/python2.7/dist-packages/sklearn/linear_model/cd_fast.so: undefined symbol: ATL_dcopy

the output of the build process

building 'sklearn.linear_model.cd_fast' extension
compiling C sources
C compiler: x86_64-linux-gnu-gcc -pthread -fno-strict-aliasing -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -fPIC

creating build/temp.linux-x86_64-2.7/sklearn/linear_model
compile options: '-DNO_ATLAS_INFO=-1 -Isklearn/src/cblas -I/usr/local/lib/python2.7/dist-packages/numpy/core/include -I/usr/local/atlas/include -I/usr/local/lib/python2.7/dist-packages/numpy/core/include -I/usr/include/python2.7 -c'
x86_64-linux-gnu-gcc: sklearn/linear_model/cd_fast.c
In file included from /usr/local/lib/python2.7/dist-packages/numpy/core/include/numpy/ndarraytypes.h:1728:0,
                 from /usr/local/lib/python2.7/dist-packages/numpy/core/include/numpy/ndarrayobject.h:17,
                 from /usr/local/lib/python2.7/dist-packages/numpy/core/include/numpy/arrayobject.h:15,
                 from sklearn/linear_model/cd_fast.c:257:
/usr/local/lib/python2.7/dist-packages/numpy/core/include/numpy/npy_deprecated_api.h:11:2: warning: #warning "Using deprecated NumPy API, disable it by #defining NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION" [-Wcpp]
In file included from /usr/local/lib/python2.7/dist-packages/numpy/core/include/numpy/ufuncobject.h:311:0,
                 from sklearn/linear_model/cd_fast.c:258:
/usr/local/lib/python2.7/dist-packages/numpy/core/include/numpy/__ufunc_api.h:236:1: warning: ‘_import_umath’ defined but not used [-Wunused-function]
x86_64-linux-gnu-gcc -pthread -shared -Wl,-O1 -Wl,-Bsymbolic-functions -Wl,-Bsymbolic-functions -Wl,-z,relro -fno-strict-aliasing -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -D_FORTIFY_SOURCE=2 -g -fstack-protector --param=ssp-buffer-size=4 -Wformat -Werror=format-security build/temp.linux-x86_64-2.7/sklearn/linear_model/cd_fast.o -L/usr/local/atlas/lib -Lbuild/temp.linux-x86_64-2.7 -lcblas -lm -o build/lib.linux-x86_64-2.7/sklearn/linear_model/cd_fast.so

and

Setting PTATLAS=ATLAS
  FOUND:
    libraries = ['ptf77blas', 'ptcblas', 'atlas']
    library_dirs = ['/usr/local/atlas/lib']
    language = c
    define_macros = [('NO_ATLAS_INFO', -1)]
    include_dirs = ['/usr/local/atlas/include']

@amueller amueller modified the milestones: 0.15.1, 0.14 Jul 18, 2014
@amueller amueller modified the milestones: 0.16, 0.17 Sep 11, 2015
@amueller amueller modified the milestones: 0.17, 0.16 Sep 11, 2015
@amueller amueller removed this from the 0.17 milestone Sep 20, 2015
@amueller amueller modified the milestone: 0.19 Sep 29, 2016
@jnothman
Copy link
Member

do we still need this open?

@jnothman jnothman modified the milestones: 0.20, 0.19 Jun 14, 2017
@cdeil
Copy link
Contributor

cdeil commented Jun 14, 2017

Looks like I was one of the people with this issue in 2012. OK to close.
(I don't know if there's still an issue, but if I notice something in the future, I'll file a new ticket.)

@jnothman
Copy link
Member

ha. I'm happy with that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

7 participants