Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

arpack tests fail when installed on mac os x #2547

Closed
joernhees opened this issue Jun 7, 2013 · 34 comments · Fixed by #2684
Closed

arpack tests fail when installed on mac os x #2547

joernhees opened this issue Jun 7, 2013 · 34 comments · Fixed by #2684
Assignees
Labels
defect A clear bug or issue that prevents SciPy from being installed or used as expected prio-normal scipy.sparse.linalg
Milestone

Comments

@joernhees
Copy link
Contributor

I ran into this problem several times already with different installation methods. This time i used brew to install scipy:

brew tap homebrew/science
brew tap samueljohn/python
brew update
brew install python
pip install nose
brew install scipy

If you then run the tests the following errors occur: https://gist.github.com/joernhees/5733199

A shortened version here:

$ ipython
Python 2.7.5 (default, Jun 8 2013, 00:27:06)
Type "copyright", "credits" or "license" for more information.

IPython 0.13.2 -- An enhanced Interactive Python.
? -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help -> Python's own help system.
object? -> Details about 'object', use 'object??' for extra details.

In [1]: import scipy

In [2]: scipy.test()
Running unit tests for scipy
NumPy version 1.7.1
NumPy is installed in /usr/local/lib/python2.7/site-packages/numpy
SciPy version 0.12.0
SciPy is installed in /usr/local/lib/python2.7/site-packages/scipy
Python version 2.7.5 (default, Jun 8 2013, 00:27:06) [GCC 4.2.1 Compatible Apple LLVM 4.2 (clang-425.0.28)]
nose version 1.3.0
/usr/local/lib/python2.7/site-packages/numpy/lib/utils.py:139: DeprecationWarning: `scipy.lib.blas` is deprecated, use `scipy.linalg.blas` instead!
warnings.warn(depdoc, DeprecationWarning)
/usr/local/lib/python2.7/site-packages/numpy/lib/utils.py:139: DeprecationWarning: `scipy.lib.lapack` is deprecated, use `scipy.linalg.lapack` instead!
warnings.warn(depdoc, DeprecationWarning)
usr/local/lib/python2.7/site-packages/scipy/misc/pilutil.py:230: DeprecationWarning: fromstring() is deprecated. Please call frombytes() instead.
image = Image.fromstring('L',shape,bytedata.tostring())
/usr/local/lib/python2.7/site-packages/scipy/misc/pilutil.py:230: DeprecationWarning: fromstring() is deprecated. Please call frombytes() instead.
image = Image.fromstring('L',shape,bytedata.tostring())
/usr/local/lib/python2.7/site-packages/scipy/misc/pilutil.py:230: DeprecationWarning: fromstring() is deprecated. Please call frombytes() instead.
image = Image.fromstring('L',shape,bytedata.tostring())
/usr/local/lib/python2.7/site-packages/scipy/misc/pilutil.py:230: DeprecationWarning: fromstring() is deprecated. Please call frombytes() instead.
image = Image.fromstring('L',shape,bytedata.tostring())
/usr/local/lib/python2.7/site-packages/scipy/misc/pilutil.py:230: DeprecationWarning: fromstring() is deprecated. Please call frombytes() instead.
image = Image.fromstring('L',shape,bytedata.tostring())
./usr/local/lib/python2.7/site-packages/scipy/misc/pilutil.py:230: DeprecationWarning: fromstring() is deprecated. Please call frombytes() instead.
image = Image.fromstring('L',shape,bytedata.tostring())
./usr/local/lib/python2.7/site-packages/scipy/misc/pilutil.py:230: DeprecationWarning: fromstring() is deprecated. Please call frombytes() instead.
image = Image.fromstring('L',shape,bytedata.tostring())

======================================================================
FAIL: test_arpack.test_symmetric_modes(True, <std-symmetric>, 'f', 2, 'LM', None, 0.5, <class 'scipy.sparse.csr.csr_matrix'>, None, 'normal')
----------------------------------------------------------------------
Traceback (most recent call last):
File "/usr/local/lib/python2.7/site-packages/nose/case.py", line 197, in runTest
self.test(*self.arg)
File "/usr/local/lib/python2.7/site-packages/scipy/sparse/linalg/eigen/arpack/tests/test_arpack.py", line 259, in eval_evec
assert_allclose(LHS, RHS, rtol=rtol, atol=atol, err_msg=err)
File "/usr/local/lib/python2.7/site-packages/numpy/testing/utils.py", line 1179, in assert_allclose
verbose=verbose, header=header)
File "/usr/local/lib/python2.7/site-packages/numpy/testing/utils.py", line 645, in assert_array_compare
raise AssertionError(msg)
AssertionError:
Not equal to tolerance rtol=0.00178814, atol=0.000357628
error for eigsh:standard, typ=f, which=LM, sigma=0.5, mattype=csr_matrix, OPpart=None, mode=normal
(mismatch 100.0%)
x: array([[ 2.38156418e-01, -6.75444982e+09],
[ -1.07853470e-01, -8.01245676e+09],
[ 1.24683023e-01, -5.19757686e+09],...
y: array([[ 2.38156418e-01, -5.70949789e+08],
[ -1.07853470e-01, -4.05829392e+08],
[ 1.24683023e-01, 6.25800146e+07],...

======================================================================
FAIL: test_arpack.test_symmetric_modes(True, <std-symmetric>, 'f', 2, 'LM', None, 0.5, <class 'scipy.sparse.csr.csr_matrix'>, None, 'buckling')
----------------------------------------------------------------------
Traceback (most recent call last):
File "/usr/local/lib/python2.7/site-packages/nose/case.py", line 197, in runTest
self.test(*self.arg)
File "/usr/local/lib/python2.7/site-packages/scipy/sparse/linalg/eigen/arpack/tests/test_arpack.py", line 259, in eval_evec
assert_allclose(LHS, RHS, rtol=rtol, atol=atol, err_msg=err)
File "/usr/local/lib/python2.7/site-packages/numpy/testing/utils.py", line 1179, in assert_allclose
verbose=verbose, header=header)
File "/usr/local/lib/python2.7/site-packages/numpy/testing/utils.py", line 645, in assert_array_compare
raise AssertionError(msg)
AssertionError:
Not equal to tolerance rtol=0.00178814, atol=0.000357628
error for eigsh:standard, typ=f, which=LM, sigma=0.5, mattype=csr_matrix, OPpart=None, mode=buckling
(mismatch 100.0%)
x: array([[ 3.53755447e-01, -2.29114355e+04],
[ -1.60204595e-01, -6.65625445e+04],
[ 1.85203065e-01, -2.69012500e+04],...
y: array([[ 3.53755447e-01, -8.88255444e+05],
[ -1.60204595e-01, -2.39343354e+06],
[ 1.85203065e-01, -3.96842525e+04],...

======================================================================
FAIL: test_arpack.test_symmetric_modes(True, <std-symmetric>, 'f', 2, 'LM', None, 0.5, <class 'scipy.sparse.csr.csr_matrix'>, None, 'cayley')
----------------------------------------------------------------------
Traceback (most recent call last):
File "/usr/local/lib/python2.7/site-packages/nose/case.py", line 197, in runTest
self.test(*self.arg)
File "/usr/local/lib/python2.7/site-packages/scipy/sparse/linalg/eigen/arpack/tests/test_arpack.py", line 259, in eval_evec
assert_allclose(LHS, RHS, rtol=rtol, atol=atol, err_msg=err)
File "/usr/local/lib/python2.7/site-packages/numpy/testing/utils.py", line 1179, in assert_allclose
verbose=verbose, header=header)
File "/usr/local/lib/python2.7/site-packages/numpy/testing/utils.py", line 645, in assert_array_compare
raise AssertionError(msg)
AssertionError:
Not equal to tolerance rtol=0.00178814, atol=0.000357628
error for eigsh:standard, typ=f, which=LM, sigma=0.5, mattype=csr_matrix, OPpart=None, mode=cayley
(mismatch 100.0%)
x: array([[ -2.38156418e-01, 1.04661597e+09],
[ 1.07853470e-01, 1.39930271e+09],
[ -1.24683023e-01, 9.56906461e+08],...
y: array([[ -2.38156418e-01, 7.63721281e+07],
[ 1.07853470e-01, 1.25169905e+08],
[ -1.24683023e-01, 2.91283130e+07],...

======================================================================
...
======================================================================
FAIL: test_arpack.test_symmetric_modes(True, <gen-symmetric>, 'd', 2, 'SA', None, 0.5, <function asarray at 0x10ad03e60>, None, 'cayley')
----------------------------------------------------------------------
Traceback (most recent call last):
File "/usr/local/lib/python2.7/site-packages/nose/case.py", line 197, in runTest
self.test(*self.arg)
File "/usr/local/lib/python2.7/site-packages/scipy/sparse/linalg/eigen/arpack/tests/test_arpack.py", line 259, in eval_evec
assert_allclose(LHS, RHS, rtol=rtol, atol=atol, err_msg=err)
File "/usr/local/lib/python2.7/site-packages/numpy/testing/utils.py", line 1179, in assert_allclose
verbose=verbose, header=header)
File "/usr/local/lib/python2.7/site-packages/numpy/testing/utils.py", line 645, in assert_array_compare
raise AssertionError(msg)
AssertionError:
Not equal to tolerance rtol=4.44089e-13, atol=4.44089e-13
error for eigsh:general, typ=d, which=SA, sigma=0.5, mattype=asarray, OPpart=None, mode=cayley
(mismatch 100.0%)
x: array([[-0.36892684, -0.01935691],
[-0.26850996, -0.11053158],
[-0.40976156, -0.13223572],...
y: array([[-0.43633077, -0.01935691],
[-0.25161386, -0.11053158],
[-0.36756684, -0.13223572],...

----------------------------------------------------------------------
Ran 4831 tests in 54.169s

FAILED (KNOWNFAIL=11, SKIP=14, failures=63)
Out[2]: <nose.result.TextTestResult run=4831 errors=0 failures=63>

In [3]:
@joernhees
Copy link
Contributor Author

I actually found a workaround by using openblas (a bit slower is better than wrong IMO):

First make sure you have a gfortran version >= 4.7: gfortran --version (If not you probably followed some popular mac os x scipy install howto before. In my case it used brew's apple-gcc42, so brew unlink apple-gcc42 && brew install gfortran could solve that).

Then run:

brew install openblas
brew reinstall scipy --with-openblas

As mentioned in samueljohn/homebrew-python#12 no one wants to be responsible for these errors...
IMHO scipy should at least warn users in case a library is used which is known to cause errors of this magnitude.

@rgommers
Copy link
Member

rgommers commented Jun 9, 2013

It's not "doesn't want to be responsible", it's simply quite hard to solve. Duplicate of gh-2372, so closing.

@rgommers rgommers closed this as completed Jun 9, 2013
@joernhees
Copy link
Contributor Author

@rgommers i'm quite sure this isn't a dup of that, i'm running on mac os x and afaik it uses Accelerate.framework in LAPACK.

Also i seem to judge the severity of this a bit different: Many of my researcher colleagues use macs and all of whom I asked to run the scipy test were surprised that these tests fail. I'm not 100 % sure how bad the errors actually are, but i get a bad feeling that many scientific publications could depend on this and never notice the errors.

What i suggest is not hard to solve: i suggested that scipy should check and notify the users if the wrong library is used which is known to cause problems. You point out yourself that not using Accelerate and llvm is the best in your opinion ( samueljohn/homebrew-python#12 ). Why not check for it?

@pv
Copy link
Member

pv commented Jun 10, 2013

Yes, disabling support for Accelerate completely could be sensible until this is resolved.

@pv pv reopened this Jun 10, 2013
@rgommers
Copy link
Member

Sorry about linking the wrong ticket. It is a duplicate of gh-2256 as well as gh-2248.

I'm not sure about disabling Accelerate support - it will create a lot of build issues and a flood of ticket / complaints on the mailing list. Disabling this part of ARPACK functionality (everywhere or only on OS X?) is probably a better alternative. AFAIK it's only single precision that has a problem and that's not widely used.

I suggest continuing this discussion on gh-2248; I'll add that to the 0.13.0 Milestone so we can't just forget about this. Closing again as duplicate.

@pv
Copy link
Member

pv commented Jun 10, 2013

@rgommers: I meant that we disable also Veclib/Accelerate detection. This is certainly doable.

The problem is that the "part of ARPACK functionality" in question is not precisely defined. It's not even only single precision, also apparently double precision fails (see 'd' in the above tests).

@rgommers
Copy link
Member

I understood that that's what you meant. It's doable, but will create lots of issues for users. We'll be disabling the only BLAS/LAPACK shipped with the OS, forcing users to hunt for other binaries (which may have their own issues) or install a BLAS/LAPACK from source (very much nontrivial). So it's not a decision to take lightly.

OpenBLAS support in numpy.distutils is still shaky by the way: http://thread.gmane.org/gmane.comp.python.numeric.general/53541/focus=53547

@rgommers
Copy link
Member

You're right by the way, missed the 'd'.

@rgommers
Copy link
Member

It's well-defined enough - disable everything that's failing on OS X 10.8 right now.

@pv
Copy link
Member

pv commented Jun 10, 2013

I think disabling is not a satisfactory solution, because: the tests show that something fails, we do not understand why. However, they do not show that the parts for which the tests pass function correctly, rather, they show they work in some test cases.

It will also break portability of Scipy code across platforms.

Moreover, the situation has apparently got worse with new OSX versions, as previously it used to be so that the problems were restricted to single precision. Next version of OSX --- more problems?

@rgommers
Copy link
Member

Could be that OS X 10.9 has even more issues, we'll find out soon enough.

There's also the -ff2c solution still, gh-280. On that same PR David says that MKL has similar issues as Accelerate. OpenBLAS and GotoBLAS are not well supported by numpy.distutils. Bento support is better, but has 64-bit issues. So what are we going to recommend, use Netlib BLAS/LAPACK or ATLAS only?

I'm going to try one more time: all test failures are in the generalized-symmetric support, as added in gh-25. It was added for 0.10.0, and started given issues soon after. Disabling generalized-symmetric support (a feature used by a tiny fraction of users) either on OS X or on all platforms would be a lower-impact change than completely removing support for Accelerate.

@jakevdp @cournape any opinion on this? Time to have a go at fixing it? I believe you both have an OS X 10.8 system:)

@jakevdp
Copy link
Member

jakevdp commented Jun 11, 2013

I hate to admit it, but I still haven't been able to successfully build scipy from source on my mac. I use macports there, and have been doing all my scipy development work on my old linux machine 😬

Regarding generalized eigenvalue problems... I'd hate to completely remove it, because it works fine on most systems and there is probably someone out there who finds it useful. What if we move it to a category along the lines of "unsupported and untested: use at your own peril"? Is there any precedent for that?

@pv
Copy link
Member

pv commented Jun 11, 2013

I am still not convinced that disabling features solves the issue --- we may certainly get fewer test failures, but this is just hiding the issue. We know that some calls to Accelerate fails to give correct results. What we do not know is whether some of these calls are also made in the usual eigenproblem case, in code paths not hit by the simple set of test cases.

Moreover, the problem is not limited to ARPACK --- remember that also scipy.linalg has had failures reported on OSX: http://scipy-user.10969.n7.nabble.com/Fwd-scipy-test-fails-for-0-11-0-on-OS-X-10-8-2-td4039.html

Basically, anything in Scipy that calls certain LAPACK/BLAS routines (which?) is suspect, and we cannot just go removing these features.

One possible fix is to go an try to plug all lapack FUNCTION calls even for double precision, as this is one place where the ABI incompatibility issue can come and bite.

@rgommers
Copy link
Member

OK let's give it a try. Accelerate in numpy should be removed as well for the same reason then. Guess we'll need to write a good guide on how to compile with other BLAS/LAPACK libs and refer anyone who can't figure it out to HomeBrew.

I'll bring it up on the mailing list - maybe that triggers someone on OS X 10.8 to have a go at fixing things.

@rgommers
Copy link
Member

We have a hosted 10.8 Mac Mini for testing by the way. I'll set up some different lapacks there. I can create an account for you as well if you want.

@pv
Copy link
Member

pv commented Jun 11, 2013

Yes --- actually, I think it's overall a bit dangerous that Numpy finds Accelerate/Veclib but does not provide compiler flags that ensure the correct Fortran ABI is used.

Scipy can perhaps hack around this by avoiding FUNCTION calls, but this is not true for third-party modules that just want get_info('lapack_opt')

@ghost ghost assigned cournape Jun 11, 2013
@cournape
Copy link
Member

I am a bit swamped at work ATM, but we can look at that during scipy conference.

@joernhees
Copy link
Contributor Author

@rgommers i've summarized how to get scipy running from scratch here: http://joernhees.de/blog/2013/06/08/mac-os-x-10-8-scientific-python-with-homebrew/
Summarizing for scipy:

# set up some taps and update brew
brew tap homebrew/science # a lot of cool formulae for scientific tools
brew samueljohn/python # numpy, scipy
brew update && brew upgrade

# install a brewed python
brew install python

# install openblas (otherwise scipy's arpack tests will fail)
brew install openblas

brew install numpy --with-opneblas
brew install scipy --with-openblas

That way all tests in scipy

Ran 4831 tests in 54.472s

OK (KNOWNFAIL=11, SKIP=14)

and numpy

Ran 4771 tests in 19.594s

OK (KNOWNFAIL=5, SKIP=6)

pass.

Thanks to @samueljohn it should be fairly simple to extract the needed compile options.

@pv
Copy link
Member

pv commented Jun 11, 2013

@joernhees: This is BTW likely an incompatible Fortran ABI issue, so recompiling with -ff2c compilation and link flags may also fix the issue with Accelerate (I remember someone reporting that this worked). Environment variables FOPT, FFLAGS and LDFLAGS control these AFAIK.

Macports includes the ff2c for Octave, which probably has similar problems. On OSX Scipy includes essentially a workaround similar to dotwrp, which worked OK for previous OSX versions, but something apparently changed on 10.8 (and 10.7?).

@rgommers
Copy link
Member

Thanks @joernhees.

@pv there's apparently also a new build issue with Accelerate (maybe came with OS X update), I've seen it reported in a couple of places now:
http://stackoverflow.com/questions/16269161/installing-issues-for-scipy-on-mac
http://charlesmartinreid.com/wiki/Scipy_vImage.h_Problem

@pv
Copy link
Member

pv commented Jun 27, 2013

Ok, I just managed to get the tests to fail on OSX 10.8.4 + clang + gfortran 4.2, when using ff2c. (@rgommers this is on the scipy test machine).

Moreover, I isolated a pure-Fortran example that fails when linked with Arpack --- everyone can try this out: https://gist.github.com/pv/5873146

So the Fortran ABI issue is partly a red herring, there's a real bug somewhere either in ARPACK or in Accelerate.
It sounds like the bug is in Accelerate, as the tests pass on other BLAS+LAPACK libraries. However, it may also be an ARPACK bug (e.g. depending on a certain level of rounding error --- but that sounds still unlikely if it works on MKL etc.), or, an incompatibility in the Fortran ABI that is not fixed by -ff2c (also this sounds unlikely, and it would then be an Accelerate bug).

It seems the plan of action for Scipy should be to drop support for Accelerate.

@mandli
Copy link

mandli commented Jun 30, 2013

I tried to tease out an example as well but I think @pv did a better job of isolating the problem and I can confirm that the gist fails for me as well and that it is not fixed by -ff2c.

@pv
Copy link
Member

pv commented Jul 1, 2013

Could one of you guys send a bug report to Apple, now that we have a reproducible Fortran-only test case? This issue affects everyone using ARPACK on OSX. (Also, I guess Apple has some recommended Fortran compiler, so maybe check with that too?) I don't use OSX, so I'd be happy if someone else does this.

@mandli
Copy link

mandli commented Jul 1, 2013

Just filed it.

@pv
Copy link
Member

pv commented Jul 1, 2013

Macports users can also verify that the issue exists on the macports-packaged arpack, and if yes, complain to macports, too.

@pv
Copy link
Member

pv commented Jul 22, 2013

The ARPACK bug report is here: http://forge.scilab.org/index.php/p/arpack-ng/issues/1259/
There's some evidence appearing that it may also be a bug in ARPACK.

@mandli
Copy link

mandli commented Jul 22, 2013

I added a pointer to the Apple bug report to that discussion.

@mandli
Copy link

mandli commented Jul 31, 2013

I heard back from Apple on this issue and they appear to have fixed the problem in a build of 10.9. I can confirm that the gist now provides the correct answer to the problem.

Running tests on numpy and scipy now leads to success (thankfully) on the build I am testing on as well.

@samueljohn
Copy link

That is good news, finally. So for 10.9, we can build scipy with Accelerate again and for 10.8 switching to openblas?

@joernhees
Copy link
Contributor Author

wow, that thanks so much for all the effort and even getting this fixed upstream 👍
I like the suggestion to use openblas on 10.8 and building with Accelerate on 10.9 again.

@mandli
Copy link

mandli commented Jul 31, 2013

I am hopeful, I will try to continue to test numpy and scipy as new builds come out.

@pv
Copy link
Member

pv commented Jul 31, 2013

For 10.8, you can also apply the bug fix discussed in the Arpack bug report. This issue turned out to be a bug in Arpack, which however only manifested for certain Lapack versions (one of them being the one shipped with Accelerate).

pv added a commit to pv/scipy-work that referenced this issue Aug 4, 2013
See discussion at http://forge.scilab.org/index.php/p/arpack-ng/issues/1259/

The Ritz vector purification step assumes workl(iq) still contains the
original Q matrix. This is however overwritten by the call to xGEQR2
earlier.

This patch fixes the issue by making a copy of the last row of the eigenvector
matrix, after it is recomputed after QR by xORM2R. The work space
WORKL(IW+NCV:IW+2*NCV) used for the TAU values of the QR decomposition is not
used later in the routine, and can be used to store a copy of the last row.

Fixes scipygh-2547 (provided you use -ff2c flag for compiling)

Thanks to Michael Wimmer for tracing the issue.
@rgommers
Copy link
Member

rgommers commented Aug 6, 2013

gh-2684 merged. Thanks to @pv and @michaelwimmer for fixing this.

@samueljohn
Copy link

Update: I added 0.13.0b1 to by tap and I can confirm that these issues are no longer reported (though I get an PIL related failure where it cannot use zip). In case any one wants to try: brew update && brew reinstall scipy --devel

Though I notice that a lot of tests are skipped (S) or known (K), so I wonder if the arpack stuff has just been marked.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
defect A clear bug or issue that prevents SciPy from being installed or used as expected prio-normal scipy.sparse.linalg
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants