Add generalized ufunc linalg functions and make numpy.linalg use them #3220

Merged
merged 92 commits into from Apr 17, 2013

Conversation

Projects
None yet
4 participants
Owner

pv commented Apr 10, 2013

This is a followup to gh-2954 (list of new commits), which replaces numpy.linalg routines with the corresponding umath_linalg implementations, while preserving backward compatibility for ndim <= 2 inputs. The new code is all under numpy/linalg.

Numpy's test suite and those of Scipy, Pandas, and Nipy pass with these changes, so there probably aren't obvious problems with backward compat.

This PR doesn't add new functions to numpy.linalg, only replaces existing ones. That, and other things still to do are summarized in issue gh-3217.

Owner

charris commented Apr 10, 2013

Looks like it needs a rebase.

ovillellas and others added some commits Sep 21, 2012

@ovillellas @pv ovillellas inner1d and mat_mult implemented using blas. 07c0237
@ovillellas @pv ovillellas created a new module to hold linalg ufuncs. 3ce193b
@ovillellas @pv ovillellas refactored some code, make it cleaner overall and ready to reuse some…
… code from the matrix_multiply in other gufuncs
65bb3e7
@ovillellas @pv ovillellas det and slogdet working 1db3f9b
@ovillellas @pv ovillellas eigh and eigvalsh working cf94941
@ovillellas @pv ovillellas solve gufunc working 2a6d450
@ovillellas @pv ovillellas solve1 and inv working 67820dd
@ovillellas @pv ovillellas fixed possible issues with BLAS _copy (0 is not a valid incx/incy val…
…ue and the functions are not guaranteed to work with them).

also got cholesky working.
c1efc48
@ovillellas @pv ovillellas working eig and eigvals priority 2 functions. 234523c
@ovillellas @pv ovillellas svd implemented. Single output working. Multiple options not function…
…al due to a bug in the harness.
4c9f286
@ovillellas @pv ovillellas modified the code so it just used external definitions of blas/lapack…
… functions (as is made in linalg module). Changed some code so that it uses BLAS instead of cblas (the fortran functions) also in line with what it is done on linalg. Modified the matrix multiply code, made it simpler and adapted to use blas (it was using cblas with CblasRowMajor, that is not available in the fortran function.
7f3afca
@ovillellas @pv ovillellas lapack_lite for builds of umath_linalg without an optimized lapack in…
… the system.
c679f7b
@ovillellas @pv ovillellas added some single precision functions to f2c_lite.c that were missing…
… and needed by out library. It seems to work now on Linux.
c85e833
@ovillellas @pv ovillellas added plenty of simple functions (quadratic_form plus all the "inspir…
…ed from PDL" ufuncs). Only missing from "inspired from PDL" is matrix_multiply3.
26d6bef
@ovillellas @pv ovillellas added information about the contents of umath-linalg module c47ee06
@ovillellas @pv ovillellas fixed gufuncs so that they use the proper signature (mwiebe fix present) 05f9401
@ovillellas @pv ovillellas fixed a warning in f2c_lite.c for umath/lapack_lite
added chosolve and poinv, they need testing.
95eef39
@ovillellas @pv ovillellas poinv and chosolve working. Rebuilt lapack_lite to support them. Used…
… also a patched f2c to remove warnings.
2bb6c8e
@ovillellas @pv ovillellas updated umath_linalg_content.txt 861e694
@ovillellas @pv ovillellas wrote a wrapper module for umath_linalg. Named gufuncs_linalg (in pyt…
…hon).
747cd46
@ovillellas @pv ovillellas first iteration with tests. Incomplete and some failing. Just a start…
…. Some bugs already fixed.
ecfb93c
@ovillellas @pv ovillellas work on tests and related fixes. Getting things in shape to commit to…
… de-shaw patch
814add8
@ovillellas @pv ovillellas removed some wrappers that weren't needed with the harness fix, just …
…changed to assignment
f74546d
@ovillellas @pv ovillellas modified umath_linalg_content.txt to reflect changes. a0c09b8
@ovillellas @pv ovillellas fixed bug in matrix_multiply when using cdoubles f6aaecb
@ovillellas @pv ovillellas fixed the problem in eigvals (apparently) c355550
@ovillellas @pv ovillellas work in progress: proper tests for eig. 2276eaa
@ovillellas @pv ovillellas added tests for ufuncs in gufuncs_linalg (the ones based on pdl). Add…
…ed multiply4 in the wrapper, as it was missing
a75fb9e
@ovillellas @pv ovillellas updated the umath_linalg_content.txt adding a mention to the wrapper …
…code.
8817773
@ovillellas @pv ovillellas reverted matrix_dot in umath_gufuncs to matrix_multiply.
added some type-tests on test_gufuncs_linalg.py
3e77076
@ovillellas @pv ovillellas updated documentation 0132755
@ovillellas @pv ovillellas BLD: Windows build fixes + some tabs removed 367f333
@ovillellas @pv ovillellas STY: made sure that split strings had \ at the end d4945ea
@ovillellas @pv ovillellas updated api version, as one merge changed it. 866d230
@ovillellas @pv ovillellas fixed testdet test. It failed due to eigvails failing in single preci…
…sion and notifying the failure as nans.
76a1963
@ovillellas @pv ovillellas TST: fixed test for gufuncs_linalg Det e41e426
@ovillellas @pv ovillellas ENH: cholesky handling of _potrf failures (set result to nan) 2cc6fb4
@ovillellas @pv ovillellas ENH: eigh, eigvalsh set result to nan on LAPACK error (_ssyevd, _heevd) f2b0bdd
@ovillellas @pv ovillellas ENH: solve sets result to nan on LAPACK error (_gesv) fb2270d
@ovillellas @pv ovillellas ENH: inv sets result to nan on LAPACK error (_gesv) 7aa26c3
@ovillellas @pv ovillellas ENH: svd sets results to nan on LAPACK error (_gesdd) ec5020c
@ovillellas @pv ovillellas ENH: chosolve sets result to nan on LAPACK error (_potrf, _potrs) 170726c
@ovillellas @pv ovillellas DOC: Added docstring for eigh 43c25bb
@ovillellas @pv ovillellas DOC: Added docstring to eigvalsh 82976c0
@ovillellas @pv ovillellas DOC: added docstring to solve b75a0cb
@ovillellas @pv ovillellas DOC: added docstring for svd 87cd05f
@ovillellas @pv ovillellas DOC: Added docstring to chosolve 1f79b69
@ovillellas @pv ovillellas DOC: added docstring for poinv bca1bbe
@ovillellas @pv ovillellas DOC: Added notes on error handling. 6e352ad
@ovillellas @pv ovillellas MAINT: renamed umath_linalg module to _umath_linalg as it is internal. 24b727b
@ovillellas @pv ovillellas MAINT: renamed the file describing the gufuncs_linalg module e1c7ed4
@ovillellas @pv ovillellas MAINT: Rewrote the gufuncs_linalg_contents as a rst file and updated it. bbd674d
@ovillellas @pv ovillellas STY: PEP8 1ec2024
@ovillellas @pv ovillellas DOC: corrected documentation - arrays of functions -> arrays of matrices a2afc85
@ovillellas @pv ovillellas ENH: Added np.seterr handling of errors f48c134
@ovillellas @pv ovillellas BLD: python 3.x compile fix 2eeddab
@ovillellas @pv ovillellas BLD: Python3 build problem fixed fd435bc
@ovillellas @pv ovillellas DOC: Added doctests to docstring for fused operations 8556374
@ovillellas @pv ovillellas DOC: added more doctests cd6c20b
@ovillellas @pv ovillellas DOC: changed doctests for eig and eigh to be more robust 05ceb12
@ovillellas @pv ovillellas DOC: fixed doctest in poinv so that the example matrix is positive-de…
…finite.
8e8f247
@ovillellas @pv ovillellas BUG: fixed a bug in eig for complex numbers. Eigenvector results are …
…computed properly now.
7e3176f
@ovillellas @pv ovillellas FIX: problems with eig and eigvals. Enhanced tests. ad8b29b
@ovillellas @pv ovillellas ENH: added complex version for inner1d. Also added dotc1d 60f54b6
@ovillellas @pv ovillellas FIX: matrix_multiply now works when given a column matrix 5dc27ac
@ovillellas @pv ovillellas DOC: changed <NDIMS> to … in shape descriptions in docstrings 0afe276
@ovillellas @pv ovillellas MAINT: added from __future__ as suggested by charris 1f8efc0
@ovillellas @pv ovillellas BLD: reverted api version back to 8 e7a54da
@pv pv MAINT: mark gufuncs_linalg.py as a internal testing-only module 35f4b17
@pv pv ENH: linalg: use _umath_linalg for det() 5b0fead
@pv pv DOC: document the behavior of generalized N-dim linear algebra functions f0a78c7
@pv pv ENH: linalg: use _umath_linalg for slogdet() 87dc3f6
@pv pv ENH: linalg: use _umath_linalg for inv() 7d2fed6
@pv pv ENH: linalg: add helper routines for gufuncs 2e8b24e
@pv pv ENH: linalg: use _umath_linalg for solve() 04ad33e
@pv pv ENH: linalg: use _umath_linalg for cholesky() 2dd6405
@pv pv ENH: linalg: use _umath_linalg for eigvals() 15a9c3b
@pv pv ENH: linalg: use _umath_linalg for eigvalsh() 74e1477
@pv pv ENH: linalg: use _umath_linalg for eig() 1253d57
@pv pv ENH: linalg: use _umath_linalg for svd() bbdca51
@pv pv BUG: core/umath_linalg: ensure FP error status reflects LAPACK error …
…status
63a8aef
@pv pv ENH: linalg: use _umath_linalg for eigh() cc7b048
@pv pv TST: linalg: add tests for generalized linalg functions 9bfa19b
@pv pv MAINT: move umath_linalg under numpy/linalg and use the same lapack_lite
Also, link umath_linalg against the system BLAS/LAPACK if available.
9c00887
@pv pv MAINT: move gufuncs_linalg_contents.rst to the docstring of the module 1b27cb0
@pv pv BUG: linalg: fix Py3 syntax 20cc77a
Owner

pv commented Apr 10, 2013

Rebased.

Owner

njsmith commented Apr 10, 2013

Is there a test for the FP flags thing (63a8aef)? That seems like it could use a test.

For the double/single issue, does it work to use the ufunc dtype= argument? Theoretically this ought to let the internal ufunc machinery make any necessary promotions one-buffer-at-a-time, which is nicer than copying entire arrays.

Owner

pv commented Apr 10, 2013

@njsmith: test_linalg.py:test_reduced_rank fails if the flags aren't cleared (for SVD, but not all of the routines). Not sure how to exactly best test that otherwise, as the issue is floating point errors (0/0 et al) inside LAPACK. Matrices full of inf cause those, but in that case LAPACK also signals a failure... I can't think of a way to directly test this.

The dtype keyword seems buggy for gufuncs, btw (gh-3222):

>>> import numpy as np
>>> a = np.array([[1.+2j,2+3j], [3+4j,4+5j]], dtype=np.cdouble)
>>> np.linalg._umath_linalg.eigvals(a, dtype=np.complex128)
__main__:1: DeprecationWarning: Implicitly casting between incompatible kinds. In a future numpy release, this will raise an error. Use casting="unsafe" if this is intentional.
__main__:1: ComplexWarning: Casting complex values to real discards the imaginary part
array([-0.37228132+0.j,  5.37228132+0.j])
>>> np.linalg._umath_linalg.eigvals(a)
array([-0.35670389-0.26307815j,  5.35670389+7.26307815j])
>>> b = np.array([[1.,2], [3,4]], dtype=np.float64)
>>> np.linalg._umath_linalg.eigvals(b, dtype=np.float64)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: No loop matching the specified signature was found for ufunc eigvals

The signature keyword however seems to work (but is much messier to use) --- does also that do the cast per-block?

I am not sure this change is needed, and may even been dangerous. The function nan_@TYPE@_matrix sets fp errors, so that it basically works as a way to detect the problems. It is highly unlikely that other errors are being raised by the FPU if Blas/Lapack is implemented properly.

The gufunc harness seems to clear the fp_errors before calling the kernel (Mark Wiebe pointed me to the code):
https://github.com/numpy/numpy/blob/master/numpy/core/src/umath/ufunc_object.c#L2068

Executes the kernel:
https://github.com/numpy/numpy/blob/master/numpy/core/src/umath/ufunc_object.c#L2090

And checks the errors:
https://github.com/numpy/numpy/blob/master/numpy/core/src/umath/ufunc_object.c#L2126

It may be dangerous, as the inner loop may be executed several times for a single python call, and using the code in this changes only the error in the last call will be taken into account, as a call with errors will clear any flag that might be present from a previous call before the harness checks the error condition.

Owner

pv replied Apr 11, 2013

It is required if we want to use the invalid FP flag for signalling LAPACK errors. This is not a hypotethical case, there are in fact input matrices even in Numpy's test suite that cause internal FP errors inside LAPACK.

However, you are correct in stating that only the last inner loop is taken into account this way. The error handling should perhaps
be done in a completely different way. One possibility is to raise a Python exception inside the ufunc loop rather than misusing the FP exception mechanism, the only thing to do is to pass the desired exception status (extobj?) in some way to the ufunc inner loop.

yeah. The use of the fp exception is because the error handling that was chosen when implementing the gufuncs was mark with NaNs. In that context it makes sense to raise the fp errors, as in practice some results were set to NaN and setting the flags just meant "there were some NaNs generated".

By the way, with this change all the fp exceptions caused inside LAPACK (and not reported as a LAPACK error, but just setting up the flags) will be lost, right?

The problem with using an exception is that it would mess all the valid values. Path will go through exception handling and the reference to the result matrix will get lost. In fact, this is one of the reasons to look for a better way to handle errors in gufuncs.

Owner

pv replied Apr 11, 2013

One work-around for the present would be to check the FP exception status first, and restore an existing invalid flag afterwards if necessary. This still throws away the FP status flags raised inside LAPACK --- however, I'm not sure that these have practical value as they have so far been ignored in Numpy/Scipy. Note that in test_linalg.py:test_reduced_rank, the SVD is valid and doesn't contain NaNs even though the invalid flag is raised (and a spurious warning is printed), so this might be an OK workaround around what is essentially a bug in LAPACK. What do you think?

The problem with dealing with errors in ufunc loops is twofold: (i) the caller does not pass information contained in the thread-local error object ("extobj") to the inner loop, and (ii) there is no way for the inner loop to signal that it wants to abort the computation immediately. One possibile way to extend this would be to use the data parameter for passing in error handling flags and signaling termination. It is already used in some parts of code to do similar things --- c.f. NpyAuxData, which even has some void pointers reserved for future use.

I would a a (iii) there is no mechanism to report a partial success. A way to notify about the elements that failed to be solved properly, while returning the results for the items that were ok. That way user code could take corrective action only on the items that failed. That is a more complex problem, though. An exception is a real bad option in that case.

Owner

njsmith commented Apr 11, 2013

@pv: I think that dtype=X is just a shorthand for sig=<all X>? But I haven't checked.

pv added some commits Apr 12, 2013

@pv pv BUG: linalg: make umath_linalg to track errors from all inner loop it…
…erations

This ensures that the FP invalid flag always reflects the return code
from LAPACK. Fixes a bug in 63a8aef where umath_linalg raises a
warning only if the error occurs in the last iteration of the ufunc
inner loop.
1b3834d
@pv pv ENH: linalg: use signature= for internal casting rather than astype i…
…n linalg ufuncs
aa8fde0
@pv pv TST: linalg: test return types of generalized linalg routines fb9b5bd
Owner

pv commented Apr 12, 2013

Using signature= for type selection, and do some ugly stuff to make the FP invalid flag reflect the LAPACK error status from all iterations. It would be nice to get the better error handling into ufuncs, but perhaps some ugliness can be tolerated before that.

pv added some commits Apr 13, 2013

@pv pv BUG: linalg: do not assume that GIL is enabled in xerbla_
With the new ufunc-based linalg, GIL is released in ufuncs, and needs to
be reacquired when raising errors in xerbla_.
68c186d
@pv pv BUG: linalg: fix LAPACK error handling in lapack_litemodule
If an exception is pending (raised from xerbla), the routines must
return NULL.
374e0b4
@pv pv TST: linalg: add tests for xerbla functionality (with and without GIL) 8eebee8
Owner

pv commented Apr 13, 2013

And then fixes for a GIL bug if a LAPACK error condition occurs.

Owner

charris commented Apr 14, 2013

@pv Do you feel this is in a state to go in?

Owner

pv commented Apr 14, 2013

I'm not aware of anything that breaks with this change (scipy, nipy, scikit-learn, pandas, statsmodels, networkx seem to suffer no regressions). Although this PR doesn't implement everything in the roadmap, it stands alone OK, and I think there's not more to add to it.

Should be advertised on the ML after (or before) merging.

Owner

charris commented Apr 15, 2013

The list post is getting much (any) response, but I'll wait a few more days.

@pv What about something like npymath, say npylapack_lite, for the combined library?

Owner

pv commented Apr 15, 2013

@charris: I don't see immediate use cases for npylapack_lite, as BLAS is widely available, if that's what you meant.

Owner

pv commented Apr 17, 2013

No further input seems to be coming, so maybe if there are no remaining concerns, we can merge and be done with it.

Owner

charris commented Apr 17, 2013

OK, in it goes.

@charris charris added a commit that referenced this pull request Apr 17, 2013

@charris charris Merge pull request #3220 from pv/linalg-gu
Add generalized ufunc linalg functions and make numpy.linalg use them
1975606

@charris charris merged commit 1975606 into numpy:master Apr 17, 2013

1 check passed

default The Travis build passed
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment