ENH: estimate condition number for sparse matrices #21620

maxaehle · 2024-09-24T15:36:12Z

Reference issue

Closes gh-18969

What does this implement/fix?

Add

scipy.sparse.linalg.SuperLU.rinvnormest to compute reciprocal norm of inverse matrix given an LU decomposition,
scipy.sparse.linalg.rinvnormest to compute LU decomposition and call the function mentioned previously,
scipy.sparse.linalg.cond1est to compute the 1-norm condition number, calling the function mentioned previously and scipy.sparse.linalg.onenormest

Implemented in the C layer, calls the SuperLU gscon function.

Calls scipy.sparse.linalg.splu and then calls the rinvnorm function of the resulting LU decomposition.

using scipy.sparse.linalg.rinvnormest and ...onenormest

instead of the condition number itself

maxaehle · 2024-09-24T15:39:43Z

scipy/sparse/linalg/_rinvnormest.py

+    >>> np.linalg.cond(A.toarray(), p=1)
+    45.0
+    """
+    return onenormest(A)/rinvnormest(A)


When A is singular, we currently get a RuntimeError: Factor is exactly singular. In contrast, np.linalg.cond gives a numpy.linalg.LinAlgError: Singular matrix. Shall we catch the RuntimeError and raise a LinAlgError instead?

Changing the error type would require a deprecation cycle as it breaks backwards compatibility. I would leave it for a follow up PR.

To be more precise, I meant to catch the LinAlgError raised by the rinvnormest call in cond1est and to replace it by a LinAlgError. This would not break backwards compatibility as cond1est is a new function?

Ah right, it's a new function, then we can choose what we want. What I am not sure about is if we should mix error messages from dense and sparse linalg. There is a MatrixRankWarning from sparse for example. Let me loop in our sparse experts @dschult and @perimosocordiae.

maxaehle · 2024-09-24T15:53:00Z

See #18969 for previous (and not yet completed) discussions on the API.

Also, otvam proposed another implementation on the Scientific Python Discourse forum that does not use SuperLU

dschmitz89

Hi @maxaehle , thanks for the PR. Out of curiosity: could you provide a performance comparison between the new function and np.linalg.cond for the same matrix in sparse and dense form?

The refguide and lint failures need to be fixed, but that should not be too difficult. Besides that, I have one remark about required tests (see below).

I cannot review the C code in depth but I think that a few comments would be useful for long term maintenance and review. SuperLU is very dense code, any pointers will help.

dschmitz89 · 2024-09-24T19:10:45Z

scipy/sparse/linalg/_dsolve/_superluobject.c

+    if (!CHECK_SLU_TYPE(self->type)) {
+        PyErr_SetString(PyExc_ValueError, "unsupported data type");
+        return NULL;


We need a test that this error is indeed raised. Same for the other ValueError below.

Some of the errors might be unreachable because the SuperLU decomposition would fail before the rinvnorm function is called, but I've added a check for the norm argument in d43b143

ilayn · 2024-09-24T19:47:02Z

I think the terminology is going a bit different here. The condition number is defined as $\kappa(A) = \left|A\right|\cdot \left|A^{-1}\right|$ with respect to inversion and not the ratio of these quantities. And gscon returns $1/\kappa$ hence the name rcondest with "reciprocal" of the estimate. Thus it is the inverse of cond1est.

So they are inverse of one another. I'm not sure the docstrings are also correct in that regard. onenormest proposed in Discourse is already in sparse.linalg and also another implementation was in dense linalg until recently

scipy/scipy/linalg/_matfuncs_expm.pyx.in

Lines 495 to 628 in 867638e

    
           def _norm1est(A, m=1, t=2, max_iter=5): 
        
               """Compute a lower bound for the 1-norm of 2D matrix A or its powers. 
        
               Computing the 1-norm of 8th or 10th power of a very large array is a very 
        
               wasteful computation if we explicitly compute the actual power. The 
        
               estimation exploits (in a nutshell) the following: 
        
                   (A @ A @ ... A) @ <thin array> = (A @ (A @ (... @ (A @ <thin array>))) 
        
               And in fact all the rest is practically Ward's power method with ``t`` 
        
               starting vectors, hence, thin array and smarter selection of those vectors. 
        
               Thus at some point ``expm`` which uses this function to scale-square, will 
        
               switch to estimating when ``np.abs(A).sum(axis=0).max()`` becomes slower 
        
               than the estimate (``linalg.norm`` is even slower). Currently the switch 
        
               is chosen to be ``n=400``. 
        
               Parameters 
        
               ---------- 
        
               A : ndarray 
        
                   Input square array of shape (N, N). 
        
               m : int, optional 
        
                   If it is different than one, then m-th power of the matrix norm is 
        
                   computed. 
        
               t : int, optional 
        
                   The number of columns of the internal matrix used in the iterations. 
        
               max_iter : int, optional 
        
                   The number of total iterations to be performed. Problems that require 
        
                   more than 5 iterations are rarely reported in practice. 
        
               Returns 
        
               ------- 
        
               c : float 
        
                   The resulting 1-norm condition number estimate of A. 
        
               Notes 
        
               ----- 
        
               Implements a SciPy adaptation of Algorithm 2.4 of [1], and the original 
        
               Fortran code given in [2]. 
        
               The algorithm involves randomized elements and hence if needed, the seed 
        
               of the Python built-in "random" module can be set for reproducible results. 
        
               References 
        
               ---------- 
        
               .. [1] Nicholas J. Higham and Francoise Tisseur (2000), "A Block Algorithm 
        
                      for Matrix 1-Norm Estimation, with an Application to 1-Norm 
        
                      Pseudospectra." SIAM J. Matrix Anal. Appl. 21(4):1185-1201, 
        
                      :doi:`10.1137/S0895479899356080` 
        
               .. [2] Sheung Hun Cheng, Nicholas J. Higham (2001), "Implementation for 
        
                      LAPACK of a Block Algorithm for Matrix 1-Norm Estimation", 
        
                      NA Report 393 
        
               """ 
        
               # We skip parallel col test for complex inputs 
        
               real_A = np.isrealobj(A) 
        
               n = A.shape[0] 
        
               est_old = 0 
        
               ind_hist = [] 
        
               S = np.zeros([n, 2*t], dtype=np.int8 if real_A else A.dtype) 
        
               Y = np.empty([n, t], dtype=A.dtype) 
        
               Y[:, 0] = A.sum(axis=1) / n 
        
               # Higham and Tisseur assigns random 1, -1 for initialization but they also 
        
               # mention that it is arbitrary. Hence instead we use e_j to already start 
        
               # the while loop. Also we don't use a temporary X but keep indices instead 
        
               if t > 1: 
        
                   cols = random.sample(population=range(n), k=t-1) 
        
                   Y[:, 1:t] = A[:, cols] 
        
                   ind_hist += cols 
        
               for k in range(max_iter): 
        
                   if m >= 1: 
        
                       for _ in range(m-1): 
        
                           Y = A @ Y 
        
                   Y_sums = (np.abs(Y)).sum(axis=0) 
        
                   best_j = np.argmax(Y_sums) 
        
                   est = Y_sums[best_j] 
        
                   if est <= est_old:  # (1) 
        
                       est = est_old 
        
                       break 
        
                   # else: 
        
                       # w = Y[:, best_j] 
        
                   est_old = est 
        
                   S[:, :t] = S[:, t:] 
        
                   if real_A: 
        
                       S[:, t:] = np.signbit(Y) 
        
                   else: 
        
                       S[:, t:].fill(1) 
        
                       mask = Y != 0. 
        
                       S[:, t:][mask] = Y[mask] / np.abs(Y[mask]) 
        
                   if t > 1 and real_A: 
        
                       # (2) 
        
                       if ((S[:, t:].T @ S[:, :t]).max(axis=1) == n).all() and k > 0: 
        
                           break 
        
                       else: 
        
                           max_spin = math.ceil(n / t) 
        
                           for col in range(t): 
        
                               curr_col = t + col 
        
                               n_it = 0 
        
                               while round(np.abs(S[:, col] @ S[:, :curr_col]).max()) == n: 
        
                                   S[:, col] = random.choices([1, -1], k=n) 
        
                                   n_it += 1 
        
                                   if n_it > max_spin: 
        
                                       break 
        
                   # (3) 
        
                   Z = A.conj().T @ S 
        
                   if m >= 1: 
        
                       for _ in range(m-1): 
        
                           Z = A.conj().T @ Z 
        
                   Z_sums = (np.abs(Z)).sum(axis=1) 
        
                   if np.argmax(Z_sums) == best_j:  # (4) 
        
                       break 
        
                   h_sorter = np.argsort(Z_sums) 
        
                   if all([x in ind_hist for x in h_sorter[:t]]):  # (5) 
        
                       break 
        
                   else: 
        
                       pick = random.choice(range(n)) 
        
                       for _ in range(t): 
        
                           while pick in ind_hist: 
        
                               pick = random.choice(range(n)) 
        
                           ind_hist += [pick] 
        
                       Y = A[:, ind_hist[-t:]] 
        
               # v = np.zeros_like(X[:, 0])  # just some equal size array 
        
               # v[best_j] = 1 
        
               return est  # , v, w

which can be rearranged to use matvecs and rmatvecs only.

maxaehle · 2024-09-25T12:41:28Z

I've added non-zero imaginary part to the complex64 and complex128 testcases in 7e33770 and since then the complex testcases fail. I need to investigate this further.

maxaehle · 2024-09-25T12:44:36Z

I think the terminology is going a bit different here.

I'm not sure which changes would be required. The new cond1est computes onenormest(A)/rinvnormest(A), which is |A| divided by 1/|A^-1|

maxaehle · 2024-09-25T13:50:25Z

Here's a performance plot for 500x500 np.float64 matrices, produced with performance.py.txt on a 4-core system:

For more dense matrices (200x200 np.float64), the performance advantage is not as big:

ilayn · 2024-09-26T20:30:12Z

I'm not sure which changes would be required. The new cond1est computes onenormest(A)/rinvnormest(A), which is |A| divided by 1/|A^-1|

What I mean is you can get cond1est by the reciprocal of the output of gscon directly.

Here is the docstring of dgscon

DGSCON estimates the reciprocal of the condition number of a general
real matrix A, in either the 1-norm or the infinity-norm, using
the LU factorization computed by DGSTRF. *

So where does the inverse come in?

maxaehle · 2024-10-01T08:14:28Z

dgscon takes an input argument anorm, which is the norm of the matrix A (not inverted). In our implementation of rinvnormest, we set anorm to 1.0. I think it would be a weird API to have the scipy sparse cond1est function ask for the norm of A as an input argument; it appears cleaner to have cond1est compute the norm of A and multiply by it.

ilayn · 2024-10-01T13:50:35Z

Yes, you are right. And that's why typically, you slap a call to lacon2 before the gscon call to get an estimate (the un-SAVEd version of lacon). Basically SuperLU is following the LAPACK model of getting the condition number with the difference that lacon2 only needs rmatvev/matvec operation for the estimate.

maxaehle · 2024-10-01T14:17:48Z

Interesting, I wasn't aware of lacon2 and currently use SciPy's own onenormest. Are you suggesting one of these:

wrap SuperLU's lacon2 to Python and use it in cond1est (and potentially make it part of the public API)
call lacon2 in the C++ code and offer cond1est directly instead of rinvnormest
or something else

or are you ok with the current version of the API?

ilayn · 2024-10-01T14:53:59Z

Yes the story is a bit convoluted. The one norm estimation originally started from the need of matrix exponential. When the original expm was being written it is written for both sparse and the dense versions. In that algorithm, you need to investigate the one-norm of an array repeatedly hence for very large arrays 1-norm estimation (instead of calculating exactly) becomes very important. Then we rewrote the dense version independently and broke that relationship between sparse and dense expm.

Since you are wrapping SuperLU within C, I think lacon2 way is more ergonomic for your current way of doing it, that is to say just call lacon2 inside the function. A similar thing is also happening in the dense expm, but obviously here I use gemv since the dense array is available;

scipy/scipy/linalg/_matfuncs_expm.c

Lines 258 to 287 in e7c89a7

    
           double 
        
           dnorm1est(double* A, int n) 
        
           { 
        
               int from = 2*n, to = n, kase = 0, tempint, int1 = 1; 
        
               int isave[3]; 
        
               double est, dbl1 = 1.0, dbl0 = 0.0; 
        
               char* opA; 
        
               double* work_arr = PyMem_RawMalloc(3*n*sizeof(double)); 
        
               if (!work_arr) { return -100; } 
        
               int* iwork_arr = PyMem_RawMalloc(n*sizeof(int)); 
        
               if (!iwork_arr) { PyMem_RawFree(work_arr);return -101; } 
        
               // 1-norm estimator by reverse communication 
        
               // dlacon( n, v, x, isgn, est, kase ) 
        
               dlacn2_(&n, work_arr, &work_arr[n], iwork_arr, &est, &kase, isave); 
        
               while (kase) 
        
               { 
        
                   opA = (kase == 1 ? "T" : "N"); 
        
                   tempint = from; 
        
                   from = to; 
        
                   to = tempint; 
        
                   dgemv_(opA, &n, &n, &dbl1, A, &n, &work_arr[from], &int1, &dbl0, &work_arr[to], &int1); 
        
                   dlacn2_(&n, work_arr, &work_arr[to], iwork_arr, &est, &kase, isave); 
        
               } 
        
               PyMem_RawFree(work_arr); 
        
               PyMem_RawFree(iwork_arr); 
        
               return est; 
        
           }

But I don't know if you have access to matvec/rmatvec methods at C level. If that is not a problem indeed number 2 is much simpler in my opinion.

For number one, I think onenormest is good enough for folks historically speaking. But as you rightly pointed out passing the norm value back to C-level seems a bit awkward in terms of public API. So I agree that's a bit limping.

maxaehle added 6 commits September 24, 2024 12:51

ENH: add scipy.sparse.linalg.SuperLU.rinvnormest

038db9c

Implemented in the C layer, calls the SuperLU gscon function.

ENH: add scipy.sparse.linalg.rinvnormest

44557bb

Calls scipy.sparse.linalg.splu and then calls the rinvnorm function of the resulting LU decomposition.

ENH: add scipy.sparse.linalg.cond1est

6a601db

using scipy.sparse.linalg.rinvnormest and ...onenormest

TST: Add rinvnormest related tests

70c1238

BUG: scipy.sparse.linalg.cond1est returned reciprocal

6ac0c76

instead of the condition number itself

TST: Add test for scipy.sparse.linalg.cond1est

f7253b2

maxaehle requested a review from perimosocordiae as a code owner September 24, 2024 15:36

github-actions bot added scipy.sparse.linalg C/C++ Items related to the internal C/C++ code base Meson Items related to the introduction of Meson as the new build system for SciPy enhancement A new feature or improvement labels Sep 24, 2024

maxaehle commented Sep 24, 2024

View reviewed changes

lucascolley requested a review from dschmitz89 September 24, 2024 16:49

lucascolley removed the Meson Items related to the introduction of Meson as the new build system for SciPy label Sep 24, 2024

dschmitz89 reviewed Sep 24, 2024

View reviewed changes

maxaehle added 2 commits September 25, 2024 09:21

TST: Test if ValueEror is raised in rinvnormest

d43b143

BUG/TST: test sparse rinvnormest with actual float/complex matrices

7e33770

maxaehle mentioned this pull request Oct 1, 2024

Poor accuracy of dgecon Reference-LAPACK/lapack#1056

Open

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH: estimate condition number for sparse matrices #21620

ENH: estimate condition number for sparse matrices #21620

maxaehle commented Sep 24, 2024

maxaehle Sep 24, 2024

dschmitz89 Sep 24, 2024

maxaehle Sep 25, 2024

dschmitz89 Sep 26, 2024

maxaehle commented Sep 24, 2024

dschmitz89 left a comment •

edited

Loading

dschmitz89 Sep 24, 2024

maxaehle Sep 25, 2024

ilayn commented Sep 24, 2024

maxaehle commented Sep 25, 2024

maxaehle commented Sep 25, 2024

maxaehle commented Sep 25, 2024

ilayn commented Sep 26, 2024

maxaehle commented Oct 1, 2024

ilayn commented Oct 1, 2024

maxaehle commented Oct 1, 2024

ilayn commented Oct 1, 2024

ENH: estimate condition number for sparse matrices #21620

Are you sure you want to change the base?

ENH: estimate condition number for sparse matrices #21620

Conversation

maxaehle commented Sep 24, 2024

Reference issue

What does this implement/fix?

maxaehle Sep 24, 2024

Choose a reason for hiding this comment

dschmitz89 Sep 24, 2024

Choose a reason for hiding this comment

maxaehle Sep 25, 2024

Choose a reason for hiding this comment

dschmitz89 Sep 26, 2024

Choose a reason for hiding this comment

maxaehle commented Sep 24, 2024

dschmitz89 left a comment • edited Loading

Choose a reason for hiding this comment

dschmitz89 Sep 24, 2024

Choose a reason for hiding this comment

maxaehle Sep 25, 2024

Choose a reason for hiding this comment

ilayn commented Sep 24, 2024

maxaehle commented Sep 25, 2024

maxaehle commented Sep 25, 2024

maxaehle commented Sep 25, 2024

ilayn commented Sep 26, 2024

maxaehle commented Oct 1, 2024

ilayn commented Oct 1, 2024

maxaehle commented Oct 1, 2024

ilayn commented Oct 1, 2024

dschmitz89 left a comment •

edited

Loading