BUG: test failure in `test_x0_equals_Mb` with `bicgstab` #15533

h-vetinari · 2022-02-06T11:49:18Z

In conda-forge, we ran into new test failure for scipy 1.8, which appears only with MKL (which is available in cf only for x86), and only if the processor supports AVX512 (which azure CI only has for linux & windows), see conda-forge/scipy-feedstock#199.

To not further delay the release of 1.8, we skipped that test for now, but it should IMO be fixed, especially as such processors are becoming more and more common (in fact, it's getting harder and harder to purposefully catch a non-AVX512 windows CI agent on azure)

The failure is in test_x0_equals_Mb[bicgstab] and looks as follows:

=================================== FAILURES ===================================
_________________________ test_x0_equals_Mb[bicgstab] __________________________
[...]/lib/python3.8/site-packages/scipy/sparse/linalg/_isolve/tests/test_iterative.py:538: in test_x0_equals_Mb
    assert_equal(info, 0)
E   AssertionError: 
E   Items are not equal:
E    ACTUAL: -11
E    DESIRED: 0
        A          = <10x10 sparse matrix of type '<class 'numpy.complex64'>'
	with 19 stored elements in Compressed Sparse Row format>
        b          = array([0., 1., 2., 3., 4., 5., 6., 7., 8., 9.])
        case       = <nonsymposdef>
        info       = -11
        solver     = <function bicgstab at 0x7fc6545a4ee0>
        sup        = <numpy.testing._private.utils.suppress_warnings object at 0x7fc62555d040>
        tol        = 1e-08
        x          = array([0.        +0.j, 0.50000358+0.j, 1.25016941+0.j, 2.12653797+0.j,
       3.0674186 +0.j, 4.03716976+0.j, 5.0158146 +0.j, 6.00288608+0.j,
       7.        +0.j, 8.        +0.j])
        x0         = 'Mb'

Also, at the end of the test suite, some (seemingly delayed) log output appears that is perhaps relevant:


   Normal return from subroutine COBYLA

   NFVALS =   50   F = 2.485185E+01    MAXCV = 1.999965E-10
   X = 4.955358E+00   6.666553E-01

 NNLS quitting on iteration count.

The text was updated successfully, but these errors were encountered:

h-vetinari · 2022-02-06T12:13:07Z

Note: it's possible that this is an MKL bug. In that case. If people think that's the most realistic (and/or there's no way to workaround), then we can raise it with Intel some way, e.g. at https://github.com/conda-forge/intel_repack-feedstock/

h-vetinari · 2022-05-24T14:18:46Z

I think this might actually be showing up in the doctests... I had a failed CI run with

scipy.sparse.linalg.bicgstab
----------------------------

File "build/testenv/lib/python3.9/site-packages/scipy/sparse/linalg/_isolve/iterative.py", line 62, in bicgstab
Failed example:
    np.allclose(A.dot(x), b)
Expected:
    True
Got:
    False

which happens in the same bicgstab method, and could be flaky based on whether the assigned agent supports AVX512 or not.

h-vetinari · 2022-05-24T14:22:14Z

~~This should be remilestoned as 1.8.1 has been released...~~

Forgot that I can do that 😅
Done...

WarrenWeckesser · 2022-05-25T18:33:14Z

I think this might actually be showing up in the doctests...

@h-vetinari, that sporadic doctest failure should be fixed by #16276

h-vetinari · 2022-05-25T21:15:49Z

Ah, thanks a lot for tracking that down @WarrenWeckesser! Guess I was just fooled by seeing the failure affect the same exact function.

rgommers · 2022-07-05T08:57:15Z

No movement and not a blocker for 1.9.0, so bumping the milestone.

h-vetinari · 2023-01-27T12:48:51Z

I bumped the milestone for now, but I'd be interested in fixing this (because it's a persistent failure in conda-forge for MKL). The problem is that I just don't know that code at all, so I'd probably need some pointers.

I did have a look now nevertheless, and the error code -11 stands for:

*                BREAKDOWN: If parameters RHO or OMEGA become smaller
*                   than some tolerance, the program will terminate.
*                   Here we check against tolerance BREAKTOL.
*
*                  -10: RHO  .ls.  BREAKTOL: RHO and RTLD have become 
*                                       orthogonal.
*                  -11: OMEGA  .ls.  BREAKTOL: S and T have become 
*                                         orthogonal relative to T'*T.

ilayn · 2023-01-27T14:41:55Z

This is also interesting from my side since I also had some issues with this on OpenBLAS side. Is it possible to have the CPU type exposed to see whether they are Tiger Lake?

h-vetinari · 2023-01-27T15:40:37Z

Is it possible to have the CPU type exposed to see whether they are Tiger Lake?

Not quite as detailed, but we have the following that numpy picks up:

Supported SIMD extensions in this NumPy install:
    baseline = SSE,SSE2,SSE3
    found = SSSE3,SSE41,POPCNT,SSE42,AVX,F16C,FMA3,AVX2,AVX512F,AVX512CD,AVX512_SKX
    not found = AVX512_KNL,AVX512_KNM,AVX512_CLX,AVX512_CNL,AVX512_ICL

In contrast, it passes when it's on an agent with the following:

Supported SIMD extensions in this NumPy install:
    baseline = SSE,SSE2,SSE3
    found = SSSE3,SSE41,POPCNT,SSE42,AVX,F16C,FMA3,AVX2
    not found = AVX512F,AVX512CD,AVX512_KNL,AVX512_KNM,AVX512_SKX,AVX512_CLX,AVX512_CNL,AVX512_ICL

I'm not sure how many types of agents there are on azure, but from the results it mostly looks like 2. I can also add some debug output to determine the processor type if that's desired/helpful.

tylerjereddy · 2023-05-29T01:54:05Z

Bumping milestone, branching is imminent (May 29th), and linalg-backend specific test failures are nothing new of course. If a fix is merged before I branch, feel free to adjust back.

ilayn · 2023-05-29T20:16:38Z

This is going to be fixed together with #18488 so 1.12 is fine. I'll make a round the issues that PR closes later.

ilayn · 2023-05-29T20:18:31Z

About the error, this is a breakdown case where the algorithm hits a point that can't proceed further due to numerical problems. But because the tests are/were too tight it was sporadic. Now all tests pass.

h-vetinari · 2023-05-29T21:56:15Z

Amazing news, thanks a lot!

h-vetinari · 2024-04-14T11:14:40Z

Essentially unchanged for 1.13.0:

	1.11.0	1.11.4	1.12.0	1.13.0
linux + x64 + blis + avx2	✔️	✔️	✔️	✔️
linux + x64 + blis + avx512	✔️	✔️	✔️	✔️
linux + x64 + mkl + avx2	✔️	✔️	❌	❌
linux + x64 + mkl + avx512	❌	❌	✔️	✔️
linux + x64 + netlib + avx2	✔️	✔️	✔️	✔️
linux + x64 + netlib + avx512	✔️	✔️	✔️	✔️
linux + x64 + openblas + avx2	✔️	✔️	✔️	✔️
linux + x64 + openblas + avx512	✔️	❌	✔️	✔️
linux + aarch64 + netlib	✔️	✔️	✔️	✔️
linux + aarch64 + openblas	✔️	✔️	❌	❌
osx + x64 + accelerate	✔️	✔️	✔️/❌	✔️
win + x64 + blis + avx2	✔️	❌	✔️	✔️
win + x64 + blis + avx512	✔️	❌	✔️	✔️
win + x64 + mkl + avx2	✔️	✔️	❌	❌
win + x64 + mkl + avx512	❌	✔️	✔️	✔️
win + x64 + openblas + avx2	✔️	✔️	✔️	✔️
win + x64 + openblas + avx512	✔️	✔️	✔️	✔️

ilayn · 2024-04-14T11:20:41Z

@h-vetinari Do you have any bandwidth or tools to play with the failing test and find anything that passes? The test numbers are really not important but the mechanism we are testing here.

h-vetinari · 2024-04-14T12:01:56Z

@h-vetinari Do you have any bandwidth or tools to play with the failing test and find anything that passes? The test numbers are really not important but the mechanism we are testing here.

If you could provide me with a patch that adds a couple of relevant debug-statements, I'm happy to throw it at our infrastructure and post the results. Would that help?

ilayn · 2024-04-14T12:17:37Z

Yes, give me some time and I'll provide something concrete.

h-vetinari · 2024-05-02T04:54:48Z

Gentle ping on some print statements that would help debug this. :)

h-vetinari · 2024-05-23T04:32:37Z

@ilayn, I'll fire up the BLAS testing again after 1.13.1 - do you think I could get some of those debug prints that would allow debugging this? Doesn't have to be perfect or complete, we can iterate...

h-vetinari · 2024-06-15T00:48:08Z

@ilayn, this is basically the last widely occurring issue in conda-forge across all platforms and BLAS flavours. I'd be very happy if we could debug it together. 🙃 Are you still willing/able to provide some basic debug statements so we can start digging?

ilayn · 2024-06-15T05:31:27Z

Yes so sorry about this. I ghosted more than once and I truly apologize. I guess something in my head is resisting to look at this code again somehow. I'll do it today.

ilayn · 2024-06-15T06:15:13Z

Is there any possibility to inject this instead of the current code? I basically sprinkled some print values to check where it goes awry

def bicgstab(A, b, x0=None, *, rtol=1e-5, atol=0., maxiter=None, M=None,
             callback=None):
    """Use BIConjugate Gradient STABilized iteration to solve ``Ax = b``.

    Parameters
    ----------
    A : {sparse matrix, ndarray, LinearOperator}
        The real or complex N-by-N matrix of the linear system.
        Alternatively, ``A`` can be a linear operator which can
        produce ``Ax`` and ``A^T x`` using, e.g.,
        ``scipy.sparse.linalg.LinearOperator``.
    b : ndarray
        Right hand side of the linear system. Has shape (N,) or (N,1).
    x0 : ndarray
        Starting guess for the solution.
    rtol, atol : float, optional
        Parameters for the convergence test. For convergence,
        ``norm(b - A @ x) <= max(rtol*norm(b), atol)`` should be satisfied.
        The default is ``atol=0.`` and ``rtol=1e-5``.
    maxiter : integer
        Maximum number of iterations.  Iteration will stop after maxiter
        steps even if the specified tolerance has not been achieved.
    M : {sparse matrix, ndarray, LinearOperator}
        Preconditioner for A.  The preconditioner should approximate the
        inverse of A.  Effective preconditioning dramatically improves the
        rate of convergence, which implies that fewer iterations are needed
        to reach a given error tolerance.
    callback : function
        User-supplied function to call after each iteration.  It is called
        as callback(xk), where xk is the current solution vector.

    Returns
    -------
    x : ndarray
        The converged solution.
    info : integer
        Provides convergence information:
            0  : successful exit
            >0 : convergence to tolerance not achieved, number of iterations
            <0 : parameter breakdown

    Examples
    --------
    >>> import numpy as np
    >>> from scipy.sparse import csc_matrix
    >>> from scipy.sparse.linalg import bicgstab
    >>> R = np.array([[4, 2, 0, 1],
    ...               [3, 0, 0, 2],
    ...               [0, 1, 1, 1],
    ...               [0, 2, 1, 0]])
    >>> A = csc_matrix(R)
    >>> b = np.array([-1, -0.5, -1, 2])
    >>> x, exit_code = bicgstab(A, b, atol=1e-5)
    >>> print(exit_code)  # 0 indicates successful convergence
    0
    >>> np.allclose(A.dot(x), b)
    True

    """
    A, M, x, b, postprocess = make_system(A, M, x0, b)
    bnrm2 = np.linalg.norm(b)
    print(f"Starting bnrm2 : {bnrm2}")
    atol, _ = _get_atol_rtol('bicgstab', bnrm2, atol, rtol)
    print(f"Starting atol: {atol}")
    if bnrm2 == 0:
        return postprocess(b), 0

    n = len(b)

    dotprod = np.vdot if np.iscomplexobj(x) else np.dot

    if maxiter is None:
        maxiter = n*10

    matvec = A.matvec
    psolve = M.matvec

    # These values make no sense but coming from original Fortran code
    # sqrt might have been meant instead.
    rhotol = np.finfo(x.dtype.char).eps**2
    omegatol = rhotol
    print(f"Starting paramtols rho, omega : {rhotol} / {omegatol}")
    # Dummy values to initialize vars, silence linter warnings
    rho_prev, omega, alpha, p, v = None, None, None, None, None

    r = b - matvec(x) if x.any() else b.copy()
    rtilde = r.copy()
    print(f"Starting residual: {r}")
    for iteration in range(maxiter):
        print(f"=============== Iteration {iteration} ========")
        if np.linalg.norm(r) < atol:  # Are we done?
            return postprocess(x), 0

        rho = dotprod(rtilde, r)
        print(f"Current rho: {rho}")
        if np.abs(rho) < rhotol:  # rho breakdown
            return postprocess(x), -10

        if iteration > 0:
            print(f"Omega check : {np.abs(omega)} < {omegatol}")
            if np.abs(omega) < omegatol:  # omega breakdown
                return postprocess(x), -11

            beta = (rho / rho_prev) * (alpha / omega)
            print(f"Current beta: {beta}")
            p -= omega*v
            p *= beta
            p += r
        else:  # First spin
            s = np.empty_like(r)
            p = r.copy()

        phat = psolve(p)
        v = matvec(phat)
        rv = dotprod(rtilde, v)
        print(f"Current rv : {rv}")
        if rv == 0:
            return postprocess(x), -11
        alpha = rho / rv
        print(f"Current alpha : {alpha}")
        r -= alpha*v
        s[:] = r[:]

        if np.linalg.norm(s) < atol:
            x += alpha*phat
            return postprocess(x), 0

        shat = psolve(s)
        t = matvec(shat)
        omega = dotprod(t, s) / dotprod(t, t)
        x += alpha*phat
        x += omega*shat
        r -= omega*t
        rho_prev = rho

        if callback:
            callback(x)

    else:  # for loop exhausted
        # Return incomplete progress
        return postprocess(x), maxiter

Somehow we have to inject -- --capture=sys to the test execution to get the results though. Not sure if you have that capacity. Conda is a bit of a mystery to me. But if you have a local machine to replicate then it would be awesome.

h-vetinari · 2024-06-16T04:17:09Z

Thank you! :)

Here's the output from the failing test (on CI agents without AVX512F/AVX512CD instructions, i.e. "only" AVX2) - identical on both linux & win:

----------------------------- Captured stdout call -----------------------------
Starting bnrm2 : 16.881943016134134
Starting atol: 1e-08
Starting paramtols rho, omega : 4.930380657631324e-32 / 4.930380657631324e-32
Starting residual: [ 0. -1. -1. -1. -1. -1. -1. -1. -1. -1.]
=============== Iteration 0 ========
Current rho: 9.0
Current rv : 10.0
Current alpha : 0.9
=============== Iteration 1 ========
Current rho: -0.0399449035812673
Omega check : 0.3994490358126722 < 4.930380657631324e-32
Current beta: -0.010000000000000023
Current rv : -0.03595041322314066
Current alpha : 1.1111111111111085
=============== Iteration 2 ========
Current rho: -2.7755575615628914e-17
Omega check : 0.3825675630233262 < 4.930380657631324e-32
Current beta: 2.0180792053039126e-15
Current rv : 0.0

This then somehow leads to info = -11, c.f.

__________________ test_x0_equals_Mb[nonsymposdef-bicgstab] ___________________
[gw1] win32 -- Python 3.12.3 %PREFIX%\python.exe
..\_test_env\Lib\site-packages\scipy\sparse\linalg\_isolve\tests\test_iterative.py:520: in test_x0_equals_Mb
    assert info == 0
E   assert -11 == 0
        A          = <10x10 sparse matrix of type '<class 'numpy.float64'>'
	with 19 stored elements in Compressed Sparse Row format>
        b          = array([0., 1., 2., 3., 4., 5., 6., 7., 8., 9.])
        case       = <nonsymposdef-bicgstab>
        info       = -11
        rtol       = 1e-08
        x          = array([0.        , 0.52361589, 1.32356787, 2.15281624, 3.        ,
       4.        , 5.        , 6.        , 7.        , 8.        ])
        x0         = 'Mb'

ilayn · 2024-06-16T07:39:42Z

OK on my machine which always gets a pass on this test, I got

Starting bnrm2 : 16.881943016134134
Starting atol: 0.00016881943016134135
Starting paramtols rho, omega : 4.930380657631324e-32 / 4.930380657631324e-32
Starting residual: [ 0. -1. -1. -1. -1. -1. -1. -1. -1. -1.]
=============== Iteration 0 ========
Current rho: 9.0
Current rv : 10.0
Current alpha : 0.9
=============== Iteration 1 ========
Current rho: -0.0399449035812674
Omega check : 0.3994490358126724 < 4.930380657631324e-32
Current beta: -0.010000000000000042
Current rv : -0.035950413223140916
Current alpha : 1.1111111111111032
=============== Iteration 2 ========
Current rho: -2.023390733630768e-16
Omega check : 0.3825675630233261 < 4.930380657631324e-32
Current beta: 1.4711864816975902e-14
Current rv : -4.96177387315249e-16
Current alpha : 0.40779583781096335
=============== Iteration 3 ========
Current rho: 1.9545000710325743e-17
Omega check : 0.4502470191790134 < 4.930380657631324e-32
Current beta: -0.08748787666655233
Current rv : 5.1525628801392976e-17
Current alpha : 0.3793258066905406
=============== Iteration 4 ========
Current rho: -2.512269613955742e-18
Omega check : 0.5300027929977077 < 4.930380657631324e-32
Current beta: -0.09199511928154214
Current rv : -2.6020852139652106e-18
Current alpha : 0.9654832210999723
=============== Iteration 5 ========
Current rho: 0.002706428175909658
Omega check : 0.49693326579407965 < 4.930380657631324e-32
Current beta: -2093037073226866.5
Current rv : 5867170329227.434
Current alpha : 4.612833826261236e-16
=============== Iteration 6 ========
Current rho: 0.0016019143903419333
Omega check : 0.571462224373975 < 4.930380657631324e-32
Current beta: 4.77774623675576e-16
Current rv : -0.00038850060507100736
Current alpha : -4.123325342180991
=============== Iteration 7 ========
Current rho: 0.002070293928641972
Omega check : 0.4902358823313919 < 4.930380657631324e-32
Current beta: -10.870141892415262
Current rv : -0.0004441575982355629
Current alpha : -4.6611696768586475
=============== Iteration 8 ========
Current rho: 0.0023866725790334187
Omega check : 0.47011447528419964 < 4.930380657631324e-32
Current beta: -11.430155103786058
Current rv : 1.2440308056737618e-15
Current alpha : 1918499580676.2253

In your case a small miracle happens and we hit rv exactly 0.0 which is quite suspicious for a dot product to be. Then quits because we will divide things with rv. That's why it is giving -11.

If I may, can I ask also to add the following lines just above the print(f"Current rv : {rv}") line

print("p", p)
print("phat", phat)
print("v", v)
print("rtilde", rtilde)

My suspicion is that one of these arrays become exactly 0.0

ilayn · 2024-06-16T07:59:54Z

The numbers are very funky on this one anyways. It has passed already in sufficient times in sufficient machines and we did not hear any screams about the rewrites so far.

So if need be we can add bicgstab to lists on lines 174 and 176 for this specific problem and have some peace of mind. If you want to remove the manual skip on your side.

h-vetinari · 2024-06-16T08:02:29Z

If I may, can I ask also to add the following lines just above the print(f"Current rv : {rv}") line

I restarted a new build with that.

The numbers are very funky on this one anyways. It has passed already in sufficient times in sufficient machines and we did not hear any screams about the rewrites so far.

This issue predates the rewrite, so it appears the rewrite was too good in that it kept bug-for-bug compatibility with the Fortran implementation, at least in this regard. ;-)

So if need be we can add bicgstab to lists on lines 174 and 176 for this specific problem and have some peace of mind. If you want to remove the manual skip on your side.

Can you clarify which file you're talking about?

ilayn · 2024-06-16T08:13:18Z

Can you clarify which file you're talking about?

Ah it is the test_iterative.py the test file itself has a way to turn on and off solvers for each case at the top. The lines 174-176 can have this solver to skip it for this particular case.

h-vetinari · 2024-06-16T10:05:50Z

With updated logging statements:

Starting bnrm2 : 16.881943016134134
Starting atol: 1.6881943016134135e-07
Starting paramtols rho, omega : 4.930380657631324e-32 / 4.930380657631324e-32
Starting residual: [ 0. -1. -1. -1. -1. -1. -1. -1. -1. -1.]
=============== Iteration 0 ========
Current rho: 9.0
p [ 0. -1. -1. -1. -1. -1. -1. -1. -1. -1.]
phat [ 0. -1. -1. -1. -1. -1. -1. -1. -1. -1.]
v [ 0. -2. -1. -1. -1. -1. -1. -1. -1. -1.]
rtilde [ 0. -1. -1. -1. -1. -1. -1. -1. -1. -1.]
Current rv : 10.0
Current alpha : 0.9
=============== Iteration 1 ========
Current rho: -0.0399449035812673
Omega check : 0.3994490358126722 < 4.930380657631324e-32
Current beta: -0.010000000000000023
p [ 0.          0.16289256  0.30545455 -0.05404959 -0.05404959 -0.05404959
 -0.05404959 -0.05404959 -0.05404959 -0.05404959]
phat [ 0.          0.16289256  0.30545455 -0.05404959 -0.05404959 -0.05404959
 -0.05404959 -0.05404959 -0.05404959 -0.05404959]
v [ 0.          0.32578512  0.44801653 -0.41355372 -0.05404959 -0.05404959
 -0.05404959 -0.05404959 -0.05404959 -0.05404959]
rtilde [ 0. -1. -1. -1. -1. -1. -1. -1. -1. -1.]
Current rv : -0.03595041322314066
Current alpha : 1.1111111111111085
=============== Iteration 2 ========
Current rho: -2.7755575615628914e-17
Omega check : 0.3825675630233262 < 4.930380657631324e-32
Current beta: 2.0180792053039126e-15
p [ 0.00000000e+00 -4.72317791e-02 -1.23519843e-01  1.79353779e-02
  1.52816244e-01 -1.61601835e-16 -1.61601835e-16 -1.61601835e-16
 -1.61601835e-16 -1.61601835e-16]
phat [ 0.00000000e+00 -4.72317791e-02 -1.23519843e-01  1.79353779e-02
  1.52816244e-01 -1.61601835e-16 -1.61601835e-16 -1.61601835e-16
 -1.61601835e-16 -1.61601835e-16]
v [ 0.00000000e+00 -9.44635581e-02 -1.99807907e-01  1.59390599e-01
  2.87697110e-01 -1.52816244e-01 -1.61601835e-16 -1.61601835e-16
 -1.61601835e-16 -1.61601835e-16]
rtilde [ 0. -1. -1. -1. -1. -1. -1. -1. -1. -1.]
Current rv : 0.0

ilayn · 2024-06-17T06:37:24Z

Since rv is the dot product of rtilde and v I can't understand how rv can get to 0.0. But maybe it is the truncated precision that numpy does not print.

This seems it boils down to a dot product difference between architectures because this is what I get

=============== Iteration 2 ========
Current rho: -2.023390733630768e-16
Omega check : 0.3825675630233261 < 4.930380657631324e-32
Current beta: 1.4711864816975902e-14
p [ 0.00000000e+00 -4.72317791e-02 -1.23519843e-01  1.79353779e-02
  1.52816244e-01 -7.78011861e-16 -7.78011861e-16 -7.78011861e-16
 -7.78011861e-16 -7.78011861e-16]
phat [ 0.00000000e+00 -4.72317791e-02 -1.23519843e-01  1.79353779e-02
  1.52816244e-01 -7.78011861e-16 -7.78011861e-16 -7.78011861e-16
 -7.78011861e-16 -7.78011861e-16]
v [ 0.00000000e+00 -9.44635581e-02 -1.99807907e-01  1.59390599e-01
  2.87697110e-01 -1.52816244e-01 -7.78011861e-16 -7.78011861e-16
 -7.78011861e-16 -7.78011861e-16]
rtilde [ 0. -1. -1. -1. -1. -1. -1. -1. -1. -1.]
Current rv : -4.96177387315249e-16
Current alpha : 0.40779583781096335

So that rv is pretty much zero anyways so it is an accident that things go through on other architectures. Since the array is as well-conditioned (a bidiagonal array with positive diagonal) as it can be, I'll open a PR to skip this solver in that particular test and hopefully we can leave this behind 😄

h-vetinari · 2024-06-17T07:35:43Z

Since rv is the dot product of rtilde and v I can't understand how rv can get to 0.0

Looks like lots and lots of cancellation. Perhaps it really is perfectly orthogonal in this very well-conditioned example. But that makes me kind of wary to skip this test. Even if rv == 0 is incredibly lucky, it seems like a valid condition to hit, so we might want to change:

scipy/scipy/sparse/linalg/_isolve/iterative.py

Lines 256 to 259 in e984c95

    
           rv = dotprod(rtilde, v) 
        
           if rv == 0: 
        
               return postprocess(x), -11 
        
           alpha = rho / rv

to do something reasonable if other quantities (rho?) are small. Especially since the rho-condition sounds completely unattainable for me, as we're taking eps ** 2, which is tiiiiiiiny:

scipy/scipy/sparse/linalg/_isolve/iterative.py

Lines 223 to 226 in e984c95

    
           # These values make no sense but coming from original Fortran code 
        
           # sqrt might have been meant instead. 
        
           rhotol = np.finfo(x.dtype.char).eps**2 
        
           omegatol = rhotol

Since the array is as well-conditioned (a bidiagonal array with positive diagonal) as it can be, [...]

My conclusion is the opposite. Such a well-behaved example should not fail to converge, especially for an algorithm that advertises itself as stabilized.

ilayn · 2024-06-17T08:02:49Z

Yes but this method is known to breakdown in many cases for example relatively large imaginary parts compared to real parts or problems where search direction and rtilde comes very close.

Another type of system from the literature is

which is quite close to our example.

Because we don't use the randomized rtilde to start but use actual r this is sometimes unavoidable. It is still better than bicg but as you can see the number of solvers in sparse.linalg it is not the end of the story.

h-vetinari · 2024-07-08T07:13:39Z

Closing this as fixed by #20973

h-vetinari added the defect A clear bug or issue that prevents SciPy from being installed or used as expected label Feb 6, 2022

h-vetinari mentioned this issue Feb 6, 2022

TEST: 1.8.x + blas variants conda-forge/scipy-feedstock#199

Closed

tylerjereddy added the scipy.sparse.linalg label Feb 6, 2022

tylerjereddy added this to the 1.8.1 milestone Feb 6, 2022

h-vetinari mentioned this issue May 24, 2022

DEP: Deprecation follow-ups #16257

Merged

h-vetinari modified the milestones: 1.8.1, 1.9.0 May 24, 2022

h-vetinari mentioned this issue May 25, 2022

DOC: remove deprecated functions from refguide #16273

Merged

rgommers modified the milestones: 1.9.0, 1.10.0 Jul 5, 2022

h-vetinari mentioned this issue Jan 27, 2023

TEST: 1.9.x + blas variants conda-forge/scipy-feedstock#223

Closed

h-vetinari modified the milestones: 1.10.0, 1.11.0 Jan 27, 2023

h-vetinari mentioned this issue Jan 27, 2023

TEST: 1.10.x + blas variants conda-forge/scipy-feedstock#224

Closed

tylerjereddy modified the milestones: 1.11.0, 1.12.0 May 29, 2023

ilayn mentioned this issue May 29, 2023

MAINT:ENH:sparse.linalg: Rewrite iterative solvers in Python, remove FORTRAN code #18488

Merged

lucascolley modified the milestones: 1.13.0, 1.13.1 Apr 14, 2024

h-vetinari mentioned this issue Apr 14, 2024

BUG: Test failures from BLAS-flavour testing for 1.13 in conda-forge #20472

Closed

cbouss mentioned this issue Apr 23, 2024

update to 1.13.0 AnacondaRecipes/scipy-feedstock#33

Merged

h-vetinari modified the milestones: 1.13.1, 1.14.0, 1.15.0 May 23, 2024

ilayn mentioned this issue Jun 17, 2024

TST:sparse.linalg: Skip test due to sensitivity to numerical noise #20973

Merged

h-vetinari mentioned this issue Jun 25, 2024

TEST: 1.14.x + blas variants conda-forge/scipy-feedstock#280

Draft

h-vetinari closed this as completed Jul 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: test failure in `test_x0_equals_Mb` with `bicgstab` #15533

BUG: test failure in `test_x0_equals_Mb` with `bicgstab` #15533

h-vetinari commented Feb 6, 2022 •

edited

Loading

h-vetinari commented Feb 6, 2022 •

edited

Loading

h-vetinari commented May 24, 2022

h-vetinari commented May 24, 2022 •

edited

Loading

WarrenWeckesser commented May 25, 2022

h-vetinari commented May 25, 2022

rgommers commented Jul 5, 2022

h-vetinari commented Jan 27, 2023

ilayn commented Jan 27, 2023

h-vetinari commented Jan 27, 2023

tylerjereddy commented May 29, 2023

ilayn commented May 29, 2023

ilayn commented May 29, 2023

h-vetinari commented May 29, 2023

h-vetinari commented Apr 14, 2024

ilayn commented Apr 14, 2024

h-vetinari commented Apr 14, 2024

ilayn commented Apr 14, 2024

h-vetinari commented May 2, 2024

h-vetinari commented May 23, 2024

h-vetinari commented Jun 15, 2024

ilayn commented Jun 15, 2024

ilayn commented Jun 15, 2024 •

edited

Loading

h-vetinari commented Jun 16, 2024

ilayn commented Jun 16, 2024

ilayn commented Jun 16, 2024

h-vetinari commented Jun 16, 2024

ilayn commented Jun 16, 2024

h-vetinari commented Jun 16, 2024

ilayn commented Jun 17, 2024

h-vetinari commented Jun 17, 2024 •

edited

Loading

ilayn commented Jun 17, 2024

h-vetinari commented Jul 8, 2024

BUG: test failure in test_x0_equals_Mb with bicgstab #15533

BUG: test failure in test_x0_equals_Mb with bicgstab #15533

Comments

h-vetinari commented Feb 6, 2022 • edited Loading

h-vetinari commented Feb 6, 2022 • edited Loading

h-vetinari commented May 24, 2022

h-vetinari commented May 24, 2022 • edited Loading

WarrenWeckesser commented May 25, 2022

h-vetinari commented May 25, 2022

rgommers commented Jul 5, 2022

h-vetinari commented Jan 27, 2023

ilayn commented Jan 27, 2023

h-vetinari commented Jan 27, 2023

tylerjereddy commented May 29, 2023

ilayn commented May 29, 2023

ilayn commented May 29, 2023

h-vetinari commented May 29, 2023

h-vetinari commented Apr 14, 2024

ilayn commented Apr 14, 2024

h-vetinari commented Apr 14, 2024

ilayn commented Apr 14, 2024

h-vetinari commented May 2, 2024

h-vetinari commented May 23, 2024

h-vetinari commented Jun 15, 2024

ilayn commented Jun 15, 2024

ilayn commented Jun 15, 2024 • edited Loading

h-vetinari commented Jun 16, 2024

ilayn commented Jun 16, 2024

ilayn commented Jun 16, 2024

h-vetinari commented Jun 16, 2024

ilayn commented Jun 16, 2024

h-vetinari commented Jun 16, 2024

ilayn commented Jun 17, 2024

h-vetinari commented Jun 17, 2024 • edited Loading

ilayn commented Jun 17, 2024

h-vetinari commented Jul 8, 2024

BUG: test failure in `test_x0_equals_Mb` with `bicgstab` #15533

BUG: test failure in `test_x0_equals_Mb` with `bicgstab` #15533

h-vetinari commented Feb 6, 2022 •

edited

Loading

h-vetinari commented Feb 6, 2022 •

edited

Loading

h-vetinari commented May 24, 2022 •

edited

Loading

ilayn commented Jun 15, 2024 •

edited

Loading

h-vetinari commented Jun 17, 2024 •

edited

Loading