Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: sparse.linalg: Reimplement Krylov iterative methods #13638

Closed
wants to merge 3 commits into from
Closed

ENH: sparse.linalg: Reimplement Krylov iterative methods #13638

wants to merge 3 commits into from

Conversation

zhaog6
Copy link
Contributor

@zhaog6 zhaog6 commented Mar 2, 2021

Reimplemented Krylov iterative methods to improve the speed of the solution and added other Krylov methods such as improved-CGS, CGNR, and CGNE

Reference issue

Y. Saad. Iterative Methods for Sparse Linear Systems, PWS Publishing Company, Boston, MA, 1996; reprinted, SIAM, Philadelphia, 2003.

What does this implement/fix?

Reimplemented Krylov iterative methods (such as CG, CGS, BiCG, BICGSTAB and GMRES) and obtained a better performance (CPU time of all methods is nearly half of the original code implemented by scipy/sparse/linalg/isolve/iterative.py). Furthermore, this implementation also added other Krylov methods (such as improved-CGS, CGNR and CGNE)

Additional information

@tylerjereddy tylerjereddy added scipy.sparse.linalg enhancement A new feature or improvement labels Mar 3, 2021
restrt=None, atol=None, callback_type=None):
"""
Use Generalized Minimal RESidual iteration to solve ``Ax = b``.
def gmres(A, b, x0=None, restart=None, tol=1e-5, maxit=10000, dtype=None, sparams=None, M=None):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't gmres() a public SciPy function? I think we usually try to remain backwards compatible and avoid things like changing maxiter parameter name to maxit and removing an argument completely as was done for atol here.

Maybe a private function or wrapper is needed? I'm not a domain expert though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I am trying to add new gmres() implementation to replace original implementation (CPU time have an improvement). It works well for Poisson equations and convection-diffusion equations in 2D on my server, but it reports error when I submit it to community. As you suggested, I should use maxiter rather than maxit, and atol also should be added. Thanks.

@ilayn
Copy link
Member

ilayn commented Mar 4, 2021

@zhaog6 You can also test these locally via python runtests.py -s sparse that would ease off the CI load.

No complaining just reminding 😃

@zhaog6
Copy link
Contributor Author

zhaog6 commented Mar 4, 2021

@zhaog6 You can also test these locally via python runtests.py -s sparse that would ease off the CI load.

No complaining just reminding 😃

Thanks,

@zhaog6 You can also test these locally via python runtests.py -s sparse that would ease off the CI load.

No complaining just reminding 😃

Thanks for prompt reminding, I'm the github and scipy package newer and did't noticed the disturb of the emails and the CI load, sorry. I has found corresponding test functions file in tests/. I would write a test file and complete these tests locally. Thanks

@ilayn
Copy link
Member

ilayn commented Mar 4, 2021

No problem at all

@zoj613
Copy link
Contributor

zoj613 commented Mar 15, 2021

@zhaog6 I think this needs a rebase to fix the circleci failure due to missing pythran installation.

@zhaog6
Copy link
Contributor Author

zhaog6 commented Mar 16, 2021

@zhaog6 I think this needs a rebase to fix the circleci failure due to missing pythran installation.

Thank you so much, I'll try it according to your suggestion

@zhaog6 zhaog6 closed this Mar 16, 2021
@zhaog6 zhaog6 reopened this Mar 16, 2021
@zoj613
Copy link
Contributor

zoj613 commented Mar 16, 2021

it seems like these failures have nothing to do with your PR. @mdhaber what do you think?

@mdhaber
Copy link
Contributor

mdhaber commented Mar 16, 2021

I don't see the connection, so I'm re-running them.

@zhaog6
Copy link
Contributor Author

zhaog6 commented Mar 17, 2021

it seems like these failures have nothing to do with your PR. @mdhaber what do you think?

Thanks, I'm going to try submitting it again today

@zhaog6
Copy link
Contributor Author

zhaog6 commented Mar 22, 2021

Dear @zoj613 @mdhaber

I have been resolved the conflicts with the base branch and push new commit, whereas it is strange that two checks were not successful. Could you check and test it again for me in github. Thanks a lot.

@zoj613
Copy link
Contributor

zoj613 commented Mar 23, 2021

@zhaog6 it seems like all tests pass. Could you add post benchmarks that show the performance difference between this PR and the main branch? I think it would help get more attention from maintainers and others to review the changes.

@zhaog6
Copy link
Contributor Author

zhaog6 commented Mar 24, 2021

@zhaog6 it seems like all tests pass. Could you add post benchmarks that show the performance difference between this PR and the main branch? I think it would help get more attention from maintainers and others to review the changes.

@zoj613 OK, I'll try to add benchmarks in sparse_linalg_solve.py to show the performance difference. Thanks.

@ilayn ilayn changed the title Reimplemented Krylov iterative methods to improve the speed of the so… ENH: sparse.linalg: Reimplement Krylov iterative methods Mar 24, 2021
@zoj613
Copy link
Contributor

zoj613 commented Mar 24, 2021

You can try follow something similar to this: https://github.com/scipy/scipy/blob/master/benchmarks/benchmarks/sparse_linalg_solve.py

@zhaog6
Copy link
Contributor Author

zhaog6 commented Mar 25, 2021

You can try follow something similar to this: https://github.com/scipy/scipy/blob/master/benchmarks/benchmarks/sparse_linalg_solve.py

Dear @zoj613

Thanks for your reminding. I encountered an issue when running the benchmarks in sparse_linalg_solve.py.
I use the command python runtests.py --bench sparse_linalg_solve with the original codes and the updated codes respectively, the updated codes is only very slightly faster than the original codes. But the performance of the updated codes has significant improvement when I rearrange the code in a file benchmark_test.py as the following:

import numpy as np
import time
from numpy.testing import assert_equal
from scipy import linalg, sparse
from scipy.sparse.linalg import cg

def _create_sparse_poisson1d(n):
    # Make Gilbert Strang's favorite matrix
    # http://www-math.mit.edu/~gs/PIX/cupcakematrix.jpg
    P1d = sparse.diags([[-1]*(n-1), [2]*n, [-1]*(n-1)], [-1, 0, 1])
    assert_equal(P1d.shape, (n, n))
    return P1d

def _create_sparse_poisson2d(n):
    P1d = _create_sparse_poisson1d(n)
    P2d = sparse.kronsum(P1d, P1d)
    assert_equal(P2d.shape, (n*n, n*n))
    return P2d.tocsr()

n = 128
A = _create_sparse_poisson2d(n)
b = np.ones(n*n)
cpu = time.time()
(x, flag) = cg(A, b, tol=1e-8)
print("Elapsed time is {} s".format(cpu-time.time()))

and use the command python benchmark_test.py. Its performance summary is as follows (unit: second):

n Running Code CG CGS iCGS BiCG BiCGStab CGNR CGNE GMRES
128 Original 0.435598 1.32017 - 0.89738 0.829499 - - 9.01621
Updated 0.31725 0.45113 0.73977 0.45081 0.38642 3.28622 3.16411 3.50677
256 Original 2.51642 Failed - 5.34472 4.19092 - - 54.83283
Updated 1.41428 2.57847 2.04661 2.31034 1.87408 39.10662 36.86936 32.83502
512 Original 22.6428 Failed - 41.4629 28.5809 - - -
Updated 12.0016 29.9860 30.3437 19.2072 14.0439 169.810 164.289 -

I don't know the reason. Look forward to your kind reply and suggestions. Thanks.

@zhaog6
Copy link
Contributor Author

zhaog6 commented Mar 31, 2021

@zoj613 I known the reason behind this problem, It calls a single thread (BLAS/LAPACK thread) by default when using python3 runtests.py --bench sparse_linalg_solve, but when I use python3 benchmark_test.py, it calls all threads (BLAS/LAPACK) on local machine. So for a single thread, execute the command line python3 runtests.py --bench sparse_linalg_solve , the result is as follows in my machine (CPU @ 3.2GHz):

Original code (1 thread):

original_j1

Updated code (1 thread):

updated_j1


For multithreads
Original code (8 thread):

original_j8

Updated code (8 thread):

updated_j8


For 32 threads, the performance table is as follows in my server (CPU @ 2.2GHz) [unit: second]:

n Running Code CG CGS iCGS BiCG BiCGStab CGNR CGNE GMRES
128 Original 0.435598 1.32017 - 0.89738 0.829499 - - 9.01621
Updated 0.31725 0.45113 0.73977 0.45081 0.38642 3.28622 3.16411 3.50677
256 Original 2.51642 Failed - 5.34472 4.19092 - - 54.83283
Updated 1.41428 2.57847 2.04661 2.31034 1.87408 39.10662 36.86936 32.83502
512 Original 22.6428 Failed - 41.4629 28.5809 - - -
Updated 12.0016 29.9860 30.3437 19.2072 14.0439 169.810 164.289 -

@zoj613
Copy link
Contributor

zoj613 commented Mar 31, 2021

This PR looks promising to me based on the benchmark results. I think maintainers of scipy like @pv or @rgommers will give more informed feedback on these changes.

Copy link
Member

@rgommers rgommers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @zhaog6. The benchmarks look promising, and there's a lot to unpack in this PR. It would be great if you could focus first on undoing the backwards incompatible changes and formatting-only changes, so that the actual functional changes to gmres and the other existing functions becomes reviewable.

If you do want to keep any changes to keywords (atol, quiet, debug, callback_type), you should explain why the changes are necessary. Please be aware that the bar for breaking backwards compatibility is high, so it's unlikely we'll want to remove keywords.

We may want to do that first,

@@ -270,10 +270,12 @@ def gcrotmk(A, b, x0=None, tol=1e-5, maxiter=1000, M=None, callback=None,
raise ValueError("Invalid value for 'truncate': %r" % (truncate,))

if atol is None:
'''
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change looks unintentional.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your kind suggestions, the comment was added in debugging, and has been deleted. I'll submit it later.

@@ -947,7 +947,7 @@ def gmres_loose(A, b, tol):
"""
b = np.asarray(b)
min_tol = 1000 * np.sqrt(b.size) * np.finfo(b.dtype).eps
return gmres(A, b, tol=max(tol, min_tol), atol=0)
return gmres(A, b, tol=max(tol, min_tol), debug=1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change looks unintentional.

Copy link
Contributor Author

@zhaog6 zhaog6 Apr 6, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the option parameter debug=1 because for GMRES (include CG etc), one usually use the calculated result as residual when computing residuals, such as GMRES use the obtained g[k+1] rather than ||b - Ax|| (computing ||b - Ax|| will bring to a (more) matrix-vector multiplication and inner product/norm in each iteration). But in the case of singular preconditioner, g[k+1] is different from ||b - Ax|| (In fact, the case of singular preconditioner may not make sense in numerical computing). So I set debug=1 to computing residuals by ||b - Ax|| rather than g[k+1].

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, seems like something you want to remove again then.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The atol will be recovered to keep compatible. I'd like to ask if I can add a new named GMRES function while the original gmres(...) still remains

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change looks unintentional.

Indeed, in arpack.py, the option parameter debug=1 is redundant and I'll remove it (No singular preconditioner is involved in this case), thanks

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would the new GMRES do differently? A separate function doesn't sound appealing, it'd be better to add a more sanely named keyword to gmres if possible.

Copy link
Contributor Author

@zhaog6 zhaog6 Apr 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dear @rgommers,

There is something wrong with inner iteration tolerance in GMRES with preconditioner (unpreconditioned GMRES is OK at present), and I'll try to debug the error. Before that, firstly I would like to make the first PR to submit iterative methods except for GMRES (such as CG, CGS, iCGS, BiCG, BiCGStab, CGNR and CGNE), All parameter options have been recovered and these algorithms are compatible with previous versions (the performance is still better than original codes). And then I'll try to fix the errors in preconditioned GMRES and complete the second PR. Thanks.

@@ -128,10 +128,12 @@ def lgmres(A, b, x0=None, tol=1e-5, maxiter=1000, M=None, callback=None,
raise ValueError("RHS must contain only finite numbers")

if atol is None:
'''
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change looks unintentional.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll remove it, thanks


# TODO check that method preserve shape and type
# TODO test both preconditioner methods


sysbit = struct.calcsize("P") * 8
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks a little unusual. Maybe you want something like is_64bit = np.dtype(np.intp).itemsize == 8?

Getting rid of import struct would be nice.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. What I need is it, and I'll try to avoid importing external modules, thanks

@@ -557,36 +563,6 @@ def UT_solve(b):


class TestGMRES:
def test_callback(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's going on here with removing these tests?


import warnings
import numpy as np
import time
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a left-over from development work, can you remove it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, the time module was used to check performance in real time when I use SciPy to solve some physical field equations from PDEs, I'll remove it, thanks.

@@ -93,12 +94,14 @@ def _get_atol(tol, atol, bnrm2, get_residual, routine_name):
"""

if atol is None:
'''
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unwanted change here too

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, it is a redundant comment and I'll remove it, thanks

x, iter_, resid, info, ndx1, ndx2, sclr1, sclr2, ijob = \
revcom(b, x, work, iter_, resid, info, ndx1, ndx2, ijob)
if callback is not None and iter_ > olditer:
def bicg(A, b, x0=None, tol=1e-5, maxiter=10000, sparams=None, M=None, callback=None, quiet=0, debug=0):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks backwards incompatible, in particular the maxiter default change and the removal of atol. You can't just change behavior like that, it will break code for existing users.

Also, why would we want to add quiet and debug keywords? Seems undesired.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be great if you could undo the formatting-only changes and remove the backwards compat breaks, then it'll be much clearer what you changed. Now the diff is basically a full rewrite of the function, which makes it hard to review.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same comments apply to the other function signatures you changed.

@@ -456,44 +728,28 @@ def gmres(A, b, x0=None, tol=1e-5, restart=None, maxiter=None, M=None, callback=

Other parameters
----------------
x0 : {array, matrix}
x0: {array, matrix}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the original was correct, can you revert?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same for other formatting changes here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I'll recover these.

A, M, x, b, postprocess = make_system(A, M, x0, b)

n = len(b)
def cgnr(A, b, x0=None, tol=1e-5, maxiter=None, M=None, callback=None):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think changing the name from cg to cgnr is backward incompatible and it also looks like atol is no longer in the function's signature which will likely break existing code for users.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the rule is simple: don't change any existing names, the only thing you can do is add new keywords at the end of a function signature.

Copy link
Contributor Author

@zhaog6 zhaog6 Apr 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dear @zoj613 @rgommers
Hm, I am sorry for reordering the function names according to the history of these methods rather than original alphabet. I'll reorder the function names of iterative methods according to alphabet in the iterative.py.
CG is reimplemented (As your suggestions earlier, the option atol has not been removed to keep compatibility). The new keyword cgnr is an implementation for Conjugate Gradient Normal Residual, which is a variant of CG (it is used to solve asymmetric problems). I added references cited and will re-submit this code to get rid to the confusion. Thanks

A, M, x, b, postprocess = make_system(A, M, x0, b)

n = len(b)
assert A.shape[0] == A.shape[1] and A.shape[0] == len(b), "The size of the matrix and the right-hand side does not match."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
assert A.shape[0] == A.shape[1] and A.shape[0] == len(b), "The size of the matrix and the right-hand side does not match."
if not (A.shape[0] == A.shape[1] and A.shape[0] == len(b)):
raise ValueError("The size of the matrix and the right-hand side does not match.")

assert is generally regarded as not good practice because using python -OO strips them away, see : https://mail.python.org/pipermail/scipy-dev/2021-March/024590.html

I would suggest changing them in the other lines as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you a lot for sharing your suggestions and experiences in Python, it helps me a lot. I'll try to modify it.

z = M.matvec(r)
d = np.inner(rhat.conjugate(), z)
if d == 0.:
print("Warning: divide by zero in the next iteration")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here is well as in other places where print occurs.

#======================
# CGNR
#======================
@set_docstring('Use Conjugate Gradient Normal Residual iteration to solve '
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like there is quite a bit of code duplication among the cg variants. Im wondering if using a unified function for all of them is better. Then access each particular variant through a keyword argument using cg? something like cg(..., method='nr') or similar. What do you think @rgommers

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dear @zoj613 , @rgommers
For CGNR, it is not common for most numerical software that using cg as the function name and method=nr as the keyword. Although CGNR is a variant of CG, please note that pepole don't want to split CGNR into cg and nr because people think it is a method for symmetric positive definite problem (SPD) when it comes to CG. But it means that one will solve asymmetric problem when it comes to CGNR (both cg and nr are not separated). Maybe it's a good way to deal with it because it is simple, but no one is willing to split the entire function name cgnr (CGNR). Thanks

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I see. Ignore my previous suggestion then.

Copy link
Contributor Author

@zhaog6 zhaog6 Apr 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dear @zoj613 ,
I'll submit the other PR with a faster GMRES later (For GMRES, your suggestion has been accepted because it is only
other implement for accelerating). Because GMRES is relatively complicated, the code would be submitted individually for the convenience of review and comparison. Thanks.

zhaog6 and others added 3 commits April 17, 2021 16:23
fixed TypeError

fixed RuntimeErrors and warnings of divide by zero

fixed RuntimeErrors caused by complex numbers (GMRES supported complex number) and AssertionError

do some modifications with AssertionError

restore iterative.py to check error

add some fixes

use private function to test

Reimplemented faster Krylov subspace iterative methods

complete the faster Krylov iteration

Fixed running errors caused by Python version on github.com/scipy/scpy (for Python3.7 and CPython 3.9)

Fixed the bugs in original test_iterative.py (It should be set to ||b - Ax||/||b - Ax0|| when initial guess is nonzero). Sorry to bother again, but it still works well using Python 3.8 (Ubuntu 20.04.2) in my server

Modify the comment in scipy/sparse/linalg/__init__.py

More set_docstring() are defined in iterative.py (remove redundant function)

Deal with all of the error indents

Fixed Refguide Check errors and AssertionErrors caused by 32-bit system

Fixed compatibility with older versions

iCGS, CGNR and CGNE is new functions (no atol), no need compatibility of atol

The new method names is placed after the existing method names.
@zhaog6 zhaog6 requested a review from rgommers April 19, 2021 00:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement A new feature or improvement scipy.sparse.linalg
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants