Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify definition of preconditioner for sparse linear system solvers #5818

Open
anntzer opened this issue Feb 7, 2016 · 9 comments · May be fixed by #20517
Open

Clarify definition of preconditioner for sparse linear system solvers #5818

anntzer opened this issue Feb 7, 2016 · 9 comments · May be fixed by #20517
Labels
Documentation Issues related to the SciPy documentation. Also check https://github.com/scipy/scipy.org good first issue Good topic for first contributor pull requests, with a relatively straightforward solution scipy.sparse.linalg

Comments

@anntzer
Copy link
Contributor

anntzer commented Feb 7, 2016

http://docs.scipy.org/doc/scipy-0.17.0/reference/generated/scipy.sparse.linalg.cg.html (and other solvers too) currently defines the preconditioner M as "The preconditioner should approximate the inverse of A."
Wikipedia and MATLAB both seem to use the inverse definition (i.e. they solve M^-1.A.x=M^-1.y).
Both definitions seem to exist in the literature(?) but (assuming that the note is correct) I think it would be helpful to clarify which M is meant there ("Note: the definition of the preconditioner matches the definition in reference XYZ, and is the inverse of the definition in reference XYZ'.")

@ev-br
Copy link
Member

ev-br commented Mar 21, 2016

Figuring this out would definitely be very welcome.

@ev-br ev-br added Documentation Issues related to the SciPy documentation. Also check https://github.com/scipy/scipy.org scipy.sparse.linalg labels Mar 21, 2016
@anntzer
Copy link
Contributor Author

anntzer commented Mar 21, 2016

I'm now pretty certain that the required M is indeed an approximation of the inverse of A (nothing else really makes sense).

@pv
Copy link
Member

pv commented Mar 21, 2016

All of scipy routines afaik assume M ~ A^-1, so the docs seem correct.

@mdhaber mdhaber added the good first issue Good topic for first contributor pull requests, with a relatively straightforward solution label May 12, 2021
@mdhaber
Copy link
Contributor

mdhaber commented May 12, 2021

Another good one for someone who is familiar with the math.

@gdcs92
Copy link

gdcs92 commented Jul 11, 2022

Hi! I've done some reading on the preconditioned CG and browsing of the source code, and would like to take this as my first issue.

The sentence "The preconditioner should approximate the inverse of A." seems to imply that we should have M ~ A^-1. This ends up being correct, but a bit confusing.

The implemented CG algorithm solves the left-preconditioned system (*), which in relevant works of the literature such as [1] and [2] is written as M^-1 Ax = M^-1 b. In this form, the convergence of the algorithm depends on the properties of M^-1 A instead of A. So, ideally M should be close to A in the sense that the eigenvalues of M^-1 A are clustered near 1 and || M^-1 A - Id ||_2 is small (see Lecture 40 at [3]).

However, since the CG iterations only require M in order to compute z = M^-1 r, what the implementation expects as its argument M is actually the operator P = M^-1. In this sense, it is correct to say that M ~ A^-1.

A note on the assertion (*) above: looking at the source code, it is not obvious at first glance that this is the case. The CG and other iterative algorithms are implemented using the reverse communications interface (described in [4]), which obscures the flow of execution a bit. But one can conclude this by inspecting the code and comparing with reference [2] (which is even referred to in [4] and in the source code of CGREVCOM.f).

References

[1] Saad, Yousef. "Iterative methods for sparse linear systems". Society for Industrial and Applied Mathematics, 2003.
[2] Barrett, Richard, et al. "Templates for the solution of linear systems: building blocks for iterative methods". Society for Industrial and Applied Mathematics, 1994.
[3] Trefethen, Lloyd N., and David Bau III. "Numerical linear algebra". Vol. 50. Siam, 1997.
[4] Dongarra, Jack, Victor Eijkhout, and Ajay Kalhan. "Reverse communication interface for linear algebra templates for iterative methods." UT, CS-95-291, May (1995).

@melissawm
Copy link
Contributor

@gdcs92 are you still interested in fixing the docs here? It's seems this is indeed the right explanation but I'd suggest making it a bit more concise for docs purposes and point to one (or 2) references instead.

@rogue-nix
Copy link

Hey @melissawm, is this issue still up to be resolved or has been taken up or assigned to someone. I would like to help out in case it remains to be resolved.

@melissawm
Copy link
Contributor

@rogue-nix There is no pull request for it so feel free to work on it 😄

@Afleloup
Copy link

Afleloup commented Apr 17, 2024

I investigated this issue a little bit. It seems like the explication given by @gdcs92 is correct. Usually, the preconditionned system is written M^-1 Ax = M^-1 b with M^-1 approximating A^-1 as it is meant in their previous comment. Because the algorithm only needs to perform operation with M^-1 it is passed as the M argument which is the inverse of the definition with other work in the litterature [1,2]. However, I feel like there is a bigger issue. In fact, the CG algorithm only works on hermitian positive definite matrix but M^-1A is not necessarily that. There is a workaround this issue if you force M^-1 to also be hermitian positive definite (see [2] for more info). This is not reflected in the precondition on M. I will make a PR soon trying to clarify that for the cg and other iterative solver.

References

[1] Trefethen, Lloyd N., and David Bau III. "Numerical linear algebra". Vol. 50. Siam, 1997
[2]Shewchuk, Jonathan R . An Introduction to the Conjugate Gradient Method Without the Agonizing Pain, 1994

Afleloup pushed a commit to Afleloup/scipy that referenced this issue Apr 18, 2024
    linear solver

    The argument M in iterative linear solver
    'bicg', 'bicgstab', 'cg', 'cgs' is not standard notation
    in comparison to the literature. It is the inverse of the
    preconditioner usually denoted M^-1. Their should be a
    precondition on M for the preconditioned conjugate gradient.
    It is now explicited in the documentation and linked
    to relevant references.
    Closes scipy#5818
Afleloup pushed a commit to Afleloup/scipy that referenced this issue Apr 18, 2024
    linear solver

    The argument M in iterative linear solver
    'bicg', 'bicgstab', 'cg', 'cgs' is not standard notation
    in comparison to the literature. It is the inverse of the
    preconditioner usually denoted M^-1. Their should be a
    precondition on M for the preconditioned conjugate gradient.
    It is now explicited in the documentation and linked
    to relevant references.
    Closes scipy#5818
Afleloup pushed a commit to Afleloup/scipy that referenced this issue Apr 19, 2024
    linear solver

    The argument M in iterative linear solver
    'bicg', 'bicgstab', 'cg', 'cgs' has different notations
    in the literature. Its usage has been clarified. Their should be a
    precondition on M for the preconditioned conjugate gradient.
    It is now explicited in the documentation and linked
    to relevant references.
    Closes scipy#5818
Afleloup pushed a commit to Afleloup/scipy that referenced this issue Apr 19, 2024
    linear solver

    The argument M in iterative linear solver
    'bicg', 'bicgstab', 'cg', 'cgs' has different notations
    in the literature. Its usage has been clarified. Their should be a
    precondition on M for the preconditioned conjugate gradient.
    It is now explicited in the documentation and linked
    to relevant references.
    Closes scipy#5818
Afleloup pushed a commit to Afleloup/scipy that referenced this issue Apr 19, 2024
    linear solver

    The argument M in iterative linear solver
    'bicg', 'bicgstab', 'cg', 'cgs' has different notations
    in the literature. Its usage has been clarified. Their should be a
    precondition on M for the preconditioned conjugate gradient.
    It is now explicited in the documentation and linked
    to relevant references.
    Closes scipy#5818
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Documentation Issues related to the SciPy documentation. Also check https://github.com/scipy/scipy.org good first issue Good topic for first contributor pull requests, with a relatively straightforward solution scipy.sparse.linalg
Projects
None yet
8 participants