<a href="https://colab.research.google.com/github/stephenbeckr/convex-optimization-class/blob/master/Demos/ConjugateGradientDemo.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Conjugate Gradient
... and related Krylov subspace methods

For solving $Ax=b$ and related problems (e.g., least-squares).  Use Krylov subspace methods if all of the following criteria are met:
1. $A$ is very large (let $A$ be $n\times n$)
2. The multiply $Ax$ can be done faster than $O(n^2)$, e.g.
  - $A$ is very sparse
  - $A$ if from a fast operator, like a FFT
3. $A$ is somewhat well-conditioned

Just how large, or how well-conditioned, or how sparse depends, and there's no simple answer (other than just try it)

APPM 5630 Advanced Convex Optimization, Spring 2021, Becker

In [50]:
import numpy as np
import scipy.sparse as sps
import scipy.sparse.linalg
import scipy.linalg as linalg
from numpy.linalg import norm
import time

In [65]:
n   = int(1e1)

rng   = np.random.default_rng()
# Make sure A is invertrible by adding identity to it
A   = sps.random(n,n,density=0.1,format='csr') + .1*sps.eye(n)
b   = rng.normal(size=(n,1))

print('condition number is', np.linalg.cond( A.toarray() ) )


condition number is 124.04128581497764


In [24]:
A

<10x10 sparse matrix of type '<class 'numpy.float64'>'
	with 19 stored elements in Compressed Sparse Row format>

In [25]:
A.toarray()

array([[0.20235633, 0.        , 0.        , 0.        , 0.        ,
        0.41103886, 0.        , 0.        , 0.        , 0.        ],
       [0.        , 0.01      , 0.        , 0.8839626 , 0.        ,
        0.        , 0.        , 0.        , 0.        , 0.        ],
       [0.68334671, 0.        , 0.01      , 0.        , 0.35071518,
        0.        , 0.        , 0.        , 0.        , 0.        ],
       [0.        , 0.        , 0.        , 0.01      , 0.        ,
        0.        , 0.        , 0.        , 0.        , 0.        ],
       [0.        , 0.        , 0.        , 0.        , 0.01      ,
        0.        , 0.        , 0.53032062, 0.        , 0.        ],
       [0.        , 0.        , 0.        , 0.        , 0.        ,
        0.01      , 0.        , 0.        , 0.        , 0.        ],
       [0.        , 0.        , 0.        , 0.        , 0.85850859,
        0.69137976, 0.01      , 0.        , 0.        , 0.        ],
       [0.        , 0.        , 0.       

In [77]:
x = linalg.solve(A.toarray(),b)
# xCG = sps.linalg.spsolve(A,b)
# xCG, info = sps.linalg.minres(A,b,maxiter=1000) # no, A must be symmetric
def lsqr(A,b):
  x,_,_,_,_,_,_,_,_,_ = sps.linalg.lsqr(A,b)
  return x
xCG = lsqr(A,b)
norm(x.ravel()-xCG.ravel())

1.4566525418517067e-08

# Larger example

In [80]:
n   = int(5e3)
rng   = np.random.default_rng()
A   = sps.random(n,n,density=0.01,format='csr') + 10*sps.eye(n)
b   = rng.normal(size=(n,1))

print("Doing dense version")
tic = time.perf_counter()
x = linalg.solve(A.toarray(),b)
toc_dense = time.perf_counter() - tic

print('Now doing sparse version')
tic = time.perf_counter()
# xCG = sps.linalg.spsolve(A,b) # This is very slow! not recommended
xCG = lsqr(A,b) # nice and fast
toc_sparse = time.perf_counter() - tic

e = norm(x.ravel()-xCG.ravel())
print(f"n x n matrix with n={n:d}")
print(f"Took {toc_dense:.1f} sec for dense version (Gaussian elimination...)")
print(f"Took {toc_sparse:.1f} sec for sparse version (CG, LSQR, ...)")
print(f"Difference between versions {e:.1e}")

Doing dense version
Now doing sparse version
n x n matrix with n=5000
Took 3.2 sec for dense version (Gaussian elimination...)
Took 0.0 sec for sparse version (CG, LSQR, ...)
Difference between versions 6.8e-07
