# Algorithme du gradient conjugué

In [5]:
using LinearAlgebra
using Optim

Résolvez
$$
\min_x f(x) = \frac{1}{2} x^T A x + b^T x + a
$$
où $A \succ 0$. En posant $\nabla f(x) = 0$, c'est équivalent à résoudre le système linéaire $Ax = -b$.

Construisons la fonction quadratique associée au programme précédent.

In [2]:
f = x -> 0.5*dot(x,A*x)+dot(b,x)

#1 (generic function with 1 method)

## Un exemple simple

Adapté de https://www.rose-hulman.edu/~bryan/lottamath/congrad.pdf

Soit
$$
A =
\begin{pmatrix}
3 & 1 & 0 \\
1 & 2 & 2 \\
0 & 2 & 4
\end{pmatrix}
$$
Considérons la fonction à minimiser
$$
f(x) = \frac{1}{2} x^TAx,
$$
et supposons que nous avons déjà calculer
\begin{align*}
d_0 &= (1, 0, 0)\\
d_1 & = (1, −3, 0)\\
d_2 &= (−2, 6, −5).
\end{align*}

Vérifions que $d_0$, $d_1$ et $d_2$ sont $A$-conjugés.

In [2]:
A = [ 3.0 1 0 ; 1 2 2 ; 0 2 4]
d0 = [ 1.0 0 0 ]'
d1 = [ 1.0 -3.0 0.0 ]'
d2 = [ -2.0 6.0 -5.0]'

println("$(dot(d0, A*d1)) $(dot(d0, A*d2)) $(dot(d1, A*d2))")

LoadError: UndefVarError: dot not defined

In [4]:
eigen(A)

Eigen{Float64,Float64,Array{Float64,2},Array{Float64,1}}
values:
3-element Array{Float64,1}:
 0.47108204270564347
 3.167449191108536
 5.361468766185826
vectors:
3×3 Array{Float64,2}:
 -0.325306   0.916757  0.231804
  0.822673   0.15351   0.547398
 -0.466246  -0.368771  0.804128

Prenons comme solution initiale $x_0 = (1, 2, 3)$. Calculons $x_1$, $x_2$ et $x_3$ en utilisant l'algorithme du gradient conjugué. $x_3$ est-il optimal?

$$
\nabla f(x) = Ax
$$

In [5]:
x0 = [1; 2; 3.0]
-A*x0

3-element Array{Float64,1}:
  -5.0
 -11.0
 -16.0

In [6]:
f = x -> dot(x,A*x)

#3 (generic function with 1 method)

Nous devons calculer $\alpha_k$, $k = 0,1,2$, en résolvant
$$
\min_{\alpha} f(x_k + \alpha d_k)
$$

Afin d'obtenir $\alpha_0$, nous devons minimiser
\begin{align*}
f(x_0 + \alpha d_0) &= \frac{1}{2}
\left(\begin{pmatrix} 1 & 2 & 3\end{pmatrix} + \alpha \begin{pmatrix} 1 & 0 & 0\end{pmatrix} \right)
\begin{pmatrix}
3 & 1 & 0 \\
1 & 2 & 2 \\
0 & 2 & 4
\end{pmatrix}
\left(\begin{pmatrix} 1 \\ 2 \\ 3 \end{pmatrix} + \alpha \begin{pmatrix} 1 \\0 \\0 \end{pmatrix} \right)
\\
& = \frac{1}{2}\begin{pmatrix} 1 + \alpha & 2 & 3 \end{pmatrix}
\begin{pmatrix}
3 & 1 & 0 \\
1 & 2 & 2 \\
0 & 2 & 4
\end{pmatrix}
\begin{pmatrix} 1 + \alpha \\ 2 \\ 3 \end{pmatrix}\\
& = \frac{1}{2}\begin{pmatrix} 1 + \alpha & 2 & 3 \end{pmatrix}
\begin{pmatrix} 5+3\alpha \\ 11+\alpha \\ 16 \end{pmatrix}\\
& = \frac{1}{2}
((1 + \alpha)(5+3\alpha) + 22+2\alpha + 48 ) \\
& = \frac{1}{2}
( 3\alpha^2 + 8\alpha + 5 + 70 + 2\alpha ) \\
& = \frac{3}{2}\alpha^2 + 5\alpha+\frac{75}{2}
\end{align*}
par rapport à $\alpha$.

Nous pouvons l'obtenir en cherchant le zéro de la dérivée par rapport à $\alpha$, c'est-à-dire
$$
\frac{d}{d\alpha} f(x+\alpha d) = 0,
$$
ou

$$
d^T \nabla f(x+\alpha d) = 0
$$

Dès lors, nous devons avoir

$$
3\alpha + 5 = 0
$$
Ainsi,
$$
\alpha_{0} = -\frac{5}{3}
$$
$$
x_1 = x_0 - \frac{5}{3} d_0 = \begin{pmatrix} -\frac{2}{3} \\ 2 \\ 3  \end{pmatrix}
$$

Nous pouvons aussi directement calculer $\alpha_0$ comme
$$
\alpha_0 = - \frac{d_0^T\nabla f(x_0)}{d_0^TAd_0}
$$

In [7]:
x0 = [1 ; 2 ; 3.0]
∇f = x -> A*x

#5 (generic function with 1 method)

In [8]:
d0 = [1 ; 0 ; 0]
α0 = -dot(d0,∇f(x0))/dot(d0,A*d0)

-1.6666666666666667

In [9]:
x1 = x0+α0*d0

3-element Array{Float64,1}:
 -0.6666666666666667
  2.0
  3.0

Une recherche linéaire à partir de $x_1$ dans la direction $d_1$ exige de minimiser
\begin{align*}
f(x_1 + \alpha d_1) & = \left(\begin{pmatrix} -\frac{2}{3} & 2 & 3 \end{pmatrix} + \alpha_1\begin{pmatrix} 1 & -3 & 0 \end{pmatrix} \right)\begin{pmatrix} 3 & 1 & 0 \\
1 & 2 & 2 \\
0 & 2 & 4 \end{pmatrix}\left(\begin{pmatrix}  -\frac{2}{3} \\ 2 \\ 3 \end{pmatrix} +  \alpha_1\begin{pmatrix} 1 \\ -3 \\ 0 \end{pmatrix} \right) \\
& =\frac{15}{2}\alpha^2 - 28\alpha + \frac{100}{3},
\end{align*}
ce qui a lieu en
$$
\alpha_1 = \frac{28}{15},
$$
donnant
$$
x_2 = x_1 + \frac{28}{15}d_1 =
    \begin{pmatrix}
     \frac{6}{5} \\ \frac{-18}{5} \\ 3
    \end{pmatrix}.
$$

In [11]:
norm([-2/3; 2 ;3]), norm([6/5; -18/5 ; 3])

(3.6666666666666665, 4.8373546489791295)

In [12]:
α1 = -dot(d1,A*x1)/dot(d1,A*d1)

1.8666666666666665

In [13]:
28/15

1.8666666666666667

In [14]:
x2 = x1+α1*d1

3×1 Array{Float64,2}:
  1.1999999999999997
 -3.5999999999999996
  3.0

In [15]:
norm(x1), norm(x2)

(3.6666666666666665, 4.8373546489791295)

La recherche linéaire finale à partir de $x_2$ dans la direction $d_2$ requiert de minimiser
$$
f(x_2 + \alpha d_2) = 20 \alpha^2 - 24\alpha + \frac{36}{5},
$$
ce qui a lieu en
$$
\alpha_2 = \frac{3}{5},
$$
donnant
$$
x_3 = x_2 + \frac{3}{5}d_2 =
    \begin{pmatrix}
     0 \\ 0 \\ 0
    \end{pmatrix},
$$
ce qui est bien entendu correct.

Similairement, nous pouvons calculer le nouveau point comme

In [16]:
α2 = -dot(d2,A*x2)/dot(d2,A*d2)
x3 = x2+α2*d2

3×1 Array{Float64,2}:
 -4.440892098500626e-16
  8.881784197001252e-16
 -4.440892098500626e-16

## Une implémentation naïve

Une première version de l'algorithme du gradient conjugué suit.

In [19]:
function cg_quadratic(A:: Matrix, b:: Vector, x0:: Vector, trace:: Bool = false)
    n = length(x0)
    x = x0
    g = b+A*x
    d = -g
    if (trace)
        iter = [ x ]
        iterg = [ norm(g) ]
        iterd = [ norm(d) ]
    end
    k = 0
    
    for k = 1:n-1
        Ad = A*d
        normd = dot(d,Ad)
        α = -dot(d,g)/normd
        x += α*d
        if (trace)
            iter = [ iter; [x] ]
            iterg = [ iterg; norm(g)]
            iterd = [ iterd; norm(d) ]
        end
        g = b+A*x
        β = dot(g,Ad)/normd
        d = -g+β*d
    end

    normd = dot(d,A*d)
    α = -dot(d,g)/normd
    x += α*d
    if (trace)
        g = b+A*x # g must be equal to 0
        iter = [ iter; [x] ]
        iterg = [ iterg; norm(g)]
        iterd = [ iterd; norm(d) ]
        return x, iter, iterg, iterd
    end
    
    return x
end

cg_quadratic (generic function with 2 methods)

Considérons l'exemple simple

In [18]:
A = [2 1; 1 2]
b = [1, 0]
A\(-b)

2-element Array{Float64,1}:
 -0.6666666666666666
  0.3333333333333333

Nous voulons résoudre
$$
    \min_{\alpha} f(x) = \frac{1}{2}x^TAx+b^Tx+c
$$

Ou, de manière équivalente, nous résolvons
$$
    c+\min_{\alpha} f(x) = \frac{1}{2}x^TAx+b^Tx
$$

In [19]:
cg_quadratic(A, b, [0, 0], true)

([-0.6666666666666666, 0.3333333333333333], [[0.0, 0.0], [-0.5, 0.0], [-0.6666666666666666, 0.3333333333333333]], [1.0, 1.0, 0.0], [1.0, 1.0, 0.5590169943749475])

Que se passe-t-il si $A$ n'est pas définie positive?

In [20]:
A = [ 1 2 ; 2 1]
A\(-b)

2-element Array{Float64,1}:
  0.3333333333333333
 -0.6666666666666666

In [21]:
cg_quadratic(A, b, [0, 0], true)

([0.33333333333333326, -0.6666666666666666], [[0.0, 0.0], [-1.0, 0.0], [0.33333333333333326, -0.6666666666666666]], [1.0, 1.0, 1.1102230246251565e-16], [1.0, 1.0, 4.47213595499958])

In [22]:
det(A)

-3.0

In [23]:
eigen(A)

Eigen{Float64,Float64,Array{Float64,2},Array{Float64,1}}
values:
2-element Array{Float64,1}:
 -1.0
  3.0
vectors:
2×2 Array{Float64,2}:
 -0.707107  0.707107
  0.707107  0.707107

In [24]:
cg_quadratic(A, b, [1, 1], true)

([0.3333333333333335, -0.6666666666666667], [[1.0, 1.0], [-0.36986301369863006, -0.02739726027397249], [0.3333333333333335, -0.6666666666666667]], [5.0, 5.0, 2.220446049250313e-16], [5.0, 5.0, 0.9763790695367754])

In [25]:
f([1/3,-2/3])

-0.3333333333333333

In [26]:
f([0,0])

0

Le gradient conjugué trouve la solution du système linéaire, laquelle correspond à un point critique au premier ordre de la fonction.

In [27]:
∇f = x -> A*x+b

#7 (generic function with 1 method)

In [28]:
x = [1.0/3; -2.0/3]
∇f(x)

2-element Array{Float64,1}:
 0.0
 0.0

In [29]:
x = [1; 1]
∇f(x)

2-element Array{Int64,1}:
 4
 3

In [30]:
step= x -> x-α*∇f(x)

#9 (generic function with 1 method)

In [31]:
α = 10
dot(step(x),A*step(x))

6886

In [32]:
λ, u = eigen(A)

Eigen{Float64,Float64,Array{Float64,2},Array{Float64,1}}
values:
2-element Array{Float64,1}:
 -1.0
  3.0
vectors:
2×2 Array{Float64,2}:
 -0.707107  0.707107
  0.707107  0.707107

In [33]:
u

2×2 Array{Float64,2}:
 -0.707107  0.707107
  0.707107  0.707107

In [36]:
x = u[:,1] # premier vecteur propre associé à λ = -1
A*x

2-element Array{Float64,1}:
  0.7071067811865475
 -0.7071067811865475

In [38]:
1.0-norm(x)

1.1102230246251565e-16

In [39]:
α = 10
f = x -> 0.5*dot(x,A*x)+dot(b,x)
f(step(x))

-106.05992052357223

In [40]:
α = 1000
dot(step(x),A*step(x))+dot(b,x)

-1.417629483042249e6

In [41]:
f(x)

-1.2071067811865475

In [43]:
x = [1/3.0; -2/3]
f(x)

0.16666666666666666

In [45]:
cg_quadratic(A, b, x, true)

([NaN, NaN], [[0.3333333333333333, -0.6666666666666666], [NaN, NaN], [NaN, NaN]], [0.0, 0.0, NaN], [0.0, 0.0, NaN])

Nous devons incorporer un test sur $\nabla f(x_k)$!

In [46]:
A = [ 1 2 ; 0 4 ]
eigen(A)

Eigen{Float64,Float64,Array{Float64,2},Array{Float64,1}}
values:
2-element Array{Float64,1}:
 1.0
 4.0
vectors:
2×2 Array{Float64,2}:
 1.0  0.5547
 0.0  0.83205

In [47]:
eigen(A*A')

Eigen{Float64,Float64,Array{Float64,2},Array{Float64,1}}
values:
2-element Array{Float64,1}:
  0.7917560805262003
 20.2082439194738
vectors:
2×2 Array{Float64,2}:
 -0.885022  0.465549
  0.465549  0.885022

In [48]:
A = [ 3 1; 1 2 ]
eigen(A)

Eigen{Float64,Float64,Array{Float64,2},Array{Float64,1}}
values:
2-element Array{Float64,1}:
 1.381966011250105
 3.618033988749895
vectors:
2×2 Array{Float64,2}:
  0.525731  -0.850651
 -0.850651  -0.525731

In [49]:
eigen(A*A')

Eigen{Float64,Float64,Array{Float64,2},Array{Float64,1}}
values:
2-element Array{Float64,1}:
  1.9098300562505257
 13.090169943749475
vectors:
2×2 Array{Float64,2}:
  0.525731  -0.850651
 -0.850651  -0.525731

Un exemple plus complexe.

In [6]:
n = 500;
m = 600;
A = randn(n,m);
A = A * A';  # A is now a positive semi-definite matrix
A = A+I # A is positive definite
b = zeros(n)
for i = 1:n
  b[i] = randn()
end
x0 = zeros(n)

500-element Array{Float64,1}:
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 ⋮
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0

In [7]:
b1 = A\(-b)

500-element Array{Float64,1}:
 -0.01475579536436628
  0.036578110372583725
  0.01672625645651456
  0.034223626543637746
 -0.007132645995545274
  0.0033017990305317067
  0.03296712769649596
 -0.02661108652116706
  0.03752351178372709
  0.02445558975380169
 -0.0007835192227863454
 -0.014472887803839419
  0.013241825498163964
  ⋮
  0.018483288340867498
  0.004774186652439695
 -0.04323885571695603
  0.03132330755198638
 -0.02009264974762955
 -0.02873095994706678
  0.0036079613485544
  0.047772056885805186
  0.01514429356124351
  0.009231913597173314
 -0.012614998516105307
  0.01646393090341269

In [52]:
b2, iter, iterg, iterd = cg_quadratic(A, b, x0, true);

In [53]:
norm(b1-b2)

1.734523990726017e-15

In [54]:
iterg

501-element Array{Float64,1}:
 22.831483117055008
 22.831483117055008
 21.686547381952103
 18.47405790416008
 17.679856509828976
 15.479270961191991
 15.054833502002992
 13.814610447800193
 12.63503173040512
 11.52554330836719
  9.575193528243885
  8.789891097025146
  8.034678645994301
  ⋮
  1.4908635599264138e-13
  1.5371720924180976e-13
  1.4106127271118373e-13
  1.3383977941484951e-13
  1.379272568519458e-13
  1.4489099844039407e-13
  1.5035721853910728e-13
  1.5847713661217427e-13
  1.498716828183743e-13
  1.5549970131117184e-13
  1.4955984269052359e-13
  1.5018105908603662e-13

In [55]:
iterd

501-element Array{Float64,1}:
 22.831483117055008
 22.831483117055008
 29.910303425502924
 28.502776208269697
 31.52834999401924
 28.700365586087482
 31.04293159827266
 29.564980127202784
 27.772256332956076
 25.823717874841616
 20.232614304274787
 19.182387728850454
 17.928887193547382
  ⋮
  1.4708663821848665e-13
  1.5927685062878838e-13
  1.4176822762880018e-13
  1.3711902855630244e-13
  1.4273471264141864e-13
  1.4652378455710525e-13
  1.5095170394201812e-13
  1.6600005925472444e-13
  1.5576346396201643e-13
  1.558812612998778e-13
  1.541227411837348e-13
  1.509204050578899e-13

Cela fonctionne, mais devons-nous vraiment faire 500 itérations? Nous serions satisfaits si nous sommes proches de la solution. Nous pouvons mesurer le résidu du système linéaire residual of the linear system
$$
r = b+Ax,
$$
ce qui n'est rien d'autre que le gradient de la fonction objectif du problème de minimisation quadratique.

In [56]:
iter

501-element Array{Array{Float64,1},1}:
 [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0  …  0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]
 [-0.00017027618072256812, 0.0002731925457880143, 0.0010684115984472631, -0.0004017684535668579, -0.0010245312224726674, -0.0033353595759318296, -0.0011646300744951321, -0.002333894483727998, -0.0007008411809605268, -0.0016820699483587867  …  -0.00033178888615875105, -0.0012883893166555226, -0.0007292784898442888, -0.0004473856499263748, 0.00011689192235378118, 0.003198233491125432, -0.00021722337122564498, 0.0018325294753949153, -0.0006225107923252299, 0.00033662030259510396]
 [-3.825733040628996e-5, -0.0003223953427695385, 0.004278562464667181, -0.0011602153475444349, 0.00037342277557283945, -0.0033164358576595614, -0.00658636068216885, 1.0174343228482909e-5, -0.0019774710069156262, -0.003995304270415299  …  -0.0021765754771554506, -0.0037361475689225527, -0.0007448318703374449, -0.000944699994161559, -0.0013794303177228529, 0.004659504036842

Nous devons inclure un test de convergence dans la fonction.

In [20]:
function cg_quadratic_tol(A:: Matrix, b:: Vector, x0:: Vector, trace:: Bool = false, tol = 1e-8)
    n = length(x0)
    x = x0
    if (trace)
        iter = [ x ]
    end
    g = b+A*x
    d = -g
    k = 0
    
    tol2 = tol*tol

    β = 0.0

    while ((dot(g,g) > tol2) && (k <= n))
        Ad = A*d
        normd = dot(d,Ad)
        α = dot(g,g)/normd
#        α = -dot(d,g)/normd
        x += α*d
        if (trace)
            iter = [ iter; x ]
        end
        g = b+A*x
        β = dot(g,Ad)/normd
        d = -g+β*d
        k += 1
    end

    if (trace)
        iter = [ iter; x ]
        return x, iter, k
    end

    return x, k
end

cg_quadratic_tol (generic function with 3 methods)

In [58]:
x, iter, k = cg_quadratic_tol(A, b, x0, true)

([0.013689587798574444, -0.018853150047148094, 0.03181609788253195, 0.020181414253286397, 0.015775635994805418, -0.0017730089268754073, -0.06801379208995433, -0.0016406184374460314, 0.0017004913597049953, 0.00452904557032978  …  -0.010666867898507406, -0.029847313749673355, 0.014563686983419375, 0.023240255786818563, -0.004510311269642599, 0.015997107774918345, 0.004398135823719092, 0.033231357919214256, -0.0025155084716944457, -0.014162210399088179], Any[[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0  …  0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0], -0.00017027618072256812, 0.0002731925457880143, 0.0010684115984472631, -0.0004017684535668579, -0.0010245312224726674, -0.0033353595759318296, -0.0011646300744951321, -0.002333894483727998, -0.0007008411809605268  …  -0.010666867898507406, -0.029847313749673355, 0.014563686983419375, 0.023240255786818563, -0.004510311269642599, 0.015997107774918345, 0.004398135823719092, 0.033231357919214256, -0.0025155084716944457, -0.014162210399

Le nombre d'itérations est

In [59]:
k

192

Sommes-nous proche de la solution?

In [60]:
norm(b1-x)

1.9379487536581979e-10

In [62]:
size(A)

(500, 500)

ce qui est nettement moindre que la dimension du problème.

## Gradient conjugué préconditionné

Si le nombre de conditionnement est égal à 1, nous convergeons en une itération.

Rappelons que le nombre de conditionnement d'une matrice $A$ définie positive est donné par
$$
\kappa(A) = \frac{\lambda_{\max}}{\lambda_{\min}}.
$$
$\kappa(A) = 1$ ssi $A = \gamma I$. Dans ce cas
$$
A = \begin{pmatrix} \gamma & 0 & \cdots & 0 \\ 0 & \gamma & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & \gamma \end{pmatrix}.
$$
Observons que $\lambda_{\max} = \lambda_{\min} = \gamma$.

Le problème quadratique devient alors
$$
f(x) = \frac{1}{2}\gamma x^Tx + b^Tx.
$$

Son gradient est
$$
\nabla f(x) = \gamma x + b.
$$
Il s'annule si
$$
x = -\frac{b}{\gamma}.
$$

Soit $x_0$. L'algorithme du gradient conjugué donne comme première de recherche $d_0 = -\nabla f(x_0) = -\gamma x_0 - b$.

Nous avons aussi
$$
\alpha_0 = - \frac{d_0^T\nabla f(x_0)}{d_0^TAd_0} = \frac{\| d_0 \|^2}{\gamma \| d_0 \|^2} = \frac{1}{\gamma}.
$$

Le premier itéré donne
$$
x_1 = x_0 + \alpha_0 d_0 = x_0 + \frac{1}{\gamma} (-\gamma x_0 - b) = -\frac{b}{\gamma}.
$$
ce qui correspond bien à la solution!

Si la matrice $A$ est diagonale et tous les éléments de la diagonale sont identiques, la direction de plus forte pente donne le minimum global.

Une implémentation basique d'un algorithme de gradient préconditionné suit, où $M$ est l'inverse du préconditioneur à appliquer.

In [23]:
function pcg_quadratic_tol(A:: Matrix, b:: Vector, x0:: Vector, M:: Matrix,
                           trace:: Bool = false, tol = 1e-8)
    n = length(x0)
    x = x0
    if (trace)
        iter = [ x ]
    end
    g = b+A*x
    v = M*g
    d = -v
    k = 0
    
    tol2 = tol*tol

    β = 0.0

    gv = dot(g,v)
    while ((gv > tol2) && (k <= n))
#    while ((dot(g,g) > tol2) && (k <= n))
        Ad = A*d
        normd = dot(d,Ad)
        #gv = dot(g,v)
        α = gv/normd
        x += α*d
        if (trace)
            iter = [ iter; x ]
        end
        g += α*Ad
        v = M*g
        gvold = gv
        gv = dot(g,v)
        β = gv/gvold
        d = -v+β*d
        k += 1
    end

    if (trace)
        iter = [ iter; x ]
        return x, iter, k
    end

    return x, k
end

pcg_quadratic_tol (generic function with 3 methods)

Let's check first that when there is no preconditioning, we obtain the same iterates.
Set

In [8]:
M = zeros(n,n)+I
x, iter, k = pcg_quadratic_tol(A, b, x0, M, true)

([-0.014755795355726473, 0.03657811037067297, 0.01672625645672698, 0.03422362653550164, -0.007132645988024869, 0.003301799029456351, 0.03296712769157697, -0.026611086508491748, 0.03752351175223354, 0.024455589755072817  …  -0.0432388557131727, 0.03132330756767234, -0.020092649762134724, -0.028730959942786922, 0.0036079613528365675, 0.04777205688339671, 0.015144293557222535, 0.009231913592940113, -0.012614998526054006, 0.016463930911307725], Any[[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0  …  0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0], 0.002454971689675096, 0.0018787612532824508, 0.0007650189467303909, 0.0010692603010837181, -0.0009856034267720784, -0.00042903143663104883, -0.0026990099908390367, 0.0015601625234068378, 0.001971225814925841  …  -0.0432388557131727, 0.03132330756767234, -0.020092649762134724, -0.028730959942786922, 0.0036079613528365675, 0.04777205688339671, 0.015144293557222535, 0.009231913592940113, -0.012614998526054006, 0.016463930911307725], 188)

In [9]:
k, norm(x-b1)

(188, 2.4708593085548377e-10)

We can compute the eigenvalues and condition number of $A$.

In [21]:
eigen(A)

Eigen{Float64,Float64,Array{Float64,2},Array{Float64,1}}
values:
1000-element Array{Float64,1}:
 0.0028299563546996254
 0.006208210038904927
 0.015730444716104408
 0.017138534301330832
 0.0335070451445163
 0.0649919709707869
 0.06964595503229143
 0.08146943154009767
 0.09038106137735147
 0.10588808809377515
 0.10701001857806425
 0.11045181064063048
 0.128268572721848
 ⋮
 9.824558046352754
 9.839092537343337
 9.85389551670673
 9.874713256081098
 9.878592111942382
 9.895246970057734
 9.930168333591583
 9.932200069442763
 9.980157699368622
 9.98393047061021
 9.98403971099756
 9.987247647762787
vectors:
1000×1000 Array{Float64,2}:
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  …  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0 

In [11]:
cond(A)

363.19295318445853

In [12]:
A

500×500 Array{Float64,2}:
 649.63       35.3486    -56.2037   …  -22.6582   -30.7703   -10.5041
  35.3486    573.027      -4.7047      -21.4998    24.0908    16.3941
 -56.2037     -4.7047    701.402        23.4452    41.1332    23.6554
  49.3961      9.08001    13.7468      -15.9837    -1.41378   -2.16623
   0.2819    -34.7654     11.5973      -19.1084    -6.44043    4.4232
  19.251       4.97156    37.2669   …   14.917     14.8889    34.7856
 -16.3865    -35.1627     20.47         13.5135   -26.0354    34.1079
   6.54232    19.4527    -35.6591       28.3743    13.9389    21.2144
 -13.8005     46.0532     23.0489      -13.1363    18.9361   -18.92
  46.6132     12.9965     -6.94255      -1.93375   14.7975    25.7285
 -15.0486     -1.75272   -24.6271   …   -7.3418    43.4084   -27.1134
  21.4708      6.31654     6.27091      37.6651    29.4554   -18.555
  22.5846     -7.69945    47.7306       -2.78425   13.7426   -41.2572
   ⋮                                ⋱                        
 -18

Try to compute a simple precontionner using the inverse of the diagonal of matrix $A$.

In [13]:
D = 1 ./diag(A)
M = Diagonal(D)

500×500 Diagonal{Float64,Array{Float64,1}}:
 0.00153934   ⋅           ⋅          …   ⋅           ⋅           ⋅ 
  ⋅          0.00174512   ⋅              ⋅           ⋅           ⋅ 
  ⋅           ⋅          0.00142572      ⋅           ⋅           ⋅ 
  ⋅           ⋅           ⋅              ⋅           ⋅           ⋅ 
  ⋅           ⋅           ⋅              ⋅           ⋅           ⋅ 
  ⋅           ⋅           ⋅          …   ⋅           ⋅           ⋅ 
  ⋅           ⋅           ⋅              ⋅           ⋅           ⋅ 
  ⋅           ⋅           ⋅              ⋅           ⋅           ⋅ 
  ⋅           ⋅           ⋅              ⋅           ⋅           ⋅ 
  ⋅           ⋅           ⋅              ⋅           ⋅           ⋅ 
  ⋅           ⋅           ⋅          …   ⋅           ⋅           ⋅ 
  ⋅           ⋅           ⋅              ⋅           ⋅           ⋅ 
  ⋅           ⋅           ⋅              ⋅           ⋅           ⋅ 
 ⋮                                   ⋱                          
  ⋅    

Unfortunately, in this case, it does not help as the condition number is not improving.

In [14]:
B = M*A
cond(B)

362.08572744482086

Consider another situation when $A$ is diagonal.

In [15]:
n = 1000;
A = zeros(n,n);
for i = 1:n
    A[i,i] = 10*rand()
end
b = zeros(n)
for i = 1:n
  b[i] = rand()
end
x0 = zeros(n)
cond(A)

3529.1172004039067

The solution we are looking for is

In [22]:
A\b

1000-element Array{Float64,1}:
 0.08058928643204467
 0.6037593942484356
 0.05123362510993889
 0.025288649268727047
 0.3157544051301201
 0.01633725032828882
 0.17723886956247462
 0.111806719707462
 0.037035774664397454
 0.0791549365847929
 0.10488968566308879
 0.1651211131034794
 0.25858107792817053
 ⋮
 0.03937631274640514
 0.005433645135135941
 0.03738742981976994
 0.01999842904935537
 0.15368638561669526
 0.021219216022764106
 0.007449347678908204
 4.706245551595791
 0.07777026560473264
 1.074720053901421
 0.19743189966931754
 0.026613908055054796

Without preconditionning, with have the iterates sequence

In [24]:
M = zeros(n,n)+I
x, iter, k = pcg_quadratic_tol(A, b, x0, M, true)

([-0.08058928645159152, -0.6037593940525089, -0.051233625105610456, -0.025288649264065852, -0.31575440498517776, -0.01633725034142313, -0.17723886973659486, -0.11180671976856532, -0.03703577466971331, -0.0791549365823525  …  -0.03738742990516486, -0.01999842904317544, -0.1536863855781329, -0.021219216014131147, -0.007449347683576828, -4.706245550636268, -0.07777026557251189, -1.0747200527303016, -0.19743189972183642, -0.026613908041785372], Any[[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0  …  0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0], -0.1591998630946615, -0.1790363932585973, -0.08175511399278433, -0.036779484109002406, -0.1072078142343552, -0.02485421862504237, -0.06146062714380509, -0.1406045318312565, -0.05817676259535404  …  -0.03738742990516486, -0.01999842904317544, -0.1536863855781329, -0.021219216014131147, -0.007449347683576828, -4.706245550636268, -0.07777026557251189, -1.0747200527303016, -0.19743189972183642, -0.026613908041785372], 209)

This is equivalent to the unpreconditioned version.

In [25]:
x, iter, k = cg_quadratic_tol(A, b, x0, true)

([-0.08058928645159154, -0.6037593940525088, -0.051233625105610414, -0.02528864926406584, -0.3157544049851781, -0.016337250341423127, -0.17723886973659483, -0.1118067197685653, -0.03703577466971329, -0.07915493658235254  …  -0.03738742990516489, -0.019998429043175437, -0.15368638557813294, -0.021219216014131154, -0.007449347683576829, -4.7062455506362655, -0.07777026557251188, -1.0747200527303018, -0.19743189972183622, -0.026613908041785365], Any[[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0  …  0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0], -0.1591998630946615, -0.1790363932585973, -0.08175511399278433, -0.036779484109002406, -0.1072078142343552, -0.02485421862504237, -0.06146062714380509, -0.1406045318312565, -0.05817676259535404  …  -0.03738742990516489, -0.019998429043175437, -0.15368638557813294, -0.021219216014131154, -0.007449347683576829, -4.7062455506362655, -0.07777026557251188, -1.0747200527303018, -0.19743189972183622, -0.026613908041785365], 209)

However, since $A$ is diagonal, an obvious diagonal preconditionner is $A^{-1}$ itself.

In [26]:
M = zeros(n,n)
for i = 1:n
    M[i,i] = 1/A[i,i]
end

The condition number of the preconditioned matrix is of course equal to 1.

In [27]:
cond(M*A)

1.0000000000000002

The theory then predicts that we converge in one iteration with the precionditionned conjugate gradient.

In [28]:
x, iter, k = pcg_quadratic_tol(A, b, x0, M, true)

([-0.08058928643204467, -0.6037593942484356, -0.05123362510993888, -0.025288649268727044, -0.3157544051301201, -0.01633725032828882, -0.17723886956247462, -0.11180671970746199, -0.037035774664397454, -0.0791549365847929  …  -0.03738742981976994, -0.01999842904935537, -0.15368638561669526, -0.021219216022764106, -0.007449347678908204, -4.706245551595791, -0.07777026560473264, -1.074720053901421, -0.19743189966931754, -0.026613908055054793], Any[[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0  …  0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0], -0.08058928643204467, -0.6037593942484356, -0.05123362510993888, -0.025288649268727044, -0.3157544051301201, -0.01633725032828882, -0.17723886956247462, -0.11180671970746199, -0.037035774664397454  …  -0.03738742981976994, -0.01999842904935537, -0.15368638561669526, -0.021219216022764106, -0.007449347678908204, -4.706245551595791, -0.07777026560473264, -1.074720053901421, -0.19743189966931754, -0.026613908055054793], 1)

Consider now another example.

In [29]:
A = zeros(n,n)+3*I
for i = 1:n-1
    A[i,i+1] = 1.4
    A[i+1,i] = 1.4
end
A

1000×1000 Array{Float64,2}:
 3.0  1.4  0.0  0.0  0.0  0.0  0.0  0.0  …  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 1.4  3.0  1.4  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  1.4  3.0  1.4  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  1.4  3.0  1.4  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  1.4  3.0  1.4  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  1.4  3.0  1.4  0.0  …  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  1.4  3.0  1.4     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  1.4  3.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  1.4     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  …  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  

In [30]:
eigen(A)

Eigen{Float64,Float64,Array{Float64,2},Array{Float64,1}}
values:
1000-element Array{Float64,1}:
 0.20001378984134793
 0.20005515922956107
 0.20012410775715758
 0.20022063474500001
 0.20034473924231072
 0.2004964200266737
 0.2006756756040492
 0.20088250420879217
 0.20111690380366315
 0.20137887207985267
 0.20166840645700262
 0.20198550408323332
 0.20233016183516794
 ⋮
 5.7980144959167665
 5.798331593542997
 5.798621127920147
 5.798883096196337
 5.799117495791208
 5.7993243243959505
 5.799503579973327
 5.79965526075769
 5.799779365255
 5.799875892242843
 5.799944840770439
 5.799986210158653
vectors:
1000×1000 Array{Float64,2}:
 -0.000140286  -0.00028057   -0.000420851  …   0.00028057   0.000140286
  0.00028057    0.000561129   0.000841665      0.000561129  0.00028057
 -0.000420851  -0.000841665  -0.0012624        0.000841665  0.000420851
  0.000561129   0.00112217    0.00168303       0.00112217   0.000561129
 -0.0007014    -0.00140263   -0.00210351       0.00140263   0.0007014
  0.000841

In [31]:
A\(-b)

1000-element Array{Float64,1}:
 -0.1878579826528576
 -0.1319601277848616
 -0.13048433889475186
  0.13707693450040037
 -0.2867389303131553
  0.11741379322106589
 -0.048310001679996474
 -0.22024618791185208
  0.04818692637871188
 -0.07833977533531897
 -0.2199692034457113
  0.030575576265437422
 -0.21307627292025264
  ⋮
 -0.3139680285633356
  0.3505938270597254
 -0.4716762523514729
  0.4877974502429737
 -0.6312099317782925
  0.41577184912386583
 -0.33039956088464967
  0.2496204545118851
 -0.575796657707367
  0.4536778726089102
 -0.5953990030474747
  0.19535096471120988

In [32]:
x, iter, k = cg_quadratic_tol(A, b, x0, true)

([-0.1878579826205031, -0.13196012801774878, -0.13048433838637943, 0.13707693407810365, -0.28673892986952537, 0.11741379250542462, -0.048310000964845314, -0.22024618867999027, 0.048186927163437664, -0.07833977603242326  …  -0.47167625272031055, 0.48779745064722413, -0.6312099322752445, 0.41577184963519326, -0.33039956140340154, 0.24962045501992164, -0.5757966582801719, 0.4536778731100891, -0.5953990032962349, 0.19535096475246563], Any[[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0  …  0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0], -0.14738746893501598, -0.16575215792709863, -0.07568900556607487, -0.03405050083704051, -0.09925315313029848, -0.023010072397639938, -0.05690033959701273, -0.1301718837225351, -0.053860132936572075  …  -0.47167625272031055, 0.48779745064722413, -0.6312099322752445, 0.41577184963519326, -0.33039956140340154, 0.24962045501992164, -0.5757966582801719, 0.4536778731100891, -0.5953990032962349, 0.19535096475246563], 56)

In [33]:
M = zeros(n,n)
for i = 1:n
    M[i,i] = 1/A[i,i]
end

In [34]:
cond(A)

28.997931666407833

In [35]:
cond(M*A)

28.997931666407865

In [36]:
x, iter, k = pcg_quadratic_tol(A, b, x0, M, true)

([-0.18785798268923093, -0.1319601282031274, -0.13048433813493285, 0.1370769334138529, -0.28673892912704757, 0.1174137921522756, -0.0483100003408053, -0.22024618912668525, 0.04818692731023655, -0.07833977605609946  …  -0.47167625227992177, 0.4877974507827983, -0.6312099326639125, 0.4157718500086872, -0.3303995617627221, 0.24962045537858682, -0.5757966581990931, 0.453677872844035, -0.5953990031221373, 0.19535096466892013], Any[[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0  …  0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0], -0.14738746893501598, -0.1657521579270986, -0.07568900556607486, -0.034050500837040504, -0.09925315313029846, -0.023010072397639938, -0.056900339597012725, -0.1301718837225351, -0.05386013293657207  …  -0.47167625227992177, 0.4877974507827983, -0.6312099326639125, 0.4157718500086872, -0.3303995617627221, 0.24962045537858682, -0.5757966581990931, 0.453677872844035, -0.5953990031221373, 0.19535096466892013], 55)

There is no advantage.

In [38]:
M = inv(A)

1000×1000 Array{Float64,2}:
  0.490553      -0.336899       0.231373      …   5.26731e-164  -2.45808e-164
 -0.336899       0.721926      -0.4958           -1.12871e-163   5.26731e-164
  0.231373      -0.4958         0.831055          1.89193e-163  -8.82901e-164
 -0.158901       0.340503      -0.570747         -2.92543e-163   1.3652e-163
  0.109129      -0.233848       0.391974          4.37685e-163  -2.04253e-163
 -0.0749471      0.160601      -0.269198      …  -6.45353e-163   3.01165e-163
  0.0514717     -0.110297       0.184878          9.45214e-163  -4.411e-163
 -0.0353494      0.0757488     -0.126969         -1.38011e-162   6.44049e-163
  0.0242771     -0.0520223      0.0871993         2.01216e-162  -9.39006e-163
 -0.0166729      0.0357276     -0.0598862        -2.93166e-162   1.36811e-162
  0.0114505     -0.0245368      0.0411283     …   4.26996e-162  -1.99265e-162
 -0.00786389     0.0168512     -0.0282458        -6.21827e-162   2.90186e-162
  0.00540072    -0.011573       0.01939

In [41]:
using SparseArrays

sparse(A)

1000×1000 SparseMatrixCSC{Float64,Int64} with 2998 stored entries:
  [1   ,    1]  =  3.0
  [2   ,    1]  =  1.4
  [1   ,    2]  =  1.4
  [2   ,    2]  =  3.0
  [3   ,    2]  =  1.4
  [2   ,    3]  =  1.4
  [3   ,    3]  =  3.0
  [4   ,    3]  =  1.4
  [3   ,    4]  =  1.4
  [4   ,    4]  =  3.0
  [5   ,    4]  =  1.4
  [4   ,    5]  =  1.4
  ⋮
  [996 ,  996]  =  3.0
  [997 ,  996]  =  1.4
  [996 ,  997]  =  1.4
  [997 ,  997]  =  3.0
  [998 ,  997]  =  1.4
  [997 ,  998]  =  1.4
  [998 ,  998]  =  3.0
  [999 ,  998]  =  1.4
  [998 ,  999]  =  1.4
  [999 ,  999]  =  3.0
  [1000,  999]  =  1.4
  [999 , 1000]  =  1.4
  [1000, 1000]  =  3.0

In [42]:
sparse(M)

1000×1000 SparseMatrixCSC{Float64,Int64} with 1000000 stored entries:
  [1   ,    1]  =  0.490553
  [2   ,    1]  =  -0.336899
  [3   ,    1]  =  0.231373
  [4   ,    1]  =  -0.158901
  [5   ,    1]  =  0.109129
  [6   ,    1]  =  -0.0749471
  [7   ,    1]  =  0.0514717
  [8   ,    1]  =  -0.0353494
  [9   ,    1]  =  0.0242771
  [10  ,    1]  =  -0.0166729
  [11  ,    1]  =  0.0114505
  [12  ,    1]  =  -0.00786389
  ⋮
  [988 , 1000]  =  0.00540072
  [989 , 1000]  =  -0.00786389
  [990 , 1000]  =  0.0114505
  [991 , 1000]  =  -0.0166729
  [992 , 1000]  =  0.0242771
  [993 , 1000]  =  -0.0353494
  [994 , 1000]  =  0.0514717
  [995 , 1000]  =  -0.0749471
  [996 , 1000]  =  0.109129
  [997 , 1000]  =  -0.158901
  [998 , 1000]  =  0.231373
  [999 , 1000]  =  -0.336899
  [1000, 1000]  =  0.490553

In [39]:
x, iter, k = pcg_quadratic_tol(A, b, x0, M, true)

([-0.1878579826528576, -0.1319601277848617, -0.13048433889475178, 0.1370769345004003, -0.28673893031315534, 0.1174137932210659, -0.04831000167999644, -0.22024618791185221, 0.04818692637871178, -0.07833977533531897  …  -0.471676252351473, 0.4877974502429738, -0.6312099317782924, 0.4157718491238658, -0.3303995608846496, 0.24962045451188508, -0.5757966577073671, 0.4536778726089104, -0.5953990030474748, 0.19535096471120994], Any[[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0  …  0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0], -0.1878579826528576, -0.1319601277848617, -0.13048433889475178, 0.1370769345004003, -0.28673893031315534, 0.1174137932210659, -0.04831000167999644, -0.22024618791185221, 0.04818692637871178  …  -0.471676252351473, 0.4877974502429738, -0.6312099317782924, 0.4157718491238658, -0.3303995608846496, 0.24962045451188508, -0.5757966577073671, 0.4536778726089104, -0.5953990030474748, 0.19535096471120994], 1)

Consider now the following example.

In [43]:
n = 1000
A = zeros(n,n)+Diagonal([2+i*i for i=1:n])

1000×1000 Array{Float64,2}:
 3.0  0.0   0.0   0.0   0.0   0.0  …       0.0       0.0       0.0  0.0
 0.0  6.0   0.0   0.0   0.0   0.0          0.0       0.0       0.0  0.0
 0.0  0.0  11.0   0.0   0.0   0.0          0.0       0.0       0.0  0.0
 0.0  0.0   0.0  18.0   0.0   0.0          0.0       0.0       0.0  0.0
 0.0  0.0   0.0   0.0  27.0   0.0          0.0       0.0       0.0  0.0
 0.0  0.0   0.0   0.0   0.0  38.0  …       0.0       0.0       0.0  0.0
 0.0  0.0   0.0   0.0   0.0   0.0          0.0       0.0       0.0  0.0
 0.0  0.0   0.0   0.0   0.0   0.0          0.0       0.0       0.0  0.0
 0.0  0.0   0.0   0.0   0.0   0.0          0.0       0.0       0.0  0.0
 0.0  0.0   0.0   0.0   0.0   0.0          0.0       0.0       0.0  0.0
 0.0  0.0   0.0   0.0   0.0   0.0  …       0.0       0.0       0.0  0.0
 0.0  0.0   0.0   0.0   0.0   0.0          0.0       0.0       0.0  0.0
 0.0  0.0   0.0   0.0   0.0   0.0          0.0       0.0       0.0  0.0
 ⋮                            ⋮    ⋱

In [44]:
for i = 1:n-1
    A[i,i+1] = 1
    A[i+1,i] = 1
end
A[n,1] = 1
A[1,n] = 1
cond(A)

372201.88311699365

In [45]:
κ = cond(A)
(sqrt(κ)-1)/(sqrt(κ)+1)

0.9967271248797949

In [46]:
A

1000×1000 Array{Float64,2}:
 3.0  1.0   0.0   0.0   0.0   0.0  …       0.0       0.0       0.0  1.0
 1.0  6.0   1.0   0.0   0.0   0.0          0.0       0.0       0.0  0.0
 0.0  1.0  11.0   1.0   0.0   0.0          0.0       0.0       0.0  0.0
 0.0  0.0   1.0  18.0   1.0   0.0          0.0       0.0       0.0  0.0
 0.0  0.0   0.0   1.0  27.0   1.0          0.0       0.0       0.0  0.0
 0.0  0.0   0.0   0.0   1.0  38.0  …       0.0       0.0       0.0  0.0
 0.0  0.0   0.0   0.0   0.0   1.0          0.0       0.0       0.0  0.0
 0.0  0.0   0.0   0.0   0.0   0.0          0.0       0.0       0.0  0.0
 0.0  0.0   0.0   0.0   0.0   0.0          0.0       0.0       0.0  0.0
 0.0  0.0   0.0   0.0   0.0   0.0          0.0       0.0       0.0  0.0
 0.0  0.0   0.0   0.0   0.0   0.0  …       0.0       0.0       0.0  0.0
 0.0  0.0   0.0   0.0   0.0   0.0          0.0       0.0       0.0  0.0
 0.0  0.0   0.0   0.0   0.0   0.0          0.0       0.0       0.0  0.0
 ⋮                            ⋮    ⋱

In [47]:
A^(-1)

1000×1000 Array{Float64,2}:
  0.353263     -0.0597876     0.00546288   …   3.53969e-13  -3.53262e-7
 -0.0597876     0.179363     -0.0163886       -5.99071e-14   5.97875e-8
  0.00546288   -0.0163886     0.092869         5.4738e-15   -5.46287e-9
 -0.00030412    0.000912359  -0.00517004      -3.04728e-16   3.04119e-10
  1.12747e-5   -3.38241e-5    0.00019167       1.12972e-17  -1.12747e-11
 -2.96856e-7    8.90567e-7   -5.04654e-6   …  -2.97449e-19   2.96855e-13
  5.82243e-9   -1.74673e-8    9.89813e-8       5.83407e-21  -5.82242e-15
 -8.82347e-11   2.64704e-10  -1.49999e-9      -8.84111e-23   8.82346e-17
  1.06319e-12  -3.18958e-12   1.80743e-11      1.06532e-24  -1.06319e-18
 -1.04243e-14   3.12729e-14  -1.77213e-13     -1.04451e-26   1.04243e-20
  8.47552e-17  -2.54265e-16   1.44084e-15  …   8.49246e-29  -8.4755e-23
 -5.80538e-19   1.74161e-18  -9.86915e-18     -5.81699e-31   5.80537e-25
  3.39506e-21  -1.01852e-20   5.7716e-20       3.40185e-33  -3.39505e-27
  ⋮                        

In [48]:
M = zeros(n,n)+I
x, iter, k = pcg_quadratic_tol(A, b, x0, M, true)

([-0.21331767327169832, -0.10295953980957633, -0.022077330804734632, -0.009287982958805029, -0.018515217071553302, -0.0011137106163345988, -0.004965379908881404, -0.01034542666420233, -0.002472839310021539, -0.004433456670584237  …  -2.6207318284094386e-7, -1.0589177085191018e-7, -6.375134046673504e-7, -1.0013845483786068e-7, -6.020231685386995e-8, -5.442191665420026e-7, -7.490742832624839e-7, -2.67345468061417e-7, -8.79049036196302e-7, 5.186863461059312e-7], Any[[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0  …  0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0], -2.202991217714471e-6, -2.4774870677209154e-6, -1.1313188003324916e-6, -5.089506919740452e-7, -1.4835306302255782e-6, -3.43930103266275e-7, -8.50485793147435e-7, -1.9456709495468142e-6, -8.050440156246742e-7  …  -2.6207318284094386e-7, -1.0589177085191018e-7, -6.375134046673504e-7, -1.0013845483786068e-7, -6.020231685386995e-8, -5.442191665420026e-7, -7.490742832624839e-7, -2.67345468061417e-7, -8.79049036196302e-7, 5.18686

In [50]:
M = zeros(n,n)
for i = 1:n
    M[i,i] = 1/A[i,i]
end
cond(A*M), cond(A)

(1.926360732450869, 372201.88311699365)

In [51]:
M

1000×1000 Array{Float64,2}:
 0.333333  0.0       0.0        …  0.0         0.0       0.0
 0.0       0.166667  0.0           0.0         0.0       0.0
 0.0       0.0       0.0909091     0.0         0.0       0.0
 0.0       0.0       0.0           0.0         0.0       0.0
 0.0       0.0       0.0           0.0         0.0       0.0
 0.0       0.0       0.0        …  0.0         0.0       0.0
 0.0       0.0       0.0           0.0         0.0       0.0
 0.0       0.0       0.0           0.0         0.0       0.0
 0.0       0.0       0.0           0.0         0.0       0.0
 0.0       0.0       0.0           0.0         0.0       0.0
 0.0       0.0       0.0        …  0.0         0.0       0.0
 0.0       0.0       0.0           0.0         0.0       0.0
 0.0       0.0       0.0           0.0         0.0       0.0
 ⋮                              ⋱                        
 0.0       0.0       0.0           0.0         0.0       0.0
 0.0       0.0       0.0           0.0         0.0       0.0

In [52]:
A*M

1000×1000 Array{Float64,2}:
 1.0       0.166667  0.0        …  0.0         0.0       9.99998e-7
 0.333333  1.0       0.0909091     0.0         0.0       0.0
 0.0       0.166667  1.0           0.0         0.0       0.0
 0.0       0.0       0.0909091     0.0         0.0       0.0
 0.0       0.0       0.0           0.0         0.0       0.0
 0.0       0.0       0.0        …  0.0         0.0       0.0
 0.0       0.0       0.0           0.0         0.0       0.0
 0.0       0.0       0.0           0.0         0.0       0.0
 0.0       0.0       0.0           0.0         0.0       0.0
 0.0       0.0       0.0           0.0         0.0       0.0
 0.0       0.0       0.0        …  0.0         0.0       0.0
 0.0       0.0       0.0           0.0         0.0       0.0
 0.0       0.0       0.0           0.0         0.0       0.0
 ⋮                              ⋱                        
 0.0       0.0       0.0           0.0         0.0       0.0
 0.0       0.0       0.0           0.0         0.0   

In [53]:
x, iter, k = pcg_quadratic_tol(A, b, x0, M, true)

([-0.21609033426925445, -0.10004709265979911, -0.025186743274806068, -0.007188218503567845, -0.018307097261454637, -0.0024499279914479088, -0.0054227660835474084, -0.00988435772999367, -0.0031204961855510697, -0.004574025792436715  …  -2.4568273559223287e-7, -8.195326728476134e-8, -6.375254370756678e-7, -1.0013485784837126e-7, -6.024970378849954e-8, -5.239950641114213e-7, -7.47246773058922e-7, -2.797560265153407e-7, -8.793123132800291e-7, -3.141443128966133e-8], Any[[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0  …  0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0], -0.2040447230221838, -0.11473449336956834, -0.028577627802445247, -0.007856643741225954, -0.01526745363834863, -0.002514898431194379, -0.004633730205006008, -0.008191422289019035, -0.0026951031403350876  …  -2.4568273559223287e-7, -8.195326728476134e-8, -6.375254370756678e-7, -1.0013485784837126e-7, -6.024970378849954e-8, -5.239950641114213e-7, -7.47246773058922e-7, -2.797560265153407e-7, -8.793123132800291e-7, -3.141443

In [54]:
function pcg_quadratic(A:: Matrix, b:: Vector, x0:: Vector, M:: Matrix,
                       trace:: Bool = false, tol = 1e-8)
    n = length(x0)
    x = x0
    if (trace)
        iter = [ x ]
    end
    g = b+A*x
    v = M\g
    d = -v
    k = 0
    
    tol2 = tol*tol

    β = 0.0

    gv = dot(g,v)
    while ((gv > tol2) && (k <= n))
#    while ((dot(g,g) > tol2) && (k <= n))
        Ad = A*d
        normd = dot(d,Ad)
        #gv = dot(g,v)
        α = gv/normd
        x += α*d
        if (trace)
            iter = [ iter; x ]
        end
        g += α*Ad
        v = M\g
        gvold = gv
        gv = dot(g,v)
        β = gv/gvold
        d = -v+β*d
        k += 1
    end

    if (trace)
        iter = [ iter; x ]
        return x, iter, k
    end

    return x, k
end

pcg_quadratic (generic function with 3 methods)

In [57]:
function ichol(A:: Matrix)

    n = size(A,1)
    C = zeros(n,n)+I
    
    for k=1:n
        C[k,k] = sqrt(A[k,k])
        for i=(k+1):n
            if (A[i,k] != 0)
                C[i,k] = A[i,k]/A[k,k]    
            end
        end
        for j=(k+1):n
            for i=j:n
                if (A[i,j] != 0)
                    C[i,j] = A[i,j]-A[i,k]*A[j,k]
                end
            end
        end
    end

    return C
end

ichol (generic function with 1 method)

In [55]:
C = cholesky(A)
C.L

1000×1000 LowerTriangular{Float64,Array{Float64,2}}:
 1.73205    ⋅         ⋅          ⋅          …     ⋅           ⋅            ⋅ 
 0.57735   2.38048    ⋅          ⋅                ⋅           ⋅            ⋅ 
 0.0       0.420084  3.28991     ⋅                ⋅           ⋅            ⋅ 
 0.0       0.0       0.303959   4.23174           ⋅           ⋅            ⋅ 
 0.0       0.0       0.0        0.23631           ⋅           ⋅            ⋅ 
 0.0       0.0       0.0        0.0         …     ⋅           ⋅            ⋅ 
 0.0       0.0       0.0        0.0               ⋅           ⋅            ⋅ 
 0.0       0.0       0.0        0.0               ⋅           ⋅            ⋅ 
 0.0       0.0       0.0        0.0               ⋅           ⋅            ⋅ 
 0.0       0.0       0.0        0.0               ⋅           ⋅            ⋅ 
 0.0       0.0       0.0        0.0         …     ⋅           ⋅            ⋅ 
 0.0       0.0       0.0        0.0               ⋅           ⋅            ⋅ 
 0.0       

In [56]:
M = C.L*C.U

1000×1000 Array{Float64,2}:
 3.0   1.0          0.0          …       0.0       0.0   1.0
 1.0   6.0          1.0                  0.0       0.0   0.0
 0.0   1.0         11.0                  0.0       0.0   0.0
 0.0   0.0          1.0                  0.0       0.0  -8.67362e-19
 0.0   0.0          0.0                  0.0       0.0   0.0
 0.0   0.0          0.0          …       0.0       0.0   0.0
 0.0   0.0          0.0                  0.0       0.0   0.0
 0.0   0.0          0.0                  0.0       0.0   0.0
 0.0   0.0          0.0                  0.0       0.0   0.0
 0.0   0.0          0.0                  0.0       0.0   0.0
 0.0   0.0          0.0          …       0.0       0.0   0.0
 0.0   0.0          0.0                  0.0       0.0   1.2326e-32
 0.0   0.0          0.0                  0.0       0.0   0.0
 ⋮                               ⋱                      
 0.0   0.0          0.0                  0.0       0.0   0.0
 0.0   0.0          0.0                  0.0  

In [58]:
x, iter, k = pcg_quadratic(A, b, x0, M, true)

([-0.21609033426004487, -0.1000470926628109, -0.025186743284327757, -0.007188218492085147, -0.01830709724800843, -0.0024499279411004595, -0.005422766117388938, -0.009884357679153724, -0.003120496215819736, -0.004574025817711981  …  -2.4568273534966826e-7, -8.195326720421051e-8, -6.37525436446165e-7, -1.0013485774978416e-7, -6.024970372926838e-8, -5.239950635943015e-7, -7.472467723213303e-7, -2.797560262397645e-7, -8.793123124116823e-7, -3.1414433731610125e-8], Any[[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0  …  0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0], -0.21609033426004487, -0.1000470926628109, -0.025186743284327757, -0.007188218492085147, -0.01830709724800843, -0.0024499279411004595, -0.005422766117388938, -0.009884357679153724, -0.003120496215819736  …  -2.4568273534966826e-7, -8.195326720421051e-8, -6.37525436446165e-7, -1.0013485774978416e-7, -6.024970372926838e-8, -5.239950635943015e-7, -7.472467723213303e-7, -2.797560262397645e-7, -8.793123124116823e-7, -3.14144337

In [59]:
C = ichol(A)

1000×1000 Array{Float64,2}:
 1.73205   0.0       0.0        …    0.0           0.0          0.0
 0.333333  2.44949   0.0             0.0           0.0          0.0
 0.0       0.166667  3.31662         0.0           0.0          0.0
 0.0       0.0       0.0909091       0.0           0.0          0.0
 0.0       0.0       0.0             0.0           0.0          0.0
 0.0       0.0       0.0        …    0.0           0.0          0.0
 0.0       0.0       0.0             0.0           0.0          0.0
 0.0       0.0       0.0             0.0           0.0          0.0
 0.0       0.0       0.0             0.0           0.0          0.0
 0.0       0.0       0.0             0.0           0.0          0.0
 0.0       0.0       0.0        …    0.0           0.0          0.0
 0.0       0.0       0.0             0.0           0.0          0.0
 0.0       0.0       0.0             0.0           0.0          0.0
 ⋮                              ⋱                            
 0.0       0.0       0.0  

In [60]:
M=C*C'

1000×1000 Array{Float64,2}:
 3.0      0.57735    0.0        0.0       …       0.0       0.57735
 0.57735  6.11111    0.408248   0.0               0.0       0.111111
 0.0      0.408248  11.0278     0.301511          0.0       0.0
 0.0      0.0        0.301511  18.0083            0.0       0.0
 0.0      0.0        0.0        0.235702          0.0       0.0
 0.0      0.0        0.0        0.0       …       0.0       0.0
 0.0      0.0        0.0        0.0               0.0       0.0
 0.0      0.0        0.0        0.0               0.0       0.0
 0.0      0.0        0.0        0.0               0.0       0.0
 0.0      0.0        0.0        0.0               0.0       0.0
 0.0      0.0        0.0        0.0       …       0.0       0.0
 0.0      0.0        0.0        0.0               0.0       0.0
 0.0      0.0        0.0        0.0               0.0       0.0
 ⋮                                        ⋱                 
 0.0      0.0        0.0        0.0               0.0       0.0
 0.0  

In [62]:
norm(M-A)

44.413128608038384

In [63]:
x, iter, k = pcg_quadratic(A, b, x0, M, true)

([-0.21609033438712855, -0.1000470926186227, -0.02518674324365346, -0.007188218610322295, -0.018307096981803977, -0.0024499280377395954, -0.005422765870966465, -0.009884357610898847, -0.003120496235790187, -0.004574025813135988  …  -2.456827371951104e-7, -8.195326782012867e-8, -6.375254412348656e-7, -1.0013485850219587e-7, -6.02497041820625e-8, -5.239950675304856e-7, -7.472467779344176e-7, -2.797560283417163e-7, -8.793123190156756e-7, -3.141443486675703e-8], Any[[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0  …  0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0], -0.21083317406502208, -0.10584200646551416, -0.028157152437304954, -0.008201614420679933, -0.017206412772396002, -0.002740115834063671, -0.005216241771747509, -0.009264897238713165, -0.003034362106614687  …  -2.456827371951104e-7, -8.195326782012867e-8, -6.375254412348656e-7, -1.0013485850219587e-7, -6.02497041820625e-8, -5.239950675304856e-7, -7.472467779344176e-7, -2.797560283417163e-7, -8.793123190156756e-7, -3.1414434866

An efficient implementation would make use of sparse matrices and specific functions to compute v.

In [64]:
using BenchmarkTools

In [65]:
@benchmark pcg_quadratic(A, b, x0, M, true)

BenchmarkTools.Trial: 
  memory estimate:  54.23 MiB
  allocs estimate:  7093
  --------------
  minimum time:     1.426 s (0.00% GC)
  median time:      2.661 s (0.00% GC)
  mean time:        2.285 s (0.54% GC)
  maximum time:     2.768 s (0.00% GC)
  --------------
  samples:          3
  evals/sample:     1

In [66]:
M = zeros(n,n)+I
@benchmark pcg_quadratic(A, b, x0, M, true)

BenchmarkTools.Trial: 
  memory estimate:  3.83 GiB
  allocs estimate:  1014019
  --------------
  minimum time:     9.287 s (10.24% GC)
  median time:      9.287 s (10.24% GC)
  mean time:        9.287 s (10.24% GC)
  maximum time:     9.287 s (10.24% GC)
  --------------
  samples:          1
  evals/sample:     1

In [67]:
@benchmark A\(-b)

BenchmarkTools.Trial: 
  memory estimate:  7.65 MiB
  allocs estimate:  5
  --------------
  minimum time:     91.050 ms (0.00% GC)
  median time:      169.786 ms (0.00% GC)
  mean time:        218.153 ms (0.66% GC)
  maximum time:     798.786 ms (4.14% GC)
  --------------
  samples:          23
  evals/sample:     1