#     12.5.2 Shifting the Inverse Power Method

With this notebook, we demonstrate how the Inverse Power Method can be accelerated by shifting the matrix.

<font color=red> Be sure to make a copy!!!! </font>

We start by creating a matrix with known eigenvalues and eigenvectors

How do we do this?  
<ul>
  <li>
    We want a matrix that is not deficient, since otherwise the Shifted Inverse Power Method may not work. 
  </li>
  <li>
    Hence, $ A = V \Lambda V^{-1} $ for some diagonal matrix $ \Lambda $ and nonsingular matrix $ V $.  The eigenvalues are then on the diagonal of $ \Lambda $ and the eigenvectors are the columns of $ V $.
    </li>
    <li>
    So, let's pick the eigenvalues for the diagonal of $ \Lambda $ and let's pick a random matrix $ V $ (in the hopes that it has linearly independent columns) and then let's see what happens.  
    </li>
    </ul>

<font color=red> Experiment by changing the eigenvalues!  What happens if you make the second entry on the diagonal equal to -4?  Or what if you set 2 to -1? </font>

In [11]:
using LinearAlgebra

Λ = Diagonal([4., 3., 2., 1.])

λ1 = Λ[1, 1]

V = rand(4, 4)

# normalize the columns of V to each have a length of one
for j in 1:4
    V[:, j:j] = V[:, j:j] / sqrt( V[:, j]' * V[:, j])
end

A = V * Λ * inv(V)

println("Λ = ")
Λ

Λ = 


4×4 Diagonal{Float64,Array{Float64,1}}:
 4.0   ⋅    ⋅    ⋅ 
  ⋅   3.0   ⋅    ⋅ 
  ⋅    ⋅   2.0   ⋅ 
  ⋅    ⋅    ⋅   1.0

In [12]:
println("V = ")
V

V = 


4×4 Array{Float64,2}:
 0.401898  0.413966  0.404508  0.661614
 0.827046  0.436008  0.390839  0.488107
 0.113078  0.552455  0.55678   0.160282
 0.376412  0.577341  0.61124   0.546195

In [13]:
println("A = ")
A

A = 


4×4 Array{Float64,2}:
 4.56531  2.82118  5.40654   -8.42643
 1.30818  6.17767  4.33174   -7.4828 
 7.64106  1.48573  9.88757  -13.1915 
 5.59424  2.8762   7.78272  -10.6306 

The idea is as follows:

The eigenvalues of $ A $ are $ \lambda_1, \ldots, \lambda_4 $ with

$$
\vert \lambda_1 \vert > \vert \lambda_2 \vert > \vert \lambda_3 \vert > \vert \lambda_4 \vert > 0
$$

and how fast the iteration converges depends on the ratio 

$$
\left\vert \frac{\lambda_4}{\lambda_3} \right\vert .
$$
Now, if you pick a value, $ \mu $ close to $ \lambda_4 $, and you iterate with $ A - \mu I $ (which is known as shifting the matrix/spectrum by $ \mu $) you can greatly improve the ratio
$$
\left\vert \frac{\lambda_4-\mu}{\lambda_3-\mu} \right\vert .
$$

Try different values of $ \mu$.  What if you pick $ \mu \approx 2 $?  
What if you pick $ \mu = 0.8 $?

In [14]:
# Pick a random starting vector and a value of μ
x = rand( 4 )
μ = 0.8

# We should really compute a factorization of A, but let's be lazy, and compute the inverse
# explicitly
Ainv = inv( A - μ * I )

for i in 1:10
    x = Ainv * x 
    
    # normalize x to length one
    x = x / sqrt( transpose( x ) * x )
    
    # Notice we compute the Rayleigh quotient with matrix A, not Ainv.  This is because
    # the eigenvector of A is an eigenvector of Ainv
    println("Rayleigh quotient with vector x: $(x'*A*x / (x'x))")
    println("inner product of x with v4     : $(x' * V[:, 4])  \n" )
end

Rayleigh quotient with vector x: 0.9535046611957441
inner product of x with v4     : 0.9994262074827158  

Rayleigh quotient with vector x: 1.008308950253941
inner product of x with v4     : 0.9999506104387562  

Rayleigh quotient with vector x: 1.0028995250926407
inner product of x with v4     : 0.9999978787132767  

Rayleigh quotient with vector x: 1.0006212033662651
inner product of x with v4     : 0.9999999281598289  

Rayleigh quotient with vector x: 1.0001160340626962
inner product of x with v4     : 0.9999999977974049  

Rayleigh quotient with vector x: 1.0000204734695066
inner product of x with v4     : 0.9999999999355833  

Rayleigh quotient with vector x: 1.0000035153220281
inner product of x with v4     : 0.9999999999981608  

Rayleigh quotient with vector x: 1.0000005952568074
inner product of x with v4     : 0.9999999999999483  

Rayleigh quotient with vector x: 1.0000001000613086
inner product of x with v4     : 0.9999999999999987  

Rayleigh quotient with vector x: 1.000

In the above, 
 <ul>
 <li>
 The Rayleigh quotient should converge to 1.0 (quickly if $ \mu \approx 1 $).
 </li>
 <li>
 The inner product of $ x $ and the last column of $ V $, $ v_{n} $, should converge to 1 or -1 since eventually $ x $ should be in the direction of $ v_{n} $ (or in the opposite direction).
 </li>
 </ul>
 
 This time, if you change the "2" on the diagonal to "-1", you still converge to $ v_{n} $ because for the matrix $ A - \mu I $, $ -1 - \mu $ is not as small as $ 1 - \mu $ (in magnitude).

 You can check this by looking at $ ( I - V_R ( V_R^T V_R )^{-1} V_R^T ) x $, where $V_R $ equals the matrix with $ v_3 $ and $ v_4 $ as its columns, to see if the vector orthogonal to $ {\cal C}( V_R ) $ converges to zero. This is seen in the following code block:


In [15]:
VR = V[:, 3:4]

4×2 Array{Float64,2}:
 0.404508  0.661614
 0.390839  0.488107
 0.55678   0.160282
 0.61124   0.546195

In [16]:
w = x - VR * inv(VR'VR) * VR' * x

4-element Array{Float64,1}:
  2.2393198406689407e-13
 -2.0349277818354494e-12
 -6.392109064279339e-13 
  1.7346124536743446e-12