#     12.5.2 Shifting the Inverse Power Method

With this notebook, we demonstrate how the Inverse Power Method can be accelerated by shifting the matrix.

<font color=red> Be sure to make a copy!!!! </font>

We start by creating a matrix with known eigenvalues and eigenvectors

How do we do this?  
<ul>
  <li>
    We want a matrix that is not deficient, since otherwise the Shifted Inverse Power Method may not work. 
  </li>
  <li>
    Hence, $ A = V \Lambda V^{-1} $ for some diagonal matrix $ \Lambda $ and nonsingular matrix $ V $.  The eigenvalues are then on the diagonal of $ \Lambda $ and the eigenvectors are the columns of $ V $.
    </li>
    <li>
    So, let's pick the eigenvalues for the diagonal of $ \Lambda $ and let's pick a random matrix $ V $ (in the hopes that it has linearly independent columns) and then let's see what happens.  
    </li>
    </ul>

<font color=red> Experiment by changing the eigenvalues!  What happens if you make the second entry on the diagonal equal to -4?  Or what if you set 2 to -1? </font>

In [1]:
import numpy as np
import laff
import flame

Lambda = np.matrix( ' 4., 0., 0., 0;\
                      0., 3., 0., 0;\
                      0., 0., 2., 0;\
                      0., 0., 0., 1' )

lambda0 = Lambda[ 0,0 ]

V = np.matrix( np.random.rand( 4,4 ) )

# normalize the columns of V to equal one

for j in range( 0, 4 ):
    V[ :, j ] = V[ :, j ] / np.sqrt( np.transpose( V[:,j] ) * V[:, j ] )

A = V * Lambda * np.linalg.inv( V )

print( 'Lambda = ' )
print( Lambda)

print( 'V = ' )
print( V )

print( 'A = ' )
print( A )


Lambda = 
[[4. 0. 0. 0.]
 [0. 3. 0. 0.]
 [0. 0. 2. 0.]
 [0. 0. 0. 1.]]
V = 
[[0.31341969 0.34670168 0.24501153 0.6549992 ]
 [0.62849999 0.75486041 0.74855915 0.24098837]
 [0.53506434 0.55412283 0.31086899 0.24964172]
 [0.46953383 0.05414432 0.53196713 0.67125231]]
A = 
[[-0.28770473 -0.83298778  3.15441881  0.3824365 ]
 [-2.42317374  0.03993437  5.48303975  0.67001039]
 [-2.31256966 -1.74510842  6.75151002  0.74407972]
 [-3.6980646  -1.82759833  4.75496949  3.49626034]]


The idea is as follows:

The eigenvalues of $ A $ are $ \lambda_0, \ldots, \lambda_3 $ with

$$
\vert \lambda_0 \vert > \vert \lambda_1 \vert > \vert \lambda_2 \vert > \vert \lambda_3 \vert > 0
$$

and how fast the iteration converges depends on the ratio 

$$
\left\vert \frac{\lambda_3}{\lambda_2} \right\vert .
$$
Now, if you pick a value, $ \mu $ close to $ \lambda_3 $, and you iterate with $ A - \mu I $ (which is known as shifting the matrix/spectrum by $ \mu $) you can greatly improve the ratio
$$
\left\vert \frac{\lambda_3-\mu}{\lambda_2-\mu} \right\vert .
$$

Try different values of $ \mu$.  What if you pick $ \mu \approx 2 $?  
What if you pick $ \mu = 0.8 $?

In [2]:
# Pick a random starting vector

x = np.matrix( np.random.rand( 4,1 ) )

# We should really compute a factorization of A, but let's be lazy, and compute the inverse
# explicitly

mu = 0.8

Ainv = np.linalg.inv( A - mu * np.eye( 4, 4 ) )

for i in range(0,10):
    x = Ainv * x 
    
    # normalize x to length one
    x = x / np.sqrt( np.transpose( x ) * x )
    
    # Notice we compute the Rayleigh quotient with matrix A, not Ainv.  This is because
    # the eigenvector of A is an eigenvector of Ainv
    
    print( 'Rayleigh quotient with vector x:', np.transpose( x ) * A * x / ( np.transpose( x ) * x ))
    print( 'inner product of x with v3     :', np.transpose( x ) * V[ :, 3 ] )
    print( ' ' )

Rayleigh quotient with vector x: [[1.22828794]]
inner product of x with v3     : [[0.9952281]]
 
Rayleigh quotient with vector x: [[1.00290328]]
inner product of x with v3     : [[0.99992569]]
 
Rayleigh quotient with vector x: [[0.99806554]]
inner product of x with v3     : [[0.99999686]]
 
Rayleigh quotient with vector x: [[0.99949124]]
inner product of x with v3     : [[0.99999988]]
 
Rayleigh quotient with vector x: [[0.9999002]]
inner product of x with v3     : [[1.]]
 
Rayleigh quotient with vector x: [[0.99998212]]
inner product of x with v3     : [[1.]]
 
Rayleigh quotient with vector x: [[0.99999691]]
inner product of x with v3     : [[1.]]
 
Rayleigh quotient with vector x: [[0.99999948]]
inner product of x with v3     : [[1.]]
 
Rayleigh quotient with vector x: [[0.99999991]]
inner product of x with v3     : [[1.]]
 
Rayleigh quotient with vector x: [[0.99999999]]
inner product of x with v3     : [[1.]]
 


In the above, 
 <ul>
 <li>
 The Rayleigh quotient should converge to 1.0 (quickly if $ \mu \approx 1 $).
 </li>
 <li>
 The inner product of $ x $ and the last column of $ V $, $ v_{n-1} $, should converge to 1 or -1 since eventually $ x $ should be in the direction of $ v_{n-1} $ (or in the opposite direction).
 </li>
 </ul>
 
 This time, if you change the "2" on the diagonal to "-1", you still converge to $ v_{n-1} $ because for the matrix $ A - \mu I $, $ -1 - \mu $ is not as small as $ 1 - \mu $ (in magnitude).

 You can check this by looking at $ ( I - V_R ( V_R^T V_R )^{-1} V_R^T ) x $, where $V_R $ equals the matrix with $ v_2 $ and $ v_3 $ as its columns, to see if the vector orthogonal to $ {\cal C}( V_R ) $ converges to zero. This is seen in the following code block:


In [3]:
w = x - V[ :,2:4 ] * np.linalg.inv( np.transpose( V[ :,2:4 ] ) * V[ :,2:4 ] ) * np.transpose( V[ :,2:4 ] ) * x
    
print( 'Norm of component orthogonal: ', np.linalg.norm( w ) )

Norm of component orthogonal:  3.528212756981633e-11
