#     12.5.3 The Rayleigh Quotient Iteration

With this notebook, we demonstrate how the Inverse Power Method can be accelerated by shifting the matrix, this time by approximating the smallest eigenvalue with the Rayleigh quotient.

<font color=red> Be sure to make a copy!!!! </font>

We start by creating a matrix with known eigenvalues and eigenvectors

How do we do this?  
<ul>
  <li>
    We want a matrix that is not deficient, since otherwise the Rayleigh Quotient Iteration Method may not work. 
  </li>
  <li>
    Hence, $ A = V \Lambda V^{-1} $ for some diagonal matrix $ \Lambda $ and nonsingular matrix $ V $.  The eigenvalues are then on the diagonal of $ \Lambda $ and the eigenvectors are the columns of $ V $.
    </li>
    <li>
    So, let's pick the eigenvalues for the diagonal of $ \Lambda $ and let's pick a random matrix $ V $ (in the hopes that it has linearly independent columns) and then let's see what happens.  
    </li>
    </ul>

<font color=red> Experiment by changing the eigenvalues!  What happens if you make the second entry on the diagonal equal to -4?  Or what if you set 2 to -1? </font>

In [1]:
import numpy as np
import laff
import flame

Lambda = np.matrix( ' 4., 0., 0., 0;\
                      0., 3., 0., 0;\
                      0., 0., 2., 0;\
                      0., 0., 0., 1' )

lambda0 = Lambda[ 0,0 ]

V = np.matrix( np.random.rand( 4,4 ) )

# normalize the columns of V to equal one

for j in range( 0, 4 ):
    V[ :, j ] = V[ :, j ] / np.sqrt( np.transpose( V[:,j] ) * V[:, j ] )

A = V * Lambda * np.linalg.inv( V )

print( 'Lambda = ' )
print( Lambda)

print( 'V = ' )
print( V )

print( 'A = ' )
print( A )


Lambda = 
[[4. 0. 0. 0.]
 [0. 3. 0. 0.]
 [0. 0. 2. 0.]
 [0. 0. 0. 1.]]
V = 
[[0.15930909 0.79937646 0.41484264 0.51304883]
 [0.61370623 0.15424173 0.39785694 0.83359535]
 [0.60426959 0.40917501 0.74390608 0.07363756]
 [0.48253864 0.41204682 0.34090933 0.19099004]]
A = 
[[ 1.95416554 -1.27412645 -0.80997773  3.31021008]
 [-2.02319911  1.05174858 -0.91304797  5.56100832]
 [-1.34643154 -0.31364455  0.55514903  5.15731355]
 [-1.07769959 -0.47887645 -1.17713101  6.43893684]]


The idea is as follows:

The eigenvalues of $ A $ are $ \lambda_0, \ldots, \lambda_3 $ with

$$
\vert \lambda_0 \vert > \vert \lambda_1 \vert > \vert \lambda_2 \vert > \vert \lambda_3 \vert > 0
$$

and how fast the iteration converges depends on the ratio 

$$
\left\vert \frac{\lambda_3}{\lambda_2} \right\vert .
$$
Now, if you pick a value, $ \mu $ close to $ \lambda_3 $, and you iterate with $ A - \mu I $ (which is known as shifting the matrix/spectrum by $ \mu $) you can greatly improve the ratio
$$
\left\vert \frac{\lambda_3-\mu}{\lambda_2-\mu} \right\vert .
$$

Generally we don't know $ \lambda_3 $ and hence don't know how to choose $ \mu $.  But we are generating a vector $ x $ that progressively gets closer and closer to an eigenvector.  Thus, we can use the Rayleigh quotient to approximate an eigenvalue.

Here we purposely say "an eigenvalue" since it could be that the first random vector $ x $ is close to an eigenvector associated with another eigenvalue, and then we may converge to a different eigenvalue.

In [2]:
# Pick a random starting vector

x = np.matrix( np.random.rand( 4,1 ) )


mu = 0.    # Let's start by not shifting, so hopefully we hone in on the smallest eigenvalue

for i in range(0,10):
    # We should really compute a factorization of A, but let's be lazy, and compute the inverse
    # explicitly
    Ainv = np.linalg.inv( A - mu * np.eye( 4, 4 ) )
    
    x = Ainv * x 
    
    # normalize x to length one
    x = x / np.sqrt( np.transpose( x ) * x )
    
    # Notice we compute the Rayleigh quotient with matrix A, not Ainv.  This is because
    # the eigenvector of A is an eigenvector of Ainv
    
    mu = np.transpose( x ) * A * x
    
    # The above returns a 1 x 1 matrix.  Let's set mu to the scalar
    
    mu = mu[ 0, 0 ]
    
    print( 'Rayleigh quotient with vector x:', np.transpose( x ) * A * x / ( np.transpose( x ) * x ))
    print( 'inner product of x with v3     :', np.transpose( x ) * V[ :, 3 ] )
    print( ' ' )

Rayleigh quotient with vector x: [[3.11667923]]
inner product of x with v3     : [[0.60082785]]
 
Rayleigh quotient with vector x: [[2.88769375]]
inner product of x with v3     : [[-0.61542382]]
 
Rayleigh quotient with vector x: [[3.00262523]]
inner product of x with v3     : [[-0.64393]]
 
Rayleigh quotient with vector x: [[3.00004297]]
inner product of x with v3     : [[0.64752633]]
 
Rayleigh quotient with vector x: [[3.]]
inner product of x with v3     : [[-0.64752183]]
 
Rayleigh quotient with vector x: [[3.]]
inner product of x with v3     : [[0.64752183]]
 
Rayleigh quotient with vector x: [[3.]]
inner product of x with v3     : [[0.64752183]]
 
Rayleigh quotient with vector x: [[3.]]
inner product of x with v3     : [[0.64752183]]
 
Rayleigh quotient with vector x: [[3.]]
inner product of x with v3     : [[0.64752183]]
 
Rayleigh quotient with vector x: [[3.]]
inner product of x with v3     : [[0.64752183]]
 


In the above, 
 <ul>
 <li>
 The Rayleigh quotient may converge to 1.0 (but it may converge to another eigenvalue!).
 </li>
 <li>
 The inner product of $ x $ and the last column of $ V $, $ v_{n-1} $, may converge to 1 or -1 since eventually $ x $ may be in the direction of $ v_{n-1} $ (or in the opposite direction).  But not if we start converging to another eigenvalue... If this happens, try rerunning all the code blocks above to get a different $V$ matrix.
 </li>
 </ul>
