#### Power iterations for `dominant` eigenvalue

Assume $A\in \mathbf{R}^{n \times n}$ is `diagonalizable`, we know its eigenvectors $v_i, \cdots, v_n$ form a `basis` for $\mathbf{R}^n$

For a vector $x\in \mathbf{R}^n$, we can express it using the eigenvectors of $A$ as

$$x=c_1v_1 + c_2v_2 + \cdots + c_nv_n$$

If we compute $Ax$ and assume $|\lambda_1|>|\lambda_2|\geq|\lambda_3| \cdots \geq |\lambda_n|$, we can write

$$\begin{align*}
Ax&=A(c_1v_1 + c_2v_2 + \cdots + c_nv_n) \\
&=c_1\lambda_1v_1 + c_2\lambda_2v_2+\cdots + c_n\lambda_nv_n
\end{align*}$$

If we keep multiplying A on the left, we get

$$\begin{align*}
A^kx&=A(c_1v_1 + c_2v_2 + \cdots + c_nv_n) \\
&=c_1\lambda_1^kv_1 + c_2\lambda_2^kv_2+\cdots + c_n\lambda_n^kv_n \\
&=\lambda_1^k\left(c_1v_1+c_2\left(\frac{\lambda_2}{\lambda_1}\right)^kv_2 + \cdots + c_n\left(\frac{\lambda_n}{\lambda_1}\right)^kv_n\right) \\
& k \rightarrow \infty, \left(\frac{\lambda_i}{\lambda_1}\right)\rightarrow 0, i\neq 1 \\
&=\lambda_1^kc_1v_1
\end{align*}$$

It provides an idea to compute the `dominant eigenvalue`

Apparently, we also need to normalize the process, otherwise the norm of the vector after many iterations goes to infinity or zero

* starting from $x^k$
* compute $y^k=Ax^k$
* get new $x^{k+1}$ by normalizing $y^k$ ($l_2$ norm, infinity norm, etc)
$$x^{k+1}=\frac{y^k}{\|y^k\|}$$

We can see that $x^{k}$ is some scalar multiple of $A^{k}x^0$, which in turn is some scalar multiple of $v_1$, with estimation error dominated by $\left(\frac{\lambda_2}{\lambda_1}\right)^k$

If we can get $x^k\rightarrow cv_1$, then eigenvalue $\lambda_1$ is simply obtained by computing $y^k = Ax^k \rightarrow \lambda_1cv_1$, or (which is Rayleigh quotient)

$$\frac{y^k\cdot x^k}{x^k\cdot x^k}\rightarrow\lambda_1$$

Example

$$A=\begin{bmatrix}8 & 3 \\2 &7\end{bmatrix}, x^0=\begin{bmatrix}1 \\ 1\end{bmatrix}$$

In [1]:
import matplotlib.pyplot as plt
import numpy as np
np.set_printoptions(formatter={'float': '{: 0.4f}'.format})

plt.style.use('dark_background')
# color: https://matplotlib.org/stable/gallery/color/named_colors.htm

In [2]:
A = np.array([[8., 3.], [2., 7.]])
x = np.array([1., 1.])

eigenvalues, eigenvectors = np.linalg.eig(A)
print(f'True eigenvalues: {eigenvalues}')
print(f'True eigenvectors (columns): {eigenvectors}\n')

num_iter = 20
for iter in range(num_iter):
    y = A @ x
    lambda_1 = np.dot(y, x) / np.dot(x, x)
    print(f'# {iter+1}: lambda_1: {lambda_1:.4f}')
    x= y / np.linalg.norm(y)
v_1 = A @ x / np.linalg.norm(A @ x)
print(f'\nv_1: {v_1}')

True eigenvalues: [ 10.0000  5.0000]
True eigenvectors (columns): [[ 0.8321 -0.7071]
 [ 0.5547  0.7071]]

# 1: lambda_1: 10.0000
# 2: lambda_1: 10.0495
# 3: lambda_1: 10.0367
# 4: lambda_1: 10.0212
# 5: lambda_1: 10.0113
# 6: lambda_1: 10.0058
# 7: lambda_1: 10.0030
# 8: lambda_1: 10.0015
# 9: lambda_1: 10.0007
# 10: lambda_1: 10.0004
# 11: lambda_1: 10.0002
# 12: lambda_1: 10.0001
# 13: lambda_1: 10.0000
# 14: lambda_1: 10.0000
# 15: lambda_1: 10.0000
# 16: lambda_1: 10.0000
# 17: lambda_1: 10.0000
# 18: lambda_1: 10.0000
# 19: lambda_1: 10.0000
# 20: lambda_1: 10.0000

v_1: [ 0.8321  0.5547]


#### Iteration for other eigenvalues

After getting the dominant eigenvalue and corresponding eigenvector, we can continue by subtracting them from $A$

$$A\leftarrow A - \lambda_1 q_1 q_1^T$$

Repeat this for more eigenvalues

However, since subtraction step can cause loss of orthogonality, therefore power iterations method is primarily used for finding dominant eigenvalue

In [3]:
# Assume A diagonalizable
def power_iteration(A, num_eigen, num_iter=1000, tol=1e-10):
    n = A.shape[0]
    eigenvalues = []
    eigenvectors = []
    A_current = A.copy()

    for k in range(num_eigen):
        x_k = np.random.rand(n)
        x_k = x_k / np.linalg.norm(x_k)

        for j in range(num_iter):
            y_k = A_current @ x_k
            y_k_unit = y_k / np.linalg.norm(y_k)

            if np.linalg.norm(y_k_unit - x_k) < tol:
                break

            x_k = y_k_unit

        print(f'{j+1} iterations for eigenvalue #{k+1}')

        # eigenvalue = (A @ x_k)[0] / x_k[0]
        # Compute the Rayleigh quotient for the eigenvalue
        eigenvalue = np.dot(A @ x_k, x_k) / np.dot(x_k, x_k)
        eigenvalues.append(eigenvalue)
        eigenvectors.append(x_k)

        v_k = x_k / np.linalg.norm(x_k)
        A_current -= eigenvalue * np.outer(v_k, v_k) / np.dot(v_k, v_k)

    return np.array(eigenvalues), np.column_stack(eigenvectors)

In [4]:
np.random.seed(42)

k = 2

eigenvalues, eigenvectors = power_iteration(A, k)

print("\nComputed eigenvalues:")
for idx, eigenvalue in enumerate(eigenvalues, 1):
    print(f"Eigenvalue {idx}: {eigenvalue:.6f}")

print("\nComputed eigenvectors (columns):")
print(eigenvectors)

true_eigenvalues, true_eigenvectors = np.linalg.eig(A)
print("\nNumPy's eigenvalues:")
print(true_eigenvalues)

print("\nNumPy's eigenvectors (columns):")
print(true_eigenvectors)

33 iterations for eigenvalue #1
3 iterations for eigenvalue #2

Computed eigenvalues:
Eigenvalue 1: 10.000000
Eigenvalue 2: 5.384615

Computed eigenvectors (columns):
[[ 0.8321 -0.3807]
 [ 0.5547  0.9247]]

NumPy's eigenvalues:
[ 10.0000  5.0000]

NumPy's eigenvectors (columns):
[[ 0.8321 -0.7071]
 [ 0.5547  0.7071]]


We see the second eigenvalue is not accurate