#### Schur decomposition and eigenvalue problem

According to `Schur theorem`, if $A\in \mathbf{R}^{n \times n}$ is a square real matrix with real eigenvalues, then there is an orthogonal matrix $Q$ and an upper triangular matrix $T$ such that

$$A=QTQ^T$$

We can show why this is true

##### Setup

For orthogonal matrix $Q_1^T=Q_1^{-1}$ and $Q_1^TQ_1=I$, by letting $Q_1=\begin{bmatrix}q_1 & q_2 & \cdots & q_n\end{bmatrix}$, and $q_1$ being a normalized `eigenvector` of $A$, we can write

$$\begin{align*}
Q_1^TAQ_1&=\begin{bmatrix}q_1^T \\ q_{2:n}^T\end{bmatrix}A\begin{bmatrix}q_1 & q_{2:n}\end{bmatrix} \\
& =\begin{bmatrix}q_1^TAq_1 & q_1^TAq_{2:n} \\ q_{2:n}^TAq_1 & q_{2:n}^TAq_{2:n}\end{bmatrix}
\end{align*}$$

By `construction`, the upper-left block is the corresponding `eigenvalue` $\lambda_1$, since

$$q_1^TAq_1 = q_1^T(\lambda_1q_1)=\lambda_1q_1^Tq_1=\lambda_1$$

For the upper-right block, we denote it $B$

$$B=q_1^TAq_{2:n}=\begin{bmatrix}q_1^TAq_2 & q_1^TAq_3 & \cdots & q_1^TAq_n \end{bmatrix}$$

The lower-left block is zero, since

$$q_{2:n}^TAq_1=\begin{bmatrix}q_2^T \\q_3^T \\ \vdots \\q_n^T \end{bmatrix}Aq_1=\lambda_1\begin{bmatrix}q_2^T \\q_3^T \\ \vdots \\q_n^T \end{bmatrix}q_1=\begin{bmatrix}0 \\0 \\ \vdots \\0 \end{bmatrix}$$

We denote the lower-right block $A_2$ and we have

$$\begin{align*}
Q_1^TAQ_1&=\begin{bmatrix}\lambda_1 & B \\ 0 & A_2\end{bmatrix}
\end{align*}$$

or

$$\begin{align*}
AQ_1&=Q_1\begin{bmatrix}\lambda_1 & B \\ 0 & A_2\end{bmatrix}
\end{align*}$$

In addition, $A_2$ contains the `remaining eigenvalues` $\lambda_2 , \cdots, \lambda_n$ of $A$

To see this, from the properties of similarity transformation, we know that $A$ and $Q_1^TAQ_1$ have the same eigenvalues

Therefore

$$\begin{align*}
\det (A-\lambda I)&=\det (Q_1^TAQ_1-\lambda I) \\
&=\det \begin{bmatrix}\lambda_1-\lambda & B \\ 0 & A_2-\lambda I\end{bmatrix} \\
& \text{property of determinant}\\
&= (\lambda_1-\lambda) \det (A_2-\lambda I)
\end{align*}$$

##### Proof by induction

For matrix of size one, Schur decomposition obviously exists

So starting with matrix of size $n=2$, we know that there exists a Schur decomposition for matrix $A_2$ of size $n-1$:

$$Q_2T_2Q_2^T=A_2$$

or

$$Q_2T_2=A_2Q_2$$

We can proceed with `induction`

If we write

$$Q=Q_1\begin{bmatrix}1 & 0 \\ 0 & Q_2\end{bmatrix}$$

(since both $Q_1$ and $Q_2$ are orthogonal, $Q$ is orthogonal)

we have for matrix $A$ of size $n$

$$\begin{align*}AQ &= A Q_1\begin{bmatrix}1 & 0 \\ 0 & Q_2\end{bmatrix} \\
& = Q_1\begin{bmatrix}\lambda_1 & B \\ 0 & A_2\end{bmatrix}\begin{bmatrix}1 & 0 \\ 0 & Q_2\end{bmatrix} \\
&=Q_1 \begin{bmatrix}\lambda_1 & BQ_2 \\ 0 & A_2Q_2\end{bmatrix} \\
&=Q_1 \begin{bmatrix}\lambda_1 & BQ_2 \\ 0 & Q_2T_2\end{bmatrix} \\
&=Q_1 \begin{bmatrix}1& 0 \\ 0 & Q_2\end{bmatrix}\begin{bmatrix}\lambda_1 & BQ_2 \\ 0 & T_2\end{bmatrix} \\
& = Q\begin{bmatrix}\lambda_1 & BQ_2 \\ 0 & T_2\end{bmatrix}
\end{align*}$$

Now, we can show that Schur decomposition exists for matrix $A$ of size $n$ by letting

$$T=\begin{bmatrix}\lambda_1 & BQ_2 \\ 0 & T_2\end{bmatrix}
$$

If $A$ is symmetric, then $T$ is upper triangular and symmetric at the same time, therefore, $T$ must be diagonal, meaning that symmetric matrices are diagonalizable, as we have known

With Schur decomposition, obviously we can get the eigenvalues of $A$ on the diagonal of $T$

So, how to compute Schur decomposition?

#### `Orthogonal` iterations

Recall our power iterations to find the dominant eigenvalue of $A$, by starting from an arbitrary vector $x$ and alternating between computing $y^k=Ax^k$ and normalizing and updating $x^k$ as $x^k=\frac{y^k}{\|y^k\|}$

To get to Schur decomposition, we would like to do the iterations not just on a single vector $x$, but on $n$ orthonormal vectors that form an `orthogonal` matrix $Q_0$

After each multiplication of $A$, we do `QR decomposition` such that we still have a set of orthonormal vectors for the next iteration

$$AQ_i=Q_{i+1}R_{i+1}$$

We can see that, if $Q_i \rightarrow Q$ as $i\rightarrow \infty$, then $R_i$ must also converges to certain upper triangular $R$, and

$$AQ=QR \Rightarrow A=QRQ^T$$

which is exactly the Schur decomposition we need where $T=R$

We can first evaluate orthogonal iterations on a square real matrix to get some feeling

We start with $Q_0=I$ and use Householder reflector for QR factorization

In [9]:
import matplotlib.pyplot as plt
import numpy as np
np.set_printoptions(formatter={'float': '{: 0.4f}'.format})

plt.style.use('dark_background')
# color: https://matplotlib.org/stable/gallery/color/named_colors.htm

In [10]:
def householder(A):
    m, n = A.shape
    R = A.copy()
    Q = np.identity(m)

    for i in range(n):
        x = R[i:, i]
        v = np.sign(x[0]) * np.linalg.norm(x) * np.eye(x.shape[0])[:,0] + x
        v /= np.linalg.norm(v)

        # Since all entries in R[i:, :i] are zero from previous iteration
        # applying transformation to R[i:, i:] would suffice
        R[i:, i:] -= 2 * np.outer(v, v) @ R[i:, i:]

        # If Q is needed explicitly
        Q[i:, :] -= 2 * np.outer(v, v) @ Q[i:, :]

    return Q.T, R

In [11]:
np.random.seed(42)

A = np.array([[2., 3, 4, 5, 6],
              [4, 4, 5, 6, 7],
              [0, 3, 6, 7, 8],
              [0, 0, 2, 8, 9],
              [0, 0, 0, 1, 10]])

# Orthogonal iteration algorithm
Q = np.eye(5)

num_iter = 51
for i in range(num_iter):
    Q, R = householder(A @ Q)
    # Diagonal elements of R are evolution of eigenvalues
    if i % 10 == 0:
        print(R.diagonal())

# Compare to NumPy
print(f'\nEigenvalues from NumPy: \n{np.linalg.eigvals(A)[::-1]}')

[-4.4721  3.1305 -2.0454 -1.7179 -7.1148]
[ 13.9041  9.6143  5.1989  1.5014 -0.3354]
[ 14.1490  9.5280  5.1553  1.5014 -0.3354]
[ 14.1539  9.5249  5.1552  1.5014 -0.3354]
[ 14.1540  9.5248  5.1552  1.5014 -0.3354]
[ 14.1540  9.5248  5.1552  1.5014 -0.3354]

Eigenvalues from NumPy: 
[ 14.1540  9.5248  5.1552  1.5014 -0.3354]


#### `QR algorithm`

So far, we iterate based on

$$Q_{i+1}R_{i+1}=AQ_i$$

We can rearrange to do it slightly differently

$$Q_i^TQ_{i+1}R_{i+1}=Q_i^TAQ_i$$

If we denote $\tilde{Q}_{i+1}=Q_i^TQ_{i+1}$, and $Q_i^TAQ_i=A_i$, then, by construction, $\tilde{Q}_{i+1}$ is `orthogonal`, and $A_i$ is `similar` to $A$ and therefore, have the same eigenvalues

So we have

$$\tilde{Q}_{i+1}R_{i+1}=A_i$$

Now if we swap the factors

$$\begin{align*}
R_{i+1}\tilde{Q}_{i+1}&=\left(Q^T_{i+1}AQ_i\right)\left(Q_i^TQ_{i+1}\right) \\
&=Q^T_{i+1}AQ_{i+1} \\
&=A_{i+1}
\end{align*}$$

We see that if this converges as $i\rightarrow \infty$

$$Q^TAQ=A_{i+1}$$

and $T$ in Schur decomposition is $A_{i+1}$

`Essentially`, at each iteration, we approximate $T$ using

$$T_i\approx Q_i^TAQ_i=Q_i^TQ_{i+1}R_{i+1}=\tilde{Q}_{i+1}R_{i+1}$$

Iterate using this expression, we get the basic QR algorithm for finding eigenvalues of general matrices

(Of course, QR algorithm used in practice implements many other tricks...)

In [12]:
# QR algorithm
num_iter = 51
for i in range(num_iter):
    Q, R = householder(A)
    A = R @ Q

    # Diagonal elements of A are evolution of eigenvalues
    if i % 10 == 0:
        print(np.diag(A))

# Compare to NumPy
print(f'\nEigenvalues from NumPy: \n{np.linalg.eigvals(A)}')

[ 6.4000  5.4776  7.8298  4.5076  5.7851]
[ 13.9835  9.6517  5.1988  1.5014 -0.3354]
[ 14.1506  9.5281  5.1553  1.5014 -0.3354]
[ 14.1539  9.5249  5.1552  1.5014 -0.3354]
[ 14.1540  9.5248  5.1552  1.5014 -0.3354]
[ 14.1540  9.5248  5.1552  1.5014 -0.3354]

Eigenvalues from NumPy: 
[ 14.1540  9.5248  5.1552  1.5014 -0.3354]
