#### Hessenberg decomposition

A Hessenberg matrix is a special kind of square matrix, one that is "almost" triangular

To be exact, an upper Hessenberg matrix has zero entries below the first subdiagonal, and a lower Hessenberg matrix has zero entries above the first superdiagonal

Similar to QR, Hessenberg decomposition aims to find for $A\in \mathbf{R}^{m \times m}$, orthogonal matrix $Q$ such that

$$A=QHQ^T$$

where $H$ is in Hessenberg form

#### Arnoldi iteration

The Arnoldi iteration performs exactly like modified Gram-Schmidt for progressively finding both first $(n+1)$ vectors in $Q$ and the $(n+1) \times n$ upper-left section of $H$

To see this, we can write more explicitly a `reduced version` of $AQ=QH$

$$AQ_n = A\begin{bmatrix}q_1 & q_2 & \cdots q_n\end{bmatrix}=Q_{n+1}H_{n+1,n}=\begin{bmatrix}q_1 & q_2 & \cdots q_{n+1}\end{bmatrix}\begin{bmatrix}h_{11} & \cdots & h_{1n} \\ h_{21}& & \vdots \\
& \ddots & \vdots \\ & & h_{n+1,n}\end{bmatrix}$$

The reason we have $n+1$ on the right is that for each $Aq_n$, the computation requires $q_{n+1}$ and $h_{n+1,n}$ due to the non-zeros in the first subdiagonal (We can see this from the full $AQ=QH$)

Different from MGS where for each $a_n$ only one new $q_n$ is produced, for Arnoldi, we will use the residual to compute $q_{n+1}$ and $h_{n+1, n}$

#### Example

In [7]:
import matplotlib.pyplot as plt
import numpy as np
np.set_printoptions(formatter={'float': '{: 0.4f}'.format})

plt.style.use('dark_background')
# color: https://matplotlib.org/stable/gallery/color/named_colors.htm

In [8]:
def arnoldi_iteration(A, b, n, tol=1e-10):
    m = A.shape[0]
    Q = np.zeros((m, n+1))
    H = np.zeros((n+1, n))
    Q[:, 0] = b / np.linalg.norm(b)

    for i in range(n):
        v = A @ Q[:, i]

        for j in range(i+1):
            H[j, i] = Q[:, j] @ v
            v -= H[j, i] * Q[:, j]

        H[i+1, i] = np.linalg.norm(v)
        if H[i+1, i] < tol:
            break
        Q[:, i+1] = v / H[i+1, i]

    return Q, H

In [9]:
np.random.seed(42)

m = 15
A = np.random.rand(m, m)
b = np.random.rand(m)

Q, H = arnoldi_iteration(A, b, 7)

In [10]:
print(f'Q {Q.shape}\n', Q)
print('\nCheck orthogonality in Q:\n', Q.T @ Q)
print(f'\nH {H.shape}\n', H)

print('\nAQ:\n', A @ Q[:,:-1])
print('\nQH:\n', Q @ H)
print('\nDifference:\n', np.linalg.norm(A @ Q[:,:-1] - Q @ H))

Q (15, 8)
 [[ 0.1102  0.3124  0.2661 -0.0351  0.3267  0.4930 -0.2339  0.1209]
 [ 0.4395 -0.3489 -0.0580  0.0320  0.1260 -0.3865 -0.2913  0.1954]
 [ 0.1775  0.2516 -0.5857  0.1913 -0.3183 -0.0470  0.3381 -0.0138]
 [ 0.4029 -0.1077 -0.1677  0.1278  0.2770  0.0338  0.1299 -0.1749]
 [ 0.2851  0.0102  0.1524 -0.4346  0.1633  0.0675  0.5022 -0.0603]
 [ 0.3590 -0.1109  0.0013  0.2131  0.0013 -0.0431 -0.3857 -0.4881]
 [ 0.2270  0.1178 -0.1251 -0.4649 -0.3217  0.0305 -0.2454  0.2847]
 [ 0.2606  0.0953  0.1109 -0.0459  0.4375 -0.2283  0.2206  0.2795]
 [ 0.2224  0.1107  0.0368 -0.0883 -0.0993  0.3414 -0.0207 -0.4470]
 [ 0.0882  0.3489 -0.2002  0.0839  0.0623  0.1520 -0.3877  0.2814]
 [ 0.3263 -0.0689  0.4790  0.3869 -0.4890  0.0911  0.1414  0.3641]
 [ 0.1268  0.2134  0.1753  0.3575 -0.0167  0.1151  0.1949 -0.1042]
 [ 0.0110  0.5508  0.3592 -0.1496 -0.1471 -0.5953 -0.0946 -0.2769]
 [ 0.2915  0.0097 -0.1020 -0.3624 -0.2444  0.1372  0.0018 -0.0313]
 [ 0.0800  0.4274 -0.2500  0.2148  0.2082 -0.1012  

#### Relation to Krylov subspace

For $n=1$, we know that $q_1=\frac{b}{\|b\|}\in K_1(A, b)=\text{span}(b)$

Now assume for certain $n$, $q_1, \cdots, q_n \in K_n(A,b)$, when we compute $v=Aq_n$, we know that by definition

$$v\in AK_n(A,b) \subseteq K_{n+1}(A,b) $$

To get $q_{n+1}$, we need to subtract projections of $v$ onto previous $q_j, j=1, \cdots, n$

$$v=Aq_n-\sum_{j=1}^nh_{jn}q_j, \,\,q_{n+1}=\frac{v}{h_{n+1,n}}$$

since each $q_j \in K_n(A,b)\subseteq K_{n+1}(A,b)$, we know $q_{n+1}\in K_{n+1}(A,b)$, as it is the linear combination of vectors in it

Therefore, the orthonormal vectors from Arnoldi algorithm satisfy

$$\text{span}(q_1, q_2, \cdots, q_n)=K_n(A, b)=\text{span}(b, Ab, \cdots, A^{n-1}b)$$

and thus form `orthonomal basis` for Krylov subspaces and therefore, must be the reduced factor $Q_n$ in a QR factorization of $K_n=Q_nR_n$

#### Hessenberg form and eigenvalues

Previously, we see that Arnoldi iteration can produce

$$AQ_n=Q_{n+1}H_{n+1, n}$$

We see that if we form $Q_n^TQ_{n+1}$, we get an `identity` matrix of size $n \times (n+1)$

Therefore, the effect of $Q_n^TQ_{n+1}H_{n+1,n}$ would be equivalent of chopping off the `final row` of $H_{n+1,n}$

$$H_n = Q_n^TQ_{n+1}H_{n+1,n}=\begin{bmatrix}h_{11} & \cdots & \cdots & h_{1n} \\ h_{21}& h_{22} & & \vdots \\
& \ddots & \ddots& \vdots \\ & & h_{n,n-1} & h_{nn}\end{bmatrix}$$

Plug in the first expression, we have

$$H_n = Q_n^TAQ_n$$

In [11]:
print('H_n:\n', H[:-1,:])
print('\nQ_n.TQ_{n+1}H_{n+1,n}:\n', Q[:,:-1].T @ Q @ H)
print('\nDifference:\n', np.linalg.norm(H[:-1,:] - Q[:,:-1].T @ Q @ H))

H_n:
 [[ 5.6578  2.6524  0.0570  0.1914  0.1585  0.2249 -0.3289]
 [ 3.0653  1.7470 -0.3188 -0.0119  0.4163  0.0132  0.2842]
 [ 0.0000  0.7440  0.1827  0.0356 -0.0546 -0.3900 -0.0085]
 [ 0.0000  0.0000  0.9925 -0.4313  0.1352  0.5985 -0.3471]
 [ 0.0000  0.0000  0.0000  0.8850 -0.4087 -0.0556 -0.0881]
 [ 0.0000  0.0000  0.0000  0.0000  0.7869 -0.2393 -0.2453]
 [ 0.0000  0.0000  0.0000  0.0000  0.0000  0.9218  0.0942]]

Q_n.TQ_{n+1}H_{n+1,n}:
 [[ 5.6578  2.6524  0.0570  0.1914  0.1585  0.2249 -0.3289]
 [ 3.0653  1.7470 -0.3188 -0.0119  0.4163  0.0132  0.2842]
 [-0.0000  0.7440  0.1827  0.0356 -0.0546 -0.3900 -0.0085]
 [ 0.0000  0.0000  0.9925 -0.4313  0.1352  0.5985 -0.3471]
 [ 0.0000  0.0000  0.0000  0.8850 -0.4087 -0.0556 -0.0881]
 [-0.0000 -0.0000  0.0000 -0.0000  0.7869 -0.2393 -0.2453]
 [-0.0000 -0.0000  0.0000 -0.0000 -0.0000  0.9218  0.0942]]

Difference:
 2.033486979914258e-15
