#### `Gram–Schmidt` procedure

Suppose $A=\begin{bmatrix}a_1 & a_2 & \cdots & a_k\end{bmatrix}\in \mathbf{R}^{n \times k}$ is `full rank`, recall the classic G-S (CGS) procedure wants to find the $Q$ and $R$ matrices such that

$$\begin{bmatrix}a_1 & a_2 & \cdots & a_k\end{bmatrix}=\begin{bmatrix}q_1 & q_2 & \cdots & q_k\end{bmatrix}\begin{bmatrix}r_{11} & r_{12} & \cdots & r_{1k} \\ 0 & r_{22} & \cdots & r_{2k} \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0& \cdots & r_{kk} \end{bmatrix}$$


$$a_i = (q_1^Ta_i)q_1 + (q_2^Ta_i)q_2 +\cdots+ (q_{i-1}^Ta_i)q_{i-1} + \|\tilde{q_i}\|q_i$$

We write the expression in the form of `sum of coefficients times vectors`

$$\begin{align*}
\tilde{q_i} &= a_i - (q_1^Ta_i)q_1 - (q_2^Ta_i)q_2 -\cdots- (q_{i-1}^Ta_i)q_{i-1}, q_i = \frac{\tilde{q_i}}{\|\tilde{q_i}\|}
\end{align*}$$

and we sequentially compute each term $(q_1^Ta_i)q_1$ and `subtract` it from $a_i$ to get $q_i$ (after normalization)

#### `Equivalent` iterative procedure

On the other hand, we know that we can also write the expression in the form of `rank-one projection matrix`

$$\begin{align*}
\tilde{q_i} &= a_i - (q_1q_1^T)a_i - (q_2q_2^T)a_i -\cdots- (q_{i-1}q_{i-1})^Ta_i\\
&=(I-Q_{i-1}Q_{i-1}^T)a_i \\
&=(I-q_{i-1}q_{i-1}^T)(I-q_{i-2}q_{i-2}^T)\cdots (I-q_1q_1^T) a_i
\end{align*}$$

where $Q_{i-1}=\begin{bmatrix}q_1 & q_2 & \cdots & q_{i-1}\end{bmatrix}$

The equality

$$(I-Q_{i-1}Q_{i-1}^T)a_i =(I-q_{i-1}q_{i-1}^T)(I-q_{i-2}q_{i-2}^T)\cdots (I-q_1q_1^T) a_i$$

follows from the fact that when two unit vectors $u, v$ are orthogonal, the projection onto the complement of $R([u, v])$ is equivalent to projection first onto the complement of $R(u)$ and then to the complement of $R(v)$

$$(I-vv^T)(I-uu^T)=I-uu^T-vv^T+u(u^Tv)v^T=I-vv^T-uu^T=(I-[u, v][u,v]^T)$$

This also indicates that at every iteration when we subtract $-q_{j}q_{j}^Ta_i$, it is equivalent to left multiplying $(I-q_{j}q_{j}^T)$

#### `Modified` G-S (MGS)

With this formulation, the MGS utilizes the following iterative procedure which allows to use prior result of $q_i$ in computation rather than always using $a_i$

$$\begin{align*}
q_i^{(1)}&=a_i \\
q_i^{(2)}&=a_i-(q_1^Ta_i)q_1 \,\,\text{(same as CGS)} \\
q_i^{(3)}&=(I-q_2q_2^T)q_i^{(2)}=q_i^{(2)}-\left(q_2^Tq_i^{(2)}\right)q_2 \,\,\text{(different from CGS)} \\
q_i^{(4)}&=(I-q_3q_3^T)q_i^{(3)}=q_i^{(3)}-\left(q_3^Tq_i^{(3)}\right)q_3 \,\,\text{(different from CGS)} \\
&\vdots \\
q_i^{(i)}&=(I-q_{i-1}q_{i-1}^T)q_i^{({i-1})}=q_i^{({i-1})}-\left(q_{i-1}^Tq_i^{({i-1})}\right)q_{i-1} \,\,\text{(different from CGS)}
\end{align*}$$

In the algorithm, we need to revise `R[j, i] = np.dot(Q[:, j], A[:, i])` to `R[j, i] = np.dot(Q[:, j], q)`

#### Intuition of better `numerical stability` of MGS

The orthogonalization coefficient computation in MGS $q_j^Tq_i^{(k)}$ only allows to propagate the potential error in $q_j$, which may span high dimension, to be restricted to $q_i^{(k)}$, whose dimension is getting lower as iteration $k$ moves on. Therefore, error propagation is getting limited


However, in CGS, the coefficient computation always involves $a_i$ (which is in high dimension) in $q_j^Ta_i$, therefore, error in $q_j$ won't be restricted to a lower dimension

#### Example

In [1]:
import matplotlib.pyplot as plt
import numpy as np
np.set_printoptions(formatter={'float': '{: 0.4f}'.format})

plt.style.use('dark_background')
# color: https://matplotlib.org/stable/gallery/color/named_colors.htm

In [2]:
def gram_schmidt(A, modified=True):
    # Columns of A are independent
    n = A.shape[1]
    Q = np.zeros_like(A)
    R = np.zeros((n, n))

    for i in range(n):
        q = A[:, i].copy()

        for j in range(i):
            if modified:
                R[j, i] = np.dot(Q[:, j], q)
            else:
                R[j, i] = np.dot(Q[:, j], A[:, i])
            q -= R[j, i] * Q[:, j]

        R[i, i] = np.sqrt(np.dot(q, q))
        q /=  R[i, i]
        Q[:, i] = q

    return Q, R

In [3]:
mat_list = ['square', 'non-square', 'ill-conditioned']
mat = mat_list[2]
epsilon = 1e-8

if mat == 'square':
    A = np.array([[1.0, 2.0, 3.0, 4.0],
                  [4.0, 1.0, 0.0, -1.0],
                  [3.0, 5.0, -2.0, 1.0],
                  [2.0, 0.0, 1.0, 2.0]])
elif mat == 'non-square':
    A = np.array([[1.0, 2.0, 3.0],
                  [4.0, 1.0, 0.0],
                  [3.0, 5.0, -2.0],
                  [2.0, 0.0, 1.0]])
elif mat == 'ill-conditioned':
    A = np.array([[1, 1, 1],
                  [epsilon, 0, 0],
                  [0, epsilon, 0],
                  [0, 0, epsilon]])

Q, R = gram_schmidt(A, modified=True)

print("Orthonormal basis Q:")
print(Q)

print("\nUpper triangular matrix R:")
print(R)

# Verify Q is orthonormal
print(f"\nQ^TQ:\n{np.dot(Q.T, Q)}")
print(f"Norms: \n{np.linalg.norm(Q, axis=0)}")

# Verify that A = QR
A_reconstructed = np.dot(Q, R)
print("\nOriginal matrix A:")
print(A)
print("\nReconstructed matrix A from Q and R:")
print(A_reconstructed)

Orthonormal basis Q:
[[ 1.0000  0.0000  0.0000]
 [ 0.0000 -0.7071 -0.4082]
 [ 0.0000  0.7071 -0.4082]
 [ 0.0000  0.0000  0.8165]]

Upper triangular matrix R:
[[ 1.0000  1.0000  1.0000]
 [ 0.0000  0.0000  0.0000]
 [ 0.0000  0.0000  0.0000]]

Q^TQ:
[[ 1.0000 -0.0000 -0.0000]
 [-0.0000  1.0000  0.0000]
 [-0.0000  0.0000  1.0000]]
Norms: 
[ 1.0000  1.0000  1.0000]

Original matrix A:
[[ 1.0000  1.0000  1.0000]
 [ 0.0000  0.0000  0.0000]
 [ 0.0000  0.0000  0.0000]
 [ 0.0000  0.0000  0.0000]]

Reconstructed matrix A from Q and R:
[[ 1.0000  1.0000  1.0000]
 [ 0.0000  0.0000  0.0000]
 [ 0.0000  0.0000 -0.0000]
 [ 0.0000  0.0000  0.0000]]


In [4]:
Q, R = gram_schmidt(A, modified=False)

print("Orthonormal basis Q:")
print(Q)

print("\nUpper triangular matrix R:")
print(R)

# Verify Q is orthonormal
print(f"\nQ^TQ:\n{np.dot(Q.T, Q)}")
print(f"Norms: \n{np.linalg.norm(Q, axis=0)}")

# Verify that A = QR
A_reconstructed = np.dot(Q, R)
print("\nOriginal matrix A:")
print(A)
print("\nReconstructed matrix A from Q and R:")
print(A_reconstructed)

Orthonormal basis Q:
[[ 1.0000  0.0000  0.0000]
 [ 0.0000 -0.7071 -0.7071]
 [ 0.0000  0.7071  0.0000]
 [ 0.0000  0.0000  0.7071]]

Upper triangular matrix R:
[[ 1.0000  1.0000  1.0000]
 [ 0.0000  0.0000  0.0000]
 [ 0.0000  0.0000  0.0000]]

Q^TQ:
[[ 1.0000 -0.0000 -0.0000]
 [-0.0000  1.0000  0.5000]
 [-0.0000  0.5000  1.0000]]
Norms: 
[ 1.0000  1.0000  1.0000]

Original matrix A:
[[ 1.0000  1.0000  1.0000]
 [ 0.0000  0.0000  0.0000]
 [ 0.0000  0.0000  0.0000]
 [ 0.0000  0.0000  0.0000]]

Reconstructed matrix A from Q and R:
[[ 1.0000  1.0000  1.0000]
 [ 0.0000  0.0000  0.0000]
 [ 0.0000  0.0000  0.0000]
 [ 0.0000  0.0000  0.0000]]


Here, we see that $Q^TQ$ is no longer producing identity matrix, indicating the CGS failed to find the orthonormal basis

However, MGS still produces orthonormal basis