### `QR` decomposition

Use `general` Gram-Schmidt, we see that we can decompose $A\in \mathbf{R}^{n \times k}$ into two matrices $Q\in \mathbf{R}^{n \times r}$, and $R\in \mathbf{R}^{r \times k}, r\leq k$

$$\begin{bmatrix}a_1 & a_2 & \cdots & a_k\end{bmatrix}=\begin{bmatrix}q_1 & q_2 & \cdots & q_r\end{bmatrix}\begin{bmatrix}r_{11} & r_{12} & \cdots & r_{1k} \\ 0 & r_{22} & \cdots & r_{2k} \\ \cdots & \cdots & \cdots & \cdots \\ 0 & 0& \cdots & r_{rk} \end{bmatrix}$$

Where columns of $Q$ are `orthonormal basis` for $R(A)$ (not necessarily $\mathbf{R}^k$ or $\mathbf{R}^n$), and $R$ is `upper staircase` form

In the case that columns of $A$ form an `independent` set, then $r=k$ and $R$ becomes `upper triangular`

(Therefore, $R$ is either square or fat matrix)

We can use a `permutation` matrix $P$ to move all columns in $R$ with new $q$ to front of matrix

$$A=Q\begin{bmatrix}\bar{R} & S\end{bmatrix}P$$

* $Q^TQ=I_r$
* $\bar{R}\in \mathbf{R}^{r\times r}$ is `upper triangular` and invertible

#### `Applications`

* directly find `orthonormal basis` for $R(A)$
* find `decomposition` of $A=BC$, $B\in\mathbf{R}^{n\times r}$, $C\in\mathbf{R}^{r\times k}$, $r=\text{rank} (A)$
* check if $b\in \text{span}(a_1, \cdots, a_k)$ by running QR on $\begin{bmatrix}a_1& \cdots & a_k & b\end{bmatrix}$

#### `Full` QR decomposition

Now, we know for $A\in \mathbf{R}^{n \times k}$, we can use general G-S to get

$A=Q_1R_1$

where $Q_1\in \mathbf{R}^{n \times r}$, and $R_1\in \mathbf{R}^{r \times k}, r\leq k$

But, how to find the basis for subspace `complementary` to $R(A)$ under $\mathbf{R}^n$? As after all, columns $a_1, \cdots, a_k\in \mathbf{R}^n$...

We can write

$$A=\begin{bmatrix}Q_1 & Q_2\end{bmatrix}\begin{bmatrix}R_1 \\ 0\end{bmatrix}$$

To get $Q_1$, $Q_2$, $R_1$, we can run general G-S to $\begin{bmatrix}A & \bar{A}\end{bmatrix}$, where $\bar{A}$ is any matrix that can make $\begin{bmatrix}A & \bar{A}\end{bmatrix}$ `full rank`, often we can use the `identity matrix` $\bar{A}=I$

* $Q_1$ are orthonormal vectors obtained from columns of $A$
* $Q_2$ are orthonormal vectors obtained from columns of $\bar{A}$ (if $A$ is not full rank)
* $Q_1$ and $Q_2$ are `orthogonal complement` to each other in $\mathbf{R}^n$

$$R(Q_1)=R(Q_2)^{\perp}$$

#### Example

In [1]:
import matplotlib.pyplot as plt
import numpy as np
np.set_printoptions(formatter={'float': '{: 0.4f}'.format})

plt.style.use('dark_background')
# color: https://matplotlib.org/stable/gallery/color/named_colors.htm

In [2]:
def full_qr(A):
    n, k = A.shape  # Get number of vectors (columns) in A
    I = np.eye(n)  # Identity matrix I
    A_augmented = np.hstack([A, I])  # Create augmented matrix [A I]

    Q1, Q2 = [], []  # Start with empty list, as we don't know how many q's are there
    R1 = np.zeros((0, k))  # Same for R1
    R2 = np.zeros((0, n))  # We add a dummy R2 just to show at which columns in I new q's are generated

    for i in range(k+n):
        # Loop over all a_i
        q = A_augmented[:, i].copy()

        if i < k: # Process columns from A
        # Remove components of a in directions of previous q's
        # This skips when i=0
            for j in range(len(Q1)):
                R1[j, i] = np.dot(Q1[j], A_augmented[:, i])
                q -= R1[j, i] * Q1[j] # -(q_j^T a_i)q_j

            # Compute norm of new q
            norm_q = np.sqrt(np.dot(q, q))

            # Only add q to Q if it is not small
            if norm_q > 1e-10:  # Tolerance
                q /= norm_q
                Q1.append(q)

                # Expand R1 to include new row corresponding to new q
                new_row = np.zeros((1, k))
                new_row[0, i] = norm_q
                R1 = np.vstack([R1, new_row])

        else:  # Process columns from I
            for j in range(len(Q1) + len(Q2)):
                if j < len(Q1):
                    # Remove components of a in direction of q's in Q_1
                    q -= np.dot(Q1[j], A_augmented[:, i]) * Q1[j]
                else:
                    # Remove components of a in direction of q's in Q_2
                    R2[j - len(Q1), i - k] = np.dot(Q2[j - len(Q1)], A_augmented[:, i])
                    q -= R2[j - len(Q1), i - k] * Q2[j - len(Q1)]

            norm_q = np.sqrt(np.dot(q, q))
            if norm_q > 1e-10:  # Tolerance
                q /= norm_q
                Q2.append(q)

                # Expand R2 to include new row corresponding to new q
                new_row = np.zeros((1, n))
                new_row[0, i - k] = norm_q
                R2 = np.vstack([R2, new_row])

    # Convert lists Q1 and Q2 to arrays
    Q1 = np.column_stack(Q1) if Q1 else np.zeros((n, 0))
    Q2 = np.column_stack(Q2) if Q2 else np.zeros((n, 0))

    return Q1, R1, Q2, R2

In [3]:
A_full_rank = False
if A_full_rank:
    A = np.array([[1.0, 2.0, 3.0, 1.0 + 2.0, 3.0, 2.0],
                  [4.0, 1.0, 0.0, 4.0 + 1.0, 0.0, 3.0],
                  [3.0, 5.0, -2.0, 3.0 + 5.0, -2.0, 7.0],
                  [2.0, 0.0, 1.0, 2.0 + 0.0, 1.0, 3.0]])
else:
    A = np.array([[1.0, -1.0, 2.0, 1.0 + 2.0],
                  [4.0, -4.0, 1.0, 4.0 + 1.0],
                  [3.0, -3.0, 5.0, 3.0 + 5.0],
                  [2.0, -2.0, 0.0, 2.0 + 0.0]])

Q1, R1, Q2, R2 = full_qr(A)

print("Q1 (Orthonormal basis for R(A)):")
print(Q1)

print("\nR1 (Upper staircase matrix corresponding to A):")
print(R1)

print("\nQ2 (Orthonormal basis for R^n \\ R(A)):")
print(Q2)

print("\nR2 (Upper staircase matrix corresponding to I):")
print(R2)

# Verify Q1 and Q2 are orthonormal
print(f"\nQ1^T Q1:\n{np.dot(Q1.T, Q1)}")
print(f"Q2^T Q2:\n{np.dot(Q2.T, Q2)}")

# Verify Q1 is orthnormal to Q2
print(f"\nQ1^T Q2:\n{np.dot(Q1.T, Q2)}")

# Verify that A = Q1 R1
A_reconstructed = np.dot(Q1, R1)

print("\nOriginal matrix A:")
print(A)
print("\nReconstructed matrix A from Q1 and R1:")
print(A_reconstructed)

Q1 (Orthonormal basis for R(A)):
[[ 0.1826  0.3324]
 [ 0.7303 -0.4602]
 [ 0.5477  0.7414]
 [ 0.3651 -0.3579]]

R1 (Upper staircase matrix corresponding to A):
[[ 5.4772 -5.4772  3.8341  9.3113]
 [ 0.0000  0.0000  3.9115  3.9115]]

Q2 (Orthonormal basis for R^n \ R(A)):
[[ 0.9253  0.0000]
 [ 0.0212  0.5044]
 [-0.3744 -0.1009]
 [ 0.0565 -0.8575]]

R2 (Upper staircase matrix corresponding to I):
[[ 0.9253  0.0212 -0.3744  0.0565]
 [ 0.0000  0.5044 -0.1009 -0.8575]]

Q1^T Q1:
[[ 1.0000  0.0000]
 [ 0.0000  1.0000]]
Q2^T Q2:
[[ 1.0000  0.0000]
 [ 0.0000  1.0000]]

Q1^T Q2:
[[-0.0000  0.0000]
 [-0.0000 -0.0000]]

Original matrix A:
[[ 1.0000 -1.0000  2.0000  3.0000]
 [ 4.0000 -4.0000  1.0000  5.0000]
 [ 3.0000 -3.0000  5.0000  8.0000]
 [ 2.0000 -2.0000  0.0000  2.0000]]

Reconstructed matrix A from Q1 and R1:
[[ 1.0000 -1.0000  2.0000  3.0000]
 [ 4.0000 -4.0000  1.0000  5.0000]
 [ 3.0000 -3.0000  5.0000  8.0000]
 [ 2.0000 -2.0000 -0.0000  2.0000]]


#### Revisit `subspaces` of matrix

We know $R(Q_1)=R(A)$, but what is $\boxed{R(Q_2)}$?

We transpose (recall this is run with $\begin{bmatrix}A & I \end{bmatrix}$)

$$A=\begin{bmatrix}Q_1 & Q_2\end{bmatrix}\begin{bmatrix}R_1 \\ 0\end{bmatrix}$$

to get

$$A^T=\begin{bmatrix}R_1^T & 0\end{bmatrix}\begin{bmatrix}Q_1^T \\ Q_2^T\end{bmatrix}$$


We see that for $z\in N(A^T)$

$$A^Tz=0 \Longleftrightarrow R_1^TQ_1^Tz=0$$

Since $R_1^T$ is tall or square matrix and full rank, $Q_1^Tz$ is a vector, so the only way that $R_1^TQ_1^Tz=0$ is $Q_1^Tz=0$


This means that $z$ is orthogonal to every `column` of $Q_1$, and therefore, every `column` in $A$ as well. As a result, $z\in R(Q_2)$

Therefore

$$R(Q_2)=N(A^T)$$

and columns of $Q_2$ are orthonormal basis for $N(A^T)$

Also, $R(Q_1)$, or $R(A)$, and $N(A^T)$ are complementary subspaces for $\mathbf{R}^\color{red}{n}$

$$\boxed{R(A)=N(A^T)^{\perp}}$$

$$\dim R(A) + \dim N(A^T) = n$$

Similarly, we can show $R(A^T)$ and $N(A)$ are complementary subspaces for $\mathbf{R}^\color{red}{k}$

$$\boxed{R(A^T)=N(A)^{\perp}}$$

$$\dim R(A^T) + \dim N(A) = k$$

That's pretty much it for `four fundamental subspaces` (two nullspaces, two column spaces) of a matrix

#### `Bessel's` inequality

If columns of $U\in \mathbf{R}^{n \times n}$ are `orthonormal set`, then

$$\|U^Tx\|^2=x^TU^TUx=x^Tx=\|x\|^2$$

However, what if $U\in \mathbf{R}^{n \times \color{red}{k}}$, where $k\leq n$?

We run full QR over $\begin{bmatrix}U & I\end{bmatrix}$ to get $Q=\begin{bmatrix}U & \bar{U}\end{bmatrix}$

Since $Q$ is `orthogonal` (square and columns are orthonormal), we have

$$\begin{align*}\left\|\begin{bmatrix}U & \bar{U}\end{bmatrix}^Tx\right\|^2&= \left\|\begin{bmatrix}U^T \\ \bar{U}^T\end{bmatrix}x\right\|^2\\
&= \left\|\begin{bmatrix}U^Tx \\ \bar{U}^Tx\end{bmatrix}\right\|^2\\
&=\|U^Tx\|^2+\|\bar{U}^Tx\|^2\\
&=\|x\|^2
\end{align*}$$

So, we have

$$\boxed{\|U^Tx\|^2 \leq \|x\|^2}$$