## **Singular Value Decomposition (SVD)**

### **Motivation**

Consider a linear transformation $T: \mathbb{R}^n \to \mathbb{R}^m$ represented by the matrix $\mathbf{A}_{m \times n}$. Let $r = \text{rank}(\mathbf{A}) \leq \min\{m, n\}$ denote the rank of $\mathbf{A}$.

A central task in linear algebra is to analyze the various subspaces associated with a matrix $\mathbf{A}$. In particular, we are interested in the following subspaces (adapted from [this source](https://ccjou.wordpress.com/2012/11/19/%E7%9F%A9%E9%99%A3%E7%9A%84%E5%9B%9B%E5%80%8B%E5%9F%BA%E6%9C%AC%E5%AD%90%E7%A9%BA%E9%96%93%E5%9F%BA%E5%BA%95%E7%AE%97%E6%B3%95/)):

- The **null space** $N(\mathbf{A})$, spanned by $\{ \mathbf{v}_{r+1}, \ldots, \mathbf{v}_n \}$
- The **row space** $C(\mathbf{A}^T)$, spanned by $\{ \mathbf{v}_1, \ldots, \mathbf{v}_r \}$
- The **column space** $C(\mathbf{A})$, spanned by $\{ \mathbf{u}_1, \ldots, \mathbf{u}_r \}$
- The **left null space** $N(\mathbf{A}^T)$, spanned by $\{ \mathbf{u}_{r+1}, \ldots, \mathbf{u}_m \}$

Here, $\{ \mathbf{v}_1, \ldots, \mathbf{v}_n \}$ and $\{ \mathbf{u}_1, \ldots, \mathbf{u}_m \}$ are **arbitrary** bases for $\mathbb{R}^n$ and $\mathbb{R}^m$, respectively.

A fundamental property of linear transformations is that for any vector — in this case, the basis vectors — the transformation $\mathbf{A} \mathbf{v}_i$ must lie within the column space $C(\mathbf{A})$. This leads to the following relationship:

$$
\mathbf{A} \mathbf{v}_i = a_{1,i} \mathbf{u}_1 + a_{2,i} \mathbf{u}_2 + \ldots + a_{r,i} \mathbf{u}_r, \quad i = 1, \ldots, r
$$

This raises the question: Can we select bases for $C(\mathbf{A}^T)$ and $C(\mathbf{A})$ such that this transformation becomes simpler and more elegant? The answer is yes! The core objective of Singular Value Decomposition (SVD) is to identify bases that simplify this transformation. By selecting the appropriate bases, we can express the relationship as:

$$
\mathbf{A} \mathbf{v}_i = \sigma_i \mathbf{u}_i, \quad i = 1, \ldots, r
$$

where $\sigma_i$ represents the singular values of $\mathbf{A}$. We will explore this in more detail in the following sections.

### **Derivation**

Since $\mathbf{A}$ is not necessarily diagonalizable, we cannot directly perform an eigen-decomposition to find eigenbases. Instead, we examine the cross-product matrices $\mathbf{A}^T \mathbf{A}$ and $\mathbf{A} \mathbf{A}^T$, both of which are symmetric and positive semi-definite.

#### **Choosing Bases: Null Space $N(\mathbf{A})$ and Row Space $C(\mathbf{A}^T)$**

We begin by analyzing the eigen-decomposition of $\mathbf{A}^T \mathbf{A}$:

1. Since $\mathbf{A}^T \mathbf{A}$ is symmetric and positive semi-definite, it can be diagonalized as:

   $$
   \mathbf{A}^T \mathbf{A} = \mathbf{V} \mathbf{\Sigma}^2 \mathbf{V}^T
   $$

   where $\mathbf{V}$ contains the eigenvectors $\mathbf{v}_i$, and $\mathbf{\Sigma}^2$ is the diagonal matrix of eigenvalues $\sigma^2_i \geq 0$.

2. Using matrix rank relations:
   - The rank of $\mathbf{A}^T \mathbf{A}$ equals the rank of $\mathbf{A}$, which is $r$ (proof omitted).
   - Since $\mathbf{A}^T \mathbf{A}$ is symmetric and positive semi-definite, its rank is the same as the rank of $\mathbf{\Sigma}^2$.
   - The rank of $\mathbf{\Sigma}^2$ is $r$, implying that $\sigma^2_{r+1} = \ldots = \sigma^2_{n} = 0$.

3. For any eigenvector $\mathbf{v}_i$ of $\mathbf{A}^T \mathbf{A}$, we have:

   $$
   \Vert \mathbf{A} \mathbf{v}_i \Vert^2 = \mathbf{v}_i^T \mathbf{A}^T \mathbf{A} \mathbf{v}_i = \sigma^2_i \mathbf{v}_i^T \mathbf{v}_i = \sigma^2_i
   $$

   This implies that $\mathbf{A} \mathbf{v}_i = \mathbf{0}$ for $i = r+1, \ldots, n$, since $\sigma^2_i = 0$ for these indices.

4. By the rank-nullity theorem, we conclude that:
   - The null space $N(\mathbf{A})$ is spanned by $\{ \mathbf{v}_{r+1}, \ldots, \mathbf{v}_n \}$,
   - The row space $C(\mathbf{A}^T)$ is spanned by $\{ \mathbf{v}_1, \ldots, \mathbf{v}_r \}$.

#### **Choosing Bases: Left Null Space $N(\mathbf{A}^T)$ and Column Space $C(\mathbf{A})$**

Similarly, we analyze the eigen-decomposition of $\mathbf{A} \mathbf{A}^T$. By similar reasoning, we conclude that:

1. The eigen-decomposition of $\mathbf{A} \mathbf{A}^T$ is:

   $$
   \mathbf{A} \mathbf{A}^T = \mathbf{U} \mathbf{\Sigma}^2 \mathbf{U}^T
   $$

   where $\mathbf{U}$ contains the eigenvectors $\mathbf{u}_i$, and $\mathbf{\Sigma}^2$ is the diagonal matrix of eigenvalues $\sigma^2_i \geq 0$.

   > **Note:** $\mathbf{A}^T \mathbf{A}$ and $\mathbf{A} \mathbf{A}^T$ share the same nonzero eigenvalues (see [this source](https://ccjou.wordpress.com/2009/03/14/ab-%e5%92%8c-ba-%e6%9c%89%e4%bd%95%e9%97%9c%e4%bf%82/)).

2. $\sigma^2_{r+1} = \ldots = \sigma^2_{m} = 0$.

3. $\mathbf{A}^T \mathbf{u}_i = \mathbf{0}$ for $i = r+1, \ldots, n$.

4. We conclude that:
   - The left null space $N(\mathbf{A}^T)$ is spanned by $\{ \mathbf{u}_{r+1}, \ldots, \mathbf{u}_m \}$,
   - The column space $C(\mathbf{A})$ is spanned by $\{ \mathbf{u}_1, \ldots, \mathbf{u}_r \}$.

### **Connection Between the Bases**

From the eigen-decompositions of $\mathbf{A}^T \mathbf{A}$ and $\mathbf{A} \mathbf{A}^T$, we establish the following relationships:

$$
\mathbf{A} \left( \mathbf{A}^T \mathbf{A} \mathbf{v}_i \right) = \mathbf{A} \left( \sigma^2_i  \mathbf{v}_i \right), \quad i = 1, \ldots, n
$$

$$
\mathbf{A} \mathbf{A}^T \mathbf{u}_i = \sigma^2_i \mathbf{u}_i, \quad i = 1, \ldots, m
$$

These relationships imply:

$$
\mathbf{u}_i = \frac{\mathbf{A} \mathbf{v}_i}{\sigma_i}, \quad i = 1, \ldots, r
$$

Rearranging this, we obtain the key equation:

$$
\mathbf{A} \mathbf{v}_i = \sigma_i \mathbf{u}_i, \quad i = 1, \ldots, r
$$

Where:
- $\sigma_i$ are the singular values of $\mathbf{A}$ (the **squared eigenvalues** of $\mathbf{A}^T \mathbf{A}$ and $\mathbf{A} \mathbf{A}^T$),
- $\mathbf{v}_i$ are the right singular vectors of $\mathbf{A}$ (the eigenvectors of $\mathbf{A}^T \mathbf{A}$),
- $\mathbf{u}_i$ are the left singular vectors of $\mathbf{A}$ (the eigenvectors of $\mathbf{A} \mathbf{A}^T$).

### **Matrix Factorization**

#### **Full Matrix Representation**

The Singular Value Decomposition (SVD) of $\mathbf{A}$ can be expressed as:

$$
\mathbf{A} = \mathbf{\hat{U}} \mathbf{\hat{\Sigma}} \mathbf{\hat{V}}^T
$$

Where:
- $\mathbf{\hat{V}} \in \mathbb{R}^{n \times n}$ contains the right singular vectors,
- $\mathbf{\hat{U}} \in \mathbb{R}^{m \times m}$ contains the left singular vectors,
- $\mathbf{\hat{\Sigma}} \in \mathbb{R}^{m \times n}$ is a diagonal matrix of singular values, with zero entries for the non-rank components.

#### **Reduced Matrix Representation**

For practical purposes, we often retain only the nonzero singular values and use the reduced form of the SVD:

$$
\mathbf{A} = \mathbf{U} \mathbf{\Sigma} \mathbf{V}^T
$$

Where:
- $\mathbf{V} \in \mathbb{R}^{n \times r}$ contains the right singular vectors,
- $\mathbf{U} \in \mathbb{R}^{m \times r}$ contains the left singular vectors,
- $\mathbf{\Sigma} \in \mathbb{R}^{r \times r}$ is a diagonal matrix of nonzero singular values.

#### **Component Representation**

We can also represent $\mathbf{A}$ as a sum of rank-1 matrices:

$$
\mathbf{A} = \sum_{i = 1}^{r} \sigma_i \mathbf{u}_i \mathbf{v}_i^T
$$

### **Appendix**

#### **Eigen-decomposition vs. Singular Value Decomposition**

Both eigen-decomposition and singular value decomposition aim to diagonalize the matrix $\mathbf{A}$, but they differ in the choice of basis vectors.

- **Eigen-decomposition** expresses $\mathbf{A}$ in terms of its eigenvectors and eigenvalues:

  $$
  \mathbf{A} = \mathbf{P} \mathbf{D} \mathbf{P}^{-1}
  $$

  where $\mathbf{P}$ is the matrix whose columns are the eigenvectors of $\mathbf{A}$, and $\mathbf{D}$ is the diagonal matrix containing the eigenvalues.

- **Singular Value Decomposition (SVD)** decomposes $\mathbf{A}$ into three matrices:

  $$
  \mathbf{A} = \mathbf{U} \mathbf{\Sigma} \mathbf{V}^T
  $$

  where:
  - $\mathbf{U}$ contains the left singular vectors of $\mathbf{A}$,
  - $\mathbf{\Sigma}$ is a diagonal matrix of singular values,
  - $\mathbf{V}$ contains the right singular vectors of $\mathbf{A}$.

In summary:
- Eigen-decomposition uses the eigenvectors of $\mathbf{A}$.
- SVD uses the right singular vectors and the left singular vectors.

> Although eigen-decomposition and SVD may coincide under special cases (e.g., symmetric matrices), they are, in general, distinct concepts.
