# 奇異值分解

![Creative Commons License](https://i.creativecommons.org/l/by/4.0/88x31.png)  
This work by Jephian Lin is licensed under a [Creative Commons Attribution 4.0 International License](http://creativecommons.org/licenses/by/4.0/).

$\newcommand{\trans}{^\top}
\newcommand{\adj}{^{\rm adj}}
\newcommand{\cof}{^{\rm cof}}
\newcommand{\inp}[2]{\left\langle#1,#2\right\rangle}
\newcommand{\dunion}{\mathbin{\dot\cup}}
\newcommand{\bzero}{\mathbf{0}}
\newcommand{\bone}{\mathbf{1}}
\newcommand{\ba}{\mathbf{a}}
\newcommand{\bb}{\mathbf{b}}
\newcommand{\bc}{\mathbf{c}}
\newcommand{\bd}{\mathbf{d}}
\newcommand{\be}{\mathbf{e}}
\newcommand{\bh}{\mathbf{h}}
\newcommand{\bp}{\mathbf{p}}
\newcommand{\bq}{\mathbf{q}}
\newcommand{\br}{\mathbf{r}}
\newcommand{\bx}{\mathbf{x}}
\newcommand{\by}{\mathbf{y}}
\newcommand{\bz}{\mathbf{z}}
\newcommand{\bu}{\mathbf{u}}
\newcommand{\bv}{\mathbf{v}}
\newcommand{\bw}{\mathbf{w}}
\newcommand{\tr}{\operatorname{tr}}
\newcommand{\nul}{\operatorname{null}}
\newcommand{\rank}{\operatorname{rank}}
%\newcommand{\ker}{\operatorname{ker}}
\newcommand{\range}{\operatorname{range}}
\newcommand{\Col}{\operatorname{Col}}
\newcommand{\Row}{\operatorname{Row}}
\newcommand{\spec}{\operatorname{spec}}
\newcommand{\vspan}{\operatorname{span}}
\newcommand{\Vol}{\operatorname{Vol}}
\newcommand{\sgn}{\operatorname{sgn}}
\newcommand{\idmap}{\operatorname{id}}
\newcommand{\am}{\operatorname{am}}
\newcommand{\gm}{\operatorname{gm}}$

In [None]:
# from lingeo import random_int_list

## Main idea

Continuing the introduction of the singular value decomposition in 314, this section will provide the theoretical foundation of it.

##### Singular value decomposition

Let $A$ be an $m\times n$ matrix.  
Then there are orthonormal bases $\alpha$ and $\beta$ of $\mathbb{R}^n$ and $\mathbb{R}^m$, respectively, such that 
$$[f_A]_\alpha^\beta = \Sigma = \begin{bmatrix} 
 \operatorname{diag}(\sigma_1, \ldots, \sigma_r) & O_{r,n-r} \\
 O_{m-r,r} & O_{m-r,n-r} 
\end{bmatrix},$$
where $\sigma_1\geq\cdots\geq\sigma_r$ and $\operatorname{diag}(\sigma_1,\ldots,\sigma_r)$ is the diagonal matrix with the given diagonal entries.  

That is, there are $m\times m$ and $n\times n$ orthogonal matrices $U$ and $V$ such that $U^\top AV = \Sigma$ and $A = U\Sigma V^\top$.  

Recall that $AB$ and $BA$ have the same set of nonzero eigenvalues.  
(See 506-6.)  
The values $\sigma_1 \geq \cdots \geq \sigma_r$ are called the **singular values** of $A$.  

- They are the nonzero eigenvalues of $A\trans A$.  
- They are the nonzero eigenvalues of $AA\trans$.  
- They are positive.  
- There are $r = \rank(A)$ of them.  

Indeed, the columns of $V$ form an orthonormal eigenbasis of $A\trans A$,  
while the columns of $U$ form an orthonormal eigenbasis of $AA\trans$.  

The singular value decomposition of an $m\times n$ matirx can be found by the following steps:  

1. Compute an orthonormal eigenbasis $\alpha$ of $A\trans A$.  
2. Order the eigenvectors in $\alpha$ by the corresponding eigenvalues in the non-increasing order.  
Let $\alpha_1$ and $\alpha_2$ be the sets of eigenvectors in $\alpha$ that correspond to positive and zero eigenvalues, respectively.  
3. Let $\beta_1 = A\alpha_1 = \{A\bv: \bv \in \alpha_1\}$.  
Let $\beta_0$ be an orthonormal basis of $\ker(AA\trans)$.  
Let $\beta = \beta_1 \cup \beta_0$.  

Thus, the desired eigenbasis are found.  

For the construction of the matrices.  

- Construct $V$ by using $\alpha$ as the columns vectors.  
- Construct $U$ by using $\beta$ as the columns vectors.  
- The singular values are the nonzero eigenvalues of $A\trans A$ (or $AA\trans$).

## Side stories

- $AB$ and $BA$ have the same nonzero eigenvalues
- image compression
- Moore–Penrose pseudo inverse

## Experiments

##### Exercise 1

執行以下程式碼。  

In [None]:
### code
set_random_seed(0)
print_ans = None

while True:
    L = matrix(3, random_int_list(9, 2))
    eigs = random_int_list(2)
    if L.is_invertible() and eigs[0] != eigs[1]:
        break
    
Q,R = QR(L)
for j in range(3):
    v = Q[:,j]
    length = sqrt((v.transpose() * v)[0,0])
    Q[:,j] = v / length

eigs.append(eigs[-1])
D = diagonal_matrix(eigs)
A = Q * D * Q.transpose()

pretty_print(LatexExpr("A ="), A)
pretty_print(LatexExpr("A = Q D Q^{-1} ="), Q, D, Q.transpose())

if print_ans:
    print("eigenvalues of A:", eigs)
    print("eigenvectors of A = columns of Q")
    pretty_print(LatexExpr("A ="), 
                 eigs[0], Q[:,0]*Q[:,0].transpose(), 
                 LatexExpr("+"), 
                 eigs[1], Q[:,1:]*Q[:,1:].transpose())

##### Exercise 1(a)

求 $A$ 的所有特徵值及其對應的特徵向量。

##### Exercise 1(b)

求 $A$ 的譜分解。

## Exercises

##### Exercise 2

求以下矩陣的譜分解。

##### Exercise 2(a)

$$
    A = \begin{bmatrix}
     0 & 1 \\
     1 & 0
    \end{bmatrix}.
$$

##### Exercise 2(b)

$$
    A = \begin{bmatrix}
     1 & 1 & 1 \\
     1 & 1 & 1 \\
     1 & 1 & 1 \\
    \end{bmatrix}.
$$

##### Exercise 2(c)

$$
    A = \begin{bmatrix}
     2 & -1 & -1 \\
     -1 & 2 & -1 \\
     -1 & -1 & 2
    \end{bmatrix}.
$$

##### Exercise 3

令 $\bu$ 為一長度為 $1$ 的實向量。  
令 $P = \bu\bu\trans$。

##### Exercise 3(a)

說明 $P$ 為垂直投影到 $\vspan\{\bu\}$ 的投影矩陣。

##### Exercise 3(b)

證明 $\tr(P\trans P) = \rank(P) = 1$。

##### Exercise 4

令 $\{\bu_1, \ldots, \bu_d\}$ 為一群互相垂直且長度均為 $1$ 的實向量。  
令 $P = \bu_1\bu_1\trans + \cdots + \bu_d\bu_d\trans$。

##### Exercise 4(a)

說明 $P$ 為垂直投影到 $\vspan\{\bu_1,\ldots, \bu_d\}$ 的投影矩陣。

##### Exercise 4(b)

證明 $\tr(P\trans P) = \rank(P) = d$。

##### Exercise 5

一個 **垂直投影矩陣** 指的是一個可以被垂直矩陣對角化且特徵值均是 $1$ 或 $0$ 的矩陣。  

令 $P$ 為一實方陣。  
證明以下敘述等價：  

1. $P$ 為一垂直投影矩陣。
2. $P$ 是對稱矩陣，且 $P^2 = P$。

##### Exercise 6

雖然譜分解裡的條件沒有明顯說明 $P_i$ 是垂直投影矩陣，  
依照以下步驟證明下列條件

1. $A = \sum_{j=1}^q \mu_j P_j$,  
2. $P_i^2 = P_i$ for any $i$, 
3. $P_iP_j = O$ for any $i$ and $j$, and 
4. $\sum_{j=1}^q P_j = I_n$.

足以說明每一個 $P_i$ 都是垂直投影矩陣。

##### Exercise 6(a)

驗證  

$$
    \begin{aligned}
    I &= P_1 + \cdots + P_q, \\
    A &= \mu_1 P_1 + \cdots + \mu_q P_q, \\
    A^2 &= \mu_1^2 P_1 + \cdots + \mu_q^2 P_q, \\
    ~ & \vdots \\
    A^{q-1} &= \mu_1^{q-1} P_1 + \cdots + \mu_q^{q-1} P_q.
    \end{aligned}
$$

並利用拉格朗日多項式來說明對每一個 $i = 1,\ldots, q$ 來說，  
都找得到一些係數 $c_0,\ldots,c_{q-1}$ 使得 $P_i = c_0 I + c_1 A + \cdots + c_{q-1} A^{q-1}$。  
因此每一個 $P_i$ 都是對稱矩陣。

##### Exercise 6(b)

說明每一個 $P_i$ 都是垂直投影矩陣。

##### Exercise 7

依照以下步驟證明下述定理。

##### Eigenvector-eigenvalue identity
若 $A$ 為一 $n\times n$ 實對稱矩陣。  
其特徵值為 $\lambda_1,\ldots,\lambda_n$ 且某一個 $\lambda_i$ 只出現一次沒有重覆。  
令 $\bv_1,\ldots, \bv_n$ 為其相對應的特徵向量，且其形成一垂直標準基。

$$
    (A - \lambda_i I)\adj = \left(\prod_{j\neq i}(\lambda_j - \lambda_i)\right)\bv_i\bv_i\trans.
$$

##### Exercise 7(a)

說明當 $x$ 不為 $A$ 的特徵值時，  

$$
    \begin{aligned}
    (A - xI)\adj &= \det(A - xI) \times \sum_{j = 1}^n (\lambda_j - x)^{-1}\bv_i\bv_i\trans \\
    &= \sum_{j = 1}^n p_i(x) \bv_i\bv_i\trans, 
    \end{aligned}
$$

其中 

$$
    p_j(x) = \prod_{k \neq j}(\lambda_k - x).
$$

##### Exercise 7(b)

將 $x$ 趨近到 $\lambda_i$ 並證明特徵向量-特徵值定理。