In [1]:
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.patches import Polygon
from matplotlib.collections import PatchCollection
import seaborn as sns
plt.rcParams['text.usetex'] = True

# The Matrix Cookbook
[原文](https://www.math.uwaterloo.ca/~hwolkowi/matrixcookbook.pdf)
## 符号和命名
$\mathbf{A}\rightarrow\;$ Matrix

$\mathbf{A}_{ij}\rightarrow\;$ Matrix indexed for some purpose

$\mathbf{A}_{i}\rightarrow\;$ Matrix indexed for some purpose

$\mathbf{A}^{ij}\rightarrow\;$ Matrix indexed for some purpose

$\mathbf{A}^{n}\rightarrow\;$ Matrix indexed for some purpose or the the $n^{th}$ power of a square matrix

$\mathbf{A}^{-1}\rightarrow\;$ The inverse matrix of matrix $\mathbf{A}$

$\mathbf{A}^{\dagger}\rightarrow\;$ The pseudo inverse matrix of the matrix $\mathbf{A}$

$\mathbf{A}^{1/2}\rightarrow\;$ The square root of a matrix (if unique), not elementwise

$(\mathbf{A})_{i,j}\rightarrow\;$ The $(i,j)^{th}$ entry of the matrix $\mathbf{A}$

$\mathbf{A}_{i,j}\rightarrow\;$ The $(i,j)^{th}$ entry of the matrix $\mathbf{A}$

$[\mathbf{A}]_{i,j}\rightarrow\;$ The $ij$-suubmatrix, i.e. $\mathbf{A}$ with $i^{th}$ row and $j^{th}$ column deleted

$\vec{a}\rightarrow\;$ Vector (column-vector)

$\vec{a}_{i}\rightarrow\;$ Vector indexed for some purpose

$\alpha_{i}\rightarrow\;$ The $i^{th}$ element of vector $\vec{a}$

$\alpha\rightarrow\;$ Scalar

---

$\Re z\rightarrow\;$ Real part of a scalar

$\Re \mathbf{z}\rightarrow\;$ Real part of a vectorS

$\Re \mathbf{Z}\rightarrow\;$ Real part of a matrix

$\Im z\rightarrow\;$ Imaginary part of a scalar

$\Im \mathbf{z}\rightarrow\;$ Imaginary part of a vectorS

$\Im \mathbf{Z}\rightarrow\;$ Imaginary part of a matrix

---

$det(\mathbf{A})\rightarrow\;$ Determinant of $\mathbf{A}$

$Tr(\mathbf{A})\rightarrow\;$ Trace of the matrix $\mathbf{A}$

$diag(\mathbf{A})\rightarrow\;$ Diagonal matrix of the matrix $\mathbf{A}$, i.e. $(diag(\mathbf{A}))_{ij} = \delta_{ij}\mathbf{A}_{ij}$

$eig(\mathbf{A})\rightarrow\;$ Eigenvalues of the matrix $\mathbf{A}$

$vec(\mathbf{A})\rightarrow\;$ The vector-version of the matrix $\mathbf{A}$

$sup\rightarrow\;$ Supremum of a set

$\Vert\mathbf{A}\Vert\rightarrow\;$ Matrix norm (subscript if any denotes what norm)

$\mathbf{A}^{T}\rightarrow\;$ Transposed matrix

$\mathbf{A}^{-T}\rightarrow\;$ The inverse of the transposed and vice versa, $\mathbf{A}^{-T} = (\mathbf{A}^{-1})^{T} = (\mathbf{A}^{T})^{-1}$

$\mathbf{A}^{\star}\rightarrow\;$ Complex conjugated matrix

$\mathbf{A}^{H}\rightarrow\;$ Transposed and complex conjugated matrix (Hermitian)

---

$\mathbf{A}\circ\mathbf{B}$ Hadamard (elementwise) product

$\mathbf{A}\otimes\mathbf{B}$ Kronecker product

---

$\mathbf{0}\rightarrow\;$ The null matrix. Zero in all entries.

$\mathbf{I}\rightarrow\;$ The idnentity matrix

$\mathbf{J}^{ij}\rightarrow\;$ The single entry matrix, 1 at (i,j) and zero elsewhere

$\mathbf{\Sigma}\rightarrow\;$ A posisitive definite matrix

$\mathbf{\Lambda}\rightarrow\;$ A diagonal matrix

## 1 Basics
$$\begin{align} (\mathbf{A}\mathbf{B})^{-1} &= \mathbf{B}^{-1}\mathbf{A}^{-1} \\ (\mathbf{ABC\ldots})^{-1} &= \ldots\mathbf{C}^{-1}\mathbf{B}^{-1}\mathbf{A}^{-1} \\ (\mathbf{A}^{T})^{-1} &= (\mathbf{A}^{-1})^{T} \\ (\mathbf{A} + \mathbf{B})^{T} &= \mathbf{A}^{T} + \mathbf{B}^{T} \\ (\mathbf{AB})^{T} &= \mathbf{B}^{T}\mathbf{A}^{T} \\ (\mathbf{ABC\ldots})^{T} &= \ldots\mathbf{C}^{T}\mathbf{B}^{T}\mathbf{A}^{T} \\ (\mathbf{A}^{H})^{-1} &= (\mathbf{A}^{-1})^{H} \\ (\mathbf{A} + \mathbf{B})^{H} &= \mathbf{B}^{H} + \mathbf{A}^{H} \\ (\mathbf{AB})^{H} &= \mathbf{B}^{H}\mathbf{A}^{H} \\ (\mathbf{ABC\ldots})^{H} &= \ldots\mathbf{C}^{H}\mathbf{B}^{H}\mathbf{A}^{H} \\ \end{align}$$

### 1 Basics Proofs
- equation (1)):$(\mathbf{A}\mathbf{B})^{-1} = \mathbf{B}^{-1}\mathbf{A}^{-1}$
Multiply both side with $AB \Rightarrow (\mathbf{A}\mathbf{B})^{-1}\mathbf{AB} = \mathbf{B}^{-1}\mathbf{A}^{-1}\mathbf{AB}$
Now, $LHS = (\mathbf{A}\mathbf{B})^{-1}\mathbf{AB}=\mathbf{I} \text{ and } RHS = \mathbf{B}^{-1}\mathbf{A}^{-1}\mathbf{AB}=\mathbf{B}^{-1}\mathbf{I}\mathbf{B} =\mathbf{I} = LHS$

---

- equation (2):$(\mathbf{ABC\ldots})^{-1} = \ldots\mathbf{C}^{-1}\mathbf{B}^{-1}\mathbf{A}^{-1}$
This is generalized case of equation (1) and can be proved similarly.
> generalized 广义

---

- equation (3):$(\mathbf{A}^{T})^{-1} = (\mathbf{A}^{-1})^{T}$
$$\begin{aligned} RHS &= (\mathbf{A}^{-1})^{T} \\&= (\mathbf{A}^{-1})^{T}\mathbf{A}^T(\mathbf{A}^T)^{-1} \quad \because \mathbf{A}^T(\mathbf{A}^T)^{-1}=\mathbf{I} \\&=(\mathbf{AA^{-1}})^T(\mathbf{A^T})^{-1} \quad \because \mathbf{B^T}\mathbf{A^T}=(\mathbf{AB})^T \quad \text{See proof of equation (5) for this} \\&=\mathbf{I^T(A^T)^{-1}}\\&=\mathbf{(A^T)^{-1}}\\&=LHS\end{aligned}$$

---

- equation (4):$(\mathbf{A} + \mathbf{B})^{T} = \mathbf{A}^{T} + \mathbf{B}^{T}$
The $(i,j)^{th}$ element of $(\mathbf A^T+\mathbf B^T)$ is the sum of $(i,j)^{th}$ elements of $\mathbf A^T$ and $\mathbf B^T$, which are $(j,i)^{th}$ element of $\mathbf A$ and $\mathbf B$, respectively. Thus the $(i,j)^{th}$ element of $\mathbf A^T+\mathbf B^T$ is the $(j,i)^{th}$ element of the sum of $\mathbf A$ and $\mathbf B$, which is equal to the $(i,j)^{th}$ element of the transpose $(\mathbf {A+B})^T$.

---

- equation (5):$(\mathbf{AB})^{T} = \mathbf{B}^{T}\mathbf{A}^{T}$
$$(ab)^T_{ki} = (ab)_{ik} = \sum_{j=1}^na_{ij}b_{jk}$$
$$(a^Tb^T)_{ki} = \sum_{j=1}^nb^T_{kj}a^T_{ji} = \sum_{j=1}^nb_{jk}a_{ij} =\sum_{j=1}^na_{ij}b_{jk} = (ab)^T_{ki}$$

---

- equation (6):$(\mathbf{ABC\ldots})^{T} = \ldots\mathbf{C}^{T}\mathbf{B}^{T}\mathbf{A}^{T}$
This is generalized form of above equation (5). To extend it to more than two matrices, use induction:
Suppose that for some $n$, we have
$$\begin{aligned}\mathbf{(A_1A_2\cdots A_n)^T} = \mathbf{A^T_n \cdots A^T_2A^T_1} \tag{proof 1-6-1}\end{aligned}$$
Note that we have already derived (5) for $n=2$.
Then, using the two matrix result and (proof 1-6-1), we have
$$\begin{aligned}\mathbf{(A_1A_2\cdots A_nA_{n+1})^T} &= \mathbf{((A_1A_2\cdots A_n)A_{n+1})^T}\\&=\mathbf{A_{n+1}^T(A_1A_2\cdots A_n)^T}\\&=\mathbf{A_{n+1}^TA_{n}^T\cdots A_{2}^TA_{1}^T}\end{aligned}$$

---

- equation (7):$(\mathbf{A}^{H})^{-1} = (\mathbf{A}^{-1})^{H}$
$$\begin{aligned}RHS &= \mathbf{(A^{-1})^H} \\&= \mathbf{(A^{-1})^H A^H (A^H)^{-1}} \\&= \mathbf{(AA^{-1})^H (A^T)^{-1}} \quad \because \mathbf{B^H A^H = (AB)^H} \text{ See proof of equation (9) fro this} \\&= \mathbf{I^H (A^H)^{-1}} \\&= \mathbf{(A^H)^{-1}} \\&= LHS\end{aligned}$$
---

- equation (8):$(\mathbf{A} + \mathbf{B})^{H} = \mathbf{B}^{H} + \mathbf{A}^{H}$

---

- equation (9):$(\mathbf{AB})^{H} = \mathbf{B}^{H}\mathbf{A}^{H}$

---

- equation (10):$(\mathbf{ABC\ldots})^{H} = \ldots\mathbf{C}^{H}\mathbf{B}^{H}\mathbf{A}^{H}$


> Left Hand Side (LHS) 左手边;Right Hand Side (RHS) 右手边

### 1 Basics Validate

In [2]:
# lets play with the above equations and equatities to see for ouselves
# caution the above probably work only for square matrices
width = 3
height = 3
A = np.random.random([width, height])
B = np.random.random([width, height])
C = np.random.random([width, height])

a = np.random.random([width, 1])

def validate(left, right, eqn):
    print("方程({})的结果: {}".format(eqn, left.all() == right.all()))

# equation 1:
lft_hand = np.linalg.inv(A.dot(B))
rgt_hand = np.dot(np.linalg.inv(B), np.linalg.inv(A))
validate(lft_hand, rgt_hand, 1)

# equation 2:
lft_hand = np.linalg.inv(A.dot(B).dot(C))
rgt_hand = np.dot(np.linalg.inv(C), np.linalg.inv(B)).dot(np.linalg.inv(A))
validate(lft_hand, rgt_hand, 2)

# equation 3:
lft_hand = np.linalg.inv(A.T)
rgt_hand = np.transpose(np.linalg.inv(A))
validate(lft_hand, rgt_hand, 3)

# equation 4:
lft_hand = (A+B).T
rgt_hand = A.T+B.T
validate(lft_hand, rgt_hand, 4)

# equation 5:
lft_hand = A.dot(B).T
rgt_hand = B.T.dot(A.T)
validate(lft_hand, rgt_hand, 5)

# equation 7:
lft_hand = np.linalg.inv(np.conjugate(A))
rgt_hand = np.conjugate(np.linalg.inv(A))
validate(lft_hand, rgt_hand, 7)

# equation 7:
lft_hand = np.linalg.inv(np.conjugate(A))
rgt_hand = np.conjugate(np.linalg.inv(A))
validate(lft_hand, rgt_hand, 7)

# equation 8:
lft_hand = np.conjugate(A+B)
rgt_hand = np.conjugate(A) + np.conjugate(B)
validate(lft_hand, rgt_hand, 8)

# equation 9:
lft_hand = np.conjugate(A.dot(B))
rgt_hand = np.conjugate(B).dot(np.conjugate(A))
validate(lft_hand, rgt_hand, 9)

方程(1)的结果: True
方程(2)的结果: True
方程(3)的结果: True
方程(4)的结果: True
方程(5)的结果: True
方程(7)的结果: True
方程(7)的结果: True
方程(8)的结果: True
方程(9)的结果: True


### 1.1 Trace
$$\begin{align} Tr(\mathbf{A}) &= \sum_{i} \mathbf{A}_{ii} \\ Tr(\mathbf{A}) &= \sum_{i}\lambda_{i}, \;\;\lambda_{i} = eig(\mathbf{A}) \\ Tr(\mathbf{A}) &= Tr(\mathbf{A}^{T}) \\ Tr(\mathbf{AB}) &= Tr(\mathbf{BA}) \\ Tr(\mathbf{A}+\mathbf{B}) &= Tr(\mathbf{A}) + Tr(\mathbf{B}) \\ Tr(\mathbf{ABC}) &= Tr(\mathbf{BCA}) = Tr(\mathbf{CAB}) \\ \vec{a}^{T}\vec{a} &= Tr(\vec{a}\vec{a}^{T}) \end{align}$$

#### 1.1 Trace Proofs

- equation (11):$Tr(\mathbf{A}) = \sum_{i} \mathbf{A}_{ii}$
By definition of trace of the matrix, which is the sum of elements on the main diagonal of $\mathbf{A}$.

---

- equation (12):$Tr(\mathbf{A}) = \sum_{i}\lambda_{i}, \;\;\lambda_{i} = eig(\mathbf{A})$
By definition, the characteristic polynomial of an $n \times n$ matrix $\mathbf{A}$ is given by
$$p(t) = \operatorname{det}(\mathbf{A-tI}) = (-1)^n \Big( t^n - (\operatorname{tr}\mathbf A)t^{n-1} + \cdots + (-1)^n \operatorname{det}\mathbf{A} \Big)$$
On the other hand, $p(t)=(−1)^n(t−\lambda_1)\cdots(t−\lambda_n)$, where the $\lambda_j$ are the eigenvalues of $\mathbf A$. So, comparing coefficients, we have $\operatorname{tr}\mathbf A=\lambda_1+\cdots+\lambda_n$.

---

- equation (13):$Tr(\mathbf{A}) = Tr(\mathbf{A}^{T})$
Elements of any matrix $\mathbf{A}$ can be represented by $\mathbf{A}_{ij}$. After transpose, the corresponding becomes $\mathbf{A}_{ji}$. But for diagonal elements, $i=j$, therefore, from definition of trace of matrix $\operatorname{Tr}(\mathbf A)=\operatorname{Tr}(\mathbf A^T)= \sum_i \mathbf A_{ii}$

---

- equation (14):$Tr(\mathbf{AB}) = Tr(\mathbf{BA})$
Let $\mathbf{A}$ be a $n \times m$ and \mathbf{B} be a $m \times n$ matrix, we have
$$\begin{aligned}\operatorname{Tr}(\mathbf {AB}) &= \sum_{i=1}^n(\mathbf{AB})_{ii} \\&=\sum_{i=1}^n\sum_{j=1}^m \mathbf {A}_{ij} \mathbf {B}_{ji} \\&= \sum_{j=1}^m\sum_{i=1}^n \mathbf {B}_{ji} \mathbf {A}_{ij} \\&= \sum_{j=1}^m(\mathbf{BA})_{jj} \\&= \operatorname{Tr}(\mathbf{BA})\end{aligned}$$

---

- equation (15):$Tr(\mathbf{A}+\mathbf{B}) = Tr(\mathbf{A}) + Tr(\mathbf{B})$
$$\begin{aligned}RHS &= \operatorname{Tr}(\mathbf A) + \operatorname{Tr}(\mathbf B) \\&=\sum_{k=1}^na_{kk} + \sum_{k=1}^nb_{kk} \\&=\sum_{k=1}^n(a_{kk} + b_{kk}) \\&= \operatorname{Tr}(\mathbf{A}+\mathbf{B}) \\&=LHS \end{aligned}$$

---

- equation (16):$Tr(\mathbf{ABC}) = Tr(\mathbf{BCA}) = Tr(\mathbf{CAB})$
More general form of (proof 14).

---

- equation (17):$\mathbf{a^Ta} = Tr(\mathbf{aa^T})$

#### 1.1 Trace Validate

In [3]:
# equation 11:
lft_hand = np.trace(A)
rgt_hand = np.sum(np.diag(A))
validate(lft_hand, rgt_hand, 11)

# equation 12:
lft_hand = np.trace(A)
rgt_hand = np.sum(np.linalg.eigvals(A))
validate(lft_hand, rgt_hand, 12)

# equation 13:
lft_hand = np.trace(A)
rgt_hand = np.trace(A.T)
validate(lft_hand, rgt_hand, 13)

# equation 14:
lft_hand = np.trace(A.dot(B))
rgt_hand = np.trace(B.dot(A))
validate(lft_hand, rgt_hand, 14)

# equation 15:
lft_hand = np.trace(A+B)
rgt_hand = np.trace(A) + np.trace(B)
validate(lft_hand, rgt_hand, 15)

# equation 16:
lft_hand = np.trace(np.dot(A.dot(B), C))
rgt_hand = np.trace(np.dot(B.dot(C), A))
validate(lft_hand, rgt_hand, 16)

# equation 17:
lft_hand = np.dot(a.T, a)
rgt_hand = np.trace(np.dot(a, a.T))
validate(lft_hand, rgt_hand, 17)

方程(11)的结果: True
方程(12)的结果: True
方程(13)的结果: True
方程(14)的结果: True
方程(15)的结果: True
方程(16)的结果: True
方程(17)的结果: True


### 1.2 Determinant
Let $A$ be an $n \times n$ matrix.
$$\begin{align} \operatorname{det}(\mathbf{A}) &= \prod_i{\lambda_i} \quad \lambda_i=\operatorname{eig}(\mathbf{A})\\ \operatorname{det}(c\mathbf{A}) &=c^n \operatorname{det}(\mathbf{A}), \quad \text{if } \mathbf{A} \in \mathbb{R}^{n \times n} \\ \operatorname{det}(\mathbf{A}^T) &=\operatorname{det}(\mathbf{A}) \\ \operatorname{det}(\mathbf{AB}) &=\operatorname{det}(\mathbf{A})\operatorname{det}(\mathbf{B}) \\ \operatorname{det}(\mathbf{A}^{-1}) &= 1/\operatorname{det}(\mathbf{A}) \\ \operatorname{det}(\mathbf{A}^{n}) &= \operatorname{det}(\mathbf{A})^n \\ \operatorname{det}(\mathbf{I+uv^T}) &= 1+\mathbf{u^Tv} \\ \end{align}$$

For $n=2$:

$$\begin{align} \operatorname{det}(\mathbf{I+A}) = 1+\operatorname{det}(\mathbf{A})+\operatorname{Tr}(\mathbf{A}) \end{align}$$

For $n=3$:

$$\begin{align} \operatorname{det}(\mathbf{I+A}) = 1+\operatorname{det}(\mathbf{A})+\operatorname{Tr}(\mathbf{A})+\frac{1}{2}\operatorname{Tr}(\mathbf{A})^2-\frac{1}{2}\operatorname{Tr}(\mathbf{A}^2) \end{align}$$

For $n=4$:

$$\begin{equation}\begin{aligned} \operatorname{det}(\mathbf{I+A}) =\; &1+\operatorname{det}(\mathbf{A})+\operatorname{Tr}(\mathbf{A})\\ &+\frac{1}{2}\operatorname{Tr}(\mathbf{A})^2-\frac{1}{2}\operatorname{Tr}(\mathbf{A}^2)\\ &+\frac{1}{6}\operatorname{Tr}(\mathbf{A})^3-\frac{1}{2}\operatorname{Tr}(\mathbf{A})\operatorname{Tr}(\mathbf{A}^2)+\frac{1}{3}\operatorname{Tr}(\mathbf{A}^3) \\ \end{aligned}\end{equation}$$

For small $\varepsilon$, the following approximation holds
对于小$\varepsilon$，以下近似成立

$$\begin{align} \operatorname{det}(\mathbf{I+\varepsilon A}) \cong 1+\operatorname{det}(\mathbf{A})+\varepsilon\operatorname{Tr}(\mathbf{A})+\frac{1}{2}\varepsilon^2\operatorname{Tr}(\mathbf{A})^2-\frac{1}{2}\varepsilon^2\operatorname{Tr}(\mathbf{A}^2) \end{align}$$


#### 1.2 Determinant Validate

In [6]:
# equation 18:
lft_hand = np.linalg.det(A)
rgt_hand = np.prod(np.linalg.eigvals(A))
validate(lft_hand, rgt_hand, 18)

方程(18)的结果: True
方程(19)的结果: True


### 1.3 The Special Case 2x2 (特殊情况)
