## Principal Component Analysis

In PCA, a set of eigenfaces are computed from the eigenvectors of sample covariance matrix Cov:
$$
Cov = \sum_{i=1}^{M} (x_i - m)(x_i - m)^T
$$
where $m$ is the mean face of the sample set. The eigenspace $U$ = {$u_1$, $u_2$, ..., $u_k$} is spanned by the $K$ eigenfaces with the largest eigenvalues. For recognition, the prototype $P$ for each face class and the probe $Q$ are projected onto $U$ to obtaion the weight vectors $w_p$= $U^T$ $(P-m)$ and $w_q$ = $U^T$ $(Q-m)$. The face class is found to minimize the distance $\mathcal{E}$ = $||w_q - w_p||$ = $||U^T$ $(Q-P)||$.

Eigenfaces are computed from the ensemble covariance matrix Cov. $Cov = \sum_{i=1}^{M} (x_i - m)(x_i - m)^T$ shows that Cov is derived from all the training face images subtracting of the mean face. We have shown that Cov can also be formulated as:
$$
Cov=\sum_{i=1}^{M} (x_i - m)(x_i - m)^T
$$
$$
=\sum_{i=1}^{M} (x_i - \frac{1}{M} \sum_{j=1}^{M} x_j)(x_i - \frac{1}{M} \sum_{k=1}^{M} x_k)^T
$$
$$
=\frac{1}{M^2} \sum_{i=1}^{M} [\sum_{j=1}^{M} \sum_{k=1}^{M}(x_i - x_j)(x_i - x_k)^T]
$$
<br>Rewrite Cov using different subscripts (exchange i and j):
$$
Cov = \frac{1}{M^2} \sum_{j=1}^{M} [\sum_{i=1}^{M} \sum_{k=1}^{M}(x_j - x_i)(x_j - x_k)^T]
$$
<br>Change the order of summation:
$$
Cov = \frac{1}{M^2} \sum_{i=1}^{M} [\sum_{k=1}^{M} \sum_{j=1}^{M}(x_j - x_i)(x_j - x_k)^T]
$$
<br>Average the two equations:
$$
Cov = \frac{1}{2M} \sum_{i=1}^{M} \sum_{j=1}^{M} (x_i - x_j) (x_i - x_j)^T
$$
<br>Therefore, the eigenvectors for {$x_i$} can also be computed as the eigenvectors for the face difference sets { ($x_i - x_j$) }, containing all the difference between any pair of face images in the training set. The PCA subspace characterizes the distribution of the face difference between any two face images, which may belong to the same individual or different individuals.

## 主成分分析

在 PCA 中，从样本协方差矩阵 Cov 的特征向量计算出一组特征脸： $$ Cov = \sum_{i=1}^{M} (x_i - m)(x_i - m)^T $$ 其中 $m$ 是样本集的平均脸。特征空间 $U$ = {$u_1$, $u_2$, ..., $u_k$} 由具有最大特征值的 $K$ 个特征脸张成。对于识别，每个面部类别的原型 $P$ 和探针 $Q$ 被投影到 $U$ 上以获得权重向量 $w_p$= $U^T$ $(P-m)$ 和 $w_q$ = $U^T$ $(Q-m)$。找到的面部类别使距离 $\mathcal{E}$ = $||w_q - w_p||$ = $||U^T$ $(Q-P)||$ 最小化。

特征脸是从整体协方差矩阵 Cov 计算出来的。$Cov = \sum_{i=1}^{M} (x_i - m)(x_i - m)^T$ 表明 Cov 是从所有训练面部图像减去平均脸得出的。我们已经证明 Cov 也可以被表述为： $$ Cov=\sum_{i=1}^{M} (x_i - m)(x_i - m)^T $$ $$ =\sum_{i=1}^{M} (x_i - \frac{1}{M} \sum_{j=1}^{M} x_j)(x_i - \frac{1}{M} \sum_{k=1}^{M} x_k)^T $$ $$ =\frac{1}{M^2} \sum_{i=1}^{M} [\sum_{j=1}^{M} \sum_{k=1}^{M}(x_i - x_j)(x_i - x_k)^T] $$ 使用不同的下标（交换 i 和 j）重写 Cov： $$ Cov = \frac{1}{M^2} \sum_{j=1}^{M} [\sum_{i=1}^{M} \sum_{k=1}^{M}(x_j - x_i)(x_j - x_k)^T] $$ 改变求和的顺序： $$ Cov = \frac{1}{M^2} \sum_{i=1}^{M} [\sum_{k=1}^{M} \sum_{j=1}^{M}(x_j - x_i)(x_j - x_k)^T] $$ 对两个等式求均值： $$ Cov = \frac{1}{2M} \sum_{i=1}^{M} \sum_{j=1}^{M} (x_i - x_j) (x_i - x_j)^T $$ 因此，{$x_i$} 的特征向量也可以被计算为面部差异集 { ($x_i - x_j$) } 的特征向量，其中包含训练集中任意一对面部图像之间的所有差异。PCA 子空间描述了任意两个面部图像之间的面部差异的分布，这两个面部图像可能属于同一个个体或不同的个体。

In [1]:
#计算一组数据DBP：78,80,81,82,84,86和另一组数据SBP：126,128,127,130,130,132等两组数据的协方差矩阵covariance matrix以及特征向量eigenvectors
import numpy as np

#定义两组数据
DBP = np.array([78,80,81,82,84,86])
SBP = np.array([126,128,127,130,130,132])

#计算协方差矩阵
covariance_matrix = np.cov(DBP,SBP)
print(covariance_matrix)

#计算特征向量
eigenvectors = np.linalg.eig(covariance_matrix)[1]
print(eigenvectors)


[[8.16666667 5.96666667]
 [5.96666667 4.96666667]]
[[ 0.79341219 -0.60868473]
 [ 0.60868473  0.79341219]]
