# Mathematical Approach to PCA

In [1]:
import numpy as np

## Step 1: Prepare the Dataset

Consider the below dataset

In [2]:
x1 = [4, 8, 13, 7]
x2 = [11, 4, 5, 14]

Create a 2D array

In [3]:
X = np.array(list(zip(x1, x2)))
print(X)

[[ 4 11]
 [ 8  4]
 [13  5]
 [ 7 14]]


## Step 2: Standardize the Dataset

Compute mean of each column

In [4]:
mean = np.mean(X, axis = 0) 
print(mean)

[8.  8.5]


Centering the data

In [5]:
X_centered = X - mean
print(X_centered)

[[-4.   2.5]
 [ 0.  -4.5]
 [ 5.  -3.5]
 [-1.   5.5]]


## Step 3: Find the Eigenvalues and Eigenvectors

Covariance matrix: $C = \frac{X . X^T}{N - 1}$ with X is the dataset matrix

Calculate covariance matrix

In [6]:
cov = np.cov(X_centered.T) # (X . X^T) / (n-1)
print(cov)

[[ 14. -11.]
 [-11.  23.]]


#### Calculating by coding

In [7]:
# calculate the eigenvectors and eigenvalues
eigenvalues, eigenvectors = np.linalg.eig(cov)
# eigenvectors = eigenvectors * (-1)
print(eigenvalues)
print(eigenvectors)

[ 6.61513568 30.38486432]
[[-0.83025082  0.55738997]
 [-0.55738997 -0.83025082]]


#### Calculating manually

Calulating through equation:

$$\bold{|C – \lambda I| = 0}
\\[5pt]
\Leftrightarrow
\begin{bmatrix}
14 & -11 \\
-11 & 23 \\
\end{bmatrix}
- \lambda
\begin{bmatrix}
1 & 0 \\
0 & 1 \\
\end{bmatrix}
= 0
\\[5pt]
\Leftrightarrow
\begin{bmatrix}
14 - \lambda & -11 \\
-11 & 23 - \lambda\\
\end{bmatrix}
= 0$$

Now we take the determinant of the left side:

$$ \Leftrightarrow
(14 - \lambda)(23 - \lambda) - (-11)(-11) = 0
\\[5pt]
\Leftrightarrow
\lambda^2 - 37\lambda + 201 = 0
\\[5pt]
\Leftrightarrow
\left[\begin{array}{l}
\lambda_1 = \frac{37 + \sqrt{565}}{2} \\
\lambda_2 = \frac{37 - \sqrt{565}}{2}
\end{array} \right.
$$



Now we have the eigenvalues, next let's find find the eigenvectors:

$$
\Leftrightarrow
C . U = \lambda . U 
\text{ with U = } 
\begin{bmatrix}
x \\
y \\
\end{bmatrix}
\\[5pt]
\Leftrightarrow
\begin{bmatrix}
14 & -11 \\
-11 & 23 \\
\end{bmatrix}
\begin{bmatrix}
x \\
y \\
\end{bmatrix} =
\lambda
\begin{bmatrix}
x \\
y \\
\end{bmatrix}
\\[5pt]
\Leftrightarrow
\left\{\begin{array}{l}
14x - 11y = \lambda x \\
-11x + 23y = \lambda y
\end{array} \right.
\\[5pt]
\Leftrightarrow
\left\{\begin{array}{l}
14x - 11y = \lambda x \\
x = \frac{23 - \lambda}{11}y
\end{array} \right.
$$

+ With $\lambda_1 = \frac{37 + \sqrt{565}}{2}$, we have:

Assuming $y = 1$, we have $x = \frac{9 - \sqrt{565}}{22}$:

$$
U_1 
= 
\sqrt{x^2 + y^2}
\\[5pt]
\Leftrightarrow
U_1 
= 
\sqrt{(\frac{9 - \sqrt{565}}{22})^2 + 1^2}
\\[5pt]
U_1 = 1.204455
$$

Now we have the eigenvectors, and they are:

$$
E_1 =
\begin{bmatrix}
\frac{\frac{9 - \sqrt{565}}{22}}{U_1} 
\\[5pt]
\frac{1}{U_1}
\end{bmatrix}
\\[5pt]
\Leftrightarrow
E_1 =
\begin{bmatrix}
-0.5574
\\
0.8303
\end{bmatrix}
$$

+ With $\lambda_2 = \frac{37 - \sqrt{565}}{2}$, we have:

Assuming $y = 1$, we have $x = \frac{9 + \sqrt{565}}{22}$:

$$
U_1 
= 
\sqrt{x^2 + y^2}
\\[5pt]
\Leftrightarrow
U_1 
= 
\sqrt{(\frac{9 + \sqrt{565}}{22})^2 + 1^2}
\\[5pt]
U_1 = 1.79407
$$

Now we have the eigenvectors, and they are:

$$
E_2 =
\begin{bmatrix}
\frac{\frac{9 + \sqrt{565}}{22}}{U_2} 
\\[5pt]
\frac{1}{U_2}
\end{bmatrix}
\\[5pt]
\Leftrightarrow
E_2 =
\begin{bmatrix}
0.8303
\\
0.5574
\end{bmatrix}
$$

## Step 4: Arrange the Eigenvalues

now we have the feature vector:
$$
\begin{bmatrix}
-0.5574
&
0.8303 
\\[5pt]
0.8303
&
0.5574 
\end{bmatrix}
$$

The eigenvector with the highest eigenvalue is the Principal Component of the dataset. Thus, in this case, eigenvectors of $\lambda_2$ are the principal components. 

## Step 5: Dataset projection

Use equation to transform the dataset: 
$$
Z = X V
\\[5pt]
\text{With X is the latest dataset and V is the feature vector}
$$


In [44]:
P = eigenvectors.T.dot(X_centered.T)
print(P.T)

[[ 1.92752836 -4.30518692]
 [ 2.50825486  3.73612869]
 [-2.20038921  5.69282771]
 [-2.23539401 -5.12376947]]


Use these equations to reconstruct data:
- To reconstruct the mean dataset:
$$
\\[5pt]
X = Z V
\\[5pt]
\text{X would be the row mean dataset}
$$
- To reconstruct the original dataset:
$\bold{Row Original DataSet = Row Zero Mean Data + Original Mean}$

In [47]:
#reconstruct the mean data
X_mean_reconstructed = eigenvectors.dot(P).T
print(X_mean_reconstructed)

#reconstruct the original data
X_reconstructed = X_mean_reconstructed + mean
print(X_reconstructed)

[[-4.0000000e+00  2.5000000e+00]
 [ 2.3471736e-16 -4.5000000e+00]
 [ 5.0000000e+00 -3.5000000e+00]
 [-1.0000000e+00  5.5000000e+00]]
[[ 4. 11.]
 [ 8.  4.]
 [13.  5.]
 [ 7. 14.]]
