## Eigenvectors and Eigenvalues
### Use numpy only
#### Find eigenvalues and eigenvevtors of the following:

$$\begin{bmatrix} 1&0 \\ 0&2 \end{bmatrix}$$
$$\begin{bmatrix} 1&2 \\ 3&-4 \end{bmatrix}$$

In [1]:
import numpy as np

A = np.array([[1 , 0],
              [0 , 2]
])

values, vectors = np.linalg.eig(A)
print(values)


[1. 2.]
[[1. 0.]
 [0. 1.]]


In [None]:
print(vectors)

In [4]:
B = np.array([ [1 , 2],
             [3 , -4]
])
values,vectors = np.linalg.eig(B)
print(values)

[ 2. -5.]


In [5]:
print(vectors)

[[ 0.89442719 -0.31622777]
 [ 0.4472136   0.9486833 ]]


# PCA Using Eigen Decomposition

### Create a matrix contains the following data

In [56]:
matrix = np.array([[1 , 2 , 3 , 4],
                   [5 , 5 , 6 , 7],
                   [1 , 4 , 2 , 3],
                   [5 , 3 , 2 , 1],
                   [8 , 1 , 2 , 2]
])

### Step 1: Standardize the dataset (Subtract mean and divide by standard deviation).

In [17]:
mean = np.mean(matrix,axis = 0)
std = np.std(matrix, axis = 0)
print(mean)
print(std)

[4.  3.  3.  3.4]
[2.68328157 1.41421356 1.54919334 2.05912603]


In [18]:
standardized_matrix = (matrix - mean)/std
print(standardized_matrix)


[[-1.11803399 -0.70710678  0.          0.29138576]
 [ 0.372678    1.41421356  1.93649167  1.74831455]
 [-1.11803399  0.70710678 -0.64549722 -0.19425717]
 [ 0.372678    0.         -0.64549722 -1.16554303]
 [ 1.49071198 -1.41421356 -0.64549722 -0.6799001 ]]


### Step 2: Calculate the covariance matrix for the features in the dataset.
#### Use the formula (X.T@X) / n then confirm using np.cov()

In [19]:
cov_matrix = (standardized_matrix.T@standardized_matrix)/  standardized_matrix.shape[0]
print(cov_matrix)

[[ 1.         -0.31622777  0.04811252 -0.18098843]
 [-0.31622777  1.          0.63900965  0.61812254]
 [ 0.04811252  0.63900965  1.          0.94044349]
 [-0.18098843  0.61812254  0.94044349  1.        ]]


In [25]:
print(np.cov(standardized_matrix.T , ddof = 0))



[[ 1.         -0.31622777  0.04811252 -0.18098843]
 [-0.31622777  1.          0.63900965  0.61812254]
 [ 0.04811252  0.63900965  1.          0.94044349]
 [-0.18098843  0.61812254  0.94044349  1.        ]]


### Step 3: Calculate the eigenvalues and eigenvectors for the covariance matrix.
### Step 4: Sort eigenvalues and their corresponding eigenvectors.

In [30]:
values,vectors = np.linalg.eig(cov_matrix)
print(values)

[2.51579324 1.0652885  0.39388704 0.02503121]


In [31]:
print(vectors)



[[ 0.16195986 -0.91705888 -0.30707099  0.19616173]
 [-0.52404813  0.20692161 -0.81731886  0.12061043]
 [-0.58589647 -0.3205394   0.1882497  -0.72009851]
 [-0.59654663 -0.11593512  0.44973251  0.65454704]]


### Step 5: Pick k eigenvalues and form a matrix of eigenvectors.

#### Select the first 2 eigen vectors

In [38]:
first_two_eigs = vectors[:,:2]
print(first_two_eigs)

[[ 0.16195986 -0.91705888]
 [-0.52404813  0.20692161]
 [-0.58589647 -0.3205394 ]
 [-0.59654663 -0.11593512]]


### Step 6:Transform the original matrix.

In [49]:
print(np.round(standardized_matrix@first_two_eigs,decimals = 3))

[[ 0.016  0.845]
 [-2.858 -0.873]
 [-0.058  1.401]
 [ 1.134  0.   ]
 [ 1.766 -1.374]]


## SVD
### Repeat using SVD and compare the results

In [82]:
_,values,vectors = np.linalg.svd(standardized_matrix)
vectors = vectors.T


In [83]:
print(vectors)
# eigen vectors

[[ 0.16195986 -0.91705888 -0.30707099  0.19616173]
 [-0.52404813  0.20692161 -0.81731886  0.12061043]
 [-0.58589647 -0.3205394   0.1882497  -0.72009851]
 [-0.59654663 -0.11593512  0.44973251  0.65454704]]


In [84]:
values = (values*values)/len(values)
print(values)
# eigen values

[3.14474155 1.33161063 0.4923588  0.03128901]


In [85]:
sum = np.sum(values)
explained_var = (values/sum)*100
print(explained_var)

[62.89483102 26.63221259  9.8471761   0.6257803 ]


In [86]:
first_two = vectors[:,:2]
print(first_two)

[[ 0.16195986 -0.91705888]
 [-0.52404813  0.20692161]
 [-0.58589647 -0.3205394 ]
 [-0.59654663 -0.11593512]]


In [87]:
print(np.round(standardized_matrix@first_two,decimals = 3))

[[ 0.016  0.845]
 [-2.858 -0.873]
 [-0.058  1.401]
 [ 1.134  0.   ]
 [ 1.766 -1.374]]
