## Principal Component Analysis
- **Goal:** Extract "n" Principal Components
- **Libraries:** sklearn, numpy
- **Data:** Digit Dataset
- **Metric:** Correct matrix 

#### Disclaimer: This implementation isn't optimized in any way, therefore it shouldn't be used for production.

In [110]:
import numpy as np
from sklearn import datasets
from sklearn.model_selection import train_test_split

In [111]:
class PCA:
    def fit(self, X, n_components):
        # Decompose matrix
        return np.linalg.svd(X, full_matrices=True)
    
    def fit_transform(self, X, n_components=2):
        # Decompose matrix
        U, S, V = self.fit(X, n_components)
        # Sort and reverse order 
        s_sorted_reversed = S.argsort()[::-1]
        # Extract indicies 
        indicies = s_sorted_reversed[:n_components]
        return X.dot(V[indicies].T)

### Loading Data

In [112]:
def load_data():
    digits = datasets.load_digits()
    X = digits.data 
    y = digits.target
    return train_test_split(X, y, test_size=0.33, random_state=42)

# Splitting data into trainings and test sets
X_train, X_test, y_train, y_test = load_data()

### Creating Model

In [113]:
pca = PCA()
# Number of principal components to extract
n_components = 5
print("------", n_components, "Principal Components------\n\n", 
      pca.fit_transform(X_test, 5))

------ 5 Principal Components------

 [[ 46.24104278 -10.51167234  16.9609893  -11.87530579   4.75584215]
 [ 58.8651787   13.81096855  -0.85293928  10.75333539  -3.86584426]
 [ 46.27149533  21.71843272  -3.6398134   -8.50892683   5.19127461]
 ...
 [ 46.26020127 -25.22014994  15.36917037   9.2694812    4.01744055]
 [ 49.83463271  25.30868611   1.63228485   5.32125574   3.97554289]
 [ 49.63862614   7.31028098  15.25410044   5.58173265  -8.2982519 ]]
