In [None]:
%load_ext autoreload
%autoreload 2

# Task 3: Eigenfaces & Fischerfaces (3 pts)

### <div align="right"> &copy; Markiian Novosad & Rostyslav Hryniv, 2023 </div>

## Completed by:   
*   First team member
*   Second team member


In this task your main goal is to discuss dimensionality reduction techniques, such as **PCA** and **LDA** and their performance on classification tasks.

A good source to read more on these methods is the book

*   Kevin P. Murphy. [Probabilistic Machine Learning: An Introduction](https://probml.github.io/pml-book/book1.html), MIT, 2022. Section 9.2.6 (FLDA), 20.1 (PCA and Eigenfaces)



## 1. Lib and Data prep (0.5 pts)
#### Let's start with importing all neccesary libraries and preparing our data

In [None]:
from PIL import Image
from sklearn.datasets import fetch_lfw_people
import numpy as np
from sklearn.svm import SVC
import sklearn
from matplotlib import pyplot as plt
import scipy

Seed libs, so we get the same results every time:

In [None]:
sklearn.random.seed(1)
np.random.seed(1)

Fetch data and split them into training and test sets:

In [None]:
lfw_people = fetch_lfw_people(min_faces_per_person=100, resize=0.3)
n,h,w = lfw_people.images.shape
labels = lfw_people.target

class1, class2 =  # choose your preferred classes, make sure that there is almost equal number of samples for each class

x= lfw_people.data[(labels == class1)|(labels==class2)]
y = labels[(labels == class2)|(labels==class1)]

x_train, x_test, y_train, y_test = # split data using sklearn library
x_train.shape, x_test.shape

Visualise Your data:

In [None]:
    # ========= YOUR CODE STARTS HERE ========= #


    # ========== YOUR CODE ENDS HERE ========== #

## 2. PCA (1.5 pts)


### **2.1 (0.7 pt) Implementation**
Let's start implementing the **PCA** algorithm

In [None]:
class MYPCA:
    def __init__(self) -> None:
        pass

    def _get_num_components(self, lambdas, threshold ) -> int:
        """
        Calculate minimal number of principal components so that variance explainability >= threshold.
        """
         # ========= YOUR CODE STARTS HERE ========= #
        ...

         # ========== YOUR CODE ENDS HERE ========== #

    def center_data(self, data):
        """
        Center our data by subtracting the mean
        """
         # ========= YOUR CODE STARTS HERE ========= #

        self._data_mean = ...
        return ...
         # ========== YOUR CODE ENDS HERE ========== #


    def compute_eigen(self, cov_mat):
        """
        Using built-in numpy functionality, compute eigenvectors and eigenvalues of covariance matrix;
        """
        # ========= YOUR CODE STARTS HERE ========= #
        eigenvalues, eigenvectors = ...

        # ========== YOUR CODE ENDS HERE ========== #

        return eigenvalues, eigenvectors


    def fit(self, data, exp_threshold=0.8):
        """
        1. Center the data;
        2. Compute covariance matrix
        3. Compute optimal k (number of components)
        4. Compute top-k eigenvalues from covariance matrix
        5. Compute components by projecting data with eigenvectors
        """
        # ========= YOUR CODE STARTS HERE ========= #

        data_norm = ...
        cov_mat = ...
        self.eigenvalues, self.eigenvectors = self.compute_eigen(cov_mat=cov_mat)

        self._k_componentes = self._get_num_components(self.eigenvalues, exp_threshold)
        top_k_vectors = ...
        self.components = top_k_vectors.dot(data_norm)
        # ========== YOUR CODE ENDS HERE ========== #


    def plot_components(self, im_shape):
        """
        For better understanding, visualise the *self.components* variable.
        """
        # ========= YOUR CODE STARTS HERE ========= #

        # ========== YOUR CODE ENDS HERE ========== #

    def plot_explainability(self):
        """
        Plot dependence of variance explainability on number of components.
        """
        # ========= YOUR CODE STARTS HERE ========= #
        ...
        # ========== YOUR CODE ENDS HERE ========== #

    def transform(self, data):
        """
        1. Center the data;
        2. Projection onto the components
        """
        # ========= YOUR CODE STARTS HERE ========= #

        data_norm = ...
        return data_norm.dot(...)
        # ========== YOUR CODE ENDS HERE ========== #





### **2.2 PCA Classifier (0.4 pts)**
Having implemented the **MYPCA** class, let's plot how our components look like and our explainability plot

In [None]:
pca = MYPCA()
pca.fit(x_train, exp_threshold=0.8)
pca.plot_explainability()

In [None]:
pca.plot_components()

Now, let's try to classify the data, using sklearn **SVC**.

In [None]:

classifier = SVC().fit(pca.transform(...), ...)
print(classifier.score(pca.transform(...), ...))

### **2.3. Discussion of the results (0.4 pts):**
1. What is the concept of variance explainability and why is it so important?
1. What is the optimal number of components to reach good enough classification performance and why?

---
#### **Your answer here:**
---

## 3. Fischerface classification and LDA (1 pt)
Now, having implemented the **PCA** algorithm, let's implement the **Linear Discriminant Analysis** algorithm.

By this [link](https://towardsdatascience.com/fishers-linear-discriminant-intuitively-explained-52a1ba79e1bb) you can read more about this algorithm.

In this task we will see how we can improve our classification using **LDA**.

### **3.1. (0.6 pt) Implementation**
Here you only need to compute the $\mathbf{\it{S_B}}$ and $\mathbf{\it{S_W}}$ covariance matrices, discussed in the source above, and select appropriate eigenvectors

In [None]:
class MYLDA:

    def fit(self, data, data_labels, k_components):
        n_feats = data.shape[-1]
        SW = np.zeros((n_feats, n_feats))
        SB = np.zeros((n_feats, n_feats))
        mu = data.mean(0)

        # ========= YOUR CODE STARTS HERE ========= #

        for channel in np.unique(data_labels):
            ...
        # ========== YOUR CODE ENDS HERE ========== #

        evals, evecs = scipy.linalg.eigh(SB, SW)
        self.proj = ... # sort eigenvectors by eigenvalues and selet top k eigenvectors

    def transform(self, data):
        return data @ self.proj


Here we will implement the class for the Fischerface algorithm, which basically uses an **LDA** on top of **PCA**, such that we will perform maximum dimensionality reduction for classification purposes:

In [None]:
class FischerFaces:
    def __init__(self) -> None:
        """
        Initialize pca and lda
        """
        self.pca = ...
        self.lda = ...

    def fit(self, X,Y, threshold, lda_components):
        """
        Fit the fischerface algo:
        1. Fit PCA
        2. Reduce data dimensions with PCA
        3. Fit LDA
        """
        self.pca.fit(...)
        pca_trans = ...
        self.lda.fit( pca_trans,  Y , lda_components )

    def transform(self, X):
        """
        Transform the data by using fitted PCA and LDA
        """
        return ...

Now, let's try out our new classifier based on Fischerface algorithm.

*Hint*: select the number of LDA components equal to the number of classes

In [None]:
f = FischerFaces()
f.compute(x_train, y_train, ..., ...)

classifier = SVC().fit(f.transform(x_train), y_train)
print(...)

0.7014925373134329


### **2.2 Discussion (0.4 pts)**
Evaluate the results of Fishcerface classifier (LDA). Discuss whether it performs better than the classic Eigenface algorithm (PCA), and if so, why

---
#### **Your explanations come here**
---