[![Fixel Algorithms](https://i.imgur.com/AqKHVZ0.png)](https://fixelalgorithms.gitlab.io/)

# AI Program

## Machine Learning - UnSupervised Learning - Dimensionality Reduction - Principal Component Analysis (PCA)

> Notebook by:
> - Royi Avital RoyiAvital@fixelalgorithms.com

## Revision History

| Version | Date       | User        |Content / Changes                                                   |
|---------|------------|-------------|--------------------------------------------------------------------|
| 1.0.000 | 13/04/2024 | Royi Avital | First version                                                      |

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/FixelAlgorithmsTeam/FixelCourses/blob/master/AIProgram/2024_02/0063DimensionalityReductionPCA.ipynb)

In [None]:
# Import Packages

# General Tools
import numpy as np
import scipy as sp
import pandas as pd

# Machine Learning
from sklearn.datasets import fetch_olivetti_faces
from sklearn.decomposition import PCA

# Miscellaneous
import math
import os
from platform import python_version
import random
import timeit

# Typing
from typing import Callable, Dict, List, Optional, Self, Set, Tuple, Union

# Visualization
import matplotlib as mpl
import matplotlib.pyplot as plt
import seaborn as sns

# Jupyter
from IPython import get_ipython
from IPython.display import Image
from IPython.display import display
from ipywidgets import Dropdown, FloatSlider, interact, IntSlider, Layout, SelectionSlider
from ipywidgets import interact

## Notations

* <font color='red'>(**?**)</font> Question to answer interactively.
* <font color='blue'>(**!**)</font> Simple task to add code for the notebook.
* <font color='green'>(**@**)</font> Optional / Extra self practice.
* <font color='brown'>(**#**)</font> Note / Useful resource / Food for thought.

Code Notations:

```python
someVar    = 2; #<! Notation for a variable
vVector    = np.random.rand(4) #<! Notation for 1D array
mMatrix    = np.random.rand(4, 3) #<! Notation for 2D array
tTensor    = np.random.rand(4, 3, 2, 3) #<! Notation for nD array (Tensor)
tuTuple    = (1, 2, 3) #<! Notation for a tuple
lList      = [1, 2, 3] #<! Notation for a list
dDict      = {1: 3, 2: 2, 3: 1} #<! Notation for a dictionary
oObj       = MyClass() #<! Notation for an object
dfData     = pd.DataFrame() #<! Notation for a data frame
dsData     = pd.Series() #<! Notation for a series
hObj       = plt.Axes() #<! Notation for an object / handler / function handler
```

### Code Exercise

 - Single line fill

 ```python
 vallToFill = ???
 ```

 - Multi Line to Fill (At least one)

 ```python
 # You need to start writing
 ????
 ```

 - Section to Fill

```python
#===========================Fill This===========================#
# 1. Explanation about what to do.
# !! Remarks to follow / take under consideration.
mX = ???

???
#===============================================================#
```

In [None]:
# Configuration
# %matplotlib inline

seedNum = 512
np.random.seed(seedNum)
random.seed(seedNum)

# Matplotlib default color palette
lMatPltLibclr = ['#1f77b4', '#ff7f0e', '#2ca02c', '#d62728', '#9467bd', '#8c564b', '#e377c2', '#7f7f7f', '#bcbd22', '#17becf']
# sns.set_theme() #>! Apply SeaBorn theme

runInGoogleColab = 'google.colab' in str(get_ipython())


In [None]:
# Constants

FIG_SIZE_DEF    = (8, 8)
ELM_SIZE_DEF    = 50
CLASS_COLOR     = ('b', 'r')
EDGE_COLOR      = 'k'
MARKER_SIZE_DEF = 10
LINE_WIDTH_DEF  = 2


In [None]:
# Courses Packages

from DataVisualization import PlotMnistImages


In [None]:
# General Auxiliary Functions

hOrdinalNum = lambda n: '%d%s' % (n, 'tsnrhtdd'[(((math.floor(n / 10) %10) != 1) * ((n % 10) < 4) * (n % 10))::4])

def PlotPcaReconstruction( mX: np.ndarray, dataIdx: int, mU: np.ndarray, vMean: np.ndarray, numComp:int, vSize: np.ndarray, hA: Optional[plt.Axes] = None, figSize: Tuple[int, int] = FIG_SIZE_DEF ) -> plt.Axes:

    if hA is None:
        hF, hA = plt.subplots(nrows = 1, ncols = 3, figsize = figSize)
    else:
        hF = hA.get_figure()

    vX = mX[dataIdx, :]

    if numComp == 0:
        vZ    = [0]
        vHatX = vMean
    else:
        vZ    = mU[:numComp]   @ (vX - vMean) #<! Encode
        vHatX = (mU[:numComp].T @  vZ) + vMean  #<! Decode
        
    mI   = np.clip(np.reshape(vX, vSize), 0, 1)
    mRec = np.clip(np.reshape(vHatX, vSize), 0, 1)

    hA[0].imshow(mI, cmap = 'gray');
    hA[0].set_title('Original Image')

    hA[1].imshow(mRec, cmap = 'gray');
    hA[1].set_title(f'Reconstructed Image, # Components: {numComp}')

    hA[2].stem(vZ, markerfmt = 'b.', label = 'Coefficients')
    hA[2].set_xlabel('Principal Component')
    hA[2].set_ylabel('Coefficient Value')

    return hA



## Dimensionality Reduction by PCA

In this note book we'll use the PCA approach for dimensionality reduction.

This notebook introduces:

1. The [Olivetti Faces Data Set from AT&T](https://scikit-learn.org/stable/modules/generated/sklearn.datasets.fetch_olivetti_faces.html).
2. Showing the PCA spectrum.
3. Showing the PCA reconstruction (Eigen Faces).

### Eigen Faces

One of the first successful approaches to face recognition is the concept of [_Eigenface_](https://en.wikipedia.org/wiki/Eigenface).   
Given enough data (Images) of the subject we build the PCA of the face of each subject.  
We use those as a mean to recognize the person.

* <font color='brown'>(**#**)</font> PCA is the most basic dimensionality reduction operator.
* <font color='brown'>(**#**)</font> The PCA output is a linear combination of the input.
* <font color='brown'>(**#**)</font> Conceptually we may think of Dimensionality Reduction as a _soft_ feature selection / mixture.

In [None]:
# Parameters

# Data
tImgSize = (64, 64)
numRows = 3
numCols = 3

# Model


## Generate / Load Data

In this notebook we'll use the [Olivetti Faces Data Set from AT&T](https://scikit-learn.org/stable/modules/generated/sklearn.datasets.fetch_olivetti_faces.html).  
The data set is available on SciKit Learn using [`fetch_olivetti_faces()`](https://scikit-learn.org/stable/modules/generated/sklearn.datasets.fetch_olivetti_faces.html). 

The data set itself is built like the MNIST, each row is an image.  
The size of the images is `(64, 64)` and there are 40 classes.

There are ten different images of each of 40 distinct subjects.  
For some subjects, the images were taken at different times, varying the lighting, facial expressions (open / closed eyes, smiling / not smiling) and facial details (glasses / no glasses).  
All the images were taken against a dark homogeneous background with the subjects in an upright, frontal position (with tolerance for some side movement).


In [None]:
# Load Data

mX, vY   = fetch_olivetti_faces(return_X_y = True)

print(f'The features data shape: {mX.shape}')
print(f'The features data type: {mX.dtype}')


* <font color='red'>(**?**)</font> Do we need to scale the data?

### Plot Data

In [None]:
# Plot the Data

hF = PlotMnistImages(mX, vY, numRows, numCols, tuImgSize = tImgSize)

## Applying Dimensionality Reduction - PCA 

The PCA method basically treats the data as a Gaussian Distribution.   
Hence, it basically decomposes the ellipsoid into its radius components, each in its own orthogonal to the others, direction.  
Those are sorted by the variance along each direction.


In [None]:
# Applying the PCA Model
numComp    = min(mX.shape)
numSamples = mX.shape[0]

oPCA = PCA(n_components = numComp) #<! Calculate all the components of the data (The default)
oPCA = oPCA.fit(mX)


### Plot the Mean Image  

The PCA works on a centered data.  
Hence the mean image is kept a side for the reconstruction.

In [None]:
# Plot the Mean Image

hF = PlotMnistImages(np.atleast_2d(oPCA.mean_), np.array(['Mean Image']), 1, 1, tuImgSize = tImgSize)

In [None]:
# Plot the PCA Spectrum

vλ = oPCA.explained_variance_ratio_

hF, hA = plt.subplots(figsize = (12, 6))
hA.stem(np.sqrt(vλ[:200]), markerfmt = 'b.', label = '$\\sqrt{\lambda_i}$')
hA.set_title('Eigen Values')
hA.set_xlabel('$i$')
hA.legend()

plt.show()

In [None]:
# Plot the Energy Ratio

vλ = oPCA.explained_variance_ratio_

hF, hA = plt.subplots(figsize = (12, 6))
hA.stem(vλ, markerfmt = 'b.', label = '$Ratio$')
hA.set_title('Variance Ratio')
hA.set_xlabel('$Component Index$')
hA.legend()

plt.show()

* <font color='brown'>(**#**)</font> Look at the rate the accumulated explained energy is accumulated.

In [None]:
# Plot the Components

mU = oPCA.components_ #<! mU.shape = (n_components, n_features)

hF, hAs = plt.subplots(nrows = 2, ncols = 5, figsize = (12, 6))
vIdx    = list(range(5)) + list(range(numComp - 5, numComp))
for kk, hA in zip(range(10), hAs.flat):
    idx = vIdx[kk]
    mI  = np.reshape(mU[idx], tImgSize)
    hA.imshow(mI)
    hA.set_title(f'{hOrdinalNum(idx + 1)} Principal Component')
    
hF.tight_layout()
plt.show()

## PCA Reconstruction

* Encode:
$$\boldsymbol{z}_{i}=\boldsymbol{U}_{d}^{T}\left(\boldsymbol{x}_{i}-\boldsymbol{\mu}_{x}\right)$$  

* Decode:
$$\hat{\boldsymbol{x}}_{i}=\boldsymbol{U}_{d}\boldsymbol{z}_{i}+\boldsymbol{\mu}_{x}$$

In [None]:
# Plotting Function Wrapper
hPlotPcaReconstruction = lambda dataIdx, numComponents: PlotPcaReconstruction(mX, dataIdx, mU, oPCA.mean_, numComponents, tImgSize, figSize = (14, 4))

In [None]:
# Interactive Visualization 
dataIdxSlider = IntSlider(min = 0, max = numSamples - 1, step = 1, value = 0, layout = Layout(width = '30%'))
numComponentsSlider = IntSlider(min = 0, max = numComp, step = 1, value = 0, layout = Layout(width = '30%'))

interact(hPlotPcaReconstruction, dataIdx = dataIdxSlider, numComponents = numComponentsSlider)

plt.show()

* <font color='red'>(**?**)</font> Describe how the actual recognition of a given face is done.
* <font color='green'>(**@**)</font> Remove one image from each class. Then build a recognition system based on all other images. Show the success rate.

<!-- Given the data base, each image in the data base has its own finger print on the data base: $\boldsymbol{z}_{i}$.  
Then, for a new image:

1. Calculate its encoding using the components: $\boldsymbol{z}_{new} = \boldsymbol{U}_{d}^{T} \left( \boldsymbol{x}_{new} - \boldsymbol{\mu}_{x} \right)$.
2. Calculate the distance to the closest existing finger printing: $j = \arg \min_{i} {d}_{i} = \left\| \boldsymbol{z}_{new} - \boldsymbol{z}_{i}$.
3. If ${d}_{j} \leq {\vareps}_{1}$ for a given threshold ${\vareps}_{1}$ then the face is recognized as the $j$ -th face in the data base.
4. If ${\vareps}_{1} < {d}_{j} \leq {\vareps}_{2}$ for a given threshold ${\vareps}_{2}$ then the image is not in the data base yet can be added.
5. If ${\vareps}_{2} < {d}_{j}$ for a given threshold ${\vareps}_{2}$ then the image is not considered a face image. -->