#Machine Learning Model: Facial Recognition
This notebook demonstrates face recognition using Principal Component Analysis (PCA) and a Support Vector Machine (SVM). We will use the Labeled Faces in the Wild (LFW) dataset.

##Import all the necessary libraries
We import the necessary Python libraries for data handling, visualization, and machine learning.

In [2]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

##Load the dataset
We use `fetch_lfw_people` from `sklearn.datasets` to load the Labeled Faces in the Wild (LFW) dataset, which consists of images of faces with labels.

In [None]:
from sklearn.datasets import fetch_lfw_people
faces = fetch_lfw_people(min_faces_per_person=50,download_if_missing=True)

## Explore the Dataset
Let's examine the dataset shape and visualize some sample images.

In [None]:
faces.data.shape

In [None]:
faces.images[9].shape

In [None]:
faces.target_names

In [None]:
faces.target_names.size

In [None]:
np.unique(faces.target)

In [None]:
faces.target_names[0]

In [None]:
plt.imshow(faces.images[0])

In [None]:
fig, ax = plt.subplots(2,4)
for idx, axidx in enumerate(ax.flat):
  axidx.imshow(faces.images[idx], cmap='bone')
  axidx.set(xticks=[], yticks=[], xlabel=faces.target_names[faces.target[idx]])

## 4. Preprocess the Data
We apply Principal Component Analysis (PCA) to reduce dimensionality while retaining important features.

In [None]:
from sklearn.svm import SVC
from sklearn.decomposition import PCA
from sklearn.pipeline import make_pipeline

In [None]:
# Apply Principal Component Analysis (PCA) for dimensionality reduction
pcaMod = PCA(n_components=150,whiten=True)
# Train a Support Vector Classifier (SVC) for face recognition
svmMod = SVC(kernel='rbf', class_weight='balanced')
mdl = make_pipeline(pcaMod, svmMod)

##Train a Machine Learning Model
We train a Support Vector Machine (SVM) classifier on the dataset to recognize faces.

###Split the dataset into training and testing sets

In [None]:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(faces.data, faces.target, test_size=0.2)

###Perform hyperparameter tuning using GridSearchCV

In [None]:
from sklearn.model_selection import GridSearchCV
param_grid = {'svc__C':[1,5,15,30], 'svc__gamma':[0.00001,0.00005,0.0001,0.005]}
grid = GridSearchCV(mdl, param_grid)

In [None]:
grid.fit(X_train,y_train)

In [None]:
print(grid.best_params_) #Display the best parameters found

In [None]:
mdl = grid.best_estimator_

###Make predictions on the test set

In [None]:
y_pred = mdl.predict(X_test)

In [None]:
y_test

In [None]:
y_pred

###Visualize predictions

In [None]:
fig, ax = plt.subplots(5, 7)
for idx, axidx in enumerate(ax.flat):
  axidx.imshow(X_test[idx].reshape(62,47), cmap='bone')
  axidx.set(xticks=[],yticks=[])
  axidx.set_ylabel(faces.target_names[y_pred[idx]].split()[-1], color='green' if y_pred[idx]==y_test[idx] else 'red')
  fig.suptitle('Wrong are in red', size=16)


##Evaluate model performance using classification report

In [None]:
from sklearn.metrics import classification_report

In [None]:
print(classification_report(y_test, y_pred, target_names=faces.target_names))

##Plot confusion matrix

In [None]:
from sklearn.metrics import confusion_matrix
mat = confusion_matrix(y_test, y_pred)
sns.heatmap(mat.T, square=True, fmt='d', cbar=False, xticklabels=faces.target_names, yticklabels=faces.target_names)
plt.xlabel("True label")
plt.ylabel("Predicted label")
plt.show()