# Advanced Certification in AIML
## A Program by IIIT-H and TalentSprint

##Learning Objectives


At the end of the experiment, you will be able to:

* extract meaningful features from the images using PCA

* apply SVM on the extracted data and recognize the face

**NOTE:**  The intent of this experiment is to understand the SVM parameters and tune your classifier

In [None]:
#@title Experiment Walthrough Video
from IPython.display import HTML

HTML("""<video width="854" height="480" controls>
  <source src="https://cdn.talentsprint.com/talentsprint1/archives/sc/misc/svm_facerecognition.mp4" type="video/mp4">
</video>
""")

##Dataset 

### Description 

The dataset chosen for this experiment is a preprocessed excerpt of the “Labeled Faces in the Wild”, aka LFW. 

Labeled Faces in the Wild, a database of face photographs designed for studying the problem of unconstrained face recognition. The data set contains more than 13,000 images of faces collected from the web. Each face has been labeled with the name of the person pictured. 1680 of the people pictured have two or more distinct photos in the data set. The only constraint on these faces is that they were detected by the Viola-Jones face detector. 


To know more about the dataset you can refer below link :


http://vis-www.cs.umass.edu/lfw/

##Domain Information


As it is known by you that every face is different and a face has various features. Some of us have a broad forehead, some have narrow, some have fuller lips whereas some have thinner lips, etc. Additionally, every feature of the face has different variations. An ideal face recognition system should be able to consider all the variations and the challenges faced to recognize a face accurately.


###Below we have listed a few challenges


**Illumination:** Lighting aspect

**Background:** The placement of the subject also serves as a significant contributor to the limitations.

**Pose:** The movements of the head 

**Occlusion:**  beard, mustache, accessories (goggles, caps, mask, etc.) also meddle with the evaluation of a face recognition system. The Presence of such components makes the subject diverse and hence it becomes difficult for the system to operate in a non-simulated environment. 

**Expressions:** A change is an expression brings a change into all the aspects of the face.

All these make the problem very complex.

##AI/ML Technique

### SVM

SVM stands for Support vector machines. It used for both classification and regression tasks. SVM works by searching the linear optimal separating hyperplane (decision boundary). The logic is that decision boundary with large margin is better when handling unseen data compared to decision boundary with a small margin. When the data is not
linearly separable, SVM transforms original data into a higher dimension using a nonlinear mapping to obtain the separating hyperplane.


To know more about SVM you can refer the below link :

https://www.quantstart.com/articles/Support-Vector-Machines-A-Guide-for-Beginners



As an example of support vector machines in action, let's take a look at the facial recognition problem

### Importing the required packages

In [None]:
from sklearn.datasets import fetch_lfw_people
from sklearn.svm import SVC
from sklearn.decomposition import PCA
from sklearn.pipeline import make_pipeline
from sklearn.model_selection import train_test_split
from sklearn.model_selection import GridSearchCV
from sklearn.metrics import classification_report
import seaborn as sns
import matplotlib.pyplot as plt
from time import time

### Loading the dataset from sklearn datasets

In [None]:
faces = fetch_lfw_people(min_faces_per_person = 60)

In [None]:
# Checking for the target names (Label names)
print(faces.target_names)

In [None]:
# Checking for the shape of images
print(faces.images.shape)

To get a sense of the data, let us visualize the faces

In [None]:
# Plotting the images in subplots
n_row, n_col = 3, 3
plt.figure(figsize=(1.8 * n_col, 2.4 * n_row))

for i in range(n_row * n_col):
    plt.subplot(n_row, n_col, i + 1)
    plt.imshow(faces.images[i], cmap='gray')
    plt.title(faces.target_names[faces.target[i]], size=12)
    plt.xticks(())
    plt.yticks(())

Each image contains [62×47] or nearly 3,000 pixels. We could proceed by simply using each pixel value as a feature, but often it is more effective to use some sort of preprocessor to extract more meaningful features; here we will use a principal component analysis to extract 150 fundamental components to feed into our support vector machine classifier. We can do this most straightforwardly by packaging the preprocessor and the classifier into a single pipeline:

In PCA, the parameter `Whiten = True`, will remove some information from the transformed signal (the relative variance scales of the components) but can sometime improve the predictive accuracy of the downstream estimators by making their data respect some hard-wired assumptions. Whitening just makes our resulting data have a unit variance, which has been shown to produce better results

In support vector machines, 'C' is a hyperparameter determining the penalty for misclassifying an sample. One method for handling imbalanced classes in support vector machines is to weight 'C' by classes, so that

$C_k = C∗w_j$

where $C$ is the penalty for misclassification, $w_j$ is a weight inversely proportional to class $j$’s frequency and 
$C_j$ is the $C$ value for class $j$. The general idea is to increase the penalty for misclassifying minority classes to prevent them from being “overwhelmed” by the majority class.

In scikit-learn, for SVC we can set the values for $C_j$
 automatically by setting **class_weight='balanced'**.  The balanced argument automatically weighs classes such that:

  $w_j = \frac{n}{kn_j}$

where $w_j$ is the weight to class $j$,  n is the number of samples, $n_j$ is the number of samples in class $j$
, and k is the total number of classes.


Note: Refer [make_pipeline](https://scikit-learn.org/stable/modules/generated/sklearn.pipeline.make_pipeline.html) from sklearn

In [None]:
pca = PCA(n_components=150, whiten=True, random_state=42)
svc = SVC(kernel='rbf', class_weight='balanced')
model = make_pipeline(pca, svc)

For testing our classifier output, we will split the data into a training and testing set:

In [None]:
Xtrain, Xtest, ytrain, ytest = train_test_split(faces.data, faces.target, random_state=42)

In [None]:
# Checking for the shape of Xtest
Xtest.shape

In [None]:
# Checking for the shape of Xtrain
Xtrain.shape

Finally, we can use a grid search to explore combinations of parameters. Here we will adjust C (which controls the margin hardness) and gamma (which controls the size of the radial basis function kernel), and determine the best model:

In [None]:
param_grid = {'svc__C': [1, 5, 10, 50],      # It takes some time to run this cell
              'svc__gamma': [0.0001, 0.0005, 0.001, 0.005]}
grid = GridSearchCV(model, param_grid)

# Starting the timer
t0 = time()

grid.fit(Xtrain, ytrain)

print("done in %0.3fs" % (time() - t0))
print(grid.best_params_)

The optimal values fall toward the middle of our grid; if they fell at the edges, we would want to expand the grid to make sure we have found the true optimum. 

Now with this model, we can predict the labels for the test data, which the model has not yet seen:

In [None]:
model = grid.best_estimator_
y_pred = model.predict(Xtest)

Let's take a look at a few of the test images along with their predicted values:

In [None]:
# Plotting the images in subplots
n_rows, n_cols = 4, 6
plt.figure(figsize=( 8, 6))

for i in range(n_rows * n_cols):
    plt.subplot(n_rows, n_cols, i + 1)
    plt.imshow(Xtest[i].reshape(62, 47), cmap='gray')
    plt.xticks(())
    plt.yticks(())
    plt.ylabel(faces.target_names[y_pred[i]].split()[-1],
                   color='black' if y_pred[i] == ytest[i] else 'red')
plt.suptitle('Predicted Names; Incorrect Labels in Red', size = 15)
plt.show()

Out of this small sample, our optimal estimator mislabeled only a single face (Bush’s face in the bottom row was mislabeled as Blair). We can get a better sense of our estimator's performance using the classification report, which lists recovery statistics label by label

In [None]:
print(classification_report(ytest, y_pred,
                            target_names=faces.target_names))

Display the confusion matrix between these classes

In [None]:
from sklearn.metrics import confusion_matrix
mat = confusion_matrix(ytest, y_pred)
sns.heatmap(mat.T, square=True, annot=True, fmt='d', cbar=False,
            xticklabels=faces.target_names,
            yticklabels=faces.target_names)
plt.xlabel('true label')
plt.ylabel('predicted label');

For a real-world facial recognition task, in which the photos do not come pre-cropped into nice grids, the only difference in the facial classification scheme is the feature selection: you would need to use a more sophisticated algorithm to find the faces, and extract features that are independent of the pixelation.

#### Acknowledgment:  Python Data Science Handbook by Jake VanderPlas