Project Description

The goal of this project is to implement different classifiers to achieve face recognition. You are given a set of faces and their corresponding labels. The given data set is divided into training and a testing set is used to train the classifiers.

There are two tasks:

Identifying the subject label from a test image.
Neutral vs. facial expression classification

Pipeline

In this project, the following classifiers are implemented for face recognition.

1. Bayes' Classifier: Assuming the underlying distribution is Gaussian, you need to implement the Maximum Likelihood estimation with Gaussian assumption followed by Bayes’ classification.

2. k-NN Rule: Implement the k-Nearest Neighbors (k-NN) rule to classify the test data. Vary k to see its effect.

3. Kernel SVM: Implement the Kernel SVM classifier by solving the dual optimization problem. Use the Radial Basis Function (RBF) and the polynomial kernel. Choose the optimal values of σ² and r using cross-validation.

4. Boosted SVM: Implement the AdaBoost algorithm for the class of linear SVMs. Investigate the improvement of the boosted classifiers with respect to the iterations of AdaBoost, and compare it with the performance of the Kernel SVMs.

5. PCA: Implement Principal Component Analysis (PCA) and apply it to the data before feeding it into the classifiers above.

6. MDA: Similar to PCA, implement MDA followed by the training/appli- cation of the classifiers.

Dataset

Dataset consists of 600 images of 200 subjects. Each subject has one neutral face image and one image with expression and the last image is of neutral face but varied illumination.

Results and Analysis

Face Recognition

1. PCA Analysis:

The above figure shows the graph of cumulative data variance vs number of eigen values. As shown in the graph, we can say that about 90% of the data variance are containing in top 100 eigen values (out of 504 eigen values). After 100 eigen values, the rate of change of variance with respect to number of eigen value is very small. Hence, to go from 90% to 95%, we need to add 100 more eigen value which is computationally expensive. Hence, in PCA compression, top 100 eigen vector are taken. So, the data is transformed from 504 dimension to 100 dimension.

2. MDA Analysis:

The above figure shows the graph of cumulative data variance vs number of eigen values. As shown in the graph, we can say that about 95% of the data variance are containing in top 100 eigen values (out of 504 eigen values). After 100 eigen values, the rate of change of variance with respect to number of eigen value is very small. Hence, to go from 95% to 98%, we need to add 100 more eigen value which is computationally expensive. Hence, in MDA compression, top 100 eigen vector are taken. So, the data is transformed from 504 dimension to 100 dimension.

3. Bayes' Classifier:

Optimal Parameter : Top 80 Eigen Value for PCA and MDA
Test Accuracy:
1. PCA : 57.0%
2. MDA : 53.0%

4. K-NN Rule:

Optimal Parameter : k = 2
Test Accuracy:
1. PCA : 54.0%
2. MDA : 52.5%

As we can see in the above graph of k-NN with PCA and MDA, the highest accuracy is achieved for k = 2 (which is equal to the number of data of each class in the train set). After that further increase in k has little impact on the accuracy and it decreases once it go beyond 6. Also, note that in this experiment if the value of k is even and if there are equal number of neighbor of two classes, the classification will be done considering 1-NN rule.

Neutral vs Expression Classification

1. PCA Analysis:

The above figure shows the graph of cumulative data variance vs number of eigen values. As shown in the graph, we can say that about 92% of the data variance are containing in top 100 eigen values (out of 504 eigen values). After 100 eigen values, the rate of change of variance with respect to number of eigen value is very small. Hence, to go from 92% to 95%, we need to add 100 more eigen value which is computationally expensive. Hence, in PCA compression, top 100 eigen vector are taken. So, the data is transformed from 504 dimension to 100 dimension.

2. MDA analysis:

The above figure shows the graph of cumulative data variance vs number of eigen values. As shown in the graph, we can say that about 100% of the data variance are containing in top 1 eigen values (out of 504 eigen values). After 1 st eigen values, the rate of change of variance with respect to number of eigen value almost 0. This is very surprising. This might be due to the face that since we have 360 data points for only two classes, MDA might found the best 1 direction where almost all the images are 100 % separable.

3. Bayes' Classifier:

Optimal Parameters
1. Top 60 eigen value in PCA analysis
2. Top 60 eigen value in MDA analysis
3. 90% data in training and 10% in testing
Test accuracy
1. PCA – 92.5%
2. MDA – 77.5%

The first two graphs show how test accuracy change with number of eigen value in MDA and PCA analysis. Since we have large number of data for both the classes, data can be compressed in relatively low dimension and we can achieve maximum test accuracy with top 60% eigen values. The third and forth graph shows how test accuracy change with splitting of training and testing data. It can be said that test accuracy is almost independent of this ratio. However, if we have more data intraining, model can generalize well on the future data. Hence, we take 90% data into training and 10% data in the testing.

4. k-NN rule:

Optimal Parameters
1. Top 60 eigen value in PCA analysis
2. Top 60 eigen value in MDA analysis
3. 90% data in training and 10% in testing
4. K = 5
Test Accuracy

PCA - 50.0%
MDA - 47.5%

5. Support Vector Machine with RBF Kernel:

Optimal Parameters
1. Top 60 eigen value in PCA analysis
2. Top 60 eigen value in MDA analysis
3. 80% data in training and 20% in testing
4. σ² = 3
Test Accuracy:
1. PCA - 83.75%
2. MDA - 77.50%

5. Support Vector Machine with polynomial Kernel:

Optimal Parameters
1. Top 60 eigen value in PCA analysis
2. Top 60 eigen value in MDA analysis
3. 80% data in training and 20% in testing
4. r = 2
Test Accuracy:
1. PCA - 70.0%
2. MDA - 71.25%

6. Boosted SVM:

Optimal Parameters
1. Top 60 eigen value in PCA analysis
2. Top 60 eigen value in MDA analysis
3. 90% data in training and 10% in testing
4. Number of Iteration : n = 8
Test Accuracy:
1. PCA - 94.5 %
2. MDA - 93.0 %

Requirement

Python 2.0 or above

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
face recognition		face recognition
neutral vs expression		neutral vs expression
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Project Description

Pipeline

Dataset

Results and Analysis

Face Recognition

Neutral vs Expression Classification

Requirement

About

Releases

Packages

Languages

abhijitmahalle/face-recognition

Folders and files

Latest commit

History

Repository files navigation

Project Description

Pipeline

Dataset

Results and Analysis

Face Recognition

Neutral vs Expression Classification

Requirement

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages