**Importation of needed Libraries**

In [1]:
from time import time
import logging
import matplotlib.pyplot as plt

from sklearn.model_selection import train_test_split
from sklearn.model_selection import GridSearchCV
from sklearn.datasets import fetch_lfw_people
from sklearn.metrics import classification_report
from sklearn.metrics import confusion_matrix
from sklearn.decomposition import PCA
from sklearn.svm import SVC

print(__doc__)

# Display progress logs on stdout
logging.basicConfig(level=logging.INFO, format='%(asctime)s %(message)s')

Automatically created module for IPython interactive environment


**Dataset Importaion and Processing**

The dataset - lfw_people were imported from sklearn.datasets. The label to predict is the identity of the individuals in the image (image recognition).

In [2]:
# Downloading and loading the dataset as numpy arrays
lfw_people = fetch_lfw_people(min_faces_per_person=70, resize=0.4)


# introspect the images arrays to find the shapes (for plotting)
n_samples, h, w = lfw_people.images.shape

# for machine learning we use the 2 data directly (as relative pixel positions info is ignored by this model)
X = lfw_people.data
n_features = X.shape[1]

# the label to predict is the id of the person
y = lfw_people.target
target_names = lfw_people.target_names
n_classes = target_names.shape[0]

print("Total dataset size:")
print("n_samples: %d" % n_samples)
print("n_features: %d" % n_features)
print("n_classes: %d" % n_classes)

Total dataset size:
n_samples: 1288
n_features: 1850
n_classes: 7


**Splitting data into a training set and a test set with test_size of 25%, and printing the resulting input variables X**

In [3]:
# split into a training and testing set
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=42)

print(X_test.shape)
print(X_train.shape)

(322, 1850)
(966, 1850)


**Setting parameters for the learner (Multilayer Perceptron) with PCA. Note: For MLP, when PCA wasn’t applied, the model was only given precision, recall and accuracy results for George Bush.**

In [4]:
#Feature extraction / dimensionality reduction

n_components = 150

print("Extracting the top %d eigenfaces from %d faces"
      % (n_components, X_train.shape[0]))
t0 = time()

pca = PCA(n_components=n_components, svd_solver='randomized',
          whiten=True).fit(X_train)

print("done in %0.3fs" % (time() - t0))

eigenfaces = pca.components_.reshape((n_components, h, w))

print("Projecting the input data on the eigenfaces orthonormal basis")

t0 = time()

X_train_pca = pca.transform(X_train)
X_test_pca = pca.transform(X_test)

print("done in %0.3fs" % (time() - t0))

Extracting the top 150 eigenfaces from 966 faces
done in 0.251s
Projecting the input data on the eigenfaces orthonormal basis
done in 0.041s


**Printing the reduced input variables X**

In [5]:
print(X_test_pca.shape)

print(X_train_pca.shape)

(322, 150)
(966, 150)


**Setting parameters for the learner (Multilayer Perceptron - MLP) and fitting the model**

In [6]:
#Calling needed libraries
from sklearn.neural_network import MLPClassifier
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
import sklearn.neural_network as nn
import sklearn.model_selection as ms

t = time()    #to check convergence time

mlp_model = nn.MLPClassifier(hidden_layer_sizes=(15,10), max_iter=2000)       #The hidden is set as (15,10) as this corresponds to the number of input in the dataset (150).

mlp_model.fit(X_train_pca, y_train)

print("done in %0.3fs" % (time() - t))

done in 2.257s


**Quantitative evaluation of the model quality on the test set**

In [7]:
print("Predicting people's names on the test set")

t0 = time()

y_pred = mlp_model.predict(X_test_pca)

print("done in %0.3fs" % (time() - t0))

print(classification_report(y_test, y_pred, target_names=target_names))
print(confusion_matrix(y_test, y_pred, labels=range(n_classes)))

Predicting people's names on the test set
done in 0.003s
                   precision    recall  f1-score   support

     Ariel Sharon       0.71      0.38      0.50        13
     Colin Powell       0.77      0.73      0.75        60
  Donald Rumsfeld       0.54      0.56      0.55        27
    George W Bush       0.83      0.84      0.83       146
Gerhard Schroeder       0.42      0.52      0.46        25
      Hugo Chavez       0.47      0.47      0.47        15
       Tony Blair       0.66      0.64      0.65        36

         accuracy                           0.71       322
        macro avg       0.63      0.59      0.60       322
     weighted avg       0.72      0.71      0.71       322

[[  5   0   3   2   1   0   2]
 [  0  44   5   5   3   1   2]
 [  1   2  15   5   1   0   3]
 [  0   7   3 123   7   2   4]
 [  0   2   0   5  13   5   0]
 [  0   0   0   1   6   7   1]
 [  1   2   2   8   0   0  23]]


> **Result: With the chosen parameters in the Multilayer Perceptron, we have an accuracy of 71%.**

**Finding the optimal number of hidden_layers using hyper parameter**

In [8]:
t0 = time()

param_grid = {'hidden_layer_sizes': [(15,10), (20,15), (30,20)], 
             }

mlp_model = GridSearchCV(
    nn.MLPClassifier(max_iter=2000), param_grid
)

mlp_model.fit(X_train_pca, y_train)

print("done in %0.3fs" % (time() - t))

print(mlp_model.best_estimator_)

done in 32.071s
MLPClassifier(hidden_layer_sizes=(20, 15), max_iter=2000)


**Quantitative evaluation of the model quality with hyper-parameters on the test set**

In [9]:
print("Predicting people's names on the test set")

t0 = time()
y_pred = mlp_model.predict(X_test_pca)

print("done in %0.3fs" % (time() - t0))

print(classification_report(y_test, y_pred, target_names=target_names))
print(confusion_matrix(y_test, y_pred, labels=range(n_classes)))

Predicting people's names on the test set
done in 0.003s
                   precision    recall  f1-score   support

     Ariel Sharon       0.45      0.38      0.42        13
     Colin Powell       0.70      0.82      0.75        60
  Donald Rumsfeld       0.75      0.78      0.76        27
    George W Bush       0.88      0.84      0.86       146
Gerhard Schroeder       0.62      0.72      0.67        25
      Hugo Chavez       0.58      0.47      0.52        15
       Tony Blair       0.71      0.67      0.69        36

         accuracy                           0.76       322
        macro avg       0.67      0.67      0.67       322
     weighted avg       0.77      0.76      0.76       322

[[  5   1   2   3   1   0   1]
 [  3  49   0   5   0   1   2]
 [  3   1  21   1   0   0   1]
 [  0  10   4 122   4   2   4]
 [  0   2   0   1  18   2   2]
 [  0   3   0   2   3   7   0]
 [  0   4   1   4   3   0  24]]


> **With the hidden_layer increasing to (30, 20) after the application of hyper-parameters, the accuracy of the results also increased to 76%. This shows that the optimal level of the hidder_layer is (30, 20)**


**Comparison of results MLP and Random Forest**

**Comparing Results without PCA:**
Random forest produced an accuracy of 63% without the application of PCA on the datasets. For MLP, when PCA wasn’t applied, the model was only given precision, recall and accuracy results for George Bush.

**Comparing Results with PCA:**
With the application of PCA in both Random Forest and MLP, random forest had an accuracy of 57%, and MLP produced an accuracy of 77%. In fact, when hyper-parameter was applied on MLP, the accuracy increased to 78%.

**Conclusion:**
Overall, this shows that MLP performed better than Random Forest in image recognition of the datasets given.