# Topics

## 1. ANN on Facial Recognition


In [1]:
%matplotlib inline
# All imports

import numpy as np
import matplotlib.pyplot as plt

import sklearn
from sklearn.neural_network import MLPClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report, confusion_matrix

# -------------> The following three are new imports <------------------
from time import time

np.set_printoptions(formatter={'float': '{:.5f}'.format})


## The Confusion Matrix:  Evaluate the accuracy of a classification.

## Definition: a confusion matrix C is such that C\_{i, j} is equal to the number of observations known to be in group i but predicted to be in group j.

From: http://scikit-learn.org/stable/modules/generated/sklearn.metrics.confusion_matrix.html


## Each row should add up to the number of objects in each class.

In [2]:
'''
3 dogs: all correctly classified

3 cats: 1 is incorrectly classified as dog, 2 classficied correctly

1 racoon: incorrectly classified as cat

'''

y_true = ["cat", "dog", "cat", "cat", "dog", "racoon", "dog"]
y_pred = ["dog", "dog", "cat", "cat", "dog", "cat", "dog"]
confusion_matrix(y_true, y_pred, labels=["dog", "racoon", "cat"])

array([[3, 0, 0],
       [0, 0, 1],
       [1, 0, 2]])

In [4]:
from sklearn.datasets import fetch_lfw_people

## Applying ANN to Facial Recogition

In [7]:
lfw_people = fetch_lfw_people(min_faces_per_person=70, resize=0.4)

ValueError: min_faces_per_person=70 is too restrictive

## Mini-breakout
## Determine how to extract information from the object:

      lfw_people

In [11]:
print(type(lfw_people))

<class 'sklearn.utils.Bunch'>


### Next several cells: Solution to mini-breakout.

In [12]:
dir(lfw_people)

['DESCR', 'data', 'images', 'target', 'target_names']

In [13]:
lfw_people.keys()

dict_keys(['data', 'images', 'target', 'target_names', 'DESCR'])

In [14]:
print("Number of distinct individuals:", lfw_people.target_names.shape)
print("Number of images:", lfw_people.target.shape)


Number of distinct individuals: (7,)
Number of images: (1288,)


In [12]:
# introspect the images arrays to find the shapes (for plotting)
n_samples, h, w = lfw_people.images.shape

# for machine learning we use the 2 data directly (as relative pixel
# positions info is ignored by this model)
X = lfw_people.data
n_features = X.shape[1]
print(X.shape)

n_features = X.shape[1]

# the label to predict is the id of the person
y = lfw_people.target
target_names = lfw_people.target_names
n_classes = target_names.shape[0]

print("Total dataset size:")
print("n_samples: {:d}".format(n_samples))
print("n_features: {:d}".format(n_features))
print("n_classes: {:d}".format(n_classes))


(1288, 1850)
Total dataset size:
n_samples: 1288
n_features: 1850
n_classes: 7


## Breakout:

## Apply ANN to Facial Recogition

## Part I: Apply it to one person -- say the first in the data set


## Part II: Apply it to all individuals with 70 pictures or more.

1. Split into a training and testing set and use the following random_state (I'll explain later)

        (test_size = .25, random_state=42)

2. Train a NN classification model using the training set.  

3. Report the score of the trainied NN applied to the training data.

4. Apply the trained NN to the testing data, and report the score.

5. Print the sklearn classification report and confusion_matrix 

6. Time the training and testing stages. Google how to use the time module

        from time import time

7. Use one intermediate layer.  You may vary the number of neurons.  But it shouldn't be greater than 1000.

8. Show the first 12 images in the testing set in a 3 by 4 arrangement for human inspection.  The title should show the predicted name and the true name of the person.

9. Show ROC and calculate AUC for Colin Powell

# End of Week13-1