# Face Image Classification



In [1]:
from os import listdir
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from random import randint
from PIL import Image
from sklearn import svm
from sklearn.linear_model import LogisticRegression

## Data Organisation

### Data Modeling

NOTE: MAYBE WE SHOULD SEPERATE THIS INTO 2 SECTIONS: DATA MODELING AND "HOW TO ORGANISE THE TRAINING SET AND TEST SET" SO IT FOLLOWS THE TASK EVALUATION GUIDANCE A LITTLE CLOSER. 

To model the data, we will represent each image as an array mapping a greyscale value to each value. For this, we open each image and get the pixel greyscale values. We add this array and it's category to either the training set or the test set. The training set is the data we use to train our model, whereas the test set is the data we use to evaluate how well our model performs. We randomly decide which of the two sets each image is assigned to. This is the easiest way to ensure there is no bias regarding the training and test data, it's one of the easiest methodes to seperate the two sets while keeping the same proportion between categories, and we have enough data to get a functioning model using this method.

In [2]:
from sklearn.model_selection import train_test_split
test_size = 0.33
random_state = 0

In [3]:
# Convert images to vectors and store in x, y
X, y = [], []
for sample in listdir("cropped"):
    for pose in listdir("cropped/{}/face".format(sample)):
        X.append(np.array(Image.open("cropped/{}/face/{}".format(sample, pose))).flatten())
        y.append(sample)
X = np.array(X, dtype=int)
y = np.array(y, dtype=str)

# Build Training and Testing Sets
X_train, X_test, y_train, y_test = train_test_split(X, y, stratify = y, test_size=test_size, random_state = random_state)

# Verify that the data has been stratified correctly
count_unique_labels_all = dict(zip(*np.unique(y, return_counts=True)))
count_unique_labels_test = dict(zip(*np.unique(y_test, return_counts=True)))
label_percentages = {k:[count_unique_labels_all[k]/len(y)*100, count_unique_labels_test[k]/len(y_test)*100] for k in count_unique_labels_all}
print("Label  | % in all data | % in test data")
print("-------|---------------|---------------")
for k in label_percentages:
    print("  {}   |     {:.2f}%     |     {:.2f}%".format(k, label_percentages[k][0], label_percentages[k][1]))


Label  | % in all data | % in test data
-------|---------------|---------------
  1a   |     6.61%     |     6.32%
  1b   |     6.09%     |     5.79%
  1c   |     4.52%     |     4.74%
  1d   |     4.17%     |     4.21%
  1e   |     4.52%     |     4.74%
  1f   |     4.00%     |     4.21%
  1g   |     3.30%     |     3.16%
  1h   |     3.83%     |     3.68%
  1i   |     3.48%     |     3.68%
  1j   |     5.57%     |     5.26%
  1k   |     5.91%     |     5.79%
  1l   |     5.91%     |     5.79%
  1m   |     4.52%     |     4.74%
  1n   |     5.22%     |     5.26%
  1o   |     3.30%     |     3.16%
  1p   |     4.52%     |     4.74%
  1q   |     4.52%     |     4.74%
  1r   |     5.74%     |     5.79%
  1s   |     8.35%     |     8.42%
  1t   |     5.91%     |     5.79%


In [89]:
print(X_test.shape, y_test.shape, X_train.shape, y_train.shape)

(190, 10304) (190,) (385, 10304) (385,)


In [90]:
print(y_train)

['1f' '1b' '1p' '1b' '1n' '1s' '1s' '1s' '1q' '1s' '1s' '1s' '1q' '1s'
 '1r' '1a' '1o' '1i' '1d' '1l' '1i' '1a' '1n' '1j' '1n' '1k' '1c' '1h'
 '1p' '1f' '1p' '1l' '1l' '1c' '1a' '1t' '1j' '1t' '1s' '1r' '1k' '1b'
 '1t' '1m' '1m' '1f' '1k' '1r' '1q' '1o' '1n' '1e' '1e' '1h' '1t' '1s'
 '1s' '1f' '1r' '1k' '1c' '1m' '1e' '1h' '1a' '1l' '1i' '1c' '1p' '1k'
 '1r' '1k' '1b' '1d' '1a' '1s' '1b' '1b' '1l' '1r' '1a' '1l' '1n' '1e'
 '1a' '1r' '1q' '1r' '1t' '1l' '1c' '1q' '1m' '1h' '1h' '1c' '1n' '1s'
 '1a' '1d' '1t' '1n' '1r' '1j' '1m' '1q' '1k' '1d' '1o' '1a' '1t' '1l'
 '1s' '1h' '1b' '1k' '1h' '1g' '1h' '1e' '1b' '1a' '1a' '1a' '1f' '1m'
 '1n' '1o' '1s' '1i' '1b' '1e' '1e' '1f' '1c' '1b' '1l' '1b' '1r' '1m'
 '1j' '1k' '1t' '1a' '1m' '1l' '1s' '1i' '1t' '1j' '1c' '1d' '1b' '1m'
 '1o' '1p' '1n' '1p' '1o' '1s' '1c' '1p' '1g' '1e' '1s' '1d' '1t' '1q'
 '1q' '1q' '1t' '1l' '1a' '1b' '1q' '1n' '1i' '1l' '1s' '1t' '1h' '1s'
 '1n' '1t' '1p' '1t' '1s' '1r' '1l' '1j' '1i' '1k' '1l' '1h' '1q' '1a'
 '1c' 

## Training, Testing and Evaluation

### Support Vector Machines

In [None]:
clf = svm.SVC(kernel='linear').fit(X_train, y_train)

In [None]:
print(clf.score(X_test, y_test))

0.9842105263157894


### Logistic Regression

Logistic Regression is a way of classifying data using the sigmoid function 
$$g(z) = \frac{1}{1+e^{-z}}$$

In [96]:
log_reg = LogisticRegression(multi_class="ovr")
log_reg.fit(X_train, y_train)

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  n_iter_i = _check_optimize_result(


LogisticRegression(multi_class='ovr')

In [97]:
print(log_reg.score(X_test, y_test))

0.9789473684210527


## Evaluation

1) Performance evalution tell you if you're making progress, and put a number on it. All machine learning models, whether it's linear regression, or else, need a metric to judge performance. This is done by using some of the copmmon evaulation measures to get data output that reflects on the performance of the model

In [24]:
from sklearn.metrics import precision_score
from sklearn.metrics import recall_score
from sklearn.metrics import confusion_matrix
from sklearn.metrics import accuracy_score
from sklearn.metrics import roc_curve
from sklearn.metrics import auc

In [128]:
y_pred = log_reg.predict(X_test)
precision = precision_score(y_test, y_pred, average='macro')
recall = recall_score(y_test, y_pred, average='macro')
f1 = 2*precision*recall/(precision + recall)
accuracy = accuracy_score(y_test, y_pred)
cm = confusion_matrix(y_test, y_pred)


fp = cm.sum(axis=0) - np.diag(cm)  
fn = cm.sum(axis=1) - np.diag(cm)
tp = np.diag(cm)
tn = cm.sum() - (fp + fn + tp)
specificity = tn/(tn+fp)

2)
TP: True positives are the cases when the actual class of the data point was 1(True) and the predicted is also 1(True)

FP: False positives are the cases when the actual class of the data point was 0(False) and the predicted is 1(True). False is because the model has predicted incorrectly and positive because the class predicted was a positive one. (1)

TN: True negatives are the cases when the actual class of the data point was 0(False) and the predicted is also 0(False)

FN: False negatives are the cases when the actual class of the data point was 1(True) and the predicted is 0(False). False is because the model has predicted incorrectly and negative because the class predicted was a negative one. (0)

In [129]:
print("TP: " + str(tp))
print("FP: " + str(fp))
print("TN: " + str(tn))
print("FN: " + str(fn))

TP: [11 11  9  8  9  8  6  6  7  9 11 10  9 10  6  9  9 11 16 11]
FP: [0 0 0 0 1 1 0 0 0 1 0 0 0 0 0 1 0 0 0 0]
TN: [178 179 181 182 180 181 184 183 183 179 179 179 181 180 184 180 181 179
 174 179]
FN: [1 0 0 0 0 0 0 1 0 1 0 1 0 0 0 0 0 0 0 0]


3)
Accuracy using crossvalidation: the measurement used to determine which model is best at identifying relationships and patterns between variables in a dataset based on the input, or training, data.

Precision: Precision is one indicator of a machine learning model's performance – the quality of a positive prediction made by the model. Precision refers to the number of true positives divided by the total number of positive predictions 

Sensitivity: Sensitivity is a measure of how well a machine learning model can detect positive instances. It is also known as the true positive rate (TPR) or recall. 

Specificity: is the proportion of true negatives that are correctly predicted by the model.

F1-Score: It elegantly sums up the predictive performance of a model by combining two otherwise competing metrics — precision and recall.

Confusion Matrix: Confusion matrix is an N x N matrix used for evaluating the performance of a classification model, where N is the number of target classes. The matrix compares the actual target values with those predicted by the machine learning model

AUC: the Area Under the Curve (AUC) is the measure of the ability of a classifier to distinguish between classes and is used as a summary of the ROC curve.

ROC: ROC curve, also known as Receiver Operating Characteristics Curve, is a metric used to measure the performance of a classifier model. The ROC curve depicts the rate of true positives with respect to the rate of false positives, therefore highlighting the sensitivity of the classifier model.

4)

Accuracy:

In [130]:
print(accuracy)

0.9789473684210527


Precision:

In [131]:
print(precision)

0.9794444444444445


Sensitivity:

In [132]:
print(recall)

0.9791450216450217


Specificity

In [133]:
print(specificity[4])

0.994475138121547


F1-Score

In [134]:
print(f1)

0.9792947101573404


Confusion Matrix

In [135]:
print(cm)

[[11  0  0  0  0  0  0  0  0  0  0  0  0  0  0  1  0  0  0  0]
 [ 0 11  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0]
 [ 0  0  9  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0]
 [ 0  0  0  8  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0]
 [ 0  0  0  0  9  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0]
 [ 0  0  0  0  0  8  0  0  0  0  0  0  0  0  0  0  0  0  0  0]
 [ 0  0  0  0  0  0  6  0  0  0  0  0  0  0  0  0  0  0  0  0]
 [ 0  0  0  0  1  0  0  6  0  0  0  0  0  0  0  0  0  0  0  0]
 [ 0  0  0  0  0  0  0  0  7  0  0  0  0  0  0  0  0  0  0  0]
 [ 0  0  0  0  0  1  0  0  0  9  0  0  0  0  0  0  0  0  0  0]
 [ 0  0  0  0  0  0  0  0  0  0 11  0  0  0  0  0  0  0  0  0]
 [ 0  0  0  0  0  0  0  0  0  1  0 10  0  0  0  0  0  0  0  0]
 [ 0  0  0  0  0  0  0  0  0  0  0  0  9  0  0  0  0  0  0  0]
 [ 0  0  0  0  0  0  0  0  0  0  0  0  0 10  0  0  0  0  0  0]
 [ 0  0  0  0  0  0  0  0  0  0  0  0  0  0  6  0  0  0  0  0]
 [ 0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  9  0  0

AUC, ROC:

In [1]:
fpr, tpr, thresholds = roc_curve(y_test, y_pred, pos_label=2)
auc=auc(fpr, tpr)
print(auc)

5) 
Accuracy: to more accurate model outcomes result in better decisions. 

Precision: Models inherently trade off between precision and recall. Typically, the higher the precision, the lower the recall, and vice versa.

Sensitivity: to evaluate model performance because it allows us to see how many positive instances the model was able to correctly identify.

Specificty: to determine the proportion of actual negative cases, which got predicted correctly.

F1-Score: to sums up the predictive performance of a model by combining two otherwise competing metrics.

Confusion Matrix: helps you to the know the performance of the classification model on a set of test data for that the true values are known.

AUC: as a summary of the ROC curve. The higher the AUC, the better the performance of the model at distinguishing between the positive and negative classes.

ROC: to select a threshold for a classifier, which maximizes the true positives and in turn minimizes the false positives

## Improvement

## PCA

Import PCA class from sklearn

In [28]:
from sklearn.decomposition import PCA

Perform PCA dimensionality reduction on training and testing data.

In [58]:
pca = PCA(n_components=10)
pca.fit_transform(X_train)
pca_X_train = pca.transform(X_train)
pca_X_test = pca.transform(X_test)
print(pca.n_components_)

10


Perform logistic regression with the new data and print accuracy.

In [61]:
logreg = LogisticRegression(max_iter=800)
logreg.fit(pca_X_train, y_train)
y_pred = logreg.predict(pca_X_test)
print(accuracy_score(y_test, y_pred))

0.8842105263157894


Perform logistic regression using the original data and print accuracy.

In [39]:
logreg = LogisticRegression(max_iter=200)
logreg.fit(X_train, y_train)
y_pred = logreg.predict(X_test)
print(accuracy_score(y_test, y_pred))

0.9789473684210527


## LDA

Import LDA class from sklearn

In [41]:
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis

Perform LDA dimensionality reduction on training and testing data.

In [57]:
lda = LinearDiscriminantAnalysis(n_components=10)
lda_X_train = lda.fit_transform(X_train, y_train)
lda_X_test = lda.transform(X_test)
print(lda.n_components)

10


Perform logistic regression using the new data and print accuracy.

In [53]:
logreg = LogisticRegression(max_iter=2000)
logreg.fit(lda_X_train, y_train)
y_pred = logreg.predict(lda_X_test)
print(accuracy_score(y_test, y_pred))

0.9736842105263158


Perform logistic regression using the original data and print accuracy.

In [None]:
logreg = LogisticRegression(max_iter=200)
logreg.fit(X_train, y_train)
y_pred = logreg.predict(X_test)
print(accuracy_score(y_test, y_pred))

0.9789473684210527
