#                     Fundamentals of Machine Learning - 2022
#                     Report 2 - Classifying with convnets - Testing
###                          Facundo Sheffield

Now that we have our three models trained, we can test them and see how they perform. We will use the test set that we created in the previous notebook.

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import pickle

In [2]:
# Lets load the faces_dict.p file:
with open('faces_dict.p', 'rb') as f:
    faces_dict = pickle.load(f)

In [3]:
# lets add one channel at the end of the image:

new_val = []
for image in faces_dict['images']:
    new_val.append(np.expand_dims(image, axis=2))

faces_dict.update({'images': np.array(new_val)})

In [4]:
# Lets start by separating into train, test and validation

from sklearn.model_selection import train_test_split
Full_X_train, X_test, Full_y_train, y_test = train_test_split(faces_dict['images'], faces_dict['target'], test_size=100, random_state=42, stratify=faces_dict['target'])
X_train, X_val, y_train, y_val = train_test_split(Full_X_train, Full_y_train, test_size=80, random_state=42, stratify=Full_y_train)

print(f"Training, validation and test data: {len(X_train)}, {len(X_val)}, {len(X_test)}")

Training, validation and test data: 220, 80, 100


In [5]:
X_test_RF = X_test.reshape(X_test.shape[0], -1)  # test data for random forest

Now we can test our Baseline and our CNN. Lets import our models.

In [13]:
# lets import our models:

import pickle
from keras import models

with open('RandomForest.pkl', 'rb') as f:
    RF_model = pickle.load(f)

CNN_model = models.load_model('CNN_model.h5')
VGG_model = models.load_model('TransferLearning_model.h5')


Now lets predict the labels for the test set.

In [7]:
from sklearn.metrics import accuracy_score

y_pred_RF = RF_model.predict(X_test_RF)  # predict the test data
print("Random Forest accuracy: ",accuracy_score(y_test, y_pred_RF))  # prints the accuracy

CNN_model.evaluate(X_test, y_test)  # prints the accuracy

Random Forest accuracy:  0.95


[0.04789689555764198, 0.9700000286102295]

In [14]:
VGG_model.evaluate(X_test, y_test)  # prints the accuracy



[0.25506657361984253, 0.9599999785423279]

Interesting! We can see that our VGG16 model does not perform as well as our CNN. But both end up performing better than our Baseline. I suspect that the difference in accuracy is probably due to the difference in the number of free parameters in each model. After all even though our VGG has 3 times less free parameters, it still managed to get quite close to our CNN. Maybe the results could have been better if we had changed the optimizer, or the learning rate of the network. And I still would like to know what would have happend with a VGG16 network fully trained on grayscale. But for now, we will leave things as they are.

### Conclusions

We managed to obtain three different ML models in order to tackle an image classification problem. Our baseline model, based on Random Forest, performed quite well, but it was outperformed by our CNN and our VGG16 models. In all cases we tried to use data augmentation when possible, and this technique gave worse results for our classical model. Regarding the VGG16 model, we managed to properly incorporate a pretrained model and use it as a solution to our problem. Having said that, the best model was our CNN, which was trained from scratch. This may have been due to the number of free parameters, but further testing would need to be done to be sure.