<h1 style=font-size:40px> Personality Type Classification </h1>

Note that this notebook is not a comprehensive list of the steps taken throughout this project but rather a summary of our final results.

<h1> Implementation </h1>

We mainly extracted images from eight different categories (anger, surprise, disgust, fear, neutral, happiness, sadness, and contempt) from a dataset found online: https://github.com/muxspace/facial_expressions

Since some categories (like neutral and happiness) had more images than others, we synthesized new images by changing factors like the contrast, color, and sharpness and added it to these categories with fewer images. This allowed for an equal amount of images for each separate category. 

We subsequently split the data into training and validation datasets using a 60-40 split; we didn't leave images for testing as the online dataset had a separate section for testing images. At the end, there were ~4,000 images in each category for the training set and ~2,730 images in each category for the validation set.

After processing these images, we resized them from 350x350 to 50x50 to allow for faster model training time. Due to time and budget constraints, these images were resized, which could have led to a lower accuracy for the overall model; for future implementation, the images will be resized back to their original format in order to ensure maximum accuracy.

In order to receive the make the most accurate predictive model, we implemented three approaches: using pre-existing Sagemaker image classification algorithms to produce the model, utilizing AWS Rekognition to predict the personality of the testing images, and building a convolutional neural network from scratch using Tensorflow and Keras.

<h2> Using Sagemaker Training Jobs </h2>

To give Sagemaker access to the contents of these files, we shifted the training and validation datasets to S3 for better operability. We used training jobs which used the pre-built image classification algorithm to create a model based on the images sent to them (350x350) using a ml.p2.xlarge instance. The hyperparameters and inputs were changed accordingly in order to provide the most optimal model. Some notable changes that we made was to the number of epochs (15), the method of gradient descent (stochastic gradient descent), and the number of training samples (53336). The training time for this model averaged ~5 hours. 

The finished model was uploaded to S3 and given testing images in order to validate its accuracy.

<h3> Testing the Training Jobs </h3>

The Sagemaker training job had an overall training accuracy of ~88% and a validation training accuracy of ~81%. In order to test the training job on the testing images given by the dataset, the model was recompiled into a Jupyter notebook for further testing.

Using the MXNet library, we extracted the model from the Sagemaker training job and used it to make a prediction. We selected a random test file and have shown the process of evaluating the model below for proof of work.

In [None]:
from PIL import Image
from numpy import asarray
import mxnet as mx
import matplotlib.pyplot as plt
import boto3
import numpy as np
import cv2
import os

training_job_name = 'image-classification-personality-training-job-try-7'
job_info = boto3.client('sagemaker').describe_training_job(TrainingJobName=training_job_name)
 
mx_model = mx.module.Module.load("../image-classification", 0, False, label_names=['out_label'])

def prepare(filepath, IMG_SIZE):
    img_array = cv2.imread(filepath, cv2.IMREAD_GRAYSCALE) 
    new_array = cv2.resize(img_array, (IMG_SIZE, IMG_SIZE)) 
    return new_array.reshape(-1, IMG_SIZE, IMG_SIZE)

def make_prediction_dense(model: mx.module, x_array: np.ndarray, batch_size: int=1):
    data_iter = mx.io.NDArrayIter(data=x_array, batch_size=batch_size)
    model.bind(data_shapes=data_iter.provide_data)
    prediction = model.predict(data_iter).asnumpy().flatten()
    return model, prediction

unique_emotions = ['anger', 'surprise', 'disgust', 'fear', 'neutral', 'happiness', 'sadness', 'contempt']
test_file = os.listdir("../test_resize")[1]
data = prepare("../test_resize/" + test_file, 350).reshape(1, 1, 350, 350)
model, prediction = make_prediction_dense(mx_model, data)
prediction_list = prediction.tolist()
max_value = max(prediction_list)
maximum = -1
for index in prediction_list:
    if (maximum < index):
        maximum = index
unique_emotion = unique_emotions[prediction_list.index(maximum)]

print(unique_emotion)

plt.imshow(cv2.imread("../test_resize/" + test_file))

<h2> Using AWS Rekognition </h2>

Using the boto3 library, Rekognition was able to be used to train and test the model. We mainly focused on the 'emotions' section of Rekognition as this had the necessary information for predicting the personalities of the training images. Since the model was already pre-trained, there was no necessary configuration in order to train this model; therefore, we immediately began testing the model's accuracy. Rekognition often showed multiple emotions, so we simply took the emotion that had the highest probability according to Rekognition.

This was done by using the images that were placed inside the S3 bucket (from the Sagemaker training jobs) and checking whether the labels of these images matched the labels that were seen by AWS Rekognition.

In [None]:
import boto3
import matplotlib.pyplot as plt
import cv2

rekognition = boto3.client('rekognition')

# Accuracy of the AWS Model (using Rekognition): approximately 60 percent

BUCKET = "image-classification-rekognition"
KEY = 'sadness/Alain_Cervantes_0001.jpg'
FEATURES_BLACKLIST = ("Landmarks", "Emotions", "Pose", "Quality", "BoundingBox", "Confidence")

unique_emotions = ['anger', 'surprise', 'disgust', 'fear', 'neutral', 'happiness', 'sadness', 'contempt']
related_emotion = ['ANGRY', 'SURPRISED', 'DISGUSTED', 'FEAR', 'CALM', 'HAPPY', 'SAD', 'CONFUSED']

KEYS = []
s3_client = boto3.client('s3')
paginator = s3_client.get_paginator('list_objects_v2')
result = paginator.paginate(Bucket='image-classification-rekognition',StartAfter='2018')
for page in result:
    if "Contents" in page:
        for key in page[ "Contents" ]:
            keyString = key[ "Key" ]
            KEYS.append(keyString)

def detect_faces(bucket, key, attributes=['ALL'], region="us-east-1"):
    rekognition = boto3.client("rekognition", region)
    response = rekognition.detect_faces(
        Image={
            "S3Object": {
            "Bucket": bucket,
            "Name": key,
            }
        },
        Attributes=attributes,
    )
    return response['FaceDetails']


KEY = KEYS[0]
for face in detect_faces(BUCKET, KEY):
    for emotion in face['Emotions']:
        real_emotion = unique_emotions.index(KEY.split('/')[0])
        predicted_emotion = related_emotion.index(emotion['Type'])
        break

print("Predicted Value: " + unique_emotions[predicted_emotion] + "\nReal Value: " + unique_emotions[real_emotion])
PATH_NAME = "../valid_processed/" + unique_emotions[real_emotion] + "/0" + KEY.split("/")[1]
plt.imshow(cv2.imread(PATH_NAME))

<h2> Building a CNN using Keras and Tensorflow </h2>

Using the Tensorflow and Keras libraries, we made a CNN model that would accurately predict incoming testing images fed into the model. This CNN model consisted of a combination of convolutional (with 38-42 filters in each), max pooling, and dense layers; our main activation function was ReLU due to its faster computational run time. The hyperparameters were tweaked accordingly to provide the best model. Some notable tweaks were to the batch size (16), the number of epochs (12), and the measure of loss (categorical cross-entropy).


In [1]:
import tensorflow as tf
from keras.preprocessing.image import ImageDataGenerator
from keras.layers import Flatten,Dense,BatchNormalization,Activation,Dropout

# Loading in training data
train_datagen = ImageDataGenerator(
        rescale=1./255,
        shear_range=0.2,
        zoom_range=0.2,
        horizontal_flip=True)

train_generator = train_datagen.flow_from_directory(
        '../train_processed',
        target_size=(50, 50),
        batch_size=16,
        class_mode='categorical')

# Loading in testing data
test_datagen = ImageDataGenerator(rescale=1./255)
test_generator = train_datagen.flow_from_directory(
        '../valid_processed',
        target_size=(50, 50),
        batch_size=16,
        class_mode='categorical')

# Initializing sequential model that makes use of convolutional, pooling, and dense layers
cnn = tf.keras.models.Sequential()
cnn.add(tf.keras.layers.Conv2D(filters=48, kernel_size=3, activation='relu', input_shape=(50, 50, 3), padding='same'))
cnn.add(tf.keras.layers.MaxPool2D((2,2), strides=(2,2), padding='same'))
cnn.add(tf.keras.layers.Conv2D(filters=48, kernel_size=3, activation='relu', padding = 'same'))
cnn.add(tf.keras.layers.MaxPool2D(pool_size=2, strides=2, padding = 'same'))
cnn.add(tf.keras.layers.Conv2D(filters=32, kernel_size=3, activation='relu', padding = 'same'))
cnn.add(tf.keras.layers.MaxPool2D(pool_size=2, strides=2, padding='same'))
cnn.add(tf.keras.layers.Flatten())
cnn.add(tf.keras.layers.Dense(128, activation='relu'))
cnn.add(tf.keras.layers.Dense(64, activation='relu'))
cnn.add(tf.keras.layers.Dense(8, activation='softmax'))

Using TensorFlow backend.


Found 31515 images belonging to 8 classes.
Found 21850 images belonging to 8 classes.
Instructions for updating:
If using Keras pass *_constraint arguments to layers.


In [None]:
# NOTE: Do not run this - Just for showing the code

# Finally compile and train the cnn
cnn.compile(optimizer="adam", loss="categorical_crossentropy", metrics=["accuracy"])
cnn.fit(x=train_generator, validation_data=test_generator, epochs=12)

cnn.save('model/test_model')

Epoch 1/12
Epoch 2/12

<h1> Testing the Built Model </h1>

First, the trained model was loaded into the Sagemaker notebook instance to check its performance over a variety of testing images. Note that these testing images were resized from 350x350 to 50x50 since the model was trained on 50x50 images. In future cases, these images can be resized to their original shape; however, for the purposes of time, we trained it on a smaller image size.

Since these testing images do not have any labels assigned to them, the model could only accurately be checked through use of visual inspection. In order to provide the most optimal method of doing this, we first added up the counts of the model's prediction of the testing images.

Next, we appended the file paths to each subsequent emotion depending on what the model predicted the testing image to be. With this, we can see the image along with the subsequent emotion that the model predicted. For example, if we pull out a particular image from the array (like the the sadness one) and see if the image emulates that.

In [None]:
def prepare(filepath):
    IMG_SIZE = 50  # 50 in txt-based
    img_array = cv2.imread(filepath)  # read in the image, convert to grayscale
    new_array = cv2.resize(img_array, (IMG_SIZE, IMG_SIZE))  # resize image to match model's expected sizing
    return new_array.reshape(-1, IMG_SIZE, IMG_SIZE, 3)  # return the image with shaping that TF wants.

In [None]:
import matplotlib.pyplot as plt
import cv2
import os
import tensorflow as tf

unique_emotions = ['anger', 'surprise', 'disgust', 'fear', 'neutral', 'happiness', 'sadness', 'contempt']
new_model = tf.keras.models.load_model('../model/my_model')

unique_pictures = {'anger' : [],
 'surprise' : [],
 'disgust' : [],
 'fear' : [],
 'neutral' : [],
 'happiness' : [],
 'sadness' : [],
 'contempt' : []}

for test_file in os.listdir("../test_resize"):
    #print(test_file)
    prediction = new_model.predict([prepare("../test_resize/" + test_file)])
    prediction_list = prediction.tolist()
    prediction_list = prediction_list[0]
    max_value = max(prediction_list)
    maximum = -1
    for index in prediction_list:
        if (maximum < index):
            maximum = index
    unique_emotion = unique_emotions[prediction_list.index(maximum)]
    
    unique_pictures[unique_emotion].append("../test_resize/" + test_file)

print("\n\n")
print('Example of anger predicted from image using model:')
plt.imshow(cv2.imread(unique_pictures['anger'][1]))

<h2> Challenges </h2>

<h3> Training Using Tensorflow and AWS Sagemaker </h3>

During this project, a repository of images were used for training the convolutional neural network from a preexisting dataset. Although this dataset allowed us to train the CNN, it provided problems in terms of training time and accuracy of classification. Due to the size of the images, we converted the images from 350x350 to 50x50, so that our model would train faster (as Sagemaker automatically logs itself out around every 12 hours, so our effective training time is reduced to that amount) considering cost and time constraints. Reducing the size of the images significantly lowered our training and validation accuracy; however, we hope that later, we will be able to use a larger instance for the complete training of the model.

<h3> Accuracy of the Dataset </h3>

The accuracy of the classification was 56%, while our training accuracy was 98%. After numerous changes to hyperparameters (including early stopping, learning rate, etc.) and the model itself, we weren't able to increase the validation accuracy and its erratic behavior while training (spiked up and down for each epoch). Running the data on one of the best widely-known image classification models (Amazon Rekognition), we saw that it had an accuracy of 60%, indicating that this dataset had inconsistent classifications or low-quality images. However, unfortunately, due to our limited access of datasets, we were unable to procure a better one at this time. We believe our model will perform better with a more accurate dataset, and will scale linearly with the accuracy of Rekognition's image classification model.