# Group 17
# Facial Emotion Prediction

## Problem Definition

To predict the emotion a person is experiencing from a picture of their face. 

**Task (T)**: Predict the emotion of the person from an image of their face. 

**Experience (E)**: Images of faces from the datasets. 

**Performance (P)**: How often the model will correctly predict human emotion from images.

## Datasets

### 1. Google facial expression comparison dataset

This dataset is a large-scale facial expression dataset consisting of face image triplets along with human annotations that specify which two faces in each triplet form the most similar pair in terms of facial expression. Each triplet in this dataset was annotated by six or more human raters. This dataset is quite different from existing expression datasets that focus mainly on discrete emotion classification or action unit detection. 

[Link](https://research.google/tools/datasets/google-facial-expression/)

### 2. FER-2013 Learn facial Expressions from an image - Kaggle

* The data consists of 48x48 pixel grayscale images of faces. The faces havebeen automatically registered so that the face is more or less centered and occupies about the same amount of space in each image.
* Facial expressions are divided into one of seven categories (0=Angry, 1=Disgust, 2=Fear, 3=Happy, 4=Sad, 5=Surprise, 6=Neutral). The training set consists of 28,709 examples and the public test set consists of 3,589 examples.

[Link](https://www.kaggle.com/msambare/fer2013)

### 3. The Olivetti faces dataset
* This dataset contains a set of face images taken between April 1992 and April 1994 at AT&T Laboratories Cambridge. The sklearn.datasets.fetch_olivetti_facesfunction is the data fetching / caching function that downloads the data archive from AT&T.
* There are ten different images of each of 40 distinct subjects. For some subjects, the images were taken at different times, varying the lighting, facial expressions (open / closed eyes, smiling / not smiling) and facial details (glasses / no glasses). Allthe images were taken against a dark homogeneous background with the subjects in an upright, frontal position (with tolerance for some side movement).
* This Dataset is intended for all users as part of scikit learn python library

[Link](https://www.kaggle.com/msambare/fer2013)

### Importing the necessary Libraries

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import os
import tensorflow as tf

### Exploring the dataset

In [2]:
print(os.listdir('/kaggle/input/fer2013'))

In [3]:
train_path = '/kaggle/input/fer2013/train'
val_path = '/kaggle/input/fer2013/test'

In [4]:
# Importing TensorFlow libraries
from tensorflow.keras.preprocessing.image import ImageDataGenerator 
from tensorflow.keras.utils import to_categorical, plot_model
from tensorflow.keras import models, layers, regularizers
from tensorflow.keras import Sequential
from tensorflow import keras
from sklearn.model_selection import KFold

## Data Preprocessing
Explain the pre-processing done on your Dataset to make it suitable for applying machine learning algorithms.

We are getting the directories with images and storing the labels of the respective directories as lists. Then we rescale all images to a uniform size (<b>Normalization</b>) which allows all images to contribute equally to the total loss rather than when other images have high and low pixels ranges give strong and weak loss, respectively. Since high pixel images require a low learning rate and low pixel images high learning rate, re-scaling helps provide a standard learning rate for all images.

We also converted the colored images to Grayscale to reduce computation complexity (<b>Greyscale Conversion</b>). 

We have also used rescaling to augment data from the existing data (<b>Data Augumentation with rescaling</b>).

In [5]:
emotion_labels = sorted(os.listdir(train_path))
print(emotion_labels)

In [6]:
batch_size = 64
target_size = (48,48)

train_datagen = ImageDataGenerator(rescale=1./255)
val_datagen   = ImageDataGenerator(rescale=1./255)

train_generator = train_datagen.flow_from_directory(
        train_path,
        target_size=target_size,
        batch_size=batch_size,
        color_mode="grayscale",
        class_mode='categorical',
        shuffle=True)

val_generator = val_datagen.flow_from_directory(
        val_path,
        target_size=target_size,
        batch_size=batch_size,
        color_mode="grayscale",
        class_mode='categorical')

# Data Summarization

### Dimensions of the Dataset

In [7]:
input_shape = (48,48,1) # img_rows, img_colums, color_channels
num_classes = 7

### Statistical Summary of all attributes

In [8]:
print(os.listdir('/kaggle/input/fer2013/train'))

### Breakdown of all data class variable

In [9]:
def plot_images(img_dir, top=10):
    all_img_dirs = os.listdir(img_dir)
    img_files = [os.path.join(img_dir, file) for file in all_img_dirs][:5]
  
    plt.figure(figsize=(12, 12))
  
    for idx, img_path in enumerate(img_files):
        plt.subplot(5, 5, idx+1)
        img = plt.imread(img_path)
        plt.tight_layout()         
        plt.imshow(img, cmap='gray')

In [10]:
print('Angry: ')
print()
plot_images(train_path+'/angry')

In [11]:
print('Disgust: ')
print()
plot_images(train_path+'/disgust')

In [12]:
print('Fear: ')
print()
plot_images(train_path+'/fear')

In [13]:
print('Happy: ')
print()
plot_images(train_path+'/happy')

In [14]:
print('Neutral: ')
print()
plot_images(train_path+'/neutral')

In [15]:
print('Sad: ')
print()
plot_images(train_path+'/sad')

In [16]:
print('Surprise: ')
print()
plot_images(train_path+'/surprise')

## Data Visualization

In [17]:
emotions = os.listdir('/kaggle/input/fer2013/train')
for emotion in emotions:
    count = len(os.listdir(f'/kaggle/input/fer2013/train/{emotion}'))
    print(f'{emotion} faces={count}')

In [18]:
emotions = os.listdir('/kaggle/input/fer2013/test')
for emotion in emotions:
    count = len(os.listdir(f'/kaggle/input/fer2013/test/{emotion}'))
    print(f'{emotion} faces={count}')

In [19]:
emotions = os.listdir('/kaggle/input/fer2013/train')
values = [len(os.listdir(f'/kaggle/input/fer2013/train/{emotion}')) for emotion in emotions]
fig = plt.figure(figsize = (10, 5))

# creating the bar plot
plt.bar(emotions, values, color ='grey',
        width = 0.4)

plt.xlabel("Emotions")
plt.ylabel("No. of images")
plt.title("Train dataset overview")
plt.show()

In [20]:
emotions = os.listdir('/kaggle/input/fer2013/test')
values = [len(os.listdir(f'/kaggle/input/fer2013/test/{emotion}')) for emotion in emotions]
fig = plt.figure(figsize = (10, 5))

# creating the bar plot
plt.bar(emotions, values, color ='grey',
        width = 0.4)

plt.xlabel("Emotions")
plt.ylabel("No. of images")
plt.title("Test dataset overview")
plt.show()

## Python Packages

1. Numpy
2. Pandas
3. Matplotlib
4. OS
5. Tensorflow
6. Sklearn

# Supervised Learning

## 1. Convolutional Neural Network (CNN)

Convolutional neural networks (CNN) is a special architecture of artificial neural networks. CNNs uses some of its features of visual cortex and have therefore achieved state of the art results in computer vision tasks.

Convolutional neural networks are comprised of two very simple elements, namely convolutional layers and pooling layers.
Although simple, there are near-infinite ways to arrange these layers for a given computer vision problem.
The elements of a convolutional neural network, such as convolutional and pooling layers, are relatively straightforward to understand.
The challenging part of using convolutional neural networks in practice is how to design model architectures that best use these simple elements.

### Building the Model

In [21]:
model = Sequential()
model.add(layers.Conv2D(16,(5,5),padding='valid',input_shape = input_shape)) # Convolutional layers
model.add(layers.Activation('relu')) # activation functions
model.add(layers.MaxPooling2D(pool_size=(2,2),strides=2,padding = 'valid')) # to reduce size of image
model.add(layers.Dropout(0.4))

model.add(layers.Conv2D(32,(5,5),padding='valid'))
model.add(layers.Activation('relu'))
model.add(layers.MaxPooling2D(pool_size=(2,2),strides=2,padding = 'valid'))
model.add(layers.Dropout(0.6))

model.add(layers.Conv2D(64,(5,5),padding='valid'))
model.add(layers.Activation('relu'))
model.add(layers.Dropout(0.8))
model.add(layers.Flatten())
model.add(layers.Dense(7)) # classification

model.add(layers.Activation('softmax'))

model.summary()

In [22]:
# Compile Model
optimizer = keras.optimizers.RMSprop(lr = 0.0001, decay = 1e-6)
model.compile(loss = 'binary_crossentropy',optimizer = optimizer, metrics = ['accuracy',keras.metrics.Precision(), keras.metrics.Recall()])

### Training the Model

In [23]:
num_epochs = 300

STEP_SIZE_TRAIN = train_generator.n//train_generator.batch_size
STEP_SIZE_VAL   = val_generator.n//val_generator.batch_size

In [None]:
# Train Model
history = model.fit(train_generator, steps_per_epoch=STEP_SIZE_TRAIN, epochs=num_epochs, batch_size=batch_size, validation_data=val_generator, validation_steps=STEP_SIZE_VAL)
history

### k-fold Cross Validation

In [None]:
# kfold = KFold(n_splits=10, shuffle=True, random_state=42)
# cvscores = []

# for train_index, test_index in kfold.split(train_generator, val_generator):
#     model = Sequential()
#     model.add(layers.Conv2D(16,(5,5),padding='valid',input_shape = input_shape))
#     model.add(layers.Activation('relu'))
#     model.add(layers.MaxPooling2D(pool_size=(2,2),strides=2,padding = 'valid'))
#     model.add(layers.Dropout(0.4))

#     model.add(layers.Conv2D(32,(5,5),padding='valid'))
#     model.add(layers.Activation('relu'))
#     model.add(layers.MaxPooling2D(pool_size=(2,2),strides=2,padding = 'valid'))
#     model.add(layers.Dropout(0.6))

#     model.add(layers.Conv2D(64,(5,5),padding='valid'))
#     model.add(layers.Activation('relu'))
#     model.add(layers.Dropout(0.8))
#     model.add(layers.Flatten())
#     model.add(layers.Dense(7))

#     model.add(layers.Activation('softmax'))
    
#     optimizer = keras.optimizers.RMSprop(lr = 0.0001, decay = 1e-6)
#     model.compile(loss = 'binary_crossentropy',optimizer = optimizer, metrics = ['accuracy',keras.metrics.Precision(), keras.metrics.Recall()])
    
#     X_train, X_test = train_generator[train_index], train_generator[test_index]
#     y_train, y_test = val_generator[train_index], val_generator[test_index]
#     history = model.fit(X_train, steps_per_epoch=STEP_SIZE_TRAIN, epochs=num_epochs, batch_size=batch_size, validation_data=y_train, validation_steps=STEP_SIZE_VAL)
    
#     scores = model.evaluate(X[test], y[test], verbose=0)
#     print("%s: %.2f%%" % (model.metrics_names[1], scores[1]*100))
#     cvscores.append(scores[1] * 100)

# print("%.2f%% (+/- %.2f%%)" % (numpy.mean(cvscores), numpy.std(cvscores)))

### Save Model

In [None]:
models.save_model(model, 'CNN.h5')

### Evaluate Model

In [None]:
cnn_score = model.evaluate_generator(val_generator, steps=STEP_SIZE_VAL) 
print('Test loss: ', cnn_score[0])
print('Test accuracy: ', cnn_score[1])

### Show Training History

In [None]:
keys=history.history.keys()
print(keys)

def show_train_history(hisData,train,test): 
    plt.plot(hisData.history[train])
    plt.plot(hisData.history[test])
    plt.title('Training History')
    plt.ylabel(train)
    plt.xlabel('Epoch')
    plt.legend(['train', 'test'], loc='upper left')
    plt.show()

show_train_history(history, 'loss', 'val_loss')
show_train_history(history, 'accuracy', 'val_accuracy')

## 2. Support Vector Machine (SVM)

It is a supervised machine learning algorithm used for both regression and classification problems.
When used for classification purposes, it separates the classes using a linear boundary.

Generally, Support Vector Machines(SVM) is considered to be a classification approach but it can be employed in both types of classification and regression problems. It can easily handle multiple continuous and categorical variables. SVM constructs a hyperplane in multidimensional space to separate different classes. SVM generates optimal hyperplane in an iterative manner, which is used to minimize an error. The core idea of SVM is to find a maximum marginal hyperplane(MMH) that best divides the dataset into classes.

### Building the Model

In [None]:
num_epochs = 200
number_of_classes = 7

In [None]:
model = Sequential()
model.add(layers.Conv2D(filters = 32, padding = "same",activation = "relu",kernel_size=3, strides = 2,input_shape=input_shape))
model.add(layers.MaxPool2D(pool_size=(2,2),strides = 2))

model.add(layers.Conv2D(filters = 32, padding = "same",activation = "relu",kernel_size=3))
model.add(layers.MaxPool2D(pool_size=(2,2),strides = 2))

model.add(layers.Flatten())
model.add(layers.Dense(128,activation="relu"))

#Output layer
model.add(layers.Dense(1,kernel_regularizer=regularizers.l2(0.01),activation = "linear"))

In [None]:
model.add(layers.Dense(number_of_classes,kernel_regularizer = regularizers.l2(0.01),activation= "softmax"))
model.compile(optimizer="adam",loss="squared_hinge", metrics = ['accuracy'])
model.summary()

### Training the Model

In [None]:
history = model.fit(x=train_generator, steps_per_epoch=STEP_SIZE_TRAIN, epochs=num_epochs, batch_size=batch_size, validation_data=val_generator, validation_steps=STEP_SIZE_VAL)

### Save Model

In [None]:
models.save_model(model, 'SVM.h5')

### Evaluate Model

In [None]:
svm_score = model.evaluate_generator(val_generator, steps=STEP_SIZE_VAL) 
print('Test loss: ', svm_score[0])
print('Test accuracy: ', svm_score[1])

### Show Training History

In [None]:
keys=history.history.keys()
print(keys)

def show_train_history(hisData,train,test): 
    plt.plot(hisData.history[train])
    plt.plot(hisData.history[test])
    plt.title('Training History')
    plt.ylabel(train)
    plt.xlabel('Epoch')
    plt.legend(['train', 'test'], loc='upper left')
    plt.show()

show_train_history(history, 'loss', 'val_loss')
show_train_history(history, 'accuracy', 'val_accuracy')

## 3. Artificial Neural Networks (ANN)

ANNs are implemented as a system of interconnected processing elements, called nodes, which are functionally analogous to biological neurons.The connections between different nodes have numerical values, called weights, and by altering these values in a systematic way, the network is eventually able to approximate the desired function.

### Building the Model

In [None]:
model_ann = Sequential()
model_ann.add(layers.Dense(16, input_shape=input_shape, activation='relu'))
model_ann.add(layers.Dropout(0.4))
model_ann.add(layers.Dense(32, activation='relu'))
model_ann.add(layers.Dropout(0.6))
model_ann.add(layers.Flatten())
model_ann.add(layers.Dense(7, activation='softmax'))

model_ann.summary()

In [None]:
model_ann.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

In [None]:
history = model_ann.fit(x=train_generator, steps_per_epoch=STEP_SIZE_TRAIN, epochs=num_epochs, batch_size=batch_size, validation_data=val_generator, validation_steps=STEP_SIZE_VAL)
history

### Save Model

In [None]:
models.save_model(model, 'ANN.h5')

### Evaluate Model

In [None]:
ann_score = model.evaluate_generator(val_generator, steps=STEP_SIZE_VAL) 
print('Test loss: ', ann_score[0])
print('Test accuracy: ', ann_score[1])

### Show Training History

In [None]:
keys=history.history.keys()
print(keys)

def show_train_history(hisData,train,test): 
    plt.plot(hisData.history[train])
    plt.plot(hisData.history[test])
    plt.title('Training History')
    plt.ylabel(train)
    plt.xlabel('Epoch')
    plt.legend(['train', 'test'], loc='upper left')
    plt.show()

show_train_history(history, 'loss', 'val_loss')
show_train_history(history, 'accuracy', 'val_accuracy')

## Comparing accuracy to find best Model

In [None]:
# Accuracy of the different models
names = ['CNN', 'SVM', 'ANN']
compare_acc = [0.51, 0.7224, 0.4436]
# compare_acc = [0.94, 0.23, 0.45]
plt.figure(figsize=(20,6))


plt.barh(names,compare_acc, color=['#FFAEBC', '#FBE7C6', '#B4F8C8'])
for index, value in enumerate(compare_acc):
    plt.text(value, index, str(value))
plt.title('Comparing Accuracy of the Models')
plt.ylabel('Accuracy')
plt.xlabel('Models')

plt.show()