<a href="https://colab.research.google.com/github/cpvivek/Facial-Emotion-Recogonition/blob/main/Capstone_DL%26MLE_Facial_Emotion_Recognition_Vivek_CP.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Facial Emotion Detection
In this project, you will implement a CNN model from scratch using a Deep Learning library Tensorflow. The model will be trained to detect facial emotions of humans via live camera feed.

If you don't have much experience with OpenCV or CNNs, don't worry about it, we have shared links to learn some basics about them as well.

## Prerequisites
- **Python**
- **Deep Learning Library (Tensorflow)**
- **OpenCV** : OpenCV is a huge open-source library for computer vision, machine learning, and image processing. Only the basics of OpenCV would be required for this project.
- **VideoGuide** : Go through the dashboard for the link.


## Pathway
- First we need a dataset. Download it from here: https://www.kaggle.com/c/challenges-in-representation-learning-facial-expression-recognition-challenge/data?select=fer2013.tar.gz  
- Split the data into training and validation sets, this is can be done in a ratio of about 10:1 or less
- Now augment the image data
- Now, code a CNN model, based on what you learnt from the link in prerequisites. A block would contain a Conv2D layer, Activation layer, BatchNorm, MaxPool, Dropout(optional). Create several such blocks (with successive blocks having double filters in Conv2D layer). Finally include a Flatten Layer, Dense layer, Activation layer. This, of course, is just a blueprint to give you direction. We'd encourage to try out different parameters and tinker around with the model to get some more practical knowledge.
- Now come the standard training, testing, and hyperparameter training. After this, you can save the model to be used to create the emotion detector.
- Now we'll detect faces using OpenCV. It's pretty simple actually. Read up on Haar Cascade classifier here: https://docs.opencv.org/3.4/db/d28/tutorial_cascade_classifier.html, it's based on the famous Viola-Jones algorithm. You can use this directly in OpenCV to detect faces in livestream.
- Select the RoI and use your model to classify the expression. You can make a bounding box and put text on it as well.
- Done!

## Learning Outcome
- Convolutional Neural Networks using Tensorflow Keras
- Data Augmentation
- Hyperparameter Tuning
- Applications of OpenCV for Building Face Detector

## Solution Code

import necessary libraries including tensorflow, keras, sklearn and opencv

In [1]:
# import necessary files
import cv2
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

In [2]:
import keras
import tensorflow as tf
from keras.utils import np_utils
from keras.models import Sequential 
from keras.layers import Dense, Conv2D, MaxPooling2D, BatchNormalization, Dropout, Flatten

Importing dataset sourced from kaggle

In [3]:
# mounting drive
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


reading the csv file.

In [4]:
file_path='/content/drive/MyDrive/Alma Better Pro/Alma Better Pro Program/Module 4: Machine Learning/Data Sets/icml_face_data.csv'

df=pd.read_csv(file_path) #facial emotion data stored as a dataframe in fe_df

Let's look into our dataset

In [9]:
df.columns

Index(['emotion', ' Usage', ' pixels'], dtype='object')

So the dataset is not all that complicated.

We've got 'emotion' field indicating different emotions.

I believe since this is a dataset used in kaggle competitions, they've went ahead and done the training, validation and test segregation.

And finally in the pixels field, we have the pixelated form of the image, flattened into an 1 dimension array. 

In [6]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 35887 entries, 0 to 35886
Data columns (total 3 columns):
 #   Column   Non-Null Count  Dtype 
---  ------   --------------  ----- 
 0   emotion  35887 non-null  int64 
 1    Usage   35887 non-null  object
 2    pixels  35887 non-null  object
dtypes: int64(1), object(2)
memory usage: 841.2+ KB


The pixel field is stored as a string at the moment. We'll have to convert that into a 48x48x1 array with float values.

In [5]:
#using lambda function to achieve this conversion
df[' pixels']=df[' pixels'].apply(lambda x: np.fromstring(x, sep=' ',dtype='float32')) #converting string to float separated by ' ' 
df[' pixels']=df[' pixels'].apply(lambda x:np.asarray(x.reshape(48,48,1))) #reshaping to 48x48x1

Let's look into emotions field.

In [6]:
df['emotion'].unique()

array([0, 2, 4, 6, 3, 5, 1])

The dataset contains 7 different emotions labeled using number from 0 to 6.

Let's segregate our training, validation and test sets right away. 

Public Test would be put into validation and Private test into test set. 

In [7]:
#defininng train, val and test data.
training_data=df[df[' Usage']=='Training']
validation_data=df[df[' Usage']=='PublicTest']
test_data=df[df[' Usage']=='PrivateTest']

In [8]:
training_data['emotion'].value_counts()

3    7215
6    4965
4    4830
2    4097
0    3995
5    3171
1     436
Name: emotion, dtype: int64

We can see that there is considerable amount of imbalance in the dataset wrt to emotions. This will create bias in the model and result in misclassifying. 

We can treat this issue by data augmentation.

Using keras for data augmentation.

In [9]:
# Data Augmentation
from keras import layers
data_augmentation=keras.Sequential(
    [
     layers.RandomFlip('horizontal'), #introduce some random flips on horizontal axis
     layers.RandomRotation(0.015), #introduce some rotation by 0.015 degree
     layers.RandomZoom(0.15,0.15) #introduce some random zoom
    ]
)

Applying Data Augmentation on training samples so that each emotion sample has equal number of samples

In [10]:
aug_df=pd.DataFrame({}) #creating a neww dataframe that will contain the augmented images
emotion=[] #list for emotion labels
aug_pixels=[] #list for augemnted image pixels
sample_length=7500 #specifying desired number of samples for each emotions. arbitrary choice

for i in range(7): #since 7 emotions
  data=training_data[training_data['emotion']==i] #selecting the image to be augmented
  j=sample_length - len(data) #calculating number of samples required to reach 7500
  for k in range(j):
    ind= k%len(data) #setting index
    augmented_image=data_augmentation(data[' pixels'][data.index[ind]]) #augmenting images
    aug_pixels.append(augmented_image) #appending augmented image to 
    emotion.append(i)#appending emotion label

aug_df[' pixels'] = aug_pixels #column containing augmented pixels
aug_df['emotion']= emotion#corresponding emotion label

In [11]:
#count of augmented images that will be added for each emotions
aug_df['emotion'].value_counts()

1    7064
5    4329
0    3505
2    3403
4    2670
6    2535
3     285
Name: emotion, dtype: int64

In [12]:
# concat train data and augmented df
training_data=pd.concat([training_data,aug_df],axis=0)
training_data['emotion'].value_counts()

0    7500
2    7500
4    7500
6    7500
3    7500
5    7500
1    7500
Name: emotion, dtype: int64

Since augmented data and training data index wouldn't matchup, we need set up proper index for the new training dataframe

In [13]:
ind=[ind for ind in range(len(training_data))]
training_data.index=ind
training_data.tail()

Unnamed: 0,emotion,Usage,pixels
52495,6,,"(((tf.Tensor(68.40772, shape=(), dtype=float32..."
52496,6,,"(((tf.Tensor(86.45305, shape=(), dtype=float32..."
52497,6,,"(((tf.Tensor(141.01285, shape=(), dtype=float3..."
52498,6,,"(((tf.Tensor(198.45676, shape=(), dtype=float3..."
52499,6,,"(((tf.Tensor(166.83704, shape=(), dtype=float3..."


So we have 52499 datapoints now. Ignore the NaN values in usage. We know the usage is training

Preprocessing X_train using keras

In [40]:
x_train=[]
for i in training_data[' pixels']:
  x_train.append(i)
x_train=np.asarray(x_train)
x_train=x_train.reshape(len(x_train),48,48,1)
y_train=np.array(training_data['emotion'])
y_train=y_train.astype(int)
y_train=np_utils.to_categorical(y_train,7)
# shape of training data
x_train.shape, y_train.shape

((52500, 48, 48, 1), (52500, 7))

Preprocessing X_val using keras

In [41]:
x_val = []
for i in validation_data[' pixels']:
  x_val.append(i)
x_val=np.asarray(x_val)
x_val= x_val.reshape(len(x_val),48,48,1)
y_val=np.array(validation_data['emotion'])
y_val=y_val.astype(int)
y_val=np_utils.to_categorical(y_val,7)



x_val.shape,y_val.shape

((3589, 48, 48, 1), (3589, 7))

Preprocessing X_val using keras

In [None]:
x_test = []

for in testing_data[' pixels']:
  x_test.append(i)
x_test=np.array(x_test)
x_test=x_test.reshape(len(x_test)48,48,1)
y_test=np.array(test_data['emotion'])
y_test=y_test.astype(int)
y_test=np_utile.to_catoegorical(y_test,7)
# code here

# then check shape
X_test.shape,y_test.shape

((3589, 48, 48, 1), (3589, 7))