# Facial Expression Recognition

### Abstract

In this project, based on an input image, the basic facial expressions such as happiness, surprise, sadness, and anger will be detected. To achieve this, first, important facial landmarks will be extracted and then using the location of these landmarks, some features will be constructed for training a dataset using Support Vector Machine (SVM) algorithm. The resulting classifier can be used for predicting facial expressions in an image.

### Dataset
To use a supervised ML algorithm, a dataset with face images and appropriate labels is needed. For this, the famous Cohn-Kanade dataset is employed in this project. Since it's a private dataset, a free version of it found on [github](https://github.com/spenceryee/CS229/tree/master/CK%2B) is used. This dataset consists of 8 directories each containing face images of a specific emotion. The emotions are: anger, contempt, disgust, fear, happiness, neutral, sadness, and surprise. Each set has different number of images with neutral having the largest one. Having roughly the same number of images in each set might help us achieve better results.

### Facial Landmarks
We need some features from each face image to train a classifier. For this, the standard dlib library is used to detect faces in an image. After detecting faces, 68 facial landmarks is determined for each face. Further info on how to extract the landmarks can be found [here](https://ieeexplore.ieee.org/document/6248015). In this project, dlib's shape predictor is used to detect facial landmarks. The file to predict landmarks can be found [here](https://github.com/italojs/facial-landmarks-recognition/blob/master/shape_predictor_68_face_landmarks.dat).

### Feature Extraction
Since facial expression of a person largely depends on the relative locations of the facial landmarks with respect each other, they are considered good candidates to be included in the feature vector. To better determine the relative position of landmark points with respect to each other, they are represented with respect to a new coordinate referrence which is the mean of all the 68 facial landmark points. 
In some images, the face might not have a straight pose with respect to the vertical axis and might be in a tilted position. To solve this issue, all images in the dataset are rotated so that the faces are aligned with the vertical axis. To determine the rotation angle, 4 points from the 68 facial landmarks are considered. These points are located on the nose and can help us determine the desired rotation angle. In the picture below, the 4 points (27, 28, 29, 30) are shown.


![](data/jupyter-data/image1.png)


After rotating each facial landmark with the calculated rotation angle, new coordinates are calculated for each landmark with respect to the new referrence point (mean point). Next, the polar coordinates of the points are calculated. First, the distance between each point and the referrence point is calulated. Then, the angle of each point with respect to the horizontal axis is calculated. The idea to use these features is inspired by [this paper](https://arxiv.org/pdf/1603.09129.pdf).

### Feature Vector
The feature vector for each landmark point consists of 4x68=272 numbers: 68 x coordinates (cartesian coordinates wrt mean point), 68 y coordinates ((cartesian coordinates wrt mean point)), 68 distance values (r in polar coordinate wrt mean point), and 68 angle values (theta in polar coordinates wrt mean point).

### Classification
The data is split into two parts: training and prediction. 90% of the dataset images are for training and the rest is for prediction/test phase. The dataset images for each set are shuffled before selecting them for training or prediction sets.


### Main Files
##### SVM_Data.py
This file contains details for preparing the dataset for training and prediction sets.

In [None]:
import cv2
import numpy as np
import Landmark_Detector
import Shuffle

def extract_feature(item):
    clahe = cv2.createCLAHE(clipLimit=2.0, tileGridSize=(8, 8))
    # read image
    image = cv2.imread(item)
    # convert to grayscale
    gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
    # increase the contrast
    clahe_image = clahe.apply(gray)
    # extract features
    features_vector = Landmark_Detector.Landmark_Detector(clahe_image)
    return features_vector

def SVM_Data(dataset):
    emotions = ["neutral", "anger", "contempt", "disgust",
                "fear", "happiness", "sadness", "surprise"]
    training_data = []
    training_labels = []
    prediction_data = []
    prediction_labels = []
    for emotion in emotions:
        print("emotion: ", emotion)
        # shuffle the set
        training, prediction = Shuffle.Shuffle(emotion, dataset)
        print("  training ...")
        for item in training:
            features_vector = extract_feature(item)
            if features_vector == "error":
                pass
            else:
                # append image array to training data list
                training_data.append(features_vector[0])
                training_labels.append(emotions.index(emotion))
        print("  prediction ...")
        for item in prediction:
            features_vector = extract_feature(item)
            if features_vector == "error":
                pass
            else:
                # append image array to prediction data list
                prediction_data.append(features_vector[0])
                prediction_labels.append(emotions.index(emotion))
        print()
    return training_data, training_labels, prediction_data, prediction_labels

##### Landmark_Detector.py
This files contains the details for detecting facial landmarks and preparing them to be included in the feature vector.

In [None]:
import numpy as np
import dlib
import math

def Landmark_Detector(image):
    features_vector = []
    detector = dlib.get_frontal_face_detector()
    predictor = dlib.shape_predictor(
        "data/shape_predictor_68_face_landmarks.dat")
    detections = detector(image, 1)
    # repeat for all detected faces
    for k, d in enumerate(detections):
        # facial landmarks with the predictor class of the dlib library
        shape = predictor(image, d)
        xsequence = []
        ysequence = []
        # save x and y coordinates in two lists
        for i in range(1, 68):
            xsequence.append(float(shape.part(i).x))
            ysequence.append(float(shape.part(i).y))
        # mean of all landmarks' x and y coordinates
        xmean = np.mean(xsequence)
        ymean = np.mean(ysequence)
        # distance between each point and the mean point
        xrelative = [(x-xmean) for x in xsequence]
        yrelative = [(y-ymean) for y in ysequence]
        # prevent 'divide by 0' error
        if xsequence[26] == xsequence[29]:
            anglenose = 0
        else:
            # rotation angle
            anglenose = int(math.atan(
                (ysequence[26]-ysequence[29])/(xsequence[26]-xsequence[29]))*180/math.pi)
        if anglenose < 0:
            anglenose += 90
        else:
            anglenose -= 90
        # create feature vector
        feature_vector = []
        for x, y, w, z in zip(xrelative, yrelative, xsequence, ysequence):
            feature_vector.append(x)
            feature_vector.append(y)
            meannp = np.asarray((ymean, xmean))
            coornp = np.asarray((z, w))
            dist = np.linalg.norm(coornp-meannp)
            if (w - xmean == 0):
                if (z - ymean > 0):
                    anglerelative = 90 - anglenose
                else:
                    anglerelative = -90 - anglenose
            else:
                anglerelative = (math.atan((z-ymean)/(w-xmean))
                                 * 180/math.pi) - anglenose
            feature_vector.append(dist)
            feature_vector.append(anglerelative)
        # append to features_vector
        features_vector.append(feature_vector)
    # return error if no face is detected
    if len(detections) < 1:
        features_vector = "error"
    return features_vector

### Results
Below you can see the confusion matrix and the resulting accuracy:

Confusion Matrix:

![](data/jupyter-data/image2.png)

Accuracy: 0.93103448

### Test on new images

Image 1:

![](data/jupyter-data/test_pic2.jpg)

### Links to other materials
1. Link to YouTube Video: https://youtu.be/7ht-pKO7nGI

2. Link to Supplemantary Materials: https://github.com/am-nu-fall-22/emotion-recognition