Skip to content

kaushal-k/Deep-Emotion

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

71 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Deep-Emotion: Facial Expression Recognition Using Attentional Convolutional Network (Tensorflow implementation)

*WIP*

As a personal exercise on reading and implementing SOTA papers, I implemented one of the leading state-of-the-art papers in Facial Expression Recoginition (FER), Deep-Emotion. As far as I know, there is no Tensorflow implementation of the paper so decided to go with TF as my choice of framework.

There are, however, a couple Pytorch versions. The most popular of them is omarSayed7's non-official implementation of DeepEmotion2019. I forked and used it as a reference.

Architecture

In a nutshell, the paper proposes an attentional CNN that predicts facial expressions by focussing a classifier layer on the most relevant portions of the input image. This attention mechanism is achieved using a Spatial Transformer Network or STN in short. The STN works by learning a set of 6 transformation prameters that is then used to perform an affine transformation of the input image. In this implementation, I used kevinzakka's library to perform the spatial transformation.

In the proposed model, a feature extractor works in parallel with the STN to generate a feature map that is fed to a classification layer for emotion inference.

Contributions

The model architecture in the Pytorch implementation differs slightly with that described in the paper (in aspects like input image flow, kernel initialization, regularization, hyperparameters etc). I tried to mirror the paper as closely as possible and made suitable changes. In additon, I worked with a couple assumptions as I was unsure of certain specifics of the model architecture as described in the paper. For this reason, the implementation might not be exactly what the authors intended, however, I have added comments in the code at all such places explaining my reasons.

Datasets

This implementation uses the following datasets:

Prerequisites

Make sure you have the following libraries installed:

  • tensorflow >= 2.13.0
  • stn == 1.0.1
  • pandas
  • pillow
  • tqdm

Repository Structure

This repository is organized as follows:

Usage

Clone the repository and follow these steps.

Environment

This repository was tested using python==3.9.12 and pip==24.0 on a Windows machine. To setup the environment, create and activate a virtual environment (virtualenv --python=python3.9.12 venv | venv/Scripts/activate) and run:

pip install -r requirements.txt

Download the Data

  1. Download the dataset from Kaggle.
  2. Decompress train.csv and test.csv into the ./data folder within the repo.

Setup the Dataset

Open terminal and run:

python main.py [-s [True]] [-d [data_path]]

--setup                 Setup the dataset for the first time
--data                  Data folder that contains data files

For example,

python main.py -s True -d data

This will produce images out of the .csv files downloaded from Kaggle and split them into training and validation datasets.

Train the model

Set hyperparameters

python main.py [-t] [--data [data_path]] [--hparams [hyperparams]]
              [--epochs] [--learning_rate] [--batch_size]

--data                  Data folder that contains training and validation files
--train                 True when training
--hparams               True when changing the hyperparameters
--epochs                Number of epochs
--learning_rate         Learning rate value
--batch_size            Training/validation batch size

For example, to specify your own hyperparameters, run:

python main.py -t True -d data -hparams True --epochs 5 --learning_rate 0.005 --batch_size 32

To use default hyperparameters (as specified in the paper), run:

python main.py -t True -d data

Samples

```