This repository contains the source code for neural networks used in facial detection, emotion recognition, and the overarching framework of engagement detection.
Currently, emotion detection has been implemented from the fer2013
dataset, and can be used for image and live
video classification, in the videoclassification.py
and emotionclassification.py
scripts.
The facevideo.py
file contains live facial detection from the computer webcam. The comments
on the top of the file contain more information on usage of the different detectors. The facedetect.py
also contains
facial detection, however it detects faces from inputted images rather than a live video feed.
The repository also contains multiple convolutional neural networks for facial emotion recognition.
They are still in progress, but the general usage is as follows: Train the model from the trainmodel.py
file,
and test the model using the testmodel.py
file.
For information on the neural network models being used, see the Neural Network Information
section below.
NOTE: Before using anything in this repository, please visit the data
directory and read the instructions
there on downloading any necessary files and the location of saved files.
To use the repository, it can be directly cloned from the command line:
git clone --recurse-submodules https://github.com/amogh7joshi/engagement-detection.git
For setup, a Makefile is provided:
make install
Or, you can manually run:
# Install System Requirements
python3 -m pip install -r requirements.txt
In either case, you should delete the .cloud
directory. Either the Makefile will do it for you or
you can manually delete it. It contains operations that I use personally when working with the Google API,
so unless you are working with any of the same Google APIs, you should delete it.
Then, use the scripts provided in the scripts
directory to install the necessary data:
- To install the model and caffemodel files for the DNN, use the
getdata.sh
script. - Download the
fer2013.csv
file from here, follow the directions in thedata
subdirectory. - Optionally, also download the
ck+
dataset from here, and follow the directions in thedata
subdirectory. - Run the
preprocess.sh
script. It may take a couple of minutes.
Once the datasets are preprocessed, they can be called through the following functions:
from data.load_data import get_fer2013_data
from data.load_data import get_ckplus_data
# Load the training, validation, and testing data (repeat with other datasets).
X_train, X_validation, X_test, y_train, y_validation, y_test = get_fer2013_data()
For more information, visit the data
subdirectory.
The other tools in the Makefile are for convenience purposes only when committing to this repository,
in addition to the editconstant.sh
script. Do not use them unless you are committing to your own repository.
The info.json
file contains the relevant locations of the cascade classifiers and DNN model files.
You can replace the current locations with those on your computer, and then load the detectors as follows.
from util.info import load_info
# Set the `eyes` option to true if you want to load the eye cascade.
cascade_face, cascade_eyes, net = load_info(eyes = True)
Currently, all models have been configured to work with the fer2013
and ck+
datasets.
Model Training: Run the trainmodel.py
script. You can edit the number of epochs in the argparse argument
at the top of the file. Alternatively, you can run itt from the command line using the flags as mentioned by the
argparse arguments. Model weights will be saved to the data/model
directory, and at the completion of the training,
the best model will be moved into the data/savedmodels
directory. The json file containing the model
architecture will also be saved there. You can control what models to keep in the data/savedmodels
directory manually.
Model Testing: Run the testmodel.py
script. You can edit which model weights and architecture you want to use at the
location at the top of the file. From there, you can run model.evaluate
on the pre-loaded training and testing data,
you can run model.predict
on any custom images you want to test, or run any other operations with the model.
A confusion matrix is also present, which will display if plt.show()
is uncommented.
Live Emotion Detection: Run the videoclassification.py
script. If you already have a trained model, set it at the top of the
script, and it will detect emotions live. For just facial detection, run the facevideo.py
script. You can choose which detector you
want to use, as described at the top of the file. If you want to save images, set the -s
flag to True
, and they will save to a
custom directory imageruntest
at the top-level. More information is included at the top of the file.
Image Emotion Detection: Run the emotionclassification.py
script. Choose the images you want to detect emotions on and place their paths in
the userimages
variable. If running from the command line, then write out the paths to each of the images when running the script. Optionally, if you
just want facial detection, run the facedetect.py
script. If running from the command line, then read the argument information at the top of the file.
Otherwise, insert the paths of the images that you want to detect faces from into a list called user_images
midway through the file. The changed images will save
to a custom directory called modded
, but you can change that from the savedir
variable. For each image inputted, the script will output the same image
with a bounding box around the faces detected from the image.
The model architecture I am currently using (architecture is the model on the upper right) for the emotion recognition convolutional neural network uses a very basic inception architecture, with each block containing a triple convolution and max pooling branch and an average pooling branch.
Previously I had used a model (architecture is on the upper left) roughly developed as a miniature version of the Xception model [1]. It contains three branches at each convolution layer: the first with two convolutions and max pooling, the second with only convolution, and a third with one convolution and average pooling.
Initially, I had chosen to use one similar to the likes of VGG16 and VGG19
[2], but chose against it due to issues which arose during training, and its
lack of any residual connections. These models can be seen in the models/pastmodels.py
file, specifically models 1-3.
Since the model has a convolutional architecture, fully-connected layers have been replaced with a global average pooling layer.
In general, it yields better results. The 2-D convolution layers can also be replaced with separable 2-D convolution layers,
although regular convolution layers seem to yield better results with the image sizes of the fer2013
dataset.
The deep neural network for face detection makes use of a pre-trained model using the ResNet architecture [3].
The directories in this repository are integrated for a seamless transition sequence. All necessary data
including model architecture, weights, cascade classifiers, and other necessary files will be saved to necessary
subdirectories in the data
directory. Please visit the data
directory for usage instructions.
Following a training sequence, models are saved to the data/savedmodels
directory. They can then be loaded
either through the architecture + weights files, or just from the weight files, as follows:
from models.model_factory import *
# Load from architecture + weights files.
model = load_json_model('<insert-model-name>', compile = 'default')
# Load just from weights file.
model = load_keras_model('<insert-model-name>', compile = False)
The code in this repository is available under the MIT License. Although you are welcome to download the repository and work with it, contributions will not be accepted. However, if you notice an issue with the system, feel free to create an issue for me to take a look at.
[1]: Chollet, F. (2017). Xception: Deep learning with depthwise separable convolutions. ArXiv:1610.02357 [Cs]. http://arxiv.org/abs/1610.02357
[2]: Simonyan, K., and Zisserman, A. (2015). Very deep convolutional networks for large-scale image recognition. ArXiv:1409.1556 [Cs]. http://arxiv.org/abs/1409.1556
[3]: He, K., Zhang, X., Ren, S., & Sun, J. (2015). Deep residual learning for image recognition. ArXiv:1512.03385 [Cs]. http://arxiv.org/abs/1512.03385