Skip to content

amirhossein-hkh/facial-expression-recognition

Repository files navigation

Facial Expression Recognition

Demo

The model performs really well in the real world and it has an accuracy of 75% on the test data.

demo.gif
The model has 5 class. (Happy,Sad,Surprise,Angry,Neutral)

Datasets

The Datasets which I have used in this project are AffectNet and FER+ (which is the same as fer2013 Kaggel but with better labels)

The combination of these dataset will have about 300,000 images for the 5 class.

Class Number of Images
Happy 134,626
Neutral 86,518
Sad 29,886
Angry 28,168
Surprise 18,608

Dependencies

numpy (1.16.4)
tensorflow (1.14.0)
keras (2.2.4)
opencv (3.4)

Haar-Cascade

Haar cascade classifiers first were proposed by Paul Viola and Michael Jones and uses Haar kernels for extracing features and it uses multiple classifiers one after an other. The classifiers get more complex as data move forward. Each classifier will specify whether the image is maybe from the desired class or it is definitly not in the desired class and if it is maybe from the desired class will pass image forward to the next classifier.

For training these classifiers,first you need to collect images from desired class (positive data in our case face images) and everything else (negative data) especially from the environment which you want to test your model like chairs, tables, etc. (for better result its better to make the images grayscale). Also you need to make a description file for postive and negetive samples.

  • For positives you need to write in the following format: [filename] [number of object annotations] [coordinates of the objects bounding rectangles (x, y, width, height)] for example: img/img1.jpg 1 140 100 45 45
  • For negative images you only need to write the file name

After collecting data and creating the two description files, you have to install opencv and its dependencies. Then using following commands you can start training your model:

First you need to create a vector file from the positive image with the following command: opencv_createsamples -info [name of description file] -num [number of positive samples] -w [width of the output] -h [height of the output] -vec [name of the vector file] for example:

opencv_createsamples -info info/info.lst -num 9000 -w 20 -h 20 -vec positives.vec

Then after creating the vector file you can start the actuall training with the following command:

opencv_traincascade -data data -vec positives.vec -bg bg.txt -numPos 7000 -numNeg 3500 -numStages 10 -w 20 -h 20

I trained the haar cascade with 7000 positive images and 3500 negative images for 21 stage which it make a very few mistakes.

For more information you can reffer to opencv website.

CNN

The cnn was implemented in Keras with the following architecture:

Layer (type)                 Output Shape              Param #   
=================================================================
conv2d_20 (Conv2D)           (None, 46, 46, 64)        640       
_________________________________________________________________
activation_35 (Activation)   (None, 46, 46, 64)        0         
_________________________________________________________________
batch_normalization_30 (Batc (None, 46, 46, 64)        256       
_________________________________________________________________
max_pooling2d_20 (MaxPooling (None, 23, 23, 64)        0         
_________________________________________________________________
conv2d_21 (Conv2D)           (None, 21, 21, 128)       73856     
_________________________________________________________________
dropout_10 (Dropout)         (None, 21, 21, 128)       0         
_________________________________________________________________
activation_36 (Activation)   (None, 21, 21, 128)       0         
_________________________________________________________________
batch_normalization_31 (Batc (None, 21, 21, 128)       512       
_________________________________________________________________
max_pooling2d_21 (MaxPooling (None, 10, 10, 128)       0         
_________________________________________________________________
conv2d_22 (Conv2D)           (None, 8, 8, 256)         295168    
_________________________________________________________________
activation_37 (Activation)   (None, 8, 8, 256)         0         
_________________________________________________________________
batch_normalization_32 (Batc (None, 8, 8, 256)         1024      
_________________________________________________________________
max_pooling2d_22 (MaxPooling (None, 4, 4, 256)         0         
_________________________________________________________________
conv2d_23 (Conv2D)           (None, 2, 2, 256)         590080    
_________________________________________________________________
activation_38 (Activation)   (None, 2, 2, 256)         0         
_________________________________________________________________
batch_normalization_33 (Batc (None, 2, 2, 256)         1024      
_________________________________________________________________
max_pooling2d_23 (MaxPooling (None, 1, 1, 256)         0         
_________________________________________________________________
flatten_5 (Flatten)          (None, 256)               0         
_________________________________________________________________
dense_15 (Dense)             (None, 128)               32896     
_________________________________________________________________
activation_39 (Activation)   (None, 128)               0         
_________________________________________________________________
batch_normalization_34 (Batc (None, 128)               512       
_________________________________________________________________
dropout_11 (Dropout)         (None, 128)               0         
_________________________________________________________________
dense_16 (Dense)             (None, 64)                8256      
_________________________________________________________________
activation_40 (Activation)   (None, 64)                0         
_________________________________________________________________
batch_normalization_35 (Batc (None, 64)                256       
_________________________________________________________________
dense_17 (Dense)             (None, 5)                 325       
_________________________________________________________________
activation_41 (Activation)   (None, 5)                 0         
=================================================================
Total params: 1,004,805
Trainable params: 1,003,013
Non-trainable params: 1,792

References

Rapid Object Detection usinga Boosted Cascade of Simple Features

AffectNet: A Database for Facial Expression, Valence, and Arousal Computing in the Wild

Kaggel Challenge