Emovere-ML

Artificial Intelligence is the science of 21st century. Artificial Intelligence (AI) is defined as the ability for a machine to “think or act humanly or rationally”. Machines are now able to process vast amount of data in real time and respond accordingly. But these machines with high IQ (Intelligence Quotient) were always lacking Emotional Intelligence (EI)/ Emotional Quotient (EQ). As technology progresses and the world becomes more and more virtual, there is a fear that we will lose the human connection and communication; but what if our devices could replace those interactions? The question of the era is whether we can build machines that can recognise human emotions.

Developers and researchers have been advancing artificial intelligence to not only create systems that think and act like humans, but also detect and react to human emotions. Humans show universal consistency in recognizing emotions but also show a great deal of variability between individuals in their abilities. Enabling the devices around us to recognize our emotions can only enhance our interaction with machines, as well as among the family of humanity. The point of this project research is to develop personalized user experiences that can help improve lives.

With recent advancements in this fields and open source tools such as Tensorflow from Google, creating and training a model is not much difficult. One of the major challenge is to collect the dataset to train the model. Then we came across an Emotion recognition challenge that was hosted by Kaggle “Challenges in Representation Learning: Facial Expression Recognition Challenge”. 56 Teams participated in this challenge and different approaches were used including Haar, Hog, SIFT, neural networks etc. The dataset given for this challenge is publicly available to download known as FER2013.

The dataset was created using the Google image search API to search for images of faces that match a set of 184 emotion-related keywords like “blissful”, “enraged,” etc. These keywords were combined with words related to gender, age or ethnicity, to obtain nearly 600 strings, which were used as facial image search queries. The first 1000 images returned for each query were kept for the next stage of processing. OpenCV face recognition was used to obtain bounding boxes around each face in the collected images. Human labellers than rejected incorrectly labelled images, corrected the cropping if necessary, and filtered out some duplicate images. Approved, cropped images were then resized to 48x48 pixels and converted to grayscale. The resulting dataset contains 35887 images, with 4953 “Anger” images, 547 “Disgust” images, 5121 “Fear” images, 8989 “Happiness” images, 6077 “Sadness” images, 4002 “Surprise” images, and 6198 “Neutral” images.

Incorporating recent advancements in the neural networks, a model was implemented using Tensorflow and Keras. It is a light weight model that works in real time, so that it can be used even on hardware constrained systems.

The model uses external methods such as HAAR Classifiers/D-Lib/Depth sensing cameras for face detection and a neural network for classification of emotions. The network will classify only if the face detectors recognise a face in the image. Therefore, the entire efficiency of the system depends on the face detector performance as well. Once detected, A state of the art comparable accuracy levels (around 66%) is achieved with the trained model on 7 different core emotions. The model was able to recognise Happy, Surprise, Neutral, Sad and Anger easily. Disgust and Contempt emotions are not found to be detected well. The reason we assume is the lack of training data for these two emotions. The model performed really well on classifying Positive Emotions, inferred from relatively high precision scores for happy and surprised. Happy has the highest precision of 84%, which may be due to higher number of samples (~7000) in the training set. Surprise has a precision of 75%, even though it had far less samples (~3000) in the training set. There must be very strong detectable signals in the surprise expression. Model performance seems weaker across negative emotions on average. Particularly, the emotion Fear has a low precision of 51% and Sad with 52%. The model frequently misclassified angry, fear, sad and neutral. It is most confused when predicting because these emotions are probably the least expressive (excluding crying faces).

It is not efficient to classify human facial expression as only a single emotion. Our expressions are much complex and contain a mix of emotions that could be used to accurately describe a particular expression. Considering this, the top3 accuracy of the model 92% (91.864) is one of the best.

Detailed Conclusions in Report: https://drive.google.com/file/d/1VS3I8geqXuFf_yJdrgNfz8DvVUoyYhVr/view?usp=sharing

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
assets		assets
datasets/fer2013		datasets/fer2013
haar		haar
training_output		training_output
.gitattributes		.gitattributes
README.md		README.md
Test with Image.py		Test with Image.py
Test with Video file.py		Test with Video file.py
TestImage.jpg		TestImage.jpg
TestVideo.mp4		TestVideo.mp4
Train.py		Train.py
VideoTest.py		VideoTest.py
VideoTestwithDlib.py		VideoTestwithDlib.py
VideofromKinect.py		VideofromKinect.py
data_Vis.ipynb		data_Vis.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Emovere-ML

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Emovere-ML

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages