Skip to content

Desuche/Emovere-ML

 
 

Repository files navigation

Emovere-ML

Artificial Intelligence is the science of 21st century. Artificial Intelligence (AI) is defined as the ability for a machine to “think or act humanly or rationally”. Machines are now able to process vast amount of data in real time and respond accordingly. But these machines with high IQ (Intelligence Quotient) were always lacking Emotional Intelligence (EI)/ Emotional Quotient (EQ). As technology progresses and the world becomes more and more virtual, there is a fear that we will lose the human connection and communication; but what if our devices could replace those interactions? The question of the era is whether we can build machines that can recognise human emotions.

Developers and researchers have been advancing artificial intelligence to not only create systems that think and act like humans, but also detect and react to human emotions. Humans show universal consistency in recognizing emotions but also show a great deal of variability between individuals in their abilities. Enabling the devices around us to recognize our emotions can only enhance our interaction with machines, as well as among the family of humanity. The point of this project research is to develop personalized user experiences that can help improve lives.

With recent advancements in this fields and open source tools such as Tensorflow from Google, creating and training a model is not much difficult. One of the major challenge is to collect the dataset to train the model. Then we came across an Emotion recognition challenge that was hosted by Kaggle “Challenges in Representation Learning: Facial Expression Recognition Challenge”. 56 Teams participated in this challenge and different approaches were used including Haar, Hog, SIFT, neural networks etc. The dataset given for this challenge is publicly available to download known as FER2013.

The dataset was created using the Google image search API to search for images of faces that match a set of 184 emotion-related keywords like “blissful”, “enraged,” etc. These keywords were combined with words related to gender, age or ethnicity, to obtain nearly 600 strings, which were used as facial image search queries. The first 1000 images returned for each query were kept for the next stage of processing. OpenCV face recognition was used to obtain bounding boxes around each face in the collected images. Human labellers than rejected incorrectly labelled images, corrected the cropping if necessary, and filtered out some duplicate images. Approved, cropped images were then resized to 48x48 pixels and converted to grayscale. The resulting dataset contains 35887 images, with 4953 “Anger” images, 547 “Disgust” images, 5121 “Fear” images, 8989 “Happiness” images, 6077 “Sadness” images, 4002 “Surprise” images, and 6198 “Neutral” images.

Incorporating recent advancements in the neural networks, a model was implemented using Tensorflow and Keras. It is a light weight model that works in real time, so that it can be used even on hardware constrained systems.

The model uses external methods such as HAAR Classifiers/D-Lib/Depth sensing cameras for face detection and a neural network for classification of emotions. The network will classify only if the face detectors recognise a face in the image. Therefore, the entire efficiency of the system depends on the face detector performance as well. Once detected, A state of the art comparable accuracy levels (around 66%) is achieved with the trained model on 7 different core emotions. The model was able to recognise Happy, Surprise, Neutral, Sad and Anger easily. Disgust and Contempt emotions are not found to be detected well. The reason we assume is the lack of training data for these two emotions. The model performed really well on classifying Positive Emotions, inferred from relatively high precision scores for happy and surprised. Happy has the highest precision of 84%, which may be due to higher number of samples (~7000) in the training set. Surprise has a precision of 75%, even though it had far less samples (~3000) in the training set. There must be very strong detectable signals in the surprise expression. Model performance seems weaker across negative emotions on average. Particularly, the emotion Fear has a low precision of 51% and Sad with 52%. The model frequently misclassified angry, fear, sad and neutral. It is most confused when predicting because these emotions are probably the least expressive (excluding crying faces).

It is not efficient to classify human facial expression as only a single emotion. Our expressions are much complex and contain a mix of emotions that could be used to accurately describe a particular expression. Considering this, the top3 accuracy of the model 92% (91.864) is one of the best.

Detailed Conclusions in Report: https://drive.google.com/file/d/1VS3I8geqXuFf_yJdrgNfz8DvVUoyYhVr/view?usp=sharing

About

Emotion recognition using machine learning

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Jupyter Notebook 88.2%
  • Python 11.8%