Emotion Recognition ER

Emotion recognition is the process of identifying human emotions using AI. This project seeks to recognise emotions from speech clips (audio + video). Generally, the technology works best at this task if it uses multiple modalities, for this reason we implemented a two-stream model to analyze facial expressions from video and voice tone from audio. These tasks are called Speech Emotion Recognition (SER) and Facial Emotion Recognition (FER) respectively.

In this project we trained the models on speech clips of RAVDESS dataset which contains 8 emotion classes: neutral, calm, happy, sad, angry, fearful, surprise and disgust (7 used).

More details about the processing and architecture in Project_Slides.pdf. Dimostrative video and deployed model in the DEMO folder.

Repo structure

Emotion-Recognition_SER-FER_RAVDESS
├───Datasets
│   ├───RAVDESS
│   ├───RAVDESS_audio
│   ├───RAVDESS_frames
│   ├───RAVDESS_frames_black
│   └───RAVDESS_frames_face_BW
├───DEMO
│   ├───Examples
│   ├───ER DEMO.mp4
│   └───ER_FullClip_DEMO.ipynb
├───Models
│   ├───Audio Stream
│   └───Video Stream
├───Other
│   └───haarcascade_frontalface_default.xml
├───Plots
├───StreamAudio_1D.ipynb
├───StreamAudio_2D.ipynb
├───FullClip_Test.ipynb
├───StreamVideo_FaceOnly.ipynb
├───StreamVideo_FramesExtraction.ipynb
├───StreamVideo_FullFrame.ipynb
├───StreamVideo_Test.ipynb
├───Project_Slides.pdf
├───README.md
├───LICENSE.md
└───requirements.txt

Execution schema

To classify emotions (using our trained model):

Copy your clips in DEMO/Examples
Run ER_FullClip_DEMO.ipynb in DEMO folder

To replicate this project (training and inference):

Download the speech clips of RAVDESS dataset and save it in Datasets/RAVDESS folder
Train video and audio models
1. Video Stream: extract frames with StreamVideo_FramesExtraction.ipynb (multiple type of frames are generated -> best are "224x224 only faces BW"), train model with StreamVideo_FullFrame.ipynb and StreamVideo_FaceOnly.ipynb (depending on the frames generated) and test the results with StreamVideo_Test.ipynb
2. Audio Stream: use StreamAudio_1D.ipynb and StreamAudio_2D.ipynb to train models (2D works better)
Use FullClip_Test.ipynb to assess global performance
Use ER_FullClip_DEMO.ipynb in DEMO folder to classify videos with the trained models.

DEMO

ER.DEMO.mp4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Emotion Recognition ER

Repo structure

Execution schema

DEMO

About

Releases

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 63 Commits
DEMO		DEMO
Datasets/RAVDESS_audio		Datasets/RAVDESS_audio
Models		Models
Other		Other
Plots		Plots
.gitignore		.gitignore
FullClip_Test.ipynb		FullClip_Test.ipynb
LICENSE.md		LICENSE.md
Project_Slides.pdf		Project_Slides.pdf
README.md		README.md
Requirements.txt		Requirements.txt
StreamAudio_1D.ipynb		StreamAudio_1D.ipynb
StreamAudio_2D.ipynb		StreamAudio_2D.ipynb
StreamVideo_FaceOnly.ipynb		StreamVideo_FaceOnly.ipynb
StreamVideo_FramesExtraction.ipynb		StreamVideo_FramesExtraction.ipynb
StreamVideo_FullFrame.ipynb		StreamVideo_FullFrame.ipynb
StreamVideo_Test.ipynb		StreamVideo_Test.ipynb

License

gianscuri/Emotion-Recognition_SER-FER_RAVDESS

Folders and files

Latest commit

History

Repository files navigation

Emotion Recognition ER

Repo structure

Execution schema

DEMO

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages