Speech Emotion based face generation using Condition GANs

Generating human faces through conditional GANs which are conditioned on emotions identified from a human speech using SER (Speech Emotion Recognition)

An image showing the overall pipeline

Below is a short demo of the web app showing generation of human faces based on emotion identified from human speech.

Results

Training samples

Generated samples

Getting Started

Prerequisites

pandas==1.0.4
Keras==2.3.1
librosa==0.7.2
streamlit==0.61.0
tensorflow==2.0.0
numpy==1.18.1
tqdm==4.42.0
scipy==1.4.1
tensorflow_hub==0.8.0
matplotlib==3.1.3
Flask==1.1.2
ipython==7.17.0
Pillow==7.2.0
pyaudio==0.2.11
scikit_learn==0.23.2

Directory Structure

Project

├── speech_emotion_recognition
│   ├── code
│   │   ├── ser_training.ipynb
│   │   ├── ser_prediction.ipynb
│   ├── data
│   │   ├── Audio_Speech_Actors_01-24
│   │   │   ├── Actor_01
│   │   │   │   ├── 03-01-01-01-01-01-01.wav
│   │   │   │   ├── 03-01-01-01-01-02-01.wav
│   │   │   │   ...
│   │   │   ├── Actor_02
│   │   │   ...
│   │   │   ├── Actor_24
│   ├── weights

├── conditional_gan
│   ├── code
│   │   ├── cgan_training.ipynb
│   │   ├── cgan_prediction.ipynb
│   ├── data
│   │   ├── fer2013.csv
│   ├── weights

├── streamlit_webapp

Data

For SER :

The dataset can be downloaded at:
https://www.kaggle.com/uwrfkaggler/ravdess-emotional-speech-audio
and should be put it in the location

./speech_emotion_recognition/data/

It consists of speech audios in the voice of 24 actors. 5 sample audio file by the first actor has been put in the above location as an example.

For GANs :

The dataset can be downloaded at:
https://www.kaggle.com/c/challenges-in-representation-learning-facial-expression-recognition-challenge/data
and should be put it in the location

./conditional_gan/data/

We are interested in the "fer2013.csv" file from the data bundle. A sample file containing data for only 5 faces has been put as an example.

Model Trainining

[Note1: Please host and run these files on Google Colab]
[Note2: Please mount the drive where data files are present(follow the directory structure)]

For each of SER and cGAN, there are two separate Jupyter Notebook files, one for training and one for prediction.

For SER:

Training :

./speech_emotion_recognition/code/ser_training.ipynb

The weights obtained are stored in ./speech_emotion_recognition/weights
The pretrained weights corresponding to the best model are already put at this location.

Prediction :

./speech_emotion_recognition/code/ser_prediction.ipynb

For cGAN:

Training :

./conditional_gan/code/cgan_training.ipynb

Prediction :

./conditional_gan/code/cgan_prediction.ipynb

References

Mirza, M. and Osindero, S., 2014. Conditional generative adversarial nets. arXiv preprint arXiv:1411.1784.
Livingstone, S.R. and Russo, F.A., 2018. The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS): A dynamic, multimodal set of facial and vocal expressions in North American English. PloS one, 13(5), p.e0196391.
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A. and Bengio, Y., 2014. Generative adversarial nets. In Advances in neural information processing systems (pp. 2672-2680).
Francois Chollet. 2017. Deep Learning with Python (1st. ed.). Manning Publications Co., USA.
https://www.kaggle.com/c/challenges-in-representation-learning-facial-expression-recognition-challenge
https://medium.com/@ma.bagheri/a-tutorial-on-conditional-generative-adversarial-nets-keras-implementation-694dcafa6282
https://machinelearningmastery.com/how-to-develop-a-conditional-generative-adversarial-network-from-scratch/

Name		Name	Last commit message	Last commit date
Latest commit History 51 Commits
conditional_gan		conditional_gan
images		images
speech_emotion_recognition		speech_emotion_recognition
streamlit_webapp		streamlit_webapp
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

conditional_gan

conditional_gan

images

images

speech_emotion_recognition

speech_emotion_recognition

streamlit_webapp

streamlit_webapp

.gitignore

.gitignore

README.md

README.md

requirements.txt

requirements.txt

Repository files navigation

Speech Emotion based face generation using Condition GANs

Results

Getting Started

Prerequisites

Directory Structure

Data

For SER :

For GANs :

Model Trainining

For SER:

Training :

Prediction :

For cGAN:

Training :

Prediction :

References

About

Releases

Packages

Contributors 2

Languages

harshit158/ser-based-conditional-gan

Folders and files

Latest commit

History

Repository files navigation

Speech Emotion based face generation using Condition GANs

Results

Getting Started

Prerequisites

Directory Structure

Data

For SER :

For GANs :

Model Trainining

For SER:

Training :

Prediction :

For cGAN:

Training :

Prediction :

References

About

Resources

Stars

Watchers

Forks

Languages