The purpose of this study is to examine two primary methods of modeling sound data for speech emotion recognition-- a flattened feature transform consistent with telecommunications standards and a convolutional neural network utilizing stacked spectrogram arrays.
Phase 1 - Problem Definition 1.1 Broad Goals 1.2 Data Source 1.3 Problem Statement
Phase 2 - Data Gathering 2.1 load files 2.2 convert stereo files to mono
Phase 3 - Exploratory Data Analysis 3.1 Waveforms 3.2 Spectrograms 3.3 Speech vs Song
Phase 4 - Modeling 4.1 Train/Test/Split 4.2 Flat Features 4.4 Convolutional Neural Net 4.5 Comparative Modeling
Phase 5 - Model Analysis 5.0 Baseline Score 5.1 Compare Accuracy Scores 5.2 Production Model
Phase 6 - Conclusions 6.1 Revisit 1.3 Problem Statement 6.2 Conclusions 6.3 Recommendations for Further Research 6.4 Credits/References
Livingstone SR, Russo FA (2018) The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS): A dynamic, multimodal set of facial and vocal expressions in North American English. PLoS ONE 13(5): e0196391. https://doi.org/10.1371/journal.pone.0196391.
EDA Segment and comprehension of the Short Term Fuorier Transform inspired by https://jackschaedler.github.io/circles-sines-signals
https://www.kdnuggets.com/2017/12/audio-classifier-deep-neural-networks.html
https://github.com/lukas/ml-class/blob/master/videos/cnn-audio/audio.ipynb
Speech Intelligibility information courtesy of : https://www.dpamicrophones.com/mic-university/facts-about-speech-intelligibility
AUTHOR=Lech Margaret, Stolar Melissa, Best Christopher, Bolia Robert
TITLE=Real-Time Speech Emotion Recognition Using a Pre-trained Image Classification Network: Effects of Bandwidth Reduction and Companding
JOURNAL=Frontiers in Computer Science
VOLUME=2
YEAR=2020
PAGES=14
URL=https://www.frontiersin.org/article/10.3389/fcomp.2020.00014
DOI=10.3389/fcomp.2020.00014
ISSN=2624-9898
McFee, Brian, Colin Raffel, Dawen Liang, Daniel PW Ellis, Matt McVicar, Eric Battenberg, and Oriol Nieto. “librosa: Audio and music signal analysis in python.” In Proceedings of the 14th python in science conference, pp. 18-25. 2015.