si_mfcc

dataset.npy contains 1440 numpy arrays with 290x13 MFCC features arrays.
labels.csv contains 1440 rows of speaker's sex and emotion. Row numbers correspond to dataset.npy

How to use

Download raw audio dataset from https://zenodo.org/record/1188976#.XPufrogzaUk. We used only audio .wav files.
python wav_cut.py to cut audio files to the same length.
python wav_to_mfcc.py to extract MFCC features from audio files.
python mfcc_to_numpy.py to transform MFCC.npy files into a single DATASET.npy file and labels.csv.
python ffnn.py to create the feed forward neural network, fit the model and evaluate.

Steps 1-4 are optional, you can skip to step 5. if you use the prepared dataset from link above.

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
MFCC		MFCC
.gitignore		.gitignore
README.md		README.md
ffnn.py		ffnn.py
mfcc_to_numpy.py		mfcc_to_numpy.py
show_mfcc.py		show_mfcc.py
split_dataset.py		split_dataset.py
wav_cut.py		wav_cut.py
wav_to_mfcc.py		wav_to_mfcc.py