Skip to content

A Pytorch implementation of 'AUTOMATIC SPEECH EMOTION RECOGNITION USING RECURRENT NEURAL NETWORKS WITH LOCAL ATTENTION'

Notifications You must be signed in to change notification settings

gogyzzz/localatt_emorecog

Repository files navigation

A Pytorch implementation of 'AUTOMATIC SPEECH EMOTION RECOGNITION USING RECURRENT NEURAL NETWORKS WITH LOCAL ATTENTION'

paper

IEMOCAP DB paper

MSP-IMPROV DB paper

Requirements

Preparation

wav_cat.list, utt.list

IEMOCAP DB has 5531 utterances, composed of 4 Emotions.

A: Anger H: Excited + Happiness N: Neutral S: Sadness

#head -2 iemocap/wav_cat.list
/your/path/Ses01F_impro01_F000.wav N
/your/path/Ses01F_impro01_F001.wav N

#head -2 iemocap/utt.list
Ses01F_impro01_F000
Ses01F_impro01_F001

MSP-IMPROV DB has 7798 utterances, composed of 4 Emotions.

#head -2 msp_improv/wav_cat.list
/your/path/MSP-IMPROV-S01A-F01-P-FM01.wav N 
/your/path/MSP-IMPROV-S01A-F01-P-FM02.wav H

#head -2 msp_improv/utt.list
MSP-IMPROV-S01A-F01-P-FM01
MSP-IMPROV-S01A-F01-P-FM02

How to Run

./add_opensmile_conf.sh your_opensmile_dir

./prepare_list.sh iemocap/wav_cat.list \ # done.
	iemocap/lld.htk.list iemocap/utt.list iemocap/lld/

./extract_lld.sh your_opensmile_dir/ iemocap/wav_cat.list \
	iemocap/lld.htk.list

./make_utt_lld_pair.py iemocap/utt.list iemocap/lld.htk.list \
	iemocap/utt_lld.pk

./iemocap/make_csv.sh iemocap/utt.list iemocap/wav_cat.list iemocap/ \
	iemocap/full_dataset.csv

# Modify make_dataset.py parameters as you want!
#
### Default setting ###
#
# devfrac=0.2
# session=1
# prelabel="gender"
#
# e.g.
# sed 's/"gender"/"speaker"/' iemocap/make_dataset.py > new_script.py
# sed 's/devfrac=0.2/devfrac=0.1/' iemocap/make_dataset.py > new_script.py

./iemocap/make_dataset.py iemocap/full_dataset.csv iemocap/utt_lld.pk iemocap/your_dataset_path

# Modify make_expcase.py params as you want!
#
### Default setting ###
#
# lr=0.00005
# bsz=64
# ephs=200

./iemocap/make_expcase.py iemocap/your_dataset_path iemocap/your_dataset_path/your_expcase

#ls iemocap/your_dataset_path/your_expcase 

# log	
# param.json
# premodel.pth
# model.pth

./run.py --propjs iemocap/your_dataset_path/your_expcase/param.json

# parameters were not tuned.

# grep test iemocap/your_dataset_path/your_expcase/log
# iemocap/sess1/exp/log:[test] score: 0.459, loss: 1.278
# iemocap/sess2/exp/log:[test] score: 0.542, loss: 1.190
# iemocap/sess3/exp/log:[test] score: 0.542, loss: 1.195
# iemocap/sess4/exp/log:[test] score: 0.521, loss: 1.214
# iemocap/sess5/exp/log:[test] score: 0.513, loss: 1.226

# grep test msp_improv/sess?/exp/log
# msp_improv/sess1/exp/log:[test] score: 0.493, loss: 1.238
# msp_improv/sess2/exp/log:[test] score: 0.485, loss: 1.249
# msp_improv/sess3/exp/log:[test] score: 0.526, loss: 1.208
# msp_improv/sess4/exp/log:[test] score: 0.502, loss: 1.225
# msp_improv/sess5/exp/log:[test] score: 0.474, loss: 1.261

About

A Pytorch implementation of 'AUTOMATIC SPEECH EMOTION RECOGNITION USING RECURRENT NEURAL NETWORKS WITH LOCAL ATTENTION'

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published