LEARNING ROBUST SELF-ATTENTION FEATURES FOR SPEECH EMOTION RECOGNITION WITH LABEL-ADAPTIVE MIXUP

Lei Kang, Lichao Zhang, Dazhi Jiang.

Accepted to ICASSP 2023.

Hardware and Software:

i9-10900
64GB RAM
RTX3090 (24GB)
Ubuntu 22.04
Python 3.8
PyTorch 1.12

Dataset

To make our results comparable to the state-of-the-art works [2, 3, 18], we merge ”excited” into ”happy” category and use speech data from four categories of ”angry”, ”happy”, ”sad” and ”neutral”, which leads to a 5531 acoustic utterances in total from 5 sessions and 10 speakers. The widely used Leave-One-Session-Out (LOSO) 5-fold cross-validation is utilized to report our final results. Thus, at each fold, 8 speakers in 4 sessions are used for training while the other 2 speakers in 1 session are used for testing.

Train the model

The dataset URL should be modified according to your environment in dataset_wavMix.py.
Start the training process by running python train.py, note that the training information will be printed out once per epoch.

Architecture of the proposed method

Comparison with state of the arts

Citation

If you are using the code or benchmarks in your research, please cite our paper:

Lei Kang, Lichao Zhang, Dazhi Jiang. "Learning Robust Self-attention Features for Speech Emotion Recognition with Label-adaptive Mixup", IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2023), Rhodes Island, Greece, Jun 2023.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
augmentor.py		augmentor.py
center_loss.py		center_loss.py
dataset_wavMix.py		dataset_wavMix.py
model.py		model.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LEARNING ROBUST SELF-ATTENTION FEATURES FOR SPEECH EMOTION RECOGNITION WITH LABEL-ADAPTIVE MIXUP

Hardware and Software:

Dataset

Train the model

Architecture of the proposed method

Comparison with state of the arts

Citation

About

Releases

Packages

Languages

License

leitro/LabelAdaptiveMixup-SER

Folders and files

Latest commit

History

Repository files navigation

LEARNING ROBUST SELF-ATTENTION FEATURES FOR SPEECH EMOTION RECOGNITION WITH LABEL-ADAPTIVE MIXUP

Hardware and Software:

Dataset

Train the model

Architecture of the proposed method

Comparison with state of the arts

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages