S3PRL-SER

S3PRL for Speech Emotion Recognition. See s3prl > downstream for supported speech emotion datasets.

Environment compatibilities

We support the following environments. The test cases are ran with tox locally and on github action:

Env	versions
os	`ubuntu-18.04`, `ubuntu-20.04`
python	`3.7`, `3.8`, `3.9`, `3.10`
pytorch	`1.13.1`

Supported SER datasets (Status, WA, UA)

CMU-MOSEI (done, 0.65, 0.24)
IEMOCAP (in-progress, 0.73, 0.71)
MSP-IMPROV (in-progress, 0.67, 0.64)
MSP-Podcast (in progress, 0.71, 0.54)
JTES (in-progress, 0.78, 0.78)
EmoFilm (in-progress, 0.XX, 0.XX)
AESDD (planned)
CaFE (planned)
SAVEE (planned)

Introduction and Usages

This is an open source toolkit called s3prl-ser, which stands for Self-Supervised Speech Pre-training and Representation Learning for Speech Emotion Recognition. Self-supervised speech pre-trained models are called upstream in this toolkit, and are utilized in various downstream tasks.

Unlike the original S3PRL, the S3PRL-SER has a single usage on Downstream:

Downstream

Utilize upstream models in lots of downstream tasks
Benchmark upstream models with SUPERB Benchmark
Document: downstream/README.md

Please refer to the original S3PRL repository if you want to experiment with Pre-train and Upstream usages.

Below is an intuitive illustration on how this toolkit may help you:

Feel free to use or modify our toolkit in your research. Here is a list of papers using our toolkit. Any question, bug report or improvement suggestion is welcome through opening up a new issue.

If you find this toolkit helpful to your research, please do consider citing our papers, thanks!

Installation

Python >= 3.8
Install sox on your OS
Install s3prl: Read doc or pip install -e ".[all]"
(Optional) Some upstream models require special dependencies. If you encounter error with a specific upstream model, you can look into the README.md under each upstream folder. E.g., upstream/pase/README.md

Development pattern for contributors

Create a personal fork of the main S3PRL repository in GitHub.
Make your changes in a named branch different from master, e.g. you create a branch new-awesome-feature.
Contact us if you have any questions during development.
Generate a pull request through the Web interface of GitHub.
Please verify that your code is free of basic mistakes, we appreciate any contribution!

Reference Repositories

Pytorch, Pytorch.
Audio, Pytorch.
Kaldi, Kaldi-ASR.
Transformers, Hugging Face.
PyTorch-Kaldi, Mirco Ravanelli.
fairseq, Facebook AI Research.
CPC, Facebook AI Research.
APC, Yu-An Chung.
VQ-APC, Yu-An Chung.
NPC, Alexander-H-Liu.
End-to-end-ASR-Pytorch, Alexander-H-Liu
Mockingjay, Andy T. Liu.
ESPnet, Shinji Watanabe
speech-representations, aws lab
PASE, Santiago Pascual and Mirco Ravanelli
LibriMix, Joris Cosentino and Manuel Pariente

License

The majority of S3PRL Toolkit is licensed under the Apache License version 2.0, however all the files authored by Facebook, Inc. (which have explicit copyright statement on the top) are licensed under CC-BY-NC.

Citation

If you find this toolkit useful, please consider citing following papers.

@article{Atmaja2022h,
  author = {Atmaja, Bagus Tris and Sasou, Akira},
  doi = {10.1109/ACCESS.2022.3225198},
  issn = {2169-3536},
  journal = {IEEE Access},
  pages = {124396--124407},
  title = {{Evaluating Self-Supervised Speech Representations for Speech Emotion Recognition}},
  url = {https://ieeexplore.ieee.org/document/9964237/},
  volume = {10},
  year = {2022}
}

@inproceedings{yang21c_interspeech,
  author={Shu-wen Yang and Po-Han Chi and Yung-Sung Chuang and Cheng-I Jeff Lai and Kushal Lakhotia and Yist Y. Lin and Andy T. Liu and Jiatong Shi and Xuankai Chang and Guan-Ting Lin and Tzu-Hsien Huang and Wei-Cheng Tseng and Ko-tik Lee and Da-Rong Liu and Zili Huang and Shuyan Dong and Shang-Wen Li and Shinji Watanabe and Abdelrahman Mohamed and Hung-yi Lee},
  title={{SUPERB: Speech Processing Universal PERformance Benchmark}},
  year=2021,
  booktitle={Proc. Interspeech 2021},
  pages={1194--1198},
  doi={10.21437/Interspeech.2021-1775}
}

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
ci		ci
docs		docs
example		example
file		file
requirements		requirements
s3prl.egg-info		s3prl.egg-info
s3prl		s3prl
src		src
test		test
tools		tools
utility		utility
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
find_content.sh		find_content.sh
hubconf.py		hubconf.py
pyrightconfig.json		pyrightconfig.json
pytest.ini		pytest.ini
setup.cfg		setup.cfg
setup.py		setup.py
tox.ini		tox.ini
valid_paths.txt		valid_paths.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

S3PRL-SER

Environment compatibilities

Supported SER datasets (Status, WA, UA)

Introduction and Usages

Downstream

Installation

Development pattern for contributors

Reference Repositories

License

Citation

About

Releases

Packages

Languages

License

bagustris/s3prl-ser

Folders and files

Latest commit

History

Repository files navigation

S3PRL-SER

Environment compatibilities

Supported SER datasets (Status, WA, UA)

Introduction and Usages

Downstream

Installation

Development pattern for contributors

Reference Repositories

License

Citation

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages