This repository is for creating extended datasets to the WHAMR corpus with larger RT60 values (between 1s and 3s).
Link to our paper: https://arxiv.org/pdf/2204.06439.pdf
See WHAMR website for original requirements: http://wham.whisper.ai/
This requires the wsj0 (https://catalog.ldc.upenn.edu/LDC93S6A/) dataset, and the WHAM noise corpus available here (http://wham.whisper.ai/).
Additional requirements:
$ pip install pandas
Copy and merge the content of this repositary with the original WHAMR simulation scripts: https://storage.googleapis.com/whisper-public/whamr_scripts.tar.gz
To re-run the reverb paramater generation with a new seed edit data/extended.py and run
$ cd data
$ python extended.py
$ cd ../
To simulate WHAMR_ext:
$ python create_whamr_ext_from_scratch.py
--wsj0-root /path/to/the/wsj/dataset/
--wham-noise-root /path/to/wham_noise/
--output-dir /path/to/output/directory/
Please cite the original WHAMR paper:
@inproceedings{Maciejewski2020WHAMR,
title = {WHAMR!: Noisy and Reverberant Single-Channel Speech Separation},
author = {Maciejewski, Matthew and Wichern, Gordon and Le Roux, Jonathan},
booktitle = {Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
year = {2020},
month = may
}
Please also cite our paper:
@misc{https://doi.org/10.48550/arxiv.2204.06439,
title = {Receptive Field Analysis of Temporal Convolutional Networks for Monaural Speech Dereverberation},
author = {Ravenscroft, William and Goetze, Stefan and Hain, Thomas},
year = {2022},
month = april,
doi = {10.48550/ARXIV.2204.06439},
url = {https://arxiv.org/abs/2204.06439},
publisher = {arXiv},
copyright = {Creative Commons Attribution 4.0 International}
}