PyTorch implementation of the 2021 INTERSPEECH paper "Scene-Agnostic Multi-Microphone Speech Dereverberation".
The code for generating the BIUREV/BIUREV-N datasets described in the paper can be found here.
- Acquire the BIUREV/BIUREV-N datasets.
- Python 3.
- barbar package (pip install barbar).
- wandb (pip install wandb).
data
- a directory that handles data preparation and the PyTorch Dataset object.networks
- a directory where the DSS and Ernst et al. models are defined.train.py
- code for training a network.test.py
- code for evaluating a network.losses.py
- implements the loss functions.taskfiles
- text files comprising lists of WAV files used for evaluation (in essence, all test files in BIUREV/BIUREV-N datasets).
-
Extract spectrograms from the BIUREV/BIUREV-N datasets to be used for training and validation.
Indata/prepare_dataset.py
, set the variablesclean_wavs_dir
andreverb_wavs_dir
as the directories where the clean and reverberant speech signals are stored. Makes sure to change the variables for both training(lines 154-155) and evaluation (lines 168-169) parts. -
Run:
python data/prepare_dataset.py --mics_num <desired number of microphones> --dataset <BIUREV or BIUREV-N>
This can take from a few minutes to a couple of hours, depending on the number of microphones.
- Run:
python train.py --mics_num <choose from 1-8> --dataset <BIUREV or BIUREV-N> --unet_arch <dss or vanilla> --gpu_ids <for example 0 or 0,1,2,3>
This will create a new directory, named trained_networks
.
- Change line 208 such that the variable
wavs_dir
refers to the parent directory of BIUREV/BIUREV-N. - Run:
python test.py --version_name <name of version> --dataset <BIUREV or BIUREV-N> -unet_arch <dss or vanilla>.
For example:
python test.py --version_name mics8_24.03.2021_05:33:29 --dataset BIUREV -unet_arch dss.
The number of microphones is inferred from the version's name.