This is a PyTorch implementation of "MetricGAN+" (Szu-Wei Fu, Cheng Yu, Tsun-An Hsieh, Peter Plantinga, Mirco Ravanelli, Xugang Lu, Yu Tsao, 2021 Interspeech). This repository is implemented only with PyTorch based on the implementation of speechbrain.
🔔 We are pleased to announce that our related work, MetricGAN-OKD, has been accepted in ICML23. 🔔
In your environment (python 3.8), the requirements can be installed with:
pip install -r requirements.txt
pip install torch==1.7.1+cu110 torchaudio==0.7.2 -f https://download.pytorch.org/whl/torch_stable.html
We verified that this is supported in PyTorch versions 1.7.1 to 1.10.1.
VoiceBank-DEMAND: Please download clean_trainset_28spk_wav.zip, noisy_trainset_28spk_wav.zip, clean_testset_wav.zip, and noisy_testset_wav.zip from here
and extract them to data/VCTK_DEMAND_48k/train(or test)/clean(or noisy)
.
The sample rate of original dataset is 48kHz. We downsample the audio files from 48kHz to 16kHz as follows.
python downsample.py
The final folder structure should look like this:
MetricGAN+
├── ...
├── data
│ ├── VCTK_DEMAND
│ │ ├── train
│ │ │ ├── clean
│ │ │ ├── noisy
│ │ ├── test
│ │ │ ├── clean
│ │ │ ├── noisy
├── ...
python main.py --exp_name=exp1 --target_metric pesq
You can change the hyperparameters (target_metric, epochs, batch_size, hist_portion, lr, ...).
python main.py --exp_name=exp2_csig_hist0.1 --target_metric csig --hist_portion=0.1
python inference.py --weight_path results/exp1/model/ --weight_file best_model.pth
We provide results and checkpoints of MetricGAN+ on the VoiceBank-DEMAND dataset.
Target Metric | PESQ | CSIG | CBAK | COVL |
---|---|---|---|---|
PESQ - trial 1 | 3.13 | 4.13 | 3.03 | 3.61 |
PESQ - trial 2 | 3.08 | 4.07 | 3.15 | 3.56 |
PESQ - trial 3 | 3.15 | 4.09 | 3.16 | 3.61 |
CSIG - trial 1 | 3.12 | 4.26 | 3.07 | 3.68 |
Please download the weights file from our release, put them in your path, and run inference.
python inference.py --weight_path your/path/ --weight_file PESQ-GAN_trial1.pth
If you find this project beneficial for your research, we kindly request your consideration in exploring our related work, MetricGAN-OKD, and citing it accordingly.
@inproceedings{shin2023metricgan,
title={MetricGAN-OKD: multi-metric optimization of MetricGAN via online knowledge distillation for speech enhancement},
author={Shin, Wooseok and Lee, Byung Hoon and Kim, Jin Sob and Park, Hyun Joon and Han, Sung Won},
booktitle={International Conference on Machine Learning},
pages={31521--31538},
year={2023},
organization={PMLR}
}