TAPLoss

Official Pytorch Implementation of TAPLoss: A Temporal Acoustic Parameter Loss For Speech Enhancement accepted to ICASSP 2023.

Prerequisites

TAPLoss requires Numpy and Pytorch, install versions that are compatible with your project. Refer to Demucs and FullSubNet for installing required packages for Demucs and FullSubNet.

Dataset

The TAP esimator is trained on the DNS 2020 clean dataset. Download the DNS Interspeech 2020 dataset from here and follow the instructions to prepare the dataset.

Temporal Acoustic Parameters

Ground truth temporal acoustic parameters are calculated using Opensmile. The 25 temporal acoustic parameters include:

Loudness_sma3
alphaRatio_sma3
hammarbergIndex_sma3
slope0-500_sma3
slope500-1500_sma3
spectralFlux_sma3
mfcc1_sma3
mfcc2_sma3
mfcc3_sma3
mfcc4_sma3
F0semitoneFrom27.5Hz_sma3nz
jitterLocal_sma3nz
shimmerLocaldB_sma3nz
HNRdBACF_sma3nz
logRelF0-H1-H2_sma3nz
logRelF0-H1-A3_sma3nz
F1frequency_sma3nz
F1bandwidth_sma3nz
F1amplitudeLogRelF0_sma3nz
F2frequency_sma3nz
F2bandwidth_sma3nz
F2amplitudeLogRelF0_sma3nz
F3frequency_sma3nz
F3bandwidth_sma3nz
F3amplitudeLogRelF0_sma3nz

Use TAPLoss in your own project

You'll need the TAPLoss.py and TAP_estimator.py located under ./TAPLoss/, you'll also find a .pt file under the same directory, which is the pretrained TAP estimator model checkpoint.

from TAPLoss import AcousticLoss

Initialize TAPLoss

TAPloss = AcousticLoss(loss_type, acoustic_model_path)
# loss_type: choose one from ["l2", "l1", "frame_energy_weighted_l2", "frame_energy_weighted_l1"]
# acoustic_model_path: path to the .pt TAP pretrained estimator model checkpoint.

Call TAPloss as a function

loss = TAPloss(clean_waveform, enhan_waveform, mode)
# clean_waveform: waveform of clean audio, sampled at 16khz.
# enhan_waveform: waveform of enhanced audio, sampled at 16khz.
# mode: "train" if your enhancement model is in train mode,
#        "eval" if your enhancement model is in eval mode,
#        gradients won't be calculated in eval mode.

Optimization

You should not update the TAP estimator parameters during training, they are supposed to be frozen.
You can either fine-tune a pre-trained enhancement model with TAPLoss as an auxiliary loss or train from scratch. If you train from scratch with TAPLoss, make sure it's weight is not too big at early stage, as acoustic estimation from noisy speech is inaccurate (TAP estimator is trained on clean speech). You can control the weight by using a scheduler.

Related Resources

More details about the official implementation of PAAPLoss: A Phonetic-Aligned Acoustic Parameter Loss for Speech Enhancement can be found at https://github.com/muqiaoy/PAAP.

Ackowledgement

Our experiments were based on enhancement models adapted from Demucs and FullSubNet. We Thank the authors of Real Time Speech Enhancement in the Waveform Domain and FullSubNet: A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement for open-sourcing their projects.

Citation

Please cite our paper if you use our code for your research.

@article{zeng2023taploss,
  title={TAPLoss: A Temporal Acoustic Parameter Loss for Speech Enhancement},
  author={Zeng, Yunyang and Konan, Joseph and Han, Shuo and Bick, David and Yang, Muqiao and Kumar, Anurag and Watanabe, Shinji and Raj, Bhiksha},
  journal={arXiv preprint arXiv:2302.08088},
  year={2023}

Name		Name	Last commit message	Last commit date
Latest commit History 81 Commits
Demucs/denoiser		Demucs/denoiser
Estimator		Estimator
FullSubNet		FullSubNet
TAPLoss		TAPLoss
LICENSE.md		LICENSE.md
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TAPLoss

Prerequisites

Dataset

Temporal Acoustic Parameters

Use TAPLoss in your own project

Related Resources

Ackowledgement

Citation

About

Releases

Packages

Contributors 4

Languages

License

YunyangZeng/TAPLoss

Folders and files

Latest commit

History

Repository files navigation

TAPLoss

Prerequisites

Dataset

Temporal Acoustic Parameters

Use TAPLoss in your own project

Related Resources

Ackowledgement

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages