CDiffuSE

CDiffuSE leverages recent advances in diffusion probabilistic models, and proposes a novel speech enhancement algorithm that incorporates characteristics of the observed noisy speech signal into the diffusion and reverse processes. More specifically, we propose a generalized formulation of the diffusion probabilistic model named conditional diffusion probabilistic model that, in its reverse process, can adapt to non-Gaussian real noises in the estimated speech signal. Conditional Diffusion Probabilistic Model for Speech Enhancement.

Audio samples

16 kHz audio samples

Pretrained models

CDiffuSE Pretrained models are released. The large model is the same as described in the paper. The base model was trained without the mel-spectrum pre-training phase, which is a little better than the ones on the paper.

Training

Before you start training, you'll need to prepare a training dataset. The default dataset is VOICEBANK-DEMAND dataset. You can download them from VOICEBANK-DEMAND and resample it to 16k Hz. By default, this implementation assumes a sample rate of 16 kHz. If you need to change this value, edit params.py.

You need to set the output path and data path under path.sh

output_path=[path_to_output_directory]
voicebank=[path_to_voicebank_directory]

Usage: Train SE model

./train.sh [stage] [model_directory]

Train SE model based on a pre-trained model.

./train.sh [stage] [model_directory] [pretrained_model_directory]/weights-[ckpt].pt

Note that the pre-training step with clean Mel-Spectrum conditioners is no longer needed in CDiffuSE. A randomly initialized CDiffuSE performs as well as one initialized from pre-trained parameters.

Multi-GPU training

By default, this implementation uses as many GPUs in parallel as returned by torch.cuda.device_count(). You can specify which GPUs to use by setting the CUDA_DEVICES_AVAILABLE environment variable before running the training module.

Validatoin and Inference API

Usage:

./valid.sh [stage] [model name] [checkpoint id] 
./inference.sh [stage] [model name] [checkpoint id]

The code of CDiffuSE is developed based on the code of Diffwave

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
Sample Files		Sample Files
src/cdiffuse		src/cdiffuse
LICENSE		LICENSE
README.md		README.md
inference.sh		inference.sh
path.sh		path.sh
train.sh		train.sh
valid.sh		valid.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CDiffuSE

Audio samples

Pretrained models

Training

Multi-GPU training

Validatoin and Inference API

References

About

Releases

Packages

Languages

License

neillu23/CDiffuSE

Folders and files

Latest commit

History

Repository files navigation

CDiffuSE

Audio samples

Pretrained models

Training

Multi-GPU training

Validatoin and Inference API

References

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages