Regeneration Enhancer

This repository provides speech enhancement via regeneration implementation with Pytorch. Algorithm is based on paper, but several changes were made in feature extraction and therefore model parameters.

TODO list:

add inference scripts
implement streaming model and its inference
provide multilingual enhancement models (and adapt feature extraction too)
make pypi package
release pretrained models

Requirements

This repository is tested on Ubuntu 16.04 with a GPU 1080 Ti.

Python 3.7+ (follow installation page)
Cuda 10.0+ (guide for ubuntu)

libsndfile (you can install via sudo apt install libsndfile-dev in ubuntu)

pip requirements (defined in requirements.txt, install via pip install -r requirements.txt):
- hydra-core 1.0.6+
- pytorch 1.7+
- torchaudio 0.7.2+
- librosa 0.8.0+
- pytest 6.2.0+
- transformers 4.3.0+, pyworld 0.2.12+, pyannote.audio 2.0+ (for feature extraction)
(optional) ffmpeg (for .mp3 support, you can install via sudo apt install ffmpeg in ubuntu)

Installation

git clone https://github.com/SolomidHero/speech-regeneration-enhancer
pip install -e ./speech-regeneration-enhancer

Training

For training you should use DAPS dataset, or dataset with similar file namings (folder structure doesn't matter):

data_folder/
  wav_1_clean.wav
  dirty/
    wav_1_recoder_bathroom.wav
    wav_2_microphone_street.wav
  some_sub_tree/
    wav_2_clean.wav

In this repository we use hydra configuration (read more), thus for training and inference you can only change config.yaml file. Also defining through parameters in bash is available.

When changes to config are made, you can check yourself if your parameters are acceptable by any of these commands:

pytest                       # to check if everything is working
pytest tests/test_scripts.py # to check if training process can be done

After data downloading and config changes, run preprocessing script (feature extraction made here):

preprocess.py dataset.wav_dir=/path/to/wavs # parameters can be added into config directly

Finally we are able to train model:

train.py train.epochs=50 train.ckpt_dir=/path/to/ckpts # parameters can be added into config directly

In /path/to/ckpt checkpoints for generator and other stuff (discriminator, optimizers) will appear from now.

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
.dvc		.dvc
.github/workflows		.github/workflows
ckpts		ckpts
features		features
modules		modules
tests		tests
utils		utils
.dvcignore		.dvcignore
.gitignore		.gitignore
.rspec		.rspec
LICENSE		LICENSE
README.MD		README.MD
Rakefile		Rakefile
config.yaml		config.yaml
dvc.lock		dvc.lock
dvc.yaml		dvc.yaml
inference.py		inference.py
preprocess.py		preprocess.py
requirements.txt		requirements.txt
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Regeneration Enhancer

Requirements

Installation

Training

Reference

About

Releases

Packages

Languages

License

SolomidHero/speech-regeneration-enhancer

Folders and files

Latest commit

History

Repository files navigation

Regeneration Enhancer

Requirements

Installation

Training

Reference

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages