Conformer-VC

Conformer-VC is inspired by Non-autoregressive sequence-to-sequence voice conversion that is parallel voice conversion methods powered by conformer.

The differences between original paper are

NOT using reduction-factor.
Mel-spectrograms are not normalized by speaker statistics.
Extract durations by DTW, not pretrained autoregressive model.
Use HiFi-GAN instead of ParallelWaveGAN

Requirements

pytorch
numpy
pyworld
accelerate
soundfile
librosa
cython
omegaconf
tqdm
resemblyzer
matplotlib
scipy

If you get an error about the package, please install it.

Usage

Preprocess

If you wanna train your dataset, please rewrite configs/preprocess.yaml and preprocess.py properly.
Note that num of source files and num of tgt files must be same and file ids must be same.

$ cd dtw && python setup.py build_ext --inplace && cd ..
$ python prerprocess.py

Training

single gpu training

$ ln -s ./dataaset/feats DATA
$ python train.py

or multi gpus

$ ln -s ./dataaset/feats DATA
$ accelerate config

answer question of your machine.

$ accelerate launch train.py

Validation

$ python validate.py --model_dir {MODEL_DIR} --hifi_gan {HIFI_GAN_DIR} --data_dir DATA

if this script run correctly, outputs directory is generated and synthesized wav is in it.

Name		Name	Last commit message	Last commit date
Latest commit History 64 Commits
configs		configs
data		data
dtw		dtw
hifi_gan		hifi_gan
models		models
transform		transform
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
preprocess.py		preprocess.py
train.py		train.py
utils.py		utils.py
validate.py		validate.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Conformer-VC

Requirements

Usage

About

Releases

Packages

Languages

License

ishine/conformer-vc-1

Folders and files

Latest commit

History

Repository files navigation

Conformer-VC

Requirements

Usage

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages