Skip to content

ishine/conformer-vc-1

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

64 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Conformer-VC

Conformer-VC is inspired by Non-autoregressive sequence-to-sequence voice conversion that is parallel voice conversion methods powered by conformer.

The differences between original paper are

  • NOT using reduction-factor.
  • Mel-spectrograms are not normalized by speaker statistics.
  • Extract durations by DTW, not pretrained autoregressive model.
  • Use HiFi-GAN instead of ParallelWaveGAN

Requirements

  • pytorch
  • numpy
  • pyworld
  • accelerate
  • soundfile
  • librosa
  • cython
  • omegaconf
  • tqdm
  • resemblyzer
  • matplotlib
  • scipy

If you get an error about the package, please install it.

Usage

  1. Preprocess

If you wanna train your dataset, please rewrite configs/preprocess.yaml and preprocess.py properly.
Note that num of source files and num of tgt files must be same and file ids must be same.

$ cd dtw && python setup.py build_ext --inplace && cd ..
$ python prerprocess.py
  1. Training

single gpu training

$ ln -s ./dataaset/feats DATA
$ python train.py

or multi gpus

$ ln -s ./dataaset/feats DATA
$ accelerate config

answer question of your machine.

$ accelerate launch train.py
  1. Validation
$ python validate.py --model_dir {MODEL_DIR} --hifi_gan {HIFI_GAN_DIR} --data_dir DATA

if this script run correctly, outputs directory is generated and synthesized wav is in it.

About

Inspired by Non-AR S2S VC

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published