SkipVQVC

Implementation of SkipVQVC with variant settings. Skip connection is an powerful technique in deep learning. However, in auto-encoder based voice conversion(VC) domain, skip connection is often no-used. Skip-connection cause model learning too fast, and overfitting on reconstruction, and such a model cannot fullfill VC anymore. In this paper, we discuss how quantization can form a strong bottleneck that skip-connection VC can fullfilled.

preprocessing

python preprocessing.py [input_dir (VCTK/wav48)] [output_dir npy dir]

File architecture

# File 
- SkipVQVC
  |- logger (some utlis used in tensorboard)
  |  |.
  |
  |- trainer (differnt trainer have different properties)
  |  |- train_normal.py
  |  |- train_rhythm.py (split speech to rhythm fator, shoud use vqvc+_rhythm model)
  |  |- train_mean_std.py (train with input normalized by mean and std)
  |
  |- model (different models like normal, speaker vae, rhythm, )
  | |- .
  | |- .
  |
  |- utils

Training config

-train_dir is your training dir
-test_dir is your testing dir (unseen speakers)
-m which model do you want in model/* (for example: vqvc+)
-n number of vectors in codebook
-ch channels in encoder and decoder
-t which trainer do you want in trainer/* (for example: train_normal)
--load_checkpoint, if you want to load checkpoint(if it is in the checkpoint dir, for example: True)

checkpoint and output dir is auto generated by you model, trainer n_embed and channel. Load checkpoint it auto load the files match its setting.

Example

python train.py -train_dir /homes/aa/mel/mel.melgan -m vqvc+ -n 128 -ch 128 -t train_normal
--> "Saving model and optimizer state at iteration 0 to checkpoint/vqvc+_n128_ch128_train_normal/gen"
--> "Saving model and optimizer state at iteration 100 to checkpoint/vqvc+_n128_ch128_train_normal/gen"

Tensorboard

tensorboard --logdir output/vqvc+_n128_ch128_train_normal

The Whole model are still in investigation to find the best parameters.

# if you want to recover the result in papers.
python train.py -train_dir your-path-to-npy-dir -m vqvc+ -n 64 -ch 64 -t train_normal

# if you want to train with rhythm information ( adjust rhythm )
python train.py -train_dir your-path-to-npy-dir -m vqvc+_rhythm -n 128 -ch 128 -t train_rhythm

# if you find that normal trainging is not very good for one-shot, you can train resample. 
#It resample the quantized code which eliminate more speaker infomration from content

python train.py -train_dir your-path-to-npy-dir -m vqvc+_resample -n 512 -ch 512 -t train_normal

# We find that normalization on embeeding space imporve the result, you can try this
python train.py -train_dir your-path-to-npy-dir -m vqvc+ -n 64 -ch 512 -t train_simple_normalize


# Still in investigation...., speaker quantize <--> cav on speaker embedding

Some details

All model is wrap by vq_model(), details can be seen in model/vqvc*
All trainer is wrap by train_() , details can be seen in trainer/train*

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
logger		logger
loss		loss
model		model
trainer		trainer
utils		utils
vocoder		vocoder
.gitignore		.gitignore
LJSpeech.ipynb		LJSpeech.ipynb
README.md		README.md
VQVAE.ipynb		VQVAE.ipynb
index.html		index.html
inference.ipynb		inference.ipynb
preprocessing.py		preprocessing.py
preprocessing_mean_std.py		preprocessing_mean_std.py
train.py		train.py
train_code.py		train_code.py
train_name.py		train_name.py
train_pitch.py		train_pitch.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SkipVQVC

preprocessing

File architecture

Training config

Example

Tensorboard

The Whole model are still in investigation to find the best parameters.

Some details

About

Releases

Packages

Languages

ericwudayi/SkipVQVC

Folders and files

Latest commit

History

Repository files navigation

SkipVQVC

preprocessing

File architecture

Training config

Example

Tensorboard

The Whole model are still in investigation to find the best parameters.

Some details

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages