Contrastive Voice Conversion (CVC)

Video (3m) | Website | Paper

This implementation is based on CUT, thanks Taesung and Junyan for sharing codes.

We provide a PyTorch implementation of non-parallel voice conversion based on patch-wise contrastive learning and adversarial learning. Compared to baseline CycleGAN-VC, CVC only requires one-way GAN training when it comes to non-parallel one-to-one voice conversion, while improving speech quality and reducing training time.

Prerequisites

Linux or macOS
Python 3
CPU or NVIDIA GPU + CUDA CuDNN

Kick Start

Clone this repo:

git clone https://github.com/Tinglok/CVC
cd CVC

Install PyTorch 1.6 and other dependencies.

For pip users, please type the command pip install -r requirements.txt.

For Conda users, you can create a new Conda environment using conda env create -f environment.yaml.
Download pre-trained Parallel WaveGAN vocoder to ./checkpoints/vocoder.

CVC Training and Test

Download the VCTK dataset

cd dataset
wget http://datashare.is.ed.ac.uk/download/DS_10283_2651.zip
unzip DS_10283_2651.zip
unzip VCTK-Corpus.zip
cp -r ./VCTK-Corpus/wav48/p* ./voice/trainA
cp -r ./VCTK-Corpus/wav48/p* ./voice/trainB

where the speaker folder could be any speakers (e.g. p256, and p270).

Train the CVC model:

python train.py --dataroot ./datasets/voice --name CVC

The checkpoints will be stored at ./checkpoints/CVC/.

Test the CVC model:

python test.py --dataroot ./datasets/voice --validation_A_dir ./datasets/voice/trainA --output_A_dir ./checkpoints/CVC/converted_sound

The converted utterance will be saved at ./checkpoints/CVC/converted_sound.

Baseline CycleGAN-VC Training and Test

Train the CycleGAN-VC model:

python train.py --dataroot ./datasets/voice --name CycleGAN --model cycle_gan

Test the CycleGAN-VC model:

python test.py --dataroot ./datasets/voice --validation_A_dir ./datasets/voice/trainA --output_A_dir ./checkpoints/CycleGAN/converted_sound --model cycle_gan

The converted utterance will be saved at ./checkpoints/CycleGAN/converted_sound.

Pre-trained CVC Model

Pre-trained models on p270-to-p256 and many-to-p249 are avaliable at this URL.

TensorBoard Visualization

To view loss plots, run tensorboard --logdir=./checkpoints and click the URL http://localhost:6006/.

Citation

If you use this code for your research, please cite our paper.

@inproceedings{li2021cvc,
  author={Tingle Li and Yichen Liu and Chenxu Hu and Hang Zhao},
  title={{CVC: Contrastive Learning for Non-Parallel Voice Conversion}},
  year=2021,
  booktitle={Proc. Interspeech 2021},
  pages={1324--1328}
}

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
checkpoints/vocoder		checkpoints/vocoder
data		data
datasets		datasets
experiments		experiments
figs		figs
models		models
options		options
util		util
LICENSE.md		LICENSE.md
README.md		README.md
compute_similarity.py		compute_similarity.py
environment.yaml		environment.yaml
requirements.txt		requirements.txt
test.py		test.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Contrastive Voice Conversion (CVC)

Video (3m) | Website | Paper

Prerequisites

Kick Start

CVC Training and Test

Baseline CycleGAN-VC Training and Test

Pre-trained CVC Model

TensorBoard Visualization

Citation

About

Releases

Packages

Languages

License

Tinglok/CVC

Folders and files

Latest commit

History

Repository files navigation

Contrastive Voice Conversion (CVC)

Video (3m) | Website | Paper

Prerequisites

Kick Start

CVC Training and Test

Baseline CycleGAN-VC Training and Test

Pre-trained CVC Model

TensorBoard Visualization

Citation

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages