GitHub - ktho22/vctts: pytorch implementation of "Emotional Voice Conversion using Multitask Learning with Text-to-Speech", Accepted to ICASSP 2020

This is the code of the paper Emotional Voice Conversion using Multitask Learning with Text-to-speech, ICASSP 2020 [link]

Prerequisite

Install required packages

pip3 install -r requirements.txt

Inference

Few samples and pretraiend model for VC are provided, so you can try with below command.

Samples contain 20 types of sentences and 7 emotions, 140 utterances in total.

~~[model download]~~

~~[samples download]~~

Model/samples download links are expired.

python3 generate.py --init_from <model_path> --gpu <gpu_id> --out_dir <out_dir>

Below is an example of generated wav.

It means the model takes contents of (fear, 20th contents) and style of (anger, 2nd contents) to make (anger, 20th contents).

pretrained_model_fea_00020_ang_00002_ang_00020_input_mel.wav

Training

You can train your own dataset, by changing contents of dataset.py

# remove silence within wav files
python3 trimmer.py --in_dir <in_dir> --out_dir <out_dir>

# Extract mel/lin spectrogram and dictionary of characters/phonemes
python3 preprocess.py --txt_dir <txt_dir> --wav_dir <wav_dir> --bin_dir <bin_dir>

# train the model, --use_txt will control vc path or tts path
python3 main.py -m <message> -g <gpu_id> --use_txt <0~1, higher value means y_t batch is more sampled>

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
audio_processing.py		audio_processing.py
collate_fn.py		collate_fn.py
dataset.py		dataset.py
encoder.py		encoder.py
generate.py		generate.py
griffin_lim.py		griffin_lim.py
main.py		main.py
model.py		model.py
preprocess.py		preprocess.py
requirements.txt		requirements.txt
stft.py		stft.py
trimmer.py		trimmer.py
util.py		util.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Prerequisite

Inference

Training

About

Releases

Packages

Languages

License

ktho22/vctts

Folders and files

Latest commit

History

Repository files navigation

Prerequisite

Inference

Training

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages