Voice-conversion-and-morphing-RelGAN

A implementation of Voice-conversion-and-morphing using RelGAN(image translation) with TensorFlow.

This enables Many to many voice conversion and voice morphing.

This is under experiment now.

Details page(japanese language)

Original papers and pages

Related papers and pages

Original implementations

RelGAN

Usage

Put the folder containing the wav files for training in named datasets.

Folders are needed 3 or more.

And Put the folder containing a few wav files for validation in datasets_val.

like this

...
│
datasets
|   │
|   ├── speaker_1
|   │     ├── wav1_1.wav
|   │     ├── wav1_2.wav
|   │     ├── ...
|   │     └── wav1_i.wav
|   ├── speaker_2
|   │     ├── wav2_1.wav
|   │     ├── wav2_2.wav
|   │     ├── ...
|   │     └── wav2_j.wav 
|   ...
|   └── speaker_N
|         ├── wavN_1.wav
|         ├── wavN_2.wav
|         ├── ...
|         └── wavN_k.wav    
datasets_val
|   │
|   ├── speaker_1
|   │     ├── wav1_i+1.wav
|   │     ├── wav1_i+2.wav
|   │     ├── ...
|   │     └── wav1_i+5.wav
|   ├── speaker_2
|   │     ├── wav2_j+1.wav
|   │     ├── wav2_j+2.wav
|   │     ├── ...
|   │     └── wav2_j+3.wav 
|   ...
|   └── speaker_N
|         ├── wavN_k+1.wav
|         ├── wavN_k+2.wav
|         ├── ...
|         └── wavN_k+4.wav 
...
├── preprocess1.py     
├── preprocess2.py
...

Run preprocess1.py to remove silence and split the file.

python preprocess1.py

Run preprocess2.py to extract features and output pickles.

python preprocess2.py

Train RelGAN-VM.

python train_relgan_vm.py

After training, inference can be performed.

Source attribute and target attribute must be designated.

In below example, The 2nd attribute wav file, datasets_val/speaker_2, will be 60% converted to the 4th attribute (probably speaker_4).

pay attention to 0-origin index.

python eval_relgan_vm.py --source_label 1 --target_label 3 --interpolation 0.6

Result examples

The examples trained using JVS (Japanese versatile speech) corpus are located in result_examples.

The following four voices were used for training.

jvs010(female, high-pitched fo, domain 0)
jvs016(female, low-pitched fo, domain 1)
jvs042(male, low-pitched fo, domain 2)
jvs054(male, high-pitched fo, domain 3)

Examples are available on youtube.

Acknowledgements

This implementation is based on njellinas's CycleGAN-VC2.

And this was created with the advice of Lgeu.

Name		Name	Last commit message	Last commit date
Latest commit History 74 Commits
models		models
result_examples		result_examples
LICENSE		LICENSE
README.md		README.md
eval_relgan_vm.py		eval_relgan_vm.py
main.sh		main.sh
preprocess1.py		preprocess1.py
preprocess2.py		preprocess2.py
requirements.txt		requirements.txt
speech_tools.py		speech_tools.py
train_relgan_vm.py		train_relgan_vm.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

models

models

result_examples

result_examples

LICENSE

LICENSE

README.md

README.md

eval_relgan_vm.py

eval_relgan_vm.py

main.sh

main.sh

preprocess1.py

preprocess1.py

preprocess2.py

preprocess2.py

requirements.txt

requirements.txt

speech_tools.py

speech_tools.py

train_relgan_vm.py

train_relgan_vm.py

utils.py

utils.py

Repository files navigation

Voice-conversion-and-morphing-RelGAN

Original papers and pages

Related papers and pages

Original implementations

Usage

Result examples

Acknowledgements

About

Releases

Packages

Contributors 2

Languages

License

itsuki8914/Voice-morphing-RelGAN

Folders and files

Latest commit

History

Repository files navigation

Voice-conversion-and-morphing-RelGAN

Original papers and pages

Related papers and pages

Original implementations

Usage

Result examples

Acknowledgements

About

Resources

License

Stars

Watchers

Forks

Languages