RealtimePPGVC

Voice conversion model for real-time synthesis using PPG (Phonetic PosteriorGram) as an intermediate feature, written in Pytorch.

Implementation details

Transcript: "二階から (n i k a i k a r a) ..."

The correspondence between index and phone is described here.

Baseline samples (No GAN, No DAT)

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
data		data
runs		runs
utils		utils
01_preprocess_asr.py		01_preprocess_asr.py
02_extract_feats_asr.py		02_extract_feats_asr.py
03_calc_scaler_asr.py		03_calc_scaler_asr.py
04_train_asr.py		04_train_asr.py
05_extract_feats_vc.py		05_extract_feats_vc.py
06_calc_scaler_vc.py		06_calc_scaler_vc.py
07_train_vc.py		07_train_vc.py
07_train_vc_gan.py		07_train_vc_gan.py
07_train_vc_multi.py		07_train_vc_multi.py
README.md		README.md
config.py		config.py
data.py		data.py
f0_jvs.json		f0_jvs.json
model.py		model.py
requirements.txt		requirements.txt
test_synthesis.py		test_synthesis.py
test_synthesis_for_js.py		test_synthesis_for_js.py
test_synthesis_multi.py		test_synthesis_multi.py