Skip to content

gteu/realtime-ppg-vc

Repository files navigation

RealtimePPGVC

Voice conversion model for real-time synthesis using PPG (Phonetic PosteriorGram) as an intermediate feature, written in Pytorch.

Implementation details

Reference

Dataset

ASR training result

ASR training sample

Transcript: "二階から (n i k a i k a r a) ..."

The correspondence between index and phone is described here.

VC speech sample

Baseline samples (No GAN, No DAT)

https://drive.google.com/drive/folders/1Djq4dwZgJdGy4rFVArZY_kLySoxu9iSj?usp=sharing

  • gen_[ID].wav: generated speech
  • ref_[ID].wav: source speech
  • jsut_target.wav: speech from target speaker

About

Voice conversion model for real-time speech synthesis using PPG (Phonetic PosteriorGram) as an intermediate feature, written in Pytorch.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published