Voice conversion software - Voice conversion (VC) is a technique to convert a speaker identity of a source speaker into that of a target speaker. This software enables the users to develop a traditional VC system based on a Gaussian mixture model (GMM) and a vocoder-free VC system based on a differential GMM (DIFFGMM) using a parallel dataset of the source and target speakers.
Paper and slide
K. Kobayashi, T. Toda, "sprocket: Open-Source Voice Conversion Software," Proc. Odyssey, pp. 203-210, June 2018. [paper]
T. Toda, "Hands on Voice Conversion," Speech Processing Courses in Crete (SPCC), July 2018. [slide]
Reproduce the typical VC systems
This software was developed to make it possible for the users to easily build the VC systems by only preparing a parallel dataset of the desired source and target speakers and executing example scripts. The following VC methods were implemented as the typical VC methods.
Traditional VC method based on GMM
- T. Toda, A.W. Black, K. Tokuda, "Voice conversion based on maximum likelihood estimation of spectral parameter trajectory," IEEE Transactions on Audio, Speech and Language Processing, Vol. 15, No. 8, pp. 2222-2235, Nov. 2007.
Vocoder-free VC method based on DIFFGMM
- K. Kobayashi, T. Toda, S. Nakamura, "F0 transformation techniques for statistical voice conversion with direct waveform modification with spectral differential," Proc. IEEE SLT, pp. 693-700, Dec. 2016.
Supply Python3 VC library
To make it possible to easily develop VC-based applications using Python (Python3), the VC library is also supplied, including several interfaces, such as acoustic feature analysis/synthesis, acoustic feature modeling, acoustic feature conversion, and waveform modification. For the details of the VC library, please see sprocket documents in (coming soon).
Installation & Run
Please use Python3.
Current stable version
pip install numpy==1.15.4 cython # for dependency pip install sprocket-vc
See VC example
For any questions or issues please visit:
Copyright (c) 2020 Kazuhiro KOBAYASHI
Released under the MIT license