-
Notifications
You must be signed in to change notification settings - Fork 82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to build a new voicedb? #22
Comments
First of all, you need to prepare (1)singing voice data (any format which python can handle) (2)sheet music of singing voice data(musicxml format[1]) (3)label file which contains the relationship between each phoneme and its time to be pronounced(HTS monophone label format). Currently NNSVS makes use of pysinsy[2](python wrapper of sinsy[3], HMM based singing voice synthesis system) to convert musicxml files to HTS fullcontext label files, and it supports Japanese only. There is need to match between phonemes used in HTS monophone label files and those converted from musicxml files with pysinsy. If you want to use other language, you may need to find or write such converter as pysinsy. If you can prepare your own data as above, the recipe of kiritan database or PJS corpus may be helpful. |
Thanks fo replying! I've prepared the data above, but how can I train the model based on those data? |
If there is a mismatch between the monophone label you prepared and HTS full-context label generated from your musicxml file by pysinsy, your recipe will stop at stage 0. You need to resolve this conflict by editing the monophone label file or the musicxml file. If you can run your recipe to the end, wav files you chose as validation and evaluation above are generated at egs/<name_of_your_voice_db>/00-svs-world/exp/<spk_of_your_voice_db>/synthesis/. |
How can I build a new voicedb by my own? Or is there any documents? (I didn't find it up to now)
The text was updated successfully, but these errors were encountered: