- Support multidict training on main notebook.
- OpenUtau export for multidict compatibility
- Inference cell inside training/Inference notebook.
- Improve SOFA notebook.
Custom Local Training GUI is moved to DiffTrainer
Want to render on CLI for a quick test? Use the Inference notebook. More info on inference.md
Please see data_example.md for the data setup format
- lab + wav (NNSVS format)
- csv + wav (DiffSinger format)
- ds (DiffSinger .ds files)
- your_speaker_folder's folder name will be used as spk_name so please be careful about your file naming
- colab notebook primarily uses python; thus space and special character in file name or folder path may be invalid
- for an in-depth guide for SVS training and/or labeling, please see SVS Singing Voice Database - Tutorial
- it is advised to edit your data using SlurCutter for a more refined data for your pitch model
- please visit DiffSinger Discord for any help and questions regarding model production
Zip file format examples:
- wav
- it is suggested to use manual segmented audio for cleaner segments (though there's minimal difference when using the auto segmentation)
- zip file format can consist of any type of files, even subfolders. data extraction will only account .wav that are within the zip into the training set
- lab + wav (NNSVS format)
- this notebook is still a rough draft, please either don't use it at all or use it with caution....
- [notebook] improve SOFA notebook, add inference
- [notebook] update dictionary conversion code for phoneme types in build OU
- [notebook] clean up multi-dict notebook and support logic for dictionary generating for out-of-spefied-lang labels (/)
- [resource] add example file(s) for multi-dicitonary training
Credits:
-
openvpi for DiffSinger fork and more
-
UtaUtaUtau for nnsvs-db-converter
-
Kei for the original notebook
-
MLo7 for the repo's content
-
PixPrucer for an in-depth SVS guide
-
haru0l for the base pretrain with embeds
-
AgentAsteriski for the local GUI