Mixer-TTS sup_data_path and sup_data_types #3793
-
Are there any examples/documentation on how to generate supplementary data for Mixer-TTS training? Is 'pitch' data optional or is it extracted automatically during training? How about 'align_prior_matrix'? I guess that must be coming from an ASR model. Thanks! |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
You can use this script to do it for the datasets in the But yes, essentially the TTSDataset class will extract these automatically during training, which is what the above script uses to do so. Specifically they happen in these functions in TTSDataset: NeMo/nemo/collections/tts/torch/data.py Lines 330 to 370 in 71c51ec |
Beta Was this translation helpful? Give feedback.
You can use this script to do it for the datasets in the
.../dataset_processing/tts/
directory (or your own if you build your own YAML configs): https://github.com/NVIDIA/NeMo/blob/main/scripts/dataset_processing/tts/extract_sup_data.pyBut yes, essentially the TTSDataset class will extract these automatically during training, which is what the above script uses to do so. Specifically they happen in these functions in TTSDataset:
NeMo/nemo/collections/tts/torch/data.py
Lines 330 to 370 in 71c51ec