Mixer-TTS sup_data_path and sup_data_types #3793

oytunturk · 2022-03-04T05:34:42Z

oytunturk
Mar 4, 2022

Are there any examples/documentation on how to generate supplementary data for Mixer-TTS training? Is 'pitch' data optional or is it extracted automatically during training? How about 'align_prior_matrix'? I guess that must be coming from an ASR model.

Thanks!

Answered by redoctopus

Mar 7, 2022

You can use this script to do it for the datasets in the .../dataset_processing/tts/ directory (or your own if you build your own YAML configs): https://github.com/NVIDIA/NeMo/blob/main/scripts/dataset_processing/tts/extract_sup_data.py

But yes, essentially the TTSDataset class will extract these automatically during training, which is what the above script uses to do so. Specifically they happen in these functions in TTSDataset:

NeMo/nemo/collections/tts/torch/data.py

Lines 330 to 370 in 71c51ec

     def add_align_prior_matrix(self, **kwargs):  
   self.align_prior_matrix_folder = kwargs.pop('align_prior_matrix_folder', None)  
    
   if self.align_prior_matrix_folder is None:  
   

View full answer

redoctopus · 2022-03-07T17:34:37Z

redoctopus
Mar 7, 2022
Collaborator

You can use this script to do it for the datasets in the .../dataset_processing/tts/ directory (or your own if you build your own YAML configs): https://github.com/NVIDIA/NeMo/blob/main/scripts/dataset_processing/tts/extract_sup_data.py

But yes, essentially the TTSDataset class will extract these automatically during training, which is what the above script uses to do so. Specifically they happen in these functions in TTSDataset:

NeMo/nemo/collections/tts/torch/data.py

Lines 330 to 370 in 71c51ec

    
           def add_align_prior_matrix(self, **kwargs): 
        
               self.align_prior_matrix_folder = kwargs.pop('align_prior_matrix_folder', None) 
        
               if self.align_prior_matrix_folder is None: 
        
                   self.align_prior_matrix_folder = Path(self.sup_data_path) / AlignPriorMatrix.name 
        
               self.align_prior_matrix_folder.mkdir(exist_ok=True, parents=True) 
        
               self.use_beta_binomial_interpolator = kwargs.pop('use_beta_binomial_interpolator', False) 
        
               if not self.cache_text: 
        
                   if 'use_beta_binomial_interpolator' in kwargs and not self.use_beta_binomial_interpolator: 
        
                       logging.warning( 
        
                           "phoneme_probability is not None, but use_beta_binomial_interpolator=False, we" 
        
                           " set use_beta_binomial_interpolator=True manually to use phoneme_probability." 
        
                       ) 
        
                   self.use_beta_binomial_interpolator = True 
        
               if self.use_beta_binomial_interpolator: 
        
                   self.beta_binomial_interpolator = BetaBinomialInterpolator() 
        
           def add_pitch(self, **kwargs): 
        
               self.pitch_folder = kwargs.pop('pitch_folder', None) 
        
               if self.pitch_folder is None: 
        
                   self.pitch_folder = Path(self.sup_data_path) / Pitch.name 
        
               self.pitch_folder.mkdir(exist_ok=True, parents=True) 
        
               self.pitch_fmin = kwargs.pop("pitch_fmin", librosa.note_to_hz('C2')) 
        
               self.pitch_fmax = kwargs.pop("pitch_fmax", librosa.note_to_hz('C7')) 
        
               self.pitch_mean = kwargs.pop("pitch_mean", None) 
        
               self.pitch_std = kwargs.pop("pitch_std", None) 
        
               self.pitch_norm = kwargs.pop("pitch_norm", False) 
        
           def add_energy(self, **kwargs): 
        
               self.energy_folder = kwargs.pop('energy_folder', None) 
        
               if self.energy_folder is None: 
        
                   self.energy_folder = Path(self.sup_data_path) / Energy.name 
        
               self.energy_folder.mkdir(exist_ok=True, parents=True)

1 reply

oytunturk Mar 9, 2022
Author

Thank you!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mixer-TTS sup_data_path and sup_data_types #3793

{{title}}

Replies: 1 comment 1 reply

{{title}}

{{title}}

Select a reply

	def add_align_prior_matrix(self, **kwargs):
	self.align_prior_matrix_folder = kwargs.pop('align_prior_matrix_folder', None)

	if self.align_prior_matrix_folder is None:

Mixer-TTS sup_data_path and sup_data_types #3793

oytunturk Mar 4, 2022

Replies: 1 comment · 1 reply

redoctopus Mar 7, 2022 Collaborator

oytunturk Mar 9, 2022 Author

oytunturk
Mar 4, 2022

Replies: 1 comment 1 reply

redoctopus
Mar 7, 2022
Collaborator

oytunturk Mar 9, 2022
Author