May be a bug? input parameters for model.predictor_encoder and model.style_encoder in train_finetune.py #243

starmoon-1134 · 2024-05-29T08:58:39Z

In the train_finetune.py file, we have noticed a potential issue with the input parameters for model.predictor_encoder and model.style_encoder. The current code is as follows:

s = model.style_encoder(gt.unsqueeze(1))           
s_dur = model.predictor_encoder(gt.unsqueeze(1))

However, in the train_second.py file, we have found a different implementation that takes into account the multispeaker scenario:

s_dur = model.predictor_encoder(st.unsqueeze(1) if multispeaker else gt.unsqueeze(1))
s = model.style_encoder(st.unsqueeze(1) if multispeaker else gt.unsqueeze(1))

The text was updated successfully, but these errors were encountered:

yl4579#243

martinambrus pushed a commit to martinambrus/StyleTTS2 that referenced this issue Aug 31, 2024

fix: potential fix for multispeaker scenarios

bc3e439

yl4579#243

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

May be a bug? input parameters for model.predictor_encoder and model.style_encoder in train_finetune.py #243

May be a bug? input parameters for model.predictor_encoder and model.style_encoder in train_finetune.py #243

starmoon-1134 commented May 29, 2024

May be a bug? input parameters for model.predictor_encoder and model.style_encoder in train_finetune.py #243

May be a bug? input parameters for model.predictor_encoder and model.style_encoder in train_finetune.py #243

Comments

starmoon-1134 commented May 29, 2024