Reason for std and input scaling in cwt? #8

dunky11 · 2022-02-28T08:36:29Z

Hey, I have some questions about your pitch predictor in cwt domain:

decoder_inp = decoder_inp.detach() + self.predictor_grad * (decoder_inp - decoder_inp.detach())
pitch_padding = mel2ph == 0


if self.pitch_type == "cwt":
    pitch_padding = None
    cwt = cwt_out = self.cwt_predictor(decoder_inp) * control
    stats_out = self.cwt_stats_layers(encoder_out[:, 0, :])  # [B, 2]
    mean = f0_mean = stats_out[:, 0]
    std = f0_std = stats_out[:, 1]
    cwt_spec = cwt_out[:, :, :10]
    if f0 is None:
        std = std * self.cwt_std_scale
        f0 = cwt2f0_norm(

I have three questions:

What is the reason for the first line? Isn't the right side always zero and therefore no gradients flow back?
Why do you scale inputs by 0.1?
Why did you scale ground truth std by 0.8?

Thanks for any help in advance!

The text was updated successfully, but these errors were encountered:

keonlee9420 · 2022-03-07T14:29:40Z

Hi @dunky11 , thanks for your attention. I think those are good points! Here is my answer
1., 2. This line is for scaling the gradient flows to the encoder during backward. decoder_inp.detach() will keep the value but lose the graph so far, whereas self.predictor_grad * (decoder_inp - decoder_inp.detach()) will be all zeros but the gradient of the decoder_inp is kept scaled by self.predictor_grad. In this way, we can keep the value but scale the gradient only.
3. The purpose of scaling is to ease the training for the model, and the value depends on the dataset. Please refer to this article for more background.

keonlee9420 · 2022-05-10T15:13:04Z

Close due to inactivity.

keonlee9420 mentioned this issue Apr 12, 2022

about duration predictor #9

Closed

keonlee9420 closed this as completed May 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reason for std and input scaling in cwt? #8

Reason for std and input scaling in cwt? #8

dunky11 commented Feb 28, 2022 •

edited

keonlee9420 commented Mar 7, 2022

keonlee9420 commented May 10, 2022

Reason for std and input scaling in cwt? #8

Reason for std and input scaling in cwt? #8

Comments

dunky11 commented Feb 28, 2022 • edited

keonlee9420 commented Mar 7, 2022

keonlee9420 commented May 10, 2022

dunky11 commented Feb 28, 2022 •

edited