Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reason for std and input scaling in cwt? #8

Closed
dunky11 opened this issue Feb 28, 2022 · 2 comments
Closed

Reason for std and input scaling in cwt? #8

dunky11 opened this issue Feb 28, 2022 · 2 comments

Comments

@dunky11
Copy link

dunky11 commented Feb 28, 2022

Hey, I have some questions about your pitch predictor in cwt domain:

decoder_inp = decoder_inp.detach() + self.predictor_grad * (decoder_inp - decoder_inp.detach())
pitch_padding = mel2ph == 0


if self.pitch_type == "cwt":
    pitch_padding = None
    cwt = cwt_out = self.cwt_predictor(decoder_inp) * control
    stats_out = self.cwt_stats_layers(encoder_out[:, 0, :])  # [B, 2]
    mean = f0_mean = stats_out[:, 0]
    std = f0_std = stats_out[:, 1]
    cwt_spec = cwt_out[:, :, :10]
    if f0 is None:
        std = std * self.cwt_std_scale
        f0 = cwt2f0_norm(

I have three questions:

  1. What is the reason for the first line? Isn't the right side always zero and therefore no gradients flow back?
  2. Why do you scale inputs by 0.1?
  3. Why did you scale ground truth std by 0.8?

Thanks for any help in advance!

@keonlee9420
Copy link
Owner

Hi @dunky11 , thanks for your attention. I think those are good points! Here is my answer
1., 2. This line is for scaling the gradient flows to the encoder during backward. decoder_inp.detach() will keep the value but lose the graph so far, whereas self.predictor_grad * (decoder_inp - decoder_inp.detach()) will be all zeros but the gradient of the decoder_inp is kept scaled by self.predictor_grad. In this way, we can keep the value but scale the gradient only.
3. The purpose of scaling is to ease the training for the model, and the value depends on the dataset. Please refer to this article for more background.

@keonlee9420
Copy link
Owner

Close due to inactivity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants