You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi @dunky11 , thanks for your attention. I think those are good points! Here is my answer
1., 2. This line is for scaling the gradient flows to the encoder during backward. decoder_inp.detach() will keep the value but lose the graph so far, whereas self.predictor_grad * (decoder_inp - decoder_inp.detach()) will be all zeros but the gradient of the decoder_inp is kept scaled by self.predictor_grad. In this way, we can keep the value but scale the gradient only.
3. The purpose of scaling is to ease the training for the model, and the value depends on the dataset. Please refer to this article for more background.
Hey, I have some questions about your pitch predictor in cwt domain:
I have three questions:
Thanks for any help in advance!
The text was updated successfully, but these errors were encountered: