-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
An issue about the default initialize methods. #4555
Comments
@LiChenda, thanks for the report. |
Some parameters like the for name, p in model.named_parameters():
if 'bias' in name:
p.data.zero_() |
BTW, this function is called by amlost all the tasks in ESPnet2, I'm not sure if these updating keeps the previously trained model reproducible. |
This sounds good to me. |
Sure! |
for name, p in model.named_parameters():
if 'bias' in name:
p.data.zero_() maybe make it case insensitive too just to be sure. Maybe raising a warning also could be useful so user knows that the bias is set to zero. But warning should be likely raised once (e.g. for each layer of that class) otherwise will be too verbose maybe. |
It should be noted that |
Thanks for pointing it out. I just made a PR for this issue, see #4574 . |
The PR is merged, so I close this issue. |
I found this issue has an impact on the most of TTS models since TTS modules is initialized using this method as a default. espnet/espnet2/tts/fastspeech2/fastspeech2.py Lines 470 to 475 in 14fcb2d
espnet/espnet2/tts/fastspeech2/fastspeech2.py Lines 828 to 829 in 14fcb2d
The model including |
Hello, I'm training some custom models with ESPNet2. And I find that the default initialize method checking the bias parameters of NN by the
p.dim()
function. This check is not precise (some parameters my also makep.dim() == 1
beTrue
) and set them to zero may lead to abnormal training. (I found one of my module always output zeros and the training may dead with ReLU. After removing the following lines, the training becomes OK. )espnet/espnet2/torch_utils/initialize.py
Lines 77 to 80 in 96bd746
The text was updated successfully, but these errors were encountered: