An issue about the default initialize methods. #4555

LiChenda · 2022-08-03T10:26:27Z

Hello, I'm training some custom models with ESPNet2. And I find that the default initialize method checking the bias parameters of NN by the p.dim() function. This check is not precise (some parameters my also make p.dim() == 1 be True ) and set them to zero may lead to abnormal training. (I found one of my module always output zeros and the training may dead with ReLU. After removing the following lines, the training becomes OK. )

espnet/espnet2/torch_utils/initialize.py

Lines 77 to 80 in 96bd746

    
           # bias init 
        
           for p in model.parameters(): 
        
               if p.dim() == 1: 
        
                   p.data.zero_()

The text was updated successfully, but these errors were encountered:

sw005320 · 2022-08-08T22:50:59Z

@LiChenda, thanks for the report.
Can you tell me more examples of when some parameters also make p.dim() == 1 be True?
How about making your treatment an option?

LiChenda · 2022-08-10T06:07:37Z

Some parameters like the weight $\gamma$ in BatchNorm, the weight $\alpha$ in PReLU, and some custom parameters defined with torch.nn.Parameter() may make p.dim() == 1 be True.
One possible solution might be updating these lines https://github.com/espnet/espnet/blob/96bd74641ceb463096067223d0734f70bddd8def/espnet2/torch_utils/initialize.py#L77-L80
with:

for name, p in model.named_parameters(): 
    if 'bias' in name:
        p.data.zero_()

LiChenda · 2022-08-10T06:15:18Z

BTW, this function is called by amlost all the tasks in ESPnet2, I'm not sure if these updating keeps the previously trained model reproducible.

sw005320 · 2022-08-10T11:42:00Z

for name, p in model.named_parameters(): 
    if 'bias' in name:
        p.data.zero_()

This sounds good to me.
Yes, we can make this an option and make default false in some timing.
Meanwhile, we can conduct some tests and make default true.
Can you make a PR?

LiChenda · 2022-08-10T11:54:46Z

Sure!

popcornell · 2022-08-10T12:21:47Z

for name, p in model.named_parameters(): 
    if 'bias' in name:
        p.data.zero_()

maybe make it case insensitive too just to be sure.
This is difficult to implement in a scalable way. Current solution thus makes sense.

Maybe raising a warning also could be useful so user knows that the bias is set to zero. But warning should be likely raised once (e.g. for each layer of that class) otherwise will be too verbose maybe.

b-flo · 2022-08-10T13:12:16Z

BTW, this function is called by amlost all the tasks in ESPnet2, I'm not sure if these updating keeps the previously trained model reproducible.

It should be noted that args.init is set to None by default for all tasks and most configs don't use initialization.
We currently have 1026 configs (!!). 172 configs use either chainer or xavier_uniform initialization, from which 110 are for the ASR task and 59 for the ENH task.
From what I see, it doesn't seem to impact config we usually use, at least for ASR.

LiChenda · 2022-08-15T09:40:37Z

BTW, this function is called by amlost all the tasks in ESPnet2, I'm not sure if these updating keeps the previously trained model reproducible.

It should be noted that args.init is set to None by default for all tasks and most configs don't use initialization. We currently have 1026 configs (!!). 172 configs use either chainer or xavier_uniform initialization, from which 110 are for the ASR task and 59 for the ENH task. From what I see, it doesn't seem to impact config we usually use, at least for ASR.

Thanks for pointing it out. I just made a PR for this issue, see #4574 .

LiChenda · 2022-08-24T09:45:35Z

The PR is merged, so I close this issue.

kan-bayashi · 2022-10-03T11:39:06Z

I found this issue has an impact on the most of TTS models since TTS modules is initialized using this method as a default.

espnet/espnet2/tts/fastspeech2/fastspeech2.py

Lines 470 to 475 in 14fcb2d

    
           # initialize parameters 
        
           self._reset_parameters( 
        
               init_type=init_type, 
        
               init_enc_alpha=init_enc_alpha, 
        
               init_dec_alpha=init_dec_alpha, 
        
           )

espnet/espnet2/tts/fastspeech2/fastspeech2.py

Lines 828 to 829 in 14fcb2d

    
           if init_type != "pytorch": 
        
               initialize(self, init_type)

The model including BatchNorm2d or BatchNorm1d modules should be improved with this changes.
(I'm not sure what is happened before fixing this issue...)

sw005320 added the Discussion label Aug 8, 2022

LiChenda mentioned this issue Aug 15, 2022

update checks for bias in initialization #4574

Merged

LiChenda closed this as completed Aug 24, 2022

kan-bayashi mentioned this issue Oct 3, 2022

Postnet does not update at all in FastSpeech2 #4684

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

An issue about the default initialize methods. #4555

An issue about the default initialize methods. #4555

LiChenda commented Aug 3, 2022 •

edited

sw005320 commented Aug 8, 2022

LiChenda commented Aug 10, 2022 •

edited

LiChenda commented Aug 10, 2022

sw005320 commented Aug 10, 2022

LiChenda commented Aug 10, 2022

popcornell commented Aug 10, 2022 •

edited

b-flo commented Aug 10, 2022 •

edited

LiChenda commented Aug 15, 2022

LiChenda commented Aug 24, 2022

kan-bayashi commented Oct 3, 2022

An issue about the default initialize methods. #4555

An issue about the default initialize methods. #4555

Comments

LiChenda commented Aug 3, 2022 • edited

sw005320 commented Aug 8, 2022

LiChenda commented Aug 10, 2022 • edited

LiChenda commented Aug 10, 2022

sw005320 commented Aug 10, 2022

LiChenda commented Aug 10, 2022

popcornell commented Aug 10, 2022 • edited

b-flo commented Aug 10, 2022 • edited

LiChenda commented Aug 15, 2022

LiChenda commented Aug 24, 2022

kan-bayashi commented Oct 3, 2022

LiChenda commented Aug 3, 2022 •

edited

LiChenda commented Aug 10, 2022 •

edited

popcornell commented Aug 10, 2022 •

edited

b-flo commented Aug 10, 2022 •

edited