New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Model params change with 0 learning rate #2293
Comments
Probably because BatchNorm changes its running averages |
Try with |
I observe the same thing with BatchNorm even with model.eval() |
The running mean and variance of BatchNorm layer will change epoch to epoch, even when LR is 0. To get the same loss, you can set the momentum parameter of BatchNorm layer to 0 (in conjunction with LR being 0). This should fix the problem. |
Do you have |
In the training script I do the testing after I missed the fact that the running mean/std would change on every iteration. On setting |
thank ! I use this to fix bn in training: def fix_bn(m):
classname = m.__class__.__name__
if classname.find('BatchNorm') != -1:
m.eval() |
When I use parameter server training and set the learning rate to 0, the loss changed a lot between epochs: |
code cleaning fixing sign-comparator
While training a convnet with SGD, the train/test loss and error change when using a learning rate of 0.
Log:
This happens with DenseNet and ResNet.
Training script: https://gist.github.com/vabh/50c12ca28619836e32a869aa0e52ea38
The architecture can be chosen in lines 52-65
Links to implementation:
DenseNet: https://github.com/bamos/densenet.pytorch
DenseNet: https://github.com/andreasveit/densenet-pytorch
ResNeXt: https://github.com/prlz77/ResNeXt.pytorch
PyTorch version:
0.1.12_2
The text was updated successfully, but these errors were encountered: