Fix behaviour of unfrozen BatchNormalization layer (resolves #46) #47

Callidior · 2018-11-06T18:11:33Z

Previously, if BatchNormalization was initialized with BatchNormalization(freeze=False), its behaviour was not equivalent to the standard BatchNormalization layer, as one would expect. Instead, it was always forced to be in training mode, providing wrong validation results.

This PR does not change the behaviour for freeze=True, but makes the layer equivalent to the standard BatchNormalization layer from Keras for freeze=False.

Previously, if `BatchNormalization` was been initialized with `BatchNormalization(freeze=False)`, its behaviour was not equivalent to the standard `BatchNormalization` layer, as one would expect. Instead, it was always forced to be in training mode, providing wrong validation results. This PR does not change the behaviour for `freeze=True`, but makes the layer equivalent to the standard `BatchNormalization` layer from Keras for `freeze=False`.

hgaiser · 2018-11-06T18:35:10Z

Doesn't this only change the behaviour if freeze=True?

Also, what accuracy are you getting now?

Callidior · 2018-11-06T18:46:11Z

No, the behaviour for freeze=True is not changed. Previously, we called the method of the superclass with training=(not self.freeze), which would evaluate to training=False. Now, if self.freeze is True, we set training=False, as before.

If self.freeze is False, however, we now have training=None (the default) instead of training=True.

It's still training, but I now already have 12% validation accuracy after the first epoch and 40% after 4 epochs, which is already higher than anything I got without the modifications made in this PR.

Callidior · 2018-11-07T06:10:32Z

By the way, I would question the example in the README. The model is initialized there with freeze_bn=True (the default), which fixes the BatchNormalization layers to test using the initialization parameters. This should be equivalent to using no batch normalization at all.

I also tried this first for my ImageNet training, since the README does so, but it didn't work.

Callidior · 2018-11-08T14:11:39Z

I now finally obtained 68% validation accuracy, which is much closer to what I got with the bundled ResNet-50 than before.

0x00b1 · 2018-11-28T19:53:03Z

Awesome. Thanks, @Callidior.

Callidior mentioned this pull request Nov 6, 2018

Training ResNet50 on ImageNet does not achieve reasonable performance #46

Closed

Callidior changed the title ~~Fix behaviour of unfrozen BatchNormalization layer~~ Fix behaviour of unfrozen BatchNormalization layer (resolves #46) Nov 6, 2018

0x00b1 merged commit 7e2e67b into broadinstitute:master Nov 28, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix behaviour of unfrozen BatchNormalization layer (resolves #46) #47

Fix behaviour of unfrozen BatchNormalization layer (resolves #46) #47

Callidior commented Nov 6, 2018 •

edited

Loading

hgaiser commented Nov 6, 2018

Callidior commented Nov 6, 2018

Callidior commented Nov 7, 2018 •

edited

Loading

Callidior commented Nov 8, 2018

0x00b1 commented Nov 28, 2018

Fix behaviour of unfrozen BatchNormalization layer (resolves #46) #47

Fix behaviour of unfrozen BatchNormalization layer (resolves #46) #47

Conversation

Callidior commented Nov 6, 2018 • edited Loading

hgaiser commented Nov 6, 2018

Callidior commented Nov 6, 2018

Callidior commented Nov 7, 2018 • edited Loading

Callidior commented Nov 8, 2018

0x00b1 commented Nov 28, 2018

Callidior commented Nov 6, 2018 •

edited

Loading

Callidior commented Nov 7, 2018 •

edited

Loading