Skip to content

Fix batch normalization to be able to deal with train() and eval() mo…#806

Merged
dlibenzi merged 1 commit into
masterfrom
bn_fixes
Jul 10, 2019
Merged

Fix batch normalization to be able to deal with train() and eval() mo…#806
dlibenzi merged 1 commit into
masterfrom
bn_fixes

Conversation

@dlibenzi
Copy link
Copy Markdown
Collaborator

@dlibenzi dlibenzi commented Jul 9, 2019

…des.

@dlibenzi dlibenzi requested review from asuhan, jysohn23 and taylanbil July 9, 2019 08:46
Copy link
Copy Markdown
Contributor

@asuhan asuhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this fixes #220, right?

How widespread is the use of running mean and variance with training, iow does this change behavior for our ResNet50 benchmark?

Comment thread torch_xla/csrc/ops/moving_average.h Outdated
@dlibenzi
Copy link
Copy Markdown
Collaborator Author

I believe this fixes #220, right?

Also, yes. I had not noticed that bug.
This born from an investigation Daniel and Taylan did on BN.

How widespread is the use of running mean and variance with training, iow does this change behavior for our ResNet50 benchmark?

I have not measured resnet50 yet, but this is something we cannot leave as is.
BN uses only the mean/variance of the current mini-batch otherwise, and worse, if user code use .train() and .eval() (which they do), accuracy drops massively (MNIST from 1st EPOCH 96% to 33%).

@dlibenzi dlibenzi merged commit 257ddd1 into master Jul 10, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants