-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wrong predictions when using BatchNormalization with training flag set #562
Comments
The same problem happens in TensorFlow as well. |
@zaidalyafeai you mean tf.keras? |
No, this definition https://www.tensorflow.org/api_docs/python/tf/layers/batch_normalization |
@caisq This is a quote from the TensorFlow page
|
@caisq, I am trying to understand the source code. Could you please explain to me what is broadcasting ? |
@caisq did we ever resolve this issue? I assume tf.layers is doing the right thing in TensorFlow.. |
I resolved this issue by modifying the source code and changing the definition of batch norm during inference time. My pix2pix demo is based on that! |
Training with BatchNormazliation should be working. See the ACGAN example under review at I'd like to see the code you're using and the change you made in order for it to work, @zaidalyafeai , if possible. |
@caisq, I may have accidentally deleted the source code :/ but the idea is simple I just forced batch norm layer to use the statistics of the input sample as if it was training. So, I didn't add any code just re-routing. |
Closing this due to lack of activity, feel to reopen. Thank you |
To get help from the community, check out our Google group.
TensorFlow.js version
latest
Browser version
Version 66.0.3359.139
Describe the problem or feature request
Batchnorm has wrong predictions when setting training = 1
Code to reproduce the bug / link to feature request
I created this simple keras model
After training, the batch norm layer weights are
After running the prediction
model.predict(np.zeros((1, 2, 2, 3)))
The output
On the browser the weights are the same but the activations are
Explanation
on keras when setting training = 1, it uses the statics of the prediction sample
Tensorflow.js uses the stored moving mean and variance of the training data
The text was updated successfully, but these errors were encountered: