-
Notifications
You must be signed in to change notification settings - Fork 19.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Batch Normalization Problem #1446
Comments
I think I'm in a very similar situation. I'm building a convnet for regression task with audio (spectrograms) and within a 10-20 min of training the loss converges to a certain value regardless of optimiser or other settings. I'll check it out what happens if I remove BN. |
check your layer activations, maybe you're just saturating your neurons and once you get there learning stops |
@lemuriandezapada thank you for the reply, could you elaborate? the basic structure goes like:
and the loss keeps decreasing. I've visualized the layers, it seems that after certain deconvolutional layers the activation becomes very noisy and looks like a mess... |
It already sounds rather weird to me to have the BatchNorm before the activation function. You're basically forcing half your neurons at all time to be active. Maybe that's not the best representation? I haven't worked much with it but I wouldn't put it before anything that doesn't have weights. |
@lemuriandezapada I was doing that because I saw it on a paper about the similar image segmentation task, before that, I was putting the BatchNormalization layer after RELU, but the problem still exists. |
I was also confused on whereto put BN, I think we should put BN after activations for |
after reading the paper again, I think putting the BN layer before non-linearity makes sense, and that's what they did in the paper (see Sec. 3.2):
Anyway, the BN in Keras doesn't work in my case, I'll try implementing it myself. |
is there a deconvolutional layer in keras that I'm missing? |
@keunwoochoi I think @fchollet had fixed the problem a few days ago, now the BN works without a problem :) 👍 |
Hey can you let us know how you bot the BN to work? |
I'm building an end-to-end image segmentation network whose output has the same size as the input.
When I add
BatchNormalization
layer to the model, the prediction of any image in the train/test set is strangely always the same, when I remove the BN layer, everything goes normal.I've looked into the code of
BatchNormalization
many times but I still can't figure out what's wrong, any ideas?The text was updated successfully, but these errors were encountered: