Batch Normalization Problem #1446

KlaymenGC · 2016-01-11T20:39:56Z

I'm building an end-to-end image segmentation network whose output has the same size as the input.

When I add BatchNormalization layer to the model, the prediction of any image in the train/test set is strangely always the same, when I remove the BN layer, everything goes normal.

I've looked into the code of BatchNormalization many times but I still can't figure out what's wrong, any ideas?

The text was updated successfully, but these errors were encountered:

keunwoochoi · 2016-01-11T21:37:35Z

I think I'm in a very similar situation. I'm building a convnet for regression task with audio (spectrograms) and within a 10-20 min of training the loss converges to a certain value regardless of optimiser or other settings. I'll check it out what happens if I remove BN.

lemuriandezapada · 2016-01-14T16:32:06Z

check your layer activations, maybe you're just saturating your neurons and once you get there learning stops

KlaymenGC · 2016-01-14T16:48:09Z

@lemuriandezapada thank you for the reply, could you elaborate? the basic structure goes like:

Convolutional Layer
BatchNormalization
RELU
...
Deconvolutional Layer
BatchNormalization
ELU
...
Softmax

and the loss keeps decreasing. I've visualized the layers, it seems that after certain deconvolutional layers the activation becomes very noisy and looks like a mess...

lemuriandezapada · 2016-01-14T16:56:55Z

It already sounds rather weird to me to have the BatchNorm before the activation function. You're basically forcing half your neurons at all time to be active. Maybe that's not the best representation? I haven't worked much with it but I wouldn't put it before anything that doesn't have weights.
It would make much more sense to put it after the activations.

KlaymenGC · 2016-01-14T17:04:12Z

@lemuriandezapada I was doing that because I saw it on a paper about the similar image segmentation task, before that, I was putting the BatchNormalization layer after RELU, but the problem still exists.

keunwoochoi · 2016-01-14T17:23:26Z

I was also confused on whereto put BN, I think we should put BN after activations for Dense layer, and before activations for Convolution2D layer. At least it seems what they're saying.

KlaymenGC · 2016-01-17T10:57:30Z

after reading the paper again, I think putting the BN layer before non-linearity makes sense, and that's what they did in the paper (see Sec. 3.2):

This formulation covers both fully-connected and convolutional layers. We add the BN transform
immediately before the nonlinearity, by normalizing x = Wu + b

Anyway, the BN in Keras doesn't work in my case, I'll try implementing it myself.

mdering · 2016-01-19T22:28:44Z

is there a deconvolutional layer in keras that I'm missing?

KlaymenGC · 2016-01-21T12:16:06Z

@keunwoochoi I think @fchollet had fixed the problem a few days ago, now the BN works without a problem :) 👍

rvenugop · 2017-03-22T07:44:20Z

Hey can you let us know how you bot the BN to work?

KlaymenGC closed this as completed Jan 21, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Batch Normalization Problem #1446

Batch Normalization Problem #1446

KlaymenGC commented Jan 11, 2016

keunwoochoi commented Jan 11, 2016

lemuriandezapada commented Jan 14, 2016

KlaymenGC commented Jan 14, 2016

lemuriandezapada commented Jan 14, 2016

KlaymenGC commented Jan 14, 2016

keunwoochoi commented Jan 14, 2016

KlaymenGC commented Jan 17, 2016

mdering commented Jan 19, 2016

KlaymenGC commented Jan 21, 2016

rvenugop commented Mar 22, 2017

Batch Normalization Problem #1446

Batch Normalization Problem #1446

Comments

KlaymenGC commented Jan 11, 2016

keunwoochoi commented Jan 11, 2016

lemuriandezapada commented Jan 14, 2016

KlaymenGC commented Jan 14, 2016

lemuriandezapada commented Jan 14, 2016

KlaymenGC commented Jan 14, 2016

keunwoochoi commented Jan 14, 2016

KlaymenGC commented Jan 17, 2016

mdering commented Jan 19, 2016

KlaymenGC commented Jan 21, 2016

rvenugop commented Mar 22, 2017