add Batch Normalization immediately before non-linearity or after in Keras? #5465

xiaoming-qxm · 2017-02-21T06:15:57Z

def conv2d_bn(x, nb_filter, nb_row, nb_col,
              border_mode='same', subsample=(1, 1),
              name=None):
    '''Utility function to apply conv + BN.
    '''

    x = Convolution2D(nb_filter, nb_row, nb_col,
                      subsample=subsample,
                      activation='relu',
                      border_mode=border_mode,
                      name=conv_name)(x)
    x = BatchNormalization(axis=bn_axis, name=bn_name)(x)
    return x

When I use official inception_v3 model in keras, I find that they use BatchNormalization after 'relu' nonlinearity as above code script.

But in the Batch Normalization paper, the authors said

we add the BN transform immediately before the nonlinearity, by
normalizing x=Wu+b.

Then I view the implementation of inception in tensorflow which add BN immediately before the nonlinearity as they said. For more details in inception ops.py

I'm confused. Why do people use above style in Keras other than the following?

def conv2d_bn(x, nb_filter, nb_row, nb_col,
              border_mode='same', subsample=(1, 1),
              name=None):
    '''Utility function to apply conv + BN.
    '''

    x = Convolution2D(nb_filter, nb_row, nb_col,
                      subsample=subsample,
                      border_mode=border_mode,
                      name=conv_name)(x)
    x = BatchNormalization(axis=bn_axis, name=bn_name)(x)
    x = Activation('relu')(x)
    return x

In the Dense case:

x = Dense(1024, name='fc')(x)
x = BatchNormalization(axis=bn_axis, name=bn_name)(x)
x = Activation('relu')(x)

The text was updated successfully, but these errors were encountered:

achalshah20 · 2017-02-21T18:20:33Z

Ideally, BN should be performed before non linearity. But I have seen some of the papers, where BN performed better if we apply after 'Relu'.

Check this: https://github.com/ducha-aiki/caffenet-benchmark/blob/master/batchnorm.md

stale · 2017-05-23T18:30:47Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs, but feel free to re-open it if needed.

Mickky666 · 2017-12-12T04:48:51Z

I'm reading the code of DCGAN, it seems that they used BN between the convolutional layer and lrelu. However, according to Jeremy Howard http://forums.fast.ai/t/questions-about-batch-normalization/230/2. He claims that we should use after non linearity.

stale bot added the stale label May 23, 2017

stale bot closed this as completed Jun 22, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add Batch Normalization immediately before non-linearity or after in Keras? #5465

add Batch Normalization immediately before non-linearity or after in Keras? #5465

xiaoming-qxm commented Feb 21, 2017 •

edited

achalshah20 commented Feb 21, 2017

stale bot commented May 23, 2017

Mickky666 commented Dec 12, 2017

add Batch Normalization immediately before non-linearity or after in Keras? #5465

add Batch Normalization immediately before non-linearity or after in Keras? #5465

Comments

xiaoming-qxm commented Feb 21, 2017 • edited

achalshah20 commented Feb 21, 2017

stale bot commented May 23, 2017

Mickky666 commented Dec 12, 2017

xiaoming-qxm commented Feb 21, 2017 •

edited