<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Batch-Normalization" data-toc-modified-id="Batch-Normalization-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>Batch Normalization</a></span></li><li><span><a href="#Batch-Normalization-layer" data-toc-modified-id="Batch-Normalization-layer-2"><span class="toc-item-num">2&nbsp;&nbsp;</span>Batch Normalization layer</a></span></li><li><span><a href="#Batch-Normalization-layer" data-toc-modified-id="Batch-Normalization-layer-3"><span class="toc-item-num">3&nbsp;&nbsp;</span>Batch Normalization layer</a></span></li><li><span><a href="#LeNet-5-architecture" data-toc-modified-id="LeNet-5-architecture-4"><span class="toc-item-num">4&nbsp;&nbsp;</span>LeNet-5 architecture</a></span></li></ul></div>

## Batch Normalization
## Batch Normalization layer

Batch normalization layer is <b>used before the activation layer </b> (according to the authors' original paper), instead of after activation layer.

<img src='../images/batch.jpg'>
<p>Source: <a href='https://arxiv.org/pdf/1502.03167.pdf'>Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift</a></p>

 for more on Batch Normalization
<p><a href='https://arxiv.org/pdf/1502.03167.pdf'>Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift</a></p>
<p><a href='https://arxiv.org/pdf/1806.02375.pdf'>Understanding Batch Normalization</a></p>

<p><a href=></a></p>

In [1]:
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.datasets import mnist
from tensorflow.keras.utils import to_categorical

In [2]:
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

In [3]:
train_images = train_images.reshape((60000, 28, 28, 1)).astype('float32')/255
test_images = test_images.reshape((10000, 28, 28, 1)).astype('float32')/255
train_labels = to_categorical(train_labels)
test_labels = to_categorical(test_labels)


## Batch Normalization layer

Batch normalization layer is <b>used before the activation layer </b> (according to the authors' original paper), instead of after activation layer.


## LeNet-5 architecture
we are going to integrate batch normalization into the LeNet-5 architecture displayed below
<img src='../images/lenet5.jpg'>
 (source: Hands-On Computer Vision with TensorFlow 2 (Leverage deep learning to create powerful image processing apps with TensorFlow 2.0 and Keras) by Benjamin Planche Eliot Andres page 94)
  

In [4]:
lenet=keras.models.Sequential([
        layers.Conv2D(filters=6,kernel_size=5,padding='same',input_shape=(28,28,1)),
        layers.BatchNormalization(),
        layers.Activation('relu'),
        layers.MaxPool2D(pool_size=2,strides=2),
        layers.Conv2D(filters=16,kernel_size=5),
        layers.BatchNormalization(),
        layers.Activation('relu'),
        layers.MaxPool2D(pool_size=2,strides=2),
        layers.Flatten(),
        layers.Dense(120),
        layers.BatchNormalization(),
        layers.Activation('relu'),
        layers.Dense(84),
        layers.BatchNormalization(),
        layers.Activation('relu'),
        layers.Dense(10),
       layers.BatchNormalization(),
        layers.Activation('softmax')
    ])

In [5]:
lenet.compile(optimizer='rmsprop',loss='categorical_crossentropy',metrics=['accuracy'])

In [6]:
lenet.fit(train_images,train_labels,batch_size=1000,epochs=6)

Train on 60000 samples
Epoch 1/6
Epoch 2/6
Epoch 3/6
Epoch 4/6
Epoch 5/6
Epoch 6/6


<tensorflow.python.keras.callbacks.History at 0x212306d6160>

In [7]:
test_loss, test_acc = lenet.evaluate(test_images, test_labels)



In [8]:
print(test_acc)

0.9752
