### Normalization Layers

The basic idea behind these layers is to normalize the output of an activation layer to improve the convergence during training

In [1]:
!pip install -q -U tensorflow-addons

[?25l[K     |▎                               | 10kB 28.4MB/s eta 0:00:01[K     |▋                               | 20kB 6.2MB/s eta 0:00:01[K     |█                               | 30kB 7.5MB/s eta 0:00:01[K     |█▏                              | 40kB 7.8MB/s eta 0:00:01[K     |█▌                              | 51kB 7.2MB/s eta 0:00:01[K     |█▉                              | 61kB 8.1MB/s eta 0:00:01[K     |██                              | 71kB 8.5MB/s eta 0:00:01[K     |██▍                             | 81kB 8.3MB/s eta 0:00:01[K     |██▊                             | 92kB 8.0MB/s eta 0:00:01[K     |███                             | 102kB 8.6MB/s eta 0:00:01[K     |███▎                            | 112kB 8.6MB/s eta 0:00:01[K     |███▋                            | 122kB 8.6MB/s eta 0:00:01[K     |███▉                            | 133kB 8.6MB/s eta 0:00:01[K     |████▏                           | 143kB 8.6MB/s eta 0:00:01[K     |████▌                     

In [2]:
import tensorflow as tf
import tensorflow_addons as tfa

In [3]:
mnist = tf.keras.datasets.mnist

(x_train, y_train),(x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz


### Group Normalization 

Group Normalization(GN) divides the channels of your inputs into smaller sub groups and normalizes these values based on their mean and variance. Since GN works on a single example this technique is batchsize independent.

GN experimentally scored closed to batch normalization in image classification tasks. It can be beneficial to use GN instead of Batch Normalization in case your overall batch_size is low, which would lead to bad performance of batch normalization

In [4]:
model = tf.keras.models.Sequential([
  tf.keras.layers.Reshape((28,28,1), input_shape = (28,28)),
  tf.keras.layers.Conv2D(filters = 10, kernel_size = (3,3), data_format = "channels_last"),
  # Groupnorm Layer
  tfa.layers.GroupNormalization(groups = 5, axis = 3),
  tf.keras.layers.Flatten(),
  tf.keras.layers.Dense(128, activation = 'relu'),
  tf.keras.layers.Dropout(0.2),
  tf.keras.layers.Dense(10, activation = 'softmax')
])

model.compile(optimizer = 'adam',
              loss = 'sparse_categorical_crossentropy',
              metrics = ['accuracy'])

model.fit(x_test, y_test)



<tensorflow.python.keras.callbacks.History at 0x7f4fd0433c88>

### Instance Normalization 

Instance Normalization is special case of group normalization where the group size is the same size as the channel size (or the axis size).

Experimental results show that instance normalization performs well on style transfer when replacing batch normalization. Recently, instance normalization has also been used as a replacement for batch normalization in GANs.

Applying InstanceNormalization after a Conv2D Layer and using a uniformed initialized scale and offset factor.

In [None]:
model = tf.keras.models.Sequential([
  tf.keras.layers.Reshape((28,28,1), input_shape = (28,28)),
  tf.keras.layers.Conv2D(filters = 10, kernel_size = (3,3), data_format = "channels_last"),
  # LayerNorm Layer
  tfa.layers.InstanceNormalization(axis = 3, 
                                   center = True, 
                                   scale = True,
                                   beta_initializer = "random_uniform",
                                   gamma_initializer = "random_uniform"),
  tf.keras.layers.Flatten(),
  tf.keras.layers.Dense(128, activation = 'relu'),
  tf.keras.layers.Dropout(0.2),
  tf.keras.layers.Dense(10, activation = 'softmax')
])

model.compile(optimizer = 'adam',
              loss = 'sparse_categorical_crossentropy',
              metrics = ['accuracy'])

model.fit(x_test, y_test)