# Standard Neural Networks Tutorial 2

This tutorial's aim is to show how to create a custom layer. It uses the MNIST dataset.

We will first create a layer that we will call __Linear__ which is the same as the Dense one i.e. calculates __*y = wx + b*__  


In [0]:
import numpy as np
import tensorflow as tf

import datetime

from tensorflow.keras.datasets.mnist import load_data
from tensorflow.keras.models import Sequential, Model
from tensorflow.keras.layers import Layer, Activation, Input, Flatten

import matplotlib.pyplot as plt
from matplotlib.pyplot import imshow

from PIL import Image, ImageOps

In [0]:
print("TensorFlow version: " + str(tf.__version__))
print("Keras version: " + str(tf.keras.__version__))

# Loading and Preprocessing

Let's start by loading the data from Keras dataset

The dataset is composed of:


*   60000 images for training.
*   10000 images for testing

Each image is a grayscale one of size 28x28.  
Instead of changing the dimensions to make each image a vector, we will keep the data as it is and add a __Flatten__ layer at the beginning of the model that will do the same job.
Data dimension will be:  
* X: (?, 28, 28)
*   Y: (?, )

In [0]:
# Reading datasets
(x_train, y_train), (x_test, y_test) = load_data()


x_train = x_train / 255
x_test  = x_test / 255

print("X Train dimensions: " + str(x_train.shape))
print("Y Train dimensions: " + str(y_train.shape))

print("X Test dimensions: " + str(x_test.shape))
print("Y Test dimensions: " + str(y_test.shape))

In [0]:
# Printing a sample image
i=np.random.randint(0, x_train.shape[0] - 1)
plt.title('Title: Plotting image:%d .... Value is %d' % (i,y_train[i]))
plt.imshow(x_train[i], cmap='gray')
plt.show()

# Custom Layer

Let's create a custom layer that we will call __Linear__ which implements the same functionality as Dense i.e. __*y = wx + b*__  

Each custom layer must inherit the Layer Class defined in tensorflow.keras.layers and must have the following functions:
* **\__init__** which initialises the layer
* **build** in which we define the parameters
* **call** which does the layer forward propagation
* **compute_output_shape** which computes the shape of the output tensor given the input tensor shape

There are other functions in case we need to save and load the model

In [0]:
class Linear(Layer):
  def __init__(self, units, **kwargs):
    self.units = units
    super(Linear, self).__init__(**kwargs)

  def build(self, input_shape):
    shape = tf.TensorShape((input_shape[-1], self.units)).as_list()
    self.w = self.add_weight(name='kernel',
                             shape=shape,
                             initializer='glorot_uniform',
                             trainable=True)
    self.b = self.add_weight(name='bias',
                             shape=(self.units,),
                             initializer='zeros',
                             trainable=True)
    super(Linear, self).build(input_shape)
  
  def call(self, inputs):
    output = tf.matmul(inputs, self.w) + self.b
    return output

  def compute_output_shape(self, input_shape):
    shape = tf.TensorShape(input_shape).as_list()
    shape[-1] = self.units
    return tf.TensorShape(shape)

# Model


The first model is the same as the one we saw earlier, but instead of using Keras Dense layer, we will use the Linear layer that we defined earlier.  
Here we will usehe Functional Model which always starts with an __Input__ layer.  
__Note:__ This model will have the same number of parameters as if we used the Keras __Dense__ Layer

In [0]:
X = Input(shape=x_train.shape[1:])

Y = Flatten()(X)
Y = Linear(100)(Y)
Y = Activation('relu')(Y)
Y = Linear(10)(Y)
Y = Activation('softmax')(Y)

model = Model(inputs = X, outputs = Y)

# Compilation

Next we need to compile the model by calling __model.compile()__ and specifying the following:
* __Optimizer__ We can choose stochastic gradient descent __SGD__ or any other more powerful optimisation method like __Adam__
* __Loss function__ The sparse_categorical_crossentropy
* __Metrics__ to use while training. Here the accuracy between the real y and the predicted one  

__model.summary()__ summarises the model showing the layers along with their parameters

In [0]:
model.compile(optimizer="Adam", loss='sparse_categorical_crossentropy', metrics=["accuracy"])

model.summary()

# Model Training

We train the model by calling __model.fit()__ and giving it the training x_train and y_train  
We can also validate the model by giving the x_test and y_test

The model will be trained and the parameters will be updated based on x_train and y_train **only**. x_test and y_test will be used to evaluate how well the model is doing on the validation set

In [0]:
validation = True
if validation == True:
  eval_data = (x_test, y_test)
else:
  eval_data = None

history = model.fit(x =x_train, y = y_train, epochs=30, batch_size=1024, validation_data=eval_data)

# Training Results

In [0]:
print ("Training Accuracy = %.4f"  % (history.history['acc'][-1]))
print ("Training Loss = %.4f"  % (history.history['loss'][-1]))

if validation is True:
  print ("Validation Accuracy = %.4f"  % (history.history['val_acc'][-1]))
  print ("Validation Loss = %.4f"  % (history.history['val_loss'][-1]))

# Plotting Results

Here we plot the training and evaluation metrics' history.
2 Plots:
* One for loss
* One for accuracy

In [0]:
  plt.figure(figsize=(12,5))
  plt.subplot(1, 2, 1)
  plt.plot(history.history['acc'], 'b', label='Training acc')
  if validation is True:
    plt.plot(history.history['val_acc'], 'r', label='Validating acc')
  plt.title('Model Accuracy')
  plt.ylabel('Accuracy')
  plt.xlabel('Epoch')
  plt.legend()

  plt.subplot(1, 2, 2)
  plt.plot(history.history['loss'], 'b', label="Training loss")
  if validation is True:
    plt.plot(history.history['val_loss'], 'r', label="Validating loss")
  plt.title('Model Loss')
  plt.ylabel('Loss')
  plt.xlabel('Epoch')
  plt.legend()
  
  plt.show()

# Early Stopping

The number of epochs to run a model is very crutial.   
Sometimes, we select a small number of epochs which leads to high loss and poor accuracy and some other time we choose a large number of epochs, reach the accuracy that we want but sill have to wait for it to finish.  
Wouldn't be nice if there is a way to stop training on a condition? Well there is using Keras Callbacks.  
Let's create a subclass of Callback which stops training where acc reaches 99%

In [0]:
class MyCallback(tf.keras.callbacks.Callback):
  def on_epoch_end(self, epoch, logs={}):
    if logs.get('acc') > 0.99:
      print("\nReached 99% accuracy ... Stopping the training!")
      self.model.stop_training = True

In [0]:
callbacks = MyCallback()

m = Sequential()
m.add(Flatten(input_shape=x_train.shape[1:]))
m.add(Linear(100))
m.add(Activation('relu'))
m.add(Linear(10))
m.add(Activation('softmax'))

m.compile(optimizer="Adam", loss='sparse_categorical_crossentropy', metrics=["accuracy"])

m.fit(x =x_train, y = y_train, epochs=100, batch_size=1024, validation_data=(x_test, y_test), callbacks=[callbacks])

# Summary

In this tutorial, we learnt how to:
* Create a custom layer
* Use Keras Functional Model
* Create a Callback to stop a model's training when accuracy reaches a desired value