Artificial Neural network (ANN)

The structure of the neural networks contains the input layer, hidden layers and output layer. So normally the information will be received by the input signal and then it will transferred to the hidden layers where all information will be processed. Finally, after processing all information, output will be released by output layer. This is how simple artificial neural networks work.

Convolution Neural networks are ANNs (artificial neural networks). CNNs (convolutional neural networks) are made of up neurons that are connected to one another by weighted branches (weight); the training parameters of the networks are once again the weight and bias. In CNNs, the connection pattern between neurons is inspired by the structure of the visual cortex in the animal world. The individual neurons that are present in this part of the brain (visual cortex) respond to certain stimuli in a narrow region of the observation, called the receptive field. The receptive fields of different neurons are partially overlapped to cover the entire field of vision. The response of a signle neuron to stimuli taking pace in its receptive field can be mathematically approximated by a convolution operation.

CNNs contains input layers, convolution layers, pooling layers fully connected layers and output layers.


Convolution layer.

This is the main type of layer; the use of one or more of these layers in CNNs are essentials. Convolution layers have neurons which are organized in 3 dimensions, such as width, height and depth.

During the forward propagation, each filer which is spatially small (along the width and height dimensions) and extends over the entire depth of the input volumne to which it is applied, is translated or convoluted with the width and height of the input volumn producing a 2 dimensional activation map (or feature map) for that filer. As the filer moved alon ghte input area, a scalar product operation is performed between the values of the filer and those of the input portion to which it is applied.

The goal of the network is to learn activated filters in the presence of some specific type of functionality in a given spatial region of the input. The queuing of all these feature maps (for all filers), and the depth dimension form the putput volumn of a convolution layer. Each element of this volumn can be interpreted as the output of a neuron that observes only a small region of the input which shares its parameters with the other neurons that in the same feature map. This is because these values all come from the application of the same filter.

**Pooling layers**


These layers are inserted into a network to reduce the spatial size (width and height) of current representations, as well as volumnes in a specific network stage; This serves to reduce the number of parameters and the computational time of the network. It also monitors overfitting. A pooling layer operates on each depth slice of the input volumn independently to resize it spatially.

For each feature obtained in the convolutional step, a matrix will be bilt and we will find the maximum in each chosen matrix to shrink the entire input. The steps will be:


1.   Pick a window size (2 or 3)

1.   Pick a stride moving range of pixels (usually 2)
2.   Slide the window across the filtered images


2.   For each window, the maximum value will be chosen.


A pooling lyers will divide input into regions and selects a single representative value (max pooling and average pooling).

The max pool layer will select the maximum numnber of features that have been detected by the convolution layers that precede it. The output will check whether a hypothetical feature is present in a region of the previous layers or not, but not exactly where. Hence, the idea is to allow the successive layers to worn on larger selection of data. Max pooling allows for faster convergence rates, and therefore, allow us to select higher invariant features to improve the generalization performance.


There are 2 main advantages of using pooling layer. The first advantage will be to reduce the calculation of subsequent layers adn the second advantage will be increase the robustness of the features with respect to spatial position.



We will use the MNIST digits data set and we can access to this dataset via keras library

In [None]:
# Implement basic CNN
# First step is we need to import and load all neccessary libraries
import matplotlib.pyplot as plt
import numpy as np
import tensorflow as tf

In [None]:
# The second step is we need to load the data and reshape the images in a 4-dimensional matrix
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()
# Reshape
x_train = x_train.reshape(-1, 28, 28, 1)
x_test = x_test.reshape(-1, 28,28,1)
# Padding the images by 2 pixels
x_train=np.pad(x_train, ((0,0),(2,2),(2,2),(0,0)), 'constant')
x_test=np.pad(x_test, ((0,0),(2,2),(2,2),(0,0)), 'constant')

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz


The MNIST dataset includes training and test datasets. These datasets are composed of the grayscale images (integer arrays with shape (num_sample, 28,28)) and the labels (integers in the range 0-9). Images are padded by 2 pixels because the input images were 32x32.


The next step is model parameters need to be set and the deph of the image (number of channels) will be 1. The reason for that is these images are grayscale.

In [None]:
image_width= x_train[0].shape[0]
image_height= x_train[0].shape[1]
num_channels = 1 # grayscale = 1 channel
seed= 98
np.random.seed(seed)
tf.random.set_seed(seed)

The next is training and test data variables will be declared and there will be various batch sizes for traing and evaluation. These values can be changed depending on the physical memory that is avaible for training and evaluating.

In [None]:
batch_size = 100
evaluation_size = 500
epochs=300
eval_every=5

 our images need to be normalized to change the values of all pixels to a common scale



In [None]:
x_train = x_train/255
x_test= x_test/255

We will declare our model. We will have the feature extractor module composed of 2 convolutional/ReLu/max pooling layers followed by the classifier with fully connected layers. Also, to get the classifier to work, we flatten the output of the feature extractor module so we can use it in the classifier.We will use a softmaxt activation function at the last layer of the classifier. Softmax will turn numeric output (logits) into probabilities that sum to one.

In [6]:
input_data = tf.keras.Input(dtype=tf.float32, shape=(image_width,image_height, num_channels), name="INPUT")
# First Conv-ReLU-MaxPool Layer
conv1 = tf.keras.layers.Conv2D(filters=6,
                               kernel_size=5,
                               padding='VALID',
                               activation="relu",
                               name="C1")(input_data)
max_pool1 = tf.keras.layers.MaxPool2D(pool_size=2,
                                      strides=2, 
                                      padding='SAME',
                                      name="S1")(conv1)
# Second Conv-ReLU-MaxPool Layer
conv2 = tf.keras.layers.Conv2D(filters=16,
                               kernel_size=5,
                               padding='VALID',
                               strides=1,
                               activation="relu",
                               name="C3")(max_pool1)
max_pool2 = tf.keras.layers.MaxPool2D(pool_size=2,
                                      strides=2, 
                                      padding='SAME',
                                      name="S4")(conv2)
# Flatten Layer
flatten = tf.keras.layers.Flatten(name="FLATTEN")(max_pool2)
# First Fully Connected Layer
fully_connected1 = tf.keras.layers.Dense(units=120,
                                         activation="relu",
                                         name="F5")(flatten)
# Second Fully Connected Layer
fully_connected2 = tf.keras.layers.Dense(units=84,
                                         activation="relu",
                                         name="F6")(fully_connected1)
# Final Fully Connected Layer
final_model_output = tf.keras.layers.Dense(units=10,
                                           activation="softmax",
                                           name="OUTPUT"
                                           )(fully_connected2)
    
model = tf.keras.Model(inputs= input_data, outputs=final_model_output)
