## Convolutional Neural Networks Using Tf
In this notebook, I am solidyfing the theoritical aspects by building simple convolutional neural networks using TensorFlow.
### Links: 
* [Flatten layer](https://www.tensorflow.org/api_docs/python/tf/keras/layers/Flatten)
* [Fully-connected layer](https://www.tensorflow.org/api_docs/python/tf/keras/layers/Dense)
* [ReLU](https://www.tensorflow.org/api_docs/python/tf/keras/layers/ReLU)
* [Convolution Layer (2D)](https://www.tensorflow.org/api_docs/python/tf/keras/layers/Conv2D)
* [Padding layer (2D)](https://www.tensorflow.org/api_docs/python/tf/keras/layers/ZeroPadding2D) 
* [Max Pooling layer (2D)](https://www.tensorflow.org/api_docs/python/tf/keras/layers/MaxPool2D)
* [model Evaluation](https://www.tutorialspoint.com/keras/keras_model_evaluation_and_prediction.htm)
* [Batch Normalization Layer](https://www.bing.com/search?q=tf+keras+batch+normalization&cvid=490c6233dd0f4713b72790c8e5700313&aqs=edge..69i57j69i60.4340j0j1&pglt=515&FORM=ANNTA1&PC=U531)
* [Add layers](https://www.tensorflow.org/api_docs/python/tf/keras/layers/Add)
* [Random initializer](tf.keras.initializers.RandomUniform) 

In [None]:

import tensorflow as tf
import tensorflow.keras.layers as tfl # tensor flow layers
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import h5py

In [None]:
# well let's try to build a convolutional neural network

model = tf.keras.Sequential([
    tfl.ZeroPadding2D(3, input_shape=(64, 64, 3)), # padding both all 4 sides with 3
    tfl.Conv2D(filters=32, kernel_size=(7, 7), strides=(1, 1)), # creates a convuolutional layer with 32 filters of size (7,7) and stride value 1
    tfl.BatchNormalization(axis=3),
    tfl.ReLU(), # performs the activation function on the input values coming from the previous layer
    tfl.MaxPool2D(), # performs Max pooling operation, the default window size is (2, 2)
    tfl.Flatten(), # flattens the output of the previous layer
    tfl.Dense(1, activation='sigmoid') # a fully-connected layer with one unit and a sigmoid activation function
])
 

In [None]:
model.compile(optimizer='adam', loss=tf.keras.losses.BinaryCrossentropy(from_logits=False), metrics=['accuracy'])
# displaying the inner details of the model seems like a reasonable idea
print(model.summary())


### Remarks
Let's explain the output of the previous cell:
* zero Padding: the layer receives an input of (64, 64, 3), as the padding is only 2D then only the first two dimensions are concerned. The padding is 3 thus the new dimensions are $(64 + 3\cdot 2, 64 + 3 \cdot 2, 3)$ = $(70, 70, 3)$
* since the convolutional layer has a stride $s=1$ and a $32$ filters of size $f = 7$ then the output's shape is $(70 - 7 + 1, 70 - 7 + 1, 32)$ = $(64, 64, 32)$
* the batch normalization layer does not change its input's shape as it normalizes the values relatively to the whole batch
* the Relu layer is an activation layer that modifies the values without affecting the shape                       
* MaxPooling layer: 


In [None]:
def load_happy_dataset(data_train_path, data_test_path):
    
    train_dataset = h5py.File(data_train_path, "r")
    train_set_x_orig = np.array(train_dataset["train_set_x"][:]) # train set features
    train_set_y_orig = np.array(train_dataset["train_set_y"][:]) # train set labels

    test_dataset = h5py.File(data_test_path, "r")
    test_set_x_orig = np.array(test_dataset["test_set_x"][:]) # test set features
    test_set_y_orig = np.array(test_dataset["test_set_y"][:]) # test set labels

    classes = np.array(test_dataset["list_classes"][:]) # the list of classes
    
    train_set_y_orig = train_set_y_orig.reshape((1, train_set_y_orig.shape[0]))
    test_set_y_orig = test_set_y_orig.reshape((1, test_set_y_orig.shape[0]))
    
    return train_set_x_orig, train_set_y_orig, test_set_x_orig, test_set_y_orig, classes


In [None]:
data_train_path = 'utility_files/train_happy.h5'
data_test_path = 'utility_files/test_happy.h5'

# let's load the data to train the model
X_train_orig, Y_train_orig, X_test_orig, Y_test_orig, classes = load_happy_dataset(data_train_path, data_test_path)

# Normalize image vectors
X_train = X_train_orig/255.

X_test = X_test_orig/255.

# Reshape
Y_train = Y_train_orig.T
Y_test = Y_test_orig.T

print ("number of training examples = " + str(X_train.shape[0]))
print ("number of test examples = " + str(X_test.shape[0]))
print ("X_train shape: " + str(X_train.shape))
print ("Y_train shape: " + str(Y_train.shape))
print ("X_test shape: " + str(X_test.shape))
print ("Y_test shape: " + str(Y_test.shape))

In [None]:
model.fit(X_train, Y_train, epochs=12, batch_size=(X_train.shape[0] // 20))


In [None]:
model.evaluate(X_test, Y_test) 
# the first value represents the value of the loss 
# the second represents the accuracy

In [None]:
# now we will build a convolutional model using the functional API

def convolutional_model(input_image, input_shape):
    """
    This method implements the forward pass according to the following schema:
    CONV2D -> RELU -> MAXPOOL -> CONV2D -> RELU -> MAXPOOL -> FLATTEN -> DENSE

    Arguments:
    input_img -- input dataset, of shape (input_shape)

    Returns:
    model -- TF Keras model (object containing the information for the entire training process) 
    """

    # the main idea of the Functional API is to build more flexible neural networks that do not necessarily follow the linear/sequential architecture
    # creating each of the models separately

    input_img = tf.keras.Input(shape=input_shape)
    conv1 = tfl.Conv2D(filters=8, kernel_size=(4,4), padding='SAME', strides=(1, 1)) # the convolution operation will not shrink the object in question
    activation1 = tfl.ReLU() # default parameters
    max_pool1 = tfl.MaxPool2D((8, 8), strides=(8, 8), padding='SAME') # no shrinkage

    conv2 = tfl.Conv2D(filters=16, kernel_size=(2,2), strides=(1,1), padding='SAME')
    activation2 = tfl.ReLU()
    max_pool2 = tfl.MaxPool2D((4, 4), strides=(4,4), padding='SAME')

    flatten = tfl.Flatten()
    fully_connected_layer = tfl.Dense(6, activation='softmax') # return the probabilities that the image belongs to one of the 6 classes

    # creating the graph: how each layer is connected to the other
    x = conv1(input_img)
    x = activation1(x)
    x = max_pool1(x)
    x = conv2(x)
    x = max_pool2(x)
    x = flatten(x)
    outputs = fully_connected_layer(x)
    # declare the model
    hand_sign_model = tf.keras.Model(inputs=input_img, outputs= outputs, name='hand_sign_number_detector')

    return hand_sign_model

In [None]:
model = convolutional_model((64, 64, 3))

## Residual Network
More detailed explanation of Residual networks can be found through this [link](https://github.com/ayhem18/Towards_Data_science/blob/master/Machine_Learning/CNN/CNN_2.ipynb). 

In [None]:
from tensorflow.keras.initializers import RandomUniform as random_uniform
from tensorflow.keras.initializers import GlorotUniform as glorot_uniform


In [None]:
## the following code builds a residual block slightly more powerful than the one explained in the theoretical part.
## this function builds a three-componenet residual block

## personnal attempt
def identity_block(X, f, filters, training=True, initializer=random_uniform):
    assert len(filters) == 3
    # retreive the number of filters associated with each component
    f1, f2, f3 = filters
    input_shape = X.shape
    # component one
    input = tf.keras.Input(shape=input_shape)
    conv1 = tfl.Conv2D(filters=f1, kernel_initializer=initializer(seed=0), padding='valid')
    batnorm1 = tfl.BatchNormalization(axis=3)
    activation = tfl.Relu()

    conv2 = tfl.Conv2D(filters=f2, kernel_initializer=initializer(seed=0), padding='valid')
    batnorm2 = tfl.BatchNormalization(axis=3)

    conv3 = tfl.Conv2D(filters=f3, kernel_initializer=initializer(seed=0), padding='valid')
    batnorm3 = tfl.BatchNormalization(axis=3)

    # first component
    x = conv1(input)
    x = batnorm1(x)
    x1 = activation(x)

    # second component
    x = conv2(x1)
    x = batnorm2(x)
    x2 = activation(x)

    # apply convolution and normalization 
    x = conv3(x2)
    x = batnorm3(x)

    x = x.add(x1)



In [None]:
def identity_block(X, f, filters, training=True, initializer=random_uniform):
    """
    Implementation of the identity residual block: Residual block where the input and output are of the same dimensions
    
    Arguments:
    X -- input tensor of shape (m, n_H_prev, n_W_prev, n_C_prev)
    f -- integer, specifying the shape of the middle CONV's window for the main path
    filters -- python list of integers, defining the number of filters in the CONV layers of the main path
    training -- True: Behave in training mode
                False: Behave in inference mode
    initializer -- to set up the initial weights of a layer. Equals to random uniform initializer
    
    Returns:
    X -- output of the identity block, tensor of shape (m, n_H, n_W, n_C)
    """
        
    # Retrieve Filters
    F1, F2, F3 = filters
    
    # Save the input value. You'll need this later to add back to the main path. 
    X_shortcut = X
    
    relu = tfl.ReLU()

    # First component of main path
    X = tfl.Conv2D(filters = F1, kernel_size = 1, strides = (1,1), padding = 'valid', kernel_initializer = initializer(seed=0))(X)
    X = tfl.BatchNormalization(axis = 3)(X, training = training) # Default axis
    X = relu(X)
    
    ## Second component of main path (≈3 lines)
    ## Set the padding = 'same'
    X = tfl.Conv2D(filters=F2, kernel_size=1, strides=(1,1), padding='same', kernel_initializer=initializer(seed=0))(X)
    X = tfl.BatchNormalization(axis=3)(X, training=training)
    X = relu(X) 

    ## Third component of main path (≈2 lines)
    ## Set the padding = 'valid'
    X = tfl.Conv2D(filters=F3, kernel_size=f, strides=(1,1), padding='same', kernel_initializer=initializer(seed=0))(X)
    X = tfl.BatchNormalization(axis=3)(X, training=training) 
    
    ## Final step: Add shortcut value to main path, and pass it through a RELU activation (≈2 lines)
    X = tfl.Add()([X, X_shortcut])
    X = relu(X) 
    
    return X

The previous block did not take into account the cae of different input and output dimensions. The following code addresses this case.

In [None]:
## personal attemtp
def convolutional_block(X, f, filters, s = 2, training=True, initializer=glorot_uniform):
    """
    Implementation of the convolutional block: residual block with possibly different input/output dimensions
    
    Arguments:
    X -- input tensor of shape (m, n_H_prev, n_W_prev, n_C_prev)
    f -- integer, specifying the shape of the middle CONV's window for the main path
    filters -- python list of integers, defining the number of filters in the CONV layers of the main path
    s -- Integer, specifying the stride to be used
    training -- True: Behave in training mode
                False: Behave in inference mode
    initializer -- to set up the initial weights of a layer. Equals to Glorot uniform initializer, 
                   also called Xavier uniform initializer.
    
    Returns:
    X -- output of the convolutional block, tensor of shape (m, n_H, n_W, n_C)
    """
    
    # Retrieve Filters
    F1, F2, F3 = filters
    
    # Save the input value
    X_shortcut = X


    ##### MAIN PATH #####
    relu = tfl.ReLU()    

    X = tfl.Conv2D(filters=F1, kernel_size=1, strides=(1,1), padding='valid', kernel_initializer=initializer(seed=0))(X)
    X = tfl.BatchNormalization(axis = 3)(X, training = training) # Default axis
    X = relu(X)
    


    ## Second component of main path (≈3 lines)
    X = tfl.Conv2D(filters=F2, kernel_size=f, strides=(1,1), padding='same', kernel_initializer=initializer(seed=0))(X)

    X = tfl.BatchNormalization(axis=3)(X, training=training) 
    X = relu(X) 

    ## Third component of main path (≈2 lines)
    X = tfl.Conv2D(filters=F3, kernel_size=1, strides=(1,1), padding='valid', kernel_initializer=initializer(seed=0))(X)
    X = tfl.BatchNormalization(axis=3)(X, training=training)
    
    ##### SHORTCUT PATH ##### (≈2 lines)
    X_shortcut = tfl.Conv2D(filters=F3, kernel_size=1, strides=(s, s), padding='same', kernel_initializer=initializer(seed=0))(X_shortcut)
    X_shortcut = tfl.BatchNormalization(axis=3)(X_shortcut, training=training)
    
    ### END CODE HERE

    # Final step: Add shortcut value to main path (Use this order [X, X_shortcut]), and pass it through a RELU activation
    X = tfl.Add()([X, X_shortcut])
    X = relu(X)

    return X
    

The following code builds my first deep neural network using the blocks introduced above.

In [None]:
def ResNet50(input_shape = (64, 64, 3), classes = 6):
    """
    Stage-wise implementation of the architecture of the popular ResNet50:
    CONV2D -> BATCHNORM -> RELU -> MAXPOOL -> CONVBLOCK -> IDBLOCK*2 -> CONVBLOCK -> IDBLOCK*3
    -> CONVBLOCK -> IDBLOCK*5 -> CONVBLOCK -> IDBLOCK*2 -> AVGPOOL -> FLATTEN -> DENSE 

    Arguments:
    input_shape -- shape of the images of the dataset
    classes -- integer, number of classes


    Returns:
    model -- a Model() instance in Keras
    """
    # define the input tensor to work with
    X_input = tfl.Input(input_shape)

    ## define a Relu acitvation layer as it is used extensively
    relu = tfl.ReLU()

    ## pad the input with zeros
    X = tfl.ZeroPadding2D(padding=(3, 3))(X_input)
    
    ## stage 1
    X = tfl.Conv2D(filters=64, kernel_size=7, strides=(2,2), kernel_initializer=glorot_uniform(seed=0))(X)
    X = tfl.BatchNormalization(axis=3)(X)
    X = relu(X)
    X = tfl.MaxPooling2D(pool_size=(3, 3), strides=(2,2))(X)

    ## stage 2
    X = convolutional_block(X, f=3, s=1, filters=[64, 64, 256])
    X = identity_block(X, f=3, filters=[64, 64, 256])
    X = identity_block(X, 3, [64, 64, 256])

    # stage 3
    X = convolutional_block(X, f=3, s=2, filters=[128, 128, 512])
    X = identity_block(X, f=3, filters=[128, 128, 512])
    X = identity_block(X, f=3, filters=[128, 128, 512])
    X = identity_block(X, f=3, filters=[128, 128, 512])

    # stage 4
    X = convolutional_block(X, f=3, s=2, filters=[256, 256, 1024])
    X = identity_block(X, f=3, filters=[256, 256, 1024])
    X = identity_block(X, f=3, filters=[256, 256, 1024])
    X = identity_block(X, f=3, filters=[256, 256, 1024])
    X = identity_block(X, f=3, filters=[256, 256, 1024])
    X = identity_block(X, f=3, filters=[256, 256, 1024])

    ## stage 5
    X = convolutional_block(X, f=3, s=2, filters=[512, 512, 2048])
    X = identity_block(X, f=3, filters=[512, 512, 2048])
    X = identity_block(X, f=3, filters=[512, 512, 2048])

    ## final stage
    X = tfl.AveragePooling2D(pool_size=(2,2))(X)
    X = tfl.Flatten()(X)
    X = tfl.Dense(classes, activation='softmax', kernel_initializer=glorot_uniform(seed=0))(X)
    model = tf.keras.Model(inputs = X_input, outputs=X)
    
    return model

In [None]:
res50 = ResNet50()
res50.compile(optimizer='adam', loss=tf.keras.losses.BinaryCrossentropy(from_logits=False), metrics=['accuracy'])
res50.fit(X_train, Y_train, epochs=12, batch_size=(X_train.shape[0] // 20))
res50.evaluate(X_test, Y_test)

In [None]:
def data_augmenter():
    '''
    Create a Sequential model composed of 2 layers
    Returns:
        tf.keras.Sequential
    '''
    data_augmentation = tf.keras.Sequential([
    tfl.RandomFlip("horizontal"),
    tfl.RandomRotation(0.2),
    ])    
    return data_augmentation

In [None]:
aug = data_augmenter()

In [None]:
augmenter = data_augmenter()

assert(augmenter.layers[0].name.startswith('random_flip')), "First layer must be RandomFlip"
assert augmenter.layers[0].mode == 'horizontal', "RadomFlip parameter must be horizontal"
assert(augmenter.layers[1].name.startswith('random_rotation')), "Second layer must be RandomRotation"
assert augmenter.layers[1].factor == 0.2, "Rotation factor must be 0.2"
assert len(augmenter.layers) == 2, "The model must have only 2 layers"


In [None]:
from tensorflow.keras.layers.experimental.preprocessing import RandomFlip, RandomRotation 

def data_augmenter_2():
    da = tf.keras.Sequential([])
    da.add(RandomFlip("horizontal"))
    da.add(RandomRotation(0.2))
    return da

da = data_augmenter_2()