# **Convolutional Neural Network** - From Theory to Practice

Convolutional neural networks (CNN/ConvNet) are a class of deep neural networks used in deep learning that are most frequently used to evaluate visual vision. ConvNet does not operate like a matrix multiplication, which is what comes to mind when we think of a neural network. It makes use of a special approach known as convolution. Convolution, as it is known in mathematics, is an operation on two functions that yields a third function that describes how the shape of one is changed by the other.

<img src="images/cnnArchitecture.jpeg"/>

## **Building CNN from Scratch**

In [33]:
import numpy as np
dynQid-7medfo-peqhyq

The following image is used as an example to understand and verify the working of CNN:

<img src="images/conv.gif" width="350"/>

In [34]:
class CNN:
    def __init__(self):
        pass
    
    def convLayer(self, input_shape, channels, strides, padding, filter_size):
        pass
    def maxPooling(self, input_matrix):
        pass
    def flatten(self, input_matrix):
        pass
    def dropout(self, input_matrix, dropout_rate = 0):
        pass


    def convLayer(self, input_shape, channels, strides, padding, filter_size):
        height, width = input_shape
        input_shape_with_channels = (height, width, channels)
        print("Input Shape (with channels):", input_shape_with_channels)
        
        # for random input and filter matrices
        # input_matrix = np.random.randint(0, 10, size=input_shape_with_channels)
        # filter_matrix = np.random.randint(0, 5, size=filter_size)
        
        input_matrix = np.array([
                    [1, 1, 1, 0, 0],
                    [0, 1, 1, 1, 0],
                    [0, 0, 1, 1, 1],
                    [0, 0, 1, 1, 0],
                    [0, 1, 1, 0, 0]
        ])
        filter_matrix = np.array([
                    [1, 0, 1],
                    [0, 1, 0],
                    [1, 0, 1]
        ])
        print("\nInput Matrix:")
        print(input_matrix)
        print("\nFilter Matrix:")
        print(filter_matrix)
        

        padding.lower()
        padSize = 0

        if padding == 'same':
            # Calculate padding needed for each dimension
            pad_height = ((height - 1) * strides[0] + filter_size[0] - height) // 2
            pad_width = ((width - 1) * strides[1] + filter_size[1] - width) // 2
            
            # Apply padding to the input matrix
            input_matrix = np.pad(input_matrix, ((pad_height, pad_height), (pad_width, pad_width), 
                                                (0, 0)), mode='constant')
            
            # Adjust height and width to consider the padding
            height += 2 * pad_height
            width += 2 * pad_width

        elif padding == 'valid':
            pass

        else:
            return "Invalid Padding!!"


        # output dimension
        conv_height = (height - filter_size[0]) // strides[0] + 1
        conv_width = (width - filter_size[1]) // strides[1] + 1
        
        output_matrix = np.zeros((conv_height, conv_width))
        
        # Convolution Operation
        for i in range(0, height - filter_size[0] + 1, strides[0]):
            for j in range(0, width - filter_size[1] + 1, strides[1]):
                receptive_field = input_matrix[i:i + filter_size[0], j:j + filter_size[1]]
                output_matrix[i // strides[0], j // strides[1]] = np.sum(receptive_field * filter_matrix)
        
        return output_matrix


    def maxPooling(self, input_matrix, pool_size, strides_pooling):
        pool_height, pool_width = pool_size
        stride_height, stride_width = strides_pooling
        pooled_height = (input_matrix.shape[0] - pool_height) // stride_height + 1
        pooled_width = (input_matrix.shape[1] - pool_width) // stride_width + 1
        pooled_matrix = np.zeros((pooled_height, pooled_width))

        for i in range(pooled_height):
            for j in range(pooled_width):
                patch = input_matrix[i * stride_height: i * stride_height + pool_height,
                                     j * stride_width: j * stride_width + pool_width]
                pooled_matrix[i, j] = np.max(patch)
        return pooled_matrix
    

    def flatten(self, input_matrix):
        return input_matrix.flatten()
    
    
    def dropout(self, input_matrix, dropout_rate = 0):
        dropout_mask = np.random.binomial(1, 1 - dropout_rate, size=input_matrix.shape)
        return input_matrix * dropout_mask

In [35]:
input_shape = (5, 5)
channels = 1
strides = (1, 1)
padding = 'valid'
filter_size = (3, 3)

In [36]:
cnn_model = CNN()

convL1 = cnn_model.convLayer(input_shape, channels, strides, padding, filter_size)

Input Shape (with channels): (5, 5, 1)

Input Matrix:
[[1 1 1 0 0]
 [0 1 1 1 0]
 [0 0 1 1 1]
 [0 0 1 1 0]
 [0 1 1 0 0]]

Filter Matrix:
[[1 0 1]
 [0 1 0]
 [1 0 1]]


In [37]:
convL1

array([[4., 3., 4.],
       [2., 4., 3.],
       [2., 3., 4.]])

In [38]:
pool_size = (2, 2)
strides_pooling = (1, 1)

maxPool = cnn_model.maxPooling(convL1, pool_size, strides_pooling)
maxPool

array([[4., 4.],
       [4., 4.]])

In [39]:
flattened_output = cnn_model.flatten(maxPool)
flattened_output

array([4., 4., 4., 4.])

In [40]:
dropout_output = cnn_model.dropout(flattened_output, 0.3)
dropout_output

array([4., 4., 0., 4.])