# **Convolutional Neural Networks (CNNs)**

###**Introduction to CNNs**

Convolutional Neural Networks (CNNs) are a class of deep neural networks that are particularly well-suited for tasks such as image recognition and classification. They are inspired by the visual processing of the human brain and are designed to automatically and adaptively learn spatial hierarchies of features from input data.

###**Architecture of CNNs:**

CNNs typically consist of layers that perform specific operations. The common architecture includes:

![CNN Architecture](https://miro.medium.com/v2/resize:fit:720/format:webp/0*uo_4mk0hTHK77o7V.png)

###**Layers in CNNs:**

>**Convolutional Layers:** These layers apply convolutional operations to the input, using filters to extract important features.

>**Pooling Layers:** Pooling layers reduce the spatial dimensions of the feature maps.

>**Fully Connected Layers:** These layers connect
every neuron in one layer to every neuron in the
next layer, allowing the network to make predictions.




###**Filters in CNNs:**

Filters (or kernels) are small-sized matrices that are used to extract features from the input data.
Convolutional layers use multiple filters to detect different patterns in the input.

###**Activation Functions:**

>**ReLU (Rectified Linear Unit):** The most commonly used activation function, ReLU replaces all negative pixel values in the feature map with zero.

>**Leaky ReLU:** A modified version of ReLU that allows a small, positive gradient for negative inputs, preventing dead neurons.


---



#**Convolutional Operations and Pooling Layers in CNNs**
##**1. Convolutional Operations**
###Padding:

Padding is a technique used to preserve the spatial dimensions of the input image after convolution operations on a feature map. Padding involves adding extra pixels around the border of the input feature map before convolution. <br><br>

This can be done in two ways:

> **Valid Padding:** In the valid padding, no padding is added to the input feature map, and the output feature map is smaller than the input feature map. This is useful when we want to reduce the spatial dimensions of the feature maps.

>**Same Padding:** In the same padding, padding is added to the input feature map such that the size of the output feature map is the same as the input feature map. This is useful when we want to preserve the spatial dimensions of the feature maps.<br><br>
The number of pixels to be added for padding can be calculated based on the size of the kernel and the desired output of the feature map size. The most common padding value is zero-padding, which involves adding zeros to the borders of the input feature map.

Padding can help in reducing the loss of information at the borders of the input feature map and can improve the performance of the model. However, it also increases the computational cost of the convolution operation. Overall, padding is an important technique in CNNs that helps in preserving the spatial dimensions of the feature maps and can improve the performance of the model.

In [1]:
import tensorflow as tf

# Create a sample 2D tensor (e.g., an image)
input_tensor = tf.constant([[1, 2, 3],
                            [4, 5, 6],
                            [7, 8, 9]])

# Specify the padding sizes for each dimension (height, width)
padding_sizes = [[1, 1], [1, 1]]

# Apply zero-padding to the tensor
padded_tensor = tf.pad(input_tensor, padding_sizes, "CONSTANT")

# Print the original and padded tensors
print("Original Tensor:")
print(input_tensor)
print("\nPadded Tensor:")
print(padded_tensor)

Original Tensor:
tf.Tensor(
[[1 2 3]
 [4 5 6]
 [7 8 9]], shape=(3, 3), dtype=int32)

Padded Tensor:
tf.Tensor(
[[0 0 0 0 0]
 [0 1 2 3 0]
 [0 4 5 6 0]
 [0 7 8 9 0]
 [0 0 0 0 0]], shape=(5, 5), dtype=int32)


###Strides:
Stride is a parameter that dictates the movement of the kernel, or filter, across the input data, such as an image. When performing a convolution operation, the stride determines how many units the filter shifts at each step. This shift can be horizontal, vertical, or both, depending on the stride's configuration.<br><br>
>For example, a stride of 1 moves the filter one pixel at a time, while a stride of 2 moves it two pixels. A larger stride will produce a smaller output dimension, effectively downsampling the image.

In [9]:
import tensorflow as tf
import numpy as np

# Create a sample 4D tensor (batch_size, height, width, channels)
input_tensor = tf.constant(np.random.randn(1, 5, 5, 3), dtype=tf.float32)

# Define the convolutional layer with a specific stride
conv_layer = tf.keras.layers.Conv2D(filters=16, kernel_size=(3, 3), strides=(2, 2), padding='valid', activation='relu')

# Apply the convolutional layer to the input tensor
output_tensor = conv_layer(input_tensor)

# Print the shapes of the input and output tensors
print("Input Tensor Shape:", input_tensor.shape)
print("Output Tensor Shape:", output_tensor.shape)

Input Tensor Shape: (1, 5, 5, 3)
Output Tensor Shape: (1, 2, 2, 16)


###Dilation:
Dilated Convolution: It is a technique that expands the kernel (input) by inserting holes between its consecutive elements. In simpler terms, it is the same as convolution but it involves pixel skipping, so as to cover a larger area of the input.<br><br>
Dilated convolution, also known as atrous convolution, is a type of convolution operation used in convolutional neural networks (CNNs) that enables the network to have a larger receptive field without increasing the number of parameters.

In [10]:
import tensorflow as tf

# Create a sample 4D tensor (batch_size, height, width, channels)
input_tensor = tf.constant([[1, 2, 3],
                            [4, 5, 6],
                            [7, 8, 9]], dtype=tf.float32)

# Reshape the tensor to have batch and channel dimensions
input_tensor = tf.reshape(input_tensor, (1, 3, 3, 1))

# Define the convolutional layer with dilation
conv_layer = tf.keras.layers.Conv2D(filters=1, kernel_size=(2, 2), dilation_rate=(2, 2), padding='valid', activation='relu')

# Apply the convolutional layer to the input tensor
output_tensor = conv_layer(input_tensor)

# Print the shapes of the input and output tensors
print("Input Tensor Shape:", input_tensor.shape)
print("Output Tensor Shape:", output_tensor.shape)


Input Tensor Shape: (1, 3, 3, 1)
Output Tensor Shape: (1, 1, 1, 1)


##**2. Pooling Layers**
The pooling operation involves sliding a two-dimensional filter over each channel of feature map and summarising the features lying within the region covered by the filter.
<br><br>
Types of Pooling Layers:
###Max Pooling:
Max pooling is a pooling operation that selects the maximum element from the region of the feature map covered by the filter. Thus, the output after max-pooling layer would be a feature map containing the most prominent features of the previous feature map.
![picture](https://media.geeksforgeeks.org/wp-content/uploads/20190721025744/Screenshot-2019-07-21-at-2.57.13-AM.png)

###Average Pooling:

Average pooling computes the average of the elements present in the region of feature map covered by the filter. Thus, while max pooling gives the most prominent feature in a particular patch of the feature map, average pooling gives the average of features present in a patch.
![picture](https://media.geeksforgeeks.org/wp-content/uploads/20190721030705/Screenshot-2019-07-21-at-3.05.56-AM.png)

Max Pooling

In [11]:
x = tf.constant([[1., 2., 3.],
                 [4., 5., 6.],
                 [7., 8., 9.]])
x = tf.reshape(x, [1, 3, 3, 1])
max_pool_2d = tf.keras.layers.MaxPooling2D(pool_size=(2, 2),
   strides=(1, 1), padding='valid')
max_pool_2d(x)

<tf.Tensor: shape=(1, 2, 2, 1), dtype=float32, numpy=
array([[[[5.],
         [6.]],

        [[8.],
         [9.]]]], dtype=float32)>

In [12]:
x = tf.constant([[1., 2., 3.],
                 [4., 5., 6.],
                 [7., 8., 9.]])
x = tf.reshape(x, [1, 3, 3, 1])
avg_pool_2d = tf.keras.layers.AveragePooling2D(pool_size=(2, 2),
   strides=(1, 1), padding='valid')
avg_pool_2d(x)

<tf.Tensor: shape=(1, 2, 2, 1), dtype=float32, numpy=
array([[[[3.],
         [4.]],

        [[6.],
         [7.]]]], dtype=float32)>