<a href="https://colab.research.google.com/github/geonextgis/End-to-End-Deep-Learning/blob/main/02_CNN/03_Padding_and_Strides_in_CNN.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Padding and Strides in CNN**

## **What is Padding?**

Padding is a technique used to preserve spatial information during the convolutional and pooling operations. It involves adding extra pixels (usually with a value of zero) around the borders of an input feature map or image.

The main purposes of padding are:

1. **Preserving Spatial Information:**
   - Without padding, the spatial dimensions of the feature map decrease with each convolutional layer, potentially leading to a significant reduction in information at the edges.
   - Padding helps maintain the spatial size, ensuring that information near the borders is given proper consideration.

2. **Mitigating the Loss of Information:**
   - In the absence of padding, the pixels at the edges of the feature map are involved in fewer convolution operations, leading to a loss of information.
   - Padding ensures that each pixel in the input has the opportunity to be the center of the receptive field for convolutional filters.

3. **Handling Stride and Filter Size:**
   - Padding becomes especially useful when using larger filter sizes or strides greater than 1. Without padding, the spatial size reduction becomes more pronounced.

<center><img src="https://miro.medium.com/v2/resize:fit:1358/1*D6iRfzDkz-sEzyjYoVZ73w.gif" width="70%"></center>

## **Types of Padding in Keras**
In Keras, a popular deep learning library, you can specify different types of padding for convolutional layers. The main types of padding available in Keras are:

1. **Valid Padding (No Padding):**
   - This is the default setting in Keras.
   - No padding is added to the input feature map.
   - The convolution operation is applied only to the valid part of the input.

<center><img src="https://upload.wikimedia.org/wikipedia/commons/7/78/Valid-padding-convolution.gif" width="30%"></center>

2. **Same Padding (Zero Padding):**
   - Padding is added to the input feature map to ensure that the spatial dimensions of the output feature map remain the same as the input.
   - The padding is distributed evenly on all sides.
   - Useful for preserving spatial information and handling larger filter sizes.

<center><img src="https://miro.medium.com/v2/resize:fit:679/1*SsKCClCa9xVxIoaocVY6Ww.gif" width="60%"></center>

## **Calculation of Feature Map Size**
If the stride $(\text{Stride})$ is set to 1 (meaning no skipping of pixels during the convolution), the formula for calculating the feature map size after padding simplifies further. For "same" padding, the formula becomes:

$$\text{Output Size} = {{\text{Input Size} + 2 \times \text{Padding} - \text{Filter Size} + 1}}$$

Here, the terms are the same as in the previous formula:

- $\text{Output Size}$: The size of the feature map after the convolution operation with padding and stride set to 1.
- $\text{Input Size}$: The size of the input (or previous layer's feature map).
- $\text{Filter Size}$: The size of the convolutional filter (kernel).
- $\text{Padding}$: The number of zero-padding pixels added to the input on each side.

## **Implementation of Padding in Keras**

### **Import Required Libraries**

In [5]:
import tensorflow
from tensorflow import keras
from keras import Sequential
from keras.layers import Conv2D, Dense, Flatten
from keras.datasets import mnist
print(tensorflow.__version__)

2.15.0


### **Read the Data**

In [4]:
(X_train, y_train), (X_test, y_test) = mnist.load_data()

### **Build the Model Architecture with `valid` Padding**

In [6]:
# Build the model architecture with 'valid' padding in the convolution layers
model = Sequential()

model.add(Conv2D(32, kernel_size=(3, 3), padding="valid", activation="relu", input_shape=(28, 28, 1)))
model.add(Conv2D(32, kernel_size=(3, 3), padding="valid", activation="relu"))
model.add(Conv2D(32, kernel_size=(3, 3), padding="valid", activation="relu"))

model.add(Flatten())

model.add(Dense(128, activation="relu"))
model.add(Dense(10, activation="softmax"))

In [7]:
# Print the model's summary
model.summary()

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 conv2d (Conv2D)             (None, 26, 26, 32)        320       
                                                                 
 conv2d_1 (Conv2D)           (None, 24, 24, 32)        9248      
                                                                 
 conv2d_2 (Conv2D)           (None, 22, 22, 32)        9248      
                                                                 
 flatten (Flatten)           (None, 15488)             0         
                                                                 
 dense (Dense)               (None, 128)               1982592   
                                                                 
 dense_1 (Dense)             (None, 10)                1290      
                                                                 
Total params: 2002698 (7.64 MB)
Trainable params: 200269

### **Build the Model Architecture with `same/zero` Padding**

In [8]:
# Build the model architecture with 'zero' padding in the convolution layers
model = Sequential()

model.add(Conv2D(32, kernel_size=(3, 3), padding="same", activation="relu", input_shape=(28, 28, 1)))
model.add(Conv2D(32, kernel_size=(3, 3), padding="same", activation="relu"))
model.add(Conv2D(32, kernel_size=(3, 3), padding="same", activation="relu"))

model.add(Flatten())

model.add(Dense(128, activation="relu"))
model.add(Dense(10, activation="softmax"))

In [9]:
# Print the model's summary
model.summary()

Model: "sequential_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 conv2d_3 (Conv2D)           (None, 28, 28, 32)        320       
                                                                 
 conv2d_4 (Conv2D)           (None, 28, 28, 32)        9248      
                                                                 
 conv2d_5 (Conv2D)           (None, 28, 28, 32)        9248      
                                                                 
 flatten_1 (Flatten)         (None, 25088)             0         
                                                                 
 dense_2 (Dense)             (None, 128)               3211392   
                                                                 
 dense_3 (Dense)             (None, 10)                1290      
                                                                 
Total params: 3231498 (12.33 MB)
Trainable params: 323

## **What is Strides?**
In the context of CNNs, "strides" refer to the step size or the number of pixels the convolutional filter (kernel) moves at each step during the convolution operation. The stride parameter determines the distance between consecutive applications of the filter to the input, influencing the spatial dimensions of the output feature map. `Strided convolution` involves using a stride value greater than 1, meaning that the convolutional filter moves more than one pixel at a time while scanning the input.

Key points about strides:

1. **Stride Value:**
   - Strides are usually set as positive integers.
   - Common values are 1, indicating that the filter moves one pixel at a time, and 2, indicating that the filter moves two pixels at a time.
   - Larger stride values result in a more aggressive reduction of the spatial dimensions.

2. **Effect on Output Size:**
   - Increasing the stride reduces the spatial dimensions of the output feature map.
   - Smaller strides lead to larger feature maps but may increase computational complexity.

3. **Strides and Subsampling:**
   - Strides can be used to achieve subsampling or down-sampling by skipping pixels during the convolution.
   - Subsampling can be beneficial for reducing the computational load and focusing on important features.

**Example of a Convolution Operation when the Stride is set to 2:**
<br>
<center><img src="https://miro.medium.com/v2/resize:fit:679/0*0LMdR2rvJAlRHC3m.gif" width="40%"></center>

## **Calculation of Feature Map Size**
The effect of strides on the output size can be described by the following formula:

$$\text{Output Size} = \frac{\text{Input Size} + 2 \times \text{Padding} - \text{Filter Size}}{{\text{Stride}}} + 1$$

Here are the terms in the formula:

- $\text{Output Size}$: The size of the feature map after the convolution operation.
- $\text{Input Size}$: The size of the input (or previous layer's feature map).
- $\text{Filter Size}$: The size of the convolutional filter (kernel).
- $\text{Padding}$: The number of zero-padding pixels added to the input on each side.
- $\text{Stride}$: The step size or the number of pixels the filter moves at each step during convolution.

## **Why Strides are required?**
Strides in convolutional neural networks (CNNs) are required for several reasons:

1. **Downsampling and Efficiency:**
   - Strides enable downsampling of the input, reducing the spatial dimensions of the feature maps.
   - Downsampling is crucial for efficiency, reducing computational complexity and memory requirements.

2. **Feature Extraction:**
   - Larger strides skip pixels during convolution, allowing the network to focus on more significant features and patterns.
   - This can be beneficial for capturing high-level features and reducing the spatial size of the feature maps.

3. **Control Over Model Complexity:**
   - Strides provide a way to control the complexity of the model by influencing the spatial dimensions of the feature maps.
   - They allow practitioners to balance between capturing fine-grained details and computational efficiency.

In summary, strides are essential for controlling the trade-off between computational efficiency and feature representation in CNNs. They allow practitioners to tailor the network architecture to the specific requirements of the task at hand, ensuring effective feature extraction and model efficiency.

## **Implementation of Strides in Keras**

In [11]:
# Build the model architecture with a (2, 2) strides in the convolution layers
model = Sequential()

model.add(Conv2D(32, kernel_size=(3, 3), strides=(2,2), padding="same", activation="relu", input_shape=(28, 28, 1)))
model.add(Conv2D(32, kernel_size=(3, 3), strides=(2,2), padding="same", activation="relu"))
model.add(Conv2D(32, kernel_size=(3, 3), strides=(2,2), padding="same", activation="relu"))

model.add(Flatten())

model.add(Dense(128, activation="relu"))
model.add(Dense(10, activation="softmax"))

In [12]:
# Print the model's summary
model.summary()

Model: "sequential_3"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 conv2d_9 (Conv2D)           (None, 14, 14, 32)        320       
                                                                 
 conv2d_10 (Conv2D)          (None, 7, 7, 32)          9248      
                                                                 
 conv2d_11 (Conv2D)          (None, 4, 4, 32)          9248      
                                                                 
 flatten_3 (Flatten)         (None, 512)               0         
                                                                 
 dense_6 (Dense)             (None, 128)               65664     
                                                                 
 dense_7 (Dense)             (None, 10)                1290      
                                                                 
Total params: 85770 (335.04 KB)
Trainable params: 8577