Skip to content

Ganesh-Esc/Deep-Learning---Padding-Striding-and-Pooling

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

3 Commits
Β 
Β 
Β 
Β 

Repository files navigation

Deep-Learning---Padding-Striding-and-Pooling

CNN Core Concepts: Padding, Stride, Pooling, and Activation 🧠

Deep Learning

While previous labs have shown that convolutions can detect features like edges and corners, it can be challenging to build an intuition for how the configuration of a convolutional layer affects the shape of its output. In a Convolutional Neural Network (CNN), having a concrete understanding of the output size of each layer is essential for building effective architectures.

This lab dives into four important factors to consider when working with CNNs: Padding, Stride, Pooling, and Activation Functions.


🎯 Objectives

After completing this lab, you will be able to:

  • Understand the use of padding and stride in CNNs.
  • Calculate the size of the output of a convolutional or pooling layer.
  • Understand the necessity of activation functions.
  • Describe the difference between Max Pooling and Average Pooling.

πŸ› οΈ Key Concepts Explained

1. Padding

Padding is the process of adding extra pixels (usually zeros) around the border of an input image before applying a convolution.

Why use it?

  • Preserve Spatial Dimensions: Without padding, each convolutional layer would shrink the size of the feature map. Padding allows us to maintain the size of the output, enabling deeper networks.
  • Improve Border Processing: It ensures that the pixels at the edges of the original image are processed by the filter multiple times, just like the pixels in the center.

Types:

  • Valid Padding: No padding is added. The output size will be smaller than the input.
  • Same Padding: Padding is added so that the output feature map has the same spatial dimensions as the input (assuming a stride of 1).

2. Stride

Stride is the number of pixels the convolutional kernel shifts over the input matrix at each step. A stride of (1, 1) means the filter moves one pixel at a time, horizontally and vertically. A stride of (2, 2) means it moves two pixels at a time, effectively downsampling the image.

3. Calculating Output Size

You can precisely calculate the output dimensions of a convolutional layer using the following formula:

Output Height = floor( (Input Height - Kernel Height + 2 * Padding) / Stride ) + 1 Output Width = floor( (Input Width - Kernel Width + 2 * Padding) / Stride ) + 1

4. Pooling Layers

Pooling layers are used to reduce the spatial dimensions (downsample) of the feature maps. This reduces the number of parameters and computational complexity, and also helps to make the network more robust to variations in the position of features.

  • Max Pooling: Selects the maximum value from each patch of the feature map. It's effective at capturing the most prominent or intense features.
  • Average Pooling: Calculates the average of the values in each patch. It provides a more generalized, smoothed-out representation of the features.

5. Activation Functions

An activation function introduces non-linearity into the model. Without non-linearity, a deep stack of convolutional layers would behave like a single, simple linear function, severely limiting its ability to learn the complex patterns found in data like images. A common activation function used in CNNs is the Rectified Linear Unit (ReLU), which outputs the input directly if it is positive, and zero otherwise.


πŸ“ Lab Workflow

  1. Visualize Padding & Stride: Manipulate these parameters on a sample input matrix to see their direct effect on the output shape.
  2. Manual Calculation: Use the output size formula to predict the dimensions of a convolutional layer's output.
  3. Keras Verification: Build a simple Keras model with a Conv2D layer to verify your manual calculations.
  4. Compare Pooling Methods: Apply MaxPooling2D and AveragePooling2D layers to a feature map and compare their outputs.
  5. Role of Activation: Discuss and demonstrate how adding an activation function like ReLU changes the output of a layer.

About

Deep Learning -> Padding , Striding and Pooling

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published