<a href="https://colab.research.google.com/github/farrelrassya/teachingMLDL/blob/main/01.%20Machine%20Learning/11.%20Week%2011/Notebook/01.AnIntroductiontoPooling%3AEnhancingDataAnalysisEfficiency.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#Introduction to Pooling in Convolutional Neural Networks (CNNs)

Pooling is a fundamental operation in CNNs that reduces the spatial dimensions of feature maps while retaining their most relevant features. By summarizing local regions of the feature map, pooling helps manage computational complexity, mitigate overfitting, and introduce a degree of translational invariance.

Purpose:  
Convolutional layers extract features from the input through learnable filters. Without dimensionality reduction, the spatial dimensions remain large, leading to excessive computational cost and a high risk of overfitting. Pooling provides a solution by downsampling the feature maps, focusing the network on salient features and ignoring unnecessary spatial details.

Types of Pooling:  
1. Max Pooling: Selects the maximum value within a defined local window (e.g., $2 \times 2$).  
2. Average Pooling: Computes the average value of the elements in the local window.  
3. Other Variants: L2-norm pooling or global pooling methods like Global Average Pooling (GAP), which aggregates information over the entire spatial dimension of the feature map.

Mathematical Formulations:  
Let $X$ be an input feature map of dimensions $H \times W \times D$.  
Define a pooling window of size $k \times k$, and a stride $s$.

The output dimensions are given by:  
$$H' = \frac{H - k}{s} + 1$$  
$$W' = \frac{W - k}{s} + 1$$

For Max Pooling:  
$$\text{PoolingOutput}(x,y) = \max_{(i,j) \in W} X(i,j)$$

For Average Pooling:  
$$\text{PoolingOutput}(x,y) = \frac{1}{|W|}\sum_{(i,j)\in W} X(i,j)$$  
Here, $|W| = k^2$ for a $k \times k$ window.

Intuition:  
Pooling layers discard fine spatial details and retain only key information. This reduces the feature map size, lowers the number of parameters in subsequent layers, and can improve generalization. The reduced spatial resolution helps the model become less sensitive to minor variations in the input, thereby aiding in robust feature extraction.

Role in CNN Architecture:  
A common sequence in CNNs is: Convolution → Nonlinearity (e.g., ReLU) → Pooling  
This pattern builds a hierarchy of features, with pooling layers progressively decreasing the spatial dimension and focusing on more abstract representations.

Global Pooling:  
Global Average Pooling (GAP) reduces each feature map to a single value, removing spatial dimensions entirely. This is often employed near the end of a CNN to connect the convolutional features directly to classification layers without large fully connected layers.

Modern Considerations:  
Some architectures replace pooling with strided convolutions to learn the downsampling operation. Nevertheless, pooling remains widely used due to its simplicity and efficacy.

Summary:  
Pooling is a key operation that addresses the challenges of large spatial dimensions in CNNs. It helps manage model complexity, reduce overfitting, and promotes invariance to small input shifts. Understanding pooling and its variants is crucial for designing effective, efficient CNN architectures.


In [1]:
import numpy as np
## ([1,3
##   5,6])
## ([2,1
##   7,8])
## ([9,2
##   1,0])
## ([4,6
##   2,3])

# Original matrix (tensor)
original_matrix = np.array([
    [1, 3, 2, 1],
    [5, 6, 7, 8],
    [9, 2, 4, 6],
    [1, 0, 2, 3]
])

# Function for max pooling 2x2 with stride 2
def max_pooling(matrix, pool_size=2, stride=2):
    rows, cols = matrix.shape
    output_rows = (rows - pool_size) // stride + 1
    output_cols = (cols - pool_size) // stride + 1
    pooled_matrix = np.zeros((output_rows, output_cols))

    for i in range(0, rows - pool_size + 1, stride):
        for j in range(0, cols - pool_size + 1, stride):
            pooled_matrix[i // stride, j // stride] = np.max(matrix[i:i+pool_size, j:j+pool_size])

    return pooled_matrix

# Apply max pooling
pooled_matrix = max_pooling(original_matrix)

# Print original and pooled matrices
print("Original Matrix:")
print(original_matrix)
print("\nPooled Matrix (Max Pooling):")
print(pooled_matrix)

Original Matrix:
[[1 3 2 1]
 [5 6 7 8]
 [9 2 4 6]
 [1 0 2 3]]

Pooled Matrix (Max Pooling):
[[6. 8.]
 [9. 6.]]


In [2]:
import numpy as np

# Original matrix (tensor)
original_matrix = np.array([
    [1, 3, 2, 1],
    [5, 6, 7, 8],
    [9, 2, 4, 6],
    [1, 0, 2, 3]
])

# Function for max pooling 2x2 with stride 2
def max_pooling(matrix, pool_size=2, stride=2):
    rows, cols = matrix.shape
    output_rows = (rows - pool_size) // stride + 1
    output_cols = (cols - pool_size) // stride + 1
    pooled_matrix = np.zeros((output_rows, output_cols))

    for i in range(0, rows - pool_size + 1, stride):
        for j in range(0, cols - pool_size + 1, stride):
            pooled_matrix[i // stride, j // stride] = np.max(matrix[i:i+pool_size, j:j+pool_size])

    return pooled_matrix

# Function for average pooling 2x2 with stride 2
def avg_pooling(matrix, pool_size=2, stride=2):
    rows, cols = matrix.shape
    output_rows = (rows - pool_size) // stride + 1
    output_cols = (cols - pool_size) // stride + 1
    pooled_matrix = np.zeros((output_rows, output_cols))

    for i in range(0, rows - pool_size + 1, stride):
        for j in range(0, cols - pool_size + 1, stride):
            pooled_matrix[i // stride, j // stride] = np.mean(matrix[i:i+pool_size, j:j+pool_size])

    return pooled_matrix

# Function for global pooling (max or average)
def global_pooling(matrix, mode='max'):
    if mode == 'max':
        return np.max(matrix)
    elif mode == 'avg':
        return np.mean(matrix)
    else:
        raise ValueError("Mode should be 'max' or 'avg'.")

# Apply pooling
max_pooled = max_pooling(original_matrix)
avg_pooled = avg_pooling(original_matrix)
global_max = global_pooling(original_matrix, mode='max')
global_avg = global_pooling(original_matrix, mode='avg')

# Print results
print("Original Matrix:")
print(original_matrix)

print("\nMax Pooled Matrix (2x2, stride 2):")
print(max_pooled)

print("\nAverage Pooled Matrix (2x2, stride 2):")
print(avg_pooled)

print("\nGlobal Max Pooling:")
print(global_max)

print("\nGlobal Average Pooling:")
print(global_avg)

Original Matrix:
[[1 3 2 1]
 [5 6 7 8]
 [9 2 4 6]
 [1 0 2 3]]

Max Pooled Matrix (2x2, stride 2):
[[6. 8.]
 [9. 6.]]

Average Pooled Matrix (2x2, stride 2):
[[3.75 4.5 ]
 [3.   3.75]]

Global Max Pooling:
9

Global Average Pooling:
3.75
