## Pooling in Convolutional Neural Networks (CNNs)

### Theory
**Pooling** is a downsampling operation used in CNNs to reduce the spatial dimensions (width and height) of the feature maps while retaining the most important information.  
It reduces computational complexity, controls overfitting, and helps the network become invariant to small translations in the input.

**Mathematical Representation**

For an input feature map \( F \) and pooling window \( P \) (e.g., \(2 \times 2\)):

$$
O(i, j) = \text{Pooling}\big( F(m : m + P_h, n : n + P_w) \big)
$$

where:
- \( O(i, j) \) is the output pooled value,
- \( P_h, P_w \) are the pooling window dimensions.

---

### Types of Pooling
| Pooling Type | Description | Formula |
|--------------|-------------|---------|
| **Max Pooling** | Takes the maximum value in the window. | \( O = \max(P) \) |
| **Average Pooling** | Takes the average value of the window. | \( O = \frac{1}{|P|} \sum P \) |
| **Global Average Pooling** | Reduces each feature map to a single value by averaging all values. | \( O = \frac{1}{HW} \sum_{i,j} F(i,j) \) |

---

### Advantages
| Advantage | Description |
|-----------|-------------|
| Reduces Dimensionality | Significantly decreases feature map size, lowering computations. |
| Controls Overfitting | Smaller representations reduce risk of overfitting. |
| Translation Invariance | Small shifts in input do not drastically affect pooled output. |
| Retains Essential Features | Keeps most relevant patterns while discarding noise. |

---

### Disadvantages
| Disadvantage | Description |
|--------------|-------------|
| Loss of Spatial Information | Pooling discards exact locations of features. |
| May Drop Useful Features | Important fine details can be lost in aggressive pooling. |
| Not Learnable | Traditional pooling has no parameters to learn. |

---

### Usage / Applications
| Application | Description |
|-------------|-------------|
| Image Classification | Used after convolution layers to extract dominant features. |
| Object Detection | Helps reduce feature map size while preserving object patterns. |
| Segmentation | Global average pooling used in fully convolutional networks. |

---

### Example (Max and Average Pooling in PyTorch)
```python
import torch
import torch.nn as nn

# Example input: 1 image, 1 channel, 4x4 pixels
x = torch.tensor([[[[1, 2, 3, 4],
                    [5, 6, 7, 8],
                    [9,10,11,12],
                    [13,14,15,16]]]], dtype=torch.float32)

# MaxPooling with 2x2 window
max_pool = nn.MaxPool2d(2)
output_max = max_pool(x)

# AveragePooling with 2x2 window
avg_pool = nn.AvgPool2d(2)
output_avg = avg_pool(x)

print("Max Pooling Output:\n", output_max)
print("Average Pooling Output:\n", output_avg)
