# Convolution (conv_forward / conv2D)

Convolution is a mathematical operation where we **multiply elements** with a filter (kernel), perform an **element-wise sum**, and then **slide the filter** across the input matrix.


* The 2D output of the convolution is called the **feature map**


![src="image.png"](attachment:image.png)


## 📌 Input Example: Grayscale Image (6×6×1) and Filter/Kernel (3×3)  
The convolution of a **6×6 matrix** with a **3×3 kernel** results in a **4×4 output matrix**.

```python
X = np.array([
    [3, 0, 1, 2, 7, 4],
    [1, 5, 8, 9, 3, 1],
    [2, 7, 2, 5, 1, 3],
    [0, 1, 3, 1, 7, 8],
    [4, 2, 1, 6, 2, 8],
    [2, 4, 5, 2, 3, 9]
])

Filter = np.array([
    [1, 0, -1],
    [1, 0, -1],
    [1, 0, -1]
])

X_conv = np.array([
    [-5, -4,  0,  8],
    [-10, -2,  2,  3],
    [0, -2, -4, -7],
    [-3, -2, -3, -16]
])

# 📌Padding & Stride

##  **Padding: Control output size**
Padding is applied to the input matrix to control the **output size** and **ensure proper filter application**.

1. **Avoid Shrinkage** – Ensures the output matrix doesn’t get progressively smaller after each convolutional layer.
2. **Edge Handling** – Allows the filter to be correctly applied to the edges of the input matrix.

- **"Valid Convolution"** → **No Padding** (Output size < Input size)
- **"Same Convolution"** → Output size = Input size (**Padding = (Filter Size - 1) / 2**)

---

## **Stride: Controlling the Step Size**
**Stride** determines how far the **filter moves** across the input matrix at each step.

- **Larger Stride (s > 1)** → Smaller output, faster computation.
- **Smaller Stride (s = 1)** → Retains more details, larger output.

### **Formula for Output Size**
Output Size = floor((n + 2p - f) / s + 1)

Where:
- \( n \) = Input Size  
- \( p \) = Padding  
- \( f \) = Filter Size  
- \( s \) = Stride  

---


# 🧠 CNN: One Layer Explained

## Why?  shift invariant  and Reduces overfitting :

 1. **Parameter sharing:** A feature detector such as a vertical edge filter, if useful in one part of image, it probably is in another part. This means we don´t need two filters for thw two parts of the data.
 2. **Sparsity of connections:** Each output value comes from only a subset of the input data.

## **Forward Pass Through a Single Convolutional Layer**

### **Input Matrix**

X = 6×6×3 # Input: 6x6 image with 3 color channels (RGB)


### **Applying Convolution**
1️⃣ **First Filter** (3×3×3)  
   - Applies convolution  
   - Activation function: **ReLU**  
   - Output: **4 × 4** after convolution + bias  

Filter1 = 3×3×3 → relu(4 × 4 + b1) → 4 × 4

2️⃣ **Second Filter** (3×3×3)  
- Another convolution operation  
- Activation function: **ReLU**  
- Output: **4 × 4**  

Filter2 = 3×3×3 → relu(4 × 4 + b2) → 4 × 4


---

## **Summary of Output Dimensions**
For a **n × n × nc** input with a **f × f × nc** filter:

Output Size = floor((n + 2p - f) / s + 1) × floor((n + 2p - f) / s + 1) × n_f

- `n` = Input size  
- `p` = Padding  
- `f` = Filter size  
- `s` = Stride  
- `n_f` = Number of filters  

## **Number of parameters**
Parameters =  (f x f x nc) x n_f + Bias terms (1 per filter)


# 🏊‍♂️ Pooling Layer in CNNs

Pooling helps to:

✔ **Reduce the spatial size** of the feature maps, reducing computation.

✔ **Increase robustness** feature detectors are more invariant to its position in the input.

✔ **Prevent overfitting** by reducing the number of parameters.  

---

- Defined by **filter size (f)** and **stride (s)**.
- Pooling is applied **independently to each channel** in the input.

### **Example**

![image.png](attachment:image.png)

## 🔹 **Types of Pooling**
### **🔹 Max Pooling**
📌 **Takes the maximum value** in the selected region.  
✔ Captures the **most important** features (e.g., strong edges).  
✔ Helps in **feature selection** by focusing on dominant activations.  

### **🔹 Average Pooling**
📌 **Takes the average value** in the selected region.  
✔ Smoothens feature maps, reducing noise.  
✔ Retains **global** information rather than focusing on extreme values.  
