```{contents}
```

## ANN vs CNN

### In ANN (Artificial Neural Network)

1. **Input:** Numeric feature vector
2. **Operation:**
   $z = W^T x + b$
3. **Activation:** Apply ReLU or another activation on each neuron
4. **Forward pass → loss calculation → backpropagation**
5. **Weight updates using optimizer**

Everything is fully connected: each input connects to each neuron via weights.

---

### In CNN (Convolutional Neural Network)

1. **Input:** Image (grayscale or RGB)
2. **Filters (kernels):**

   * 3×3, 5×5, or other sizes
   * Multiple filters (F1, F2, F3 … Fn)
   * Initialized **randomly**, not fixed like edge/shape detectors
3. **Convolution operation:**

   * Slide filter over the image
   * Perform elementwise multiplication + sum
   * Produces **feature maps**
4. **Activation:**
   Apply **ReLU** on each feature map value:
   $$
   \text{ReLU}(x) = \max(0, x)
   $$
   ✔ Prevents vanishing gradients
   ✔ Simple derivative (0 or 1) → easy backprop
5. **Backpropagation:**

   * Gradients update filter weights
   * Same mechanism as ANN
6. **Next step (after ReLU):** Max Pooling (downsampling)

---

### Why ReLU is applied after convolution?

* Derivative is easy (0 or 1)
* Required for backprop to update filter weights
* Prevents vanishing gradient
* Preserves non-linearity

---

**Key Conceptual Difference**

| Step            | ANN                          | CNN                                 |
| --------------- | ---------------------------- | ----------------------------------- |
| Input           | Feature vector               | Image (2D/3D)                       |
| Learnable units | Weights for every connection | Filters (kernels)                   |
| Operation       | Dot product per neuron       | Convolution over spatial regions    |
| Activation      | After weighted sum           | After convolution (per feature map) |
| Parameters      | Large (dense)                | Fewer (shared weights)              |
| Processing      | Whole input at once          | Local receptive fields              |

---

If you want a one-liner:

**ANN learns weights for each input → neuron connection.
CNN learns filters that scan images to extract patterns (edges, textures, objects). ReLU enables gradient flow so filters get updated.**
