```{contents}
```
## Residual Connection (Skip Connection)

A **Residual Connection** is a structural design in neural networks that allows the input of a layer to be **added directly to its output**.
It enables very deep networks to train effectively.

---

### **Core Intuition**

Deep networks should not have to **relearn the identity function**.

A residual connection lets the model choose:

> **Learn something new OR keep what already works.**

This dramatically improves training stability and depth.

---

### **Mathematical Form**

Without residual:
[
y = F(x)
]

With residual:
[
y = F(x) + x
]

Where (F(x)) is the transformation learned by the layer.

---

### **Why It Works**

####Gradient Flow

Gradients can flow directly through the identity path, preventing:

* Vanishing gradients
* Exploding gradients

####Optimization Simplicity

Learning a small correction is easier than learning an entire transformation from scratch.

---

### **Architecture Example**

```
x ────────────────┐
                  │
      F(x)        ▼
x → [ Layer ] → (+) → y
```

---

### **Applications**

#### Transformers

Every attention and feedforward block uses residual connections.

#### CNNs (ResNet)

Enabled training of networks with 100+ layers.

#### VAEs & GANs

Improves stability and convergence.

#### Reinforcement Learning

Deep policy networks rely on residuals.

---

### **Benefits**

| Benefit               | Explanation                |
| --------------------- | -------------------------- |
| Enables deep networks | Hundreds of layers         |
| Stable training       | Prevents gradient collapse |
| Faster convergence    | Easier optimization        |
| Better generalization | Smoother learning          |

---

### **Residual + LayerNorm (Transformer Block)**

```
x → LayerNorm → Attention → + → LayerNorm → FFN → +
```

This combination is the backbone of modern deep learning.

---

**Intuition Summary**

Residual connections give the network a **shortcut path** that preserves useful information and makes deep learning practical.