# 🏗️ ResNet Architecture

## Residual Learning Revolution

**Problem**: Very deep networks degrade (not just overfitting!)
**Solution**: Skip connections

$$y = F(x) + x$$

---


In [None]:
import numpy as np
print('✅ ResNet ready!')


## Why ResNet Works

### The Degradation Problem
- 56-layer plain network WORSE than 20-layer
- Not overfitting—training error higher!

### Residual Block Solution

$$H(x) = F(x) + x$$

where $F(x)$ is learned residual.

**Key insight**: Easier to learn residual $F(x) = 0$ than identity $H(x) = x$

### Gradient Flow

$$\frac{\partial L}{\partial x} = \frac{\partial L}{\partial H} \left( \frac{\partial F}{\partial x} + 1 \right)$$

The "+ 1" ensures gradient always flows!


In [None]:
class ResidualBlock:
    """Basic ResNet building block."""
    
    def __init__(self, channels):
        self.channels = channels
        # In practice: Conv → BN → ReLU → Conv → BN
        # Then: out = ReLU(residual + skip)
    
    def forward(self, x):
        # Save input for skip connection
        identity = x
        
        # Residual path (2 conv layers)
        out = self.conv1(x)  # Simplified
        out = self.relu(out)
        out = self.conv2(out)
        
        # Add skip connection
        out += identity
        out = self.relu(out)
        
        return out

print('✅ ResNet block structure!')


## ResNet Variants

- **ResNet-18**: 18 layers
- **ResNet-34**: 34 layers
- **ResNet-50**: 50 layers (uses bottleneck)
- **ResNet-101**: 101 layers
- **ResNet-152**: 152 layers

**Bottleneck design**: 1×1 → 3×3 → 1×1 (reduces parameters)
