# üß† Model Evolution & Evaluation

> **PM Accelerator Mission**: "By making industry-leading tools and education available to individuals from all backgrounds, we level the playing field for future PM leaders."

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/moazmo/weather-trend-forecasting/blob/main/presentation/03_Model_Evolution.ipynb)
[![nbviewer](https://img.shields.io/badge/render-nbviewer-orange.svg)](https://nbviewer.org/github/moazmo/weather-trend-forecasting/blob/main/presentation/03_Model_Evolution.ipynb)

This notebook covers:
1. Model Architecture Comparison
2. Training Process
3. Evaluation Metrics
4. Final Model Deep Dive

## 1. Model Evolution Journey

### üìà Performance Timeline

| Version | Model | MAE | Parameters | Training Time |
|---------|-------|-----|------------|---------------|
| V1.0 | MLP (3 layers) | ~4.5¬∞C | ~50K | 5 min |
| V2.2 | LSTM (2 layers) | 2.05¬∞C | ~100K | 30 min |
| V2.3 | Transformer (4 layers) | 2.05¬∞C | ~200K | 15 min |
| V3.0 | Multivariate Transformer | 2.07¬∞C | ~250K | 20 min |
| **V4.0** | **Advanced Transformer + GRN** | **2.00¬∞C** | **1.3M** | **45 min** |

![Model Evolution](images/model_evolution.png)

In [3]:
import plotly.io as pio
# Configure Plotly for Jupyter Book (HTML renderer with CDN)
pio.renderers.default = "notebook_connected"

import plotly.graph_objects as go

# Model evolution visualization (interactive version)
versions = ['V1 MLP', 'V2.2 LSTM', 'V2.3 Transformer', 'V3.0 Multivariate', 'V4.0 Advanced']
mae_scores = [4.5, 2.05, 2.05, 2.07, 2.00]
colors = ['#ff6b6b', '#ffd93d', '#6bcb77', '#4d96ff', '#00f296']

fig = go.Figure(data=[
    go.Bar(x=versions, y=mae_scores, marker_color=colors, text=[f'{m}¬∞C' for m in mae_scores], textposition='outside')
])
fig.update_layout(
    title='üìä Model Performance Evolution (Lower is Better)',
    yaxis_title='MAE (¬∞C)',
    template='plotly_dark',
    yaxis_range=[0, 5]
)
fig.add_hline(y=2.0, line_dash='dash', line_color='green', annotation_text='Target: 2.0¬∞C')
fig.show()

## 2. Architecture Deep Dives

### V1: Multi-Layer Perceptron (MLP)

```
Input (20 features) ‚Üí Dense(256, ReLU) ‚Üí Dense(128, ReLU) ‚Üí Dense(64, ReLU) ‚Üí Output(7)
```

**Pros:** Simple, fast to train, easy to interpret

**Cons:** No temporal awareness, cannot capture sequences

---

### V2.2: LSTM (Long Short-Term Memory)

```
Input Sequence (30 days) ‚Üí LSTM(128) ‚Üí LSTM(128) ‚Üí Dense(64) ‚Üí Output(7)
```

**Innovation:** First model to use **30-day historical sequence**

---

### V4.0: Advanced Transformer with GRN (Final Model)

```
Input (30 days √ó 25 features)
        ‚Üì
‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
‚îÇ Gated Residual Network‚îÇ  ‚Üê Learns to skip irrelevant inputs
‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò
        ‚Üì
‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
‚îÇ  Positional Encoding  ‚îÇ  ‚Üê Injects sequence order
‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò
        ‚Üì
‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
‚îÇ  Transformer Encoder  ‚îÇ  ‚Üê 6 layers, 8 attention heads
‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò
        ‚Üì
‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
‚îÇ Gated Residual Network‚îÇ  ‚Üê Filters noise before output
‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò
        ‚Üì
    7-Day Forecast
```

## 3. Training Configuration

| Parameter | Value |
|-----------|-------|
| Architecture | AdvancedWeatherTransformer |
| d_model | 128 |
| Attention Heads | 8 |
| Transformer Layers | 6 |
| Dropout | 0.15 |
| Parameters | 1,324,167 |
| Loss Function | HuberLoss (delta=1.0) |
| Optimizer | AdamW (lr=0.001) |
| Scheduler | CosineAnnealingWarmRestarts |
| Batch Size | 256 |
| Epochs | 44 (early stopped) |

## 4. Evaluation Results

### MAE by Forecast Day

![MAE Per Day](images/mae_per_day.png)

In [4]:
# Per-day MAE visualization (interactive version)
days = ['Day 1', 'Day 2', 'Day 3', 'Day 4', 'Day 5', 'Day 6', 'Day 7']
mae_per_day = [1.97, 1.95, 1.94, 1.94, 1.99, 2.05, 2.14]

fig = go.Figure(data=[
    go.Scatter(x=days, y=mae_per_day, mode='lines+markers+text', 
               text=[f'{m}¬∞C' for m in mae_per_day], textposition='top center',
               line=dict(color='#4facfe', width=3),
               marker=dict(size=12))
])
fig.update_layout(
    title='üìà MAE by Forecast Day (V4 Model)',
    xaxis_title='Forecast Day',
    yaxis_title='MAE (¬∞C)',
    template='plotly_dark',
    yaxis_range=[1.5, 2.5]
)
fig.show()

print("\nüìä Key Observation:")
print("   - Days 1-4: Very accurate (~1.95¬∞C)")
print("   - Days 5-7: Slight degradation (~2.1¬∞C)")
print("   - This is expected: uncertainty grows with forecast horizon")


üìä Key Observation:
   - Days 1-4: Very accurate (~1.95¬∞C)
   - Days 5-7: Slight degradation (~2.1¬∞C)
   - This is expected: uncertainty grows with forecast horizon


## 5. Key Learnings

### ‚úÖ What Worked Well
1. **Sequence Modeling**: Moving from MLP to LSTM cut MAE in half
2. **Gated Residual Networks**: Allowed selective feature importance
3. **Real Weather Data**: Open-Meteo integration improved generalization
4. **Cyclical Encoding**: Sin/cos for time features

### ‚ùå What Didn't Work
1. **Adding more features naively**: V3 multivariate was WORSE than V2.3
2. **Larger batch sizes**: Beyond 256 hurt convergence
3. **Very deep models (>8 layers)**: Overfitting increased

---

## üèÅ Conclusion

We successfully built a **state-of-the-art weather forecasting system** that:
- Achieves **2.00¬∞C MAE** (exceeded 2.5¬∞C target)
- Works for **any location on Earth**
- Provides **real-time predictions** via web app
- Is **production-ready** with Docker deployment