# 03 — Model Training & Evaluation
**Urban Energy Consumption Forecasting with LSTM**

This notebook trains the stacked LSTM, tracks learning curves, and runs a
comprehensive evaluation on the held-out test set.

Sections
1. Environment & data preparation
2. Model architecture overview
3. Training
4. Learning curves
5. Test-set metrics
6. Prediction visualisations
7. Horizon error profile
8. Comparison against ARIMA baseline
9. Export model

In [None]:
import sys, warnings, os
sys.path.insert(0, '..')
warnings.filterwarnings('ignore')
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import tensorflow as tf

from config import (
    RAW_DATA_FILE, MODEL_SAVE_PATH, LOOKBACK, HORIZON,
    EPOCHS, BATCH_SIZE, LEARNING_RATE, FEATURE_COLUMNS
)
from src.preprocess import DataPreprocessor
from src.model import build_lstm_model, train_model
from src.evaluate import ModelEvaluator, compute_all_metrics

sns.set_theme(style='whitegrid')
plt.rcParams.update({'figure.dpi': 120})

print(f'TensorFlow {tf.__version__}')
print(f'GPUs: {tf.config.list_physical_devices("GPU")}')

## 1 · Data Preparation

In [None]:
prep = DataPreprocessor(lookback=LOOKBACK, horizon=HORIZON)
splits = prep.run(raw_filepath=RAW_DATA_FILE, save=True)
X_train, X_val, X_test, y_train, y_val, y_test = splits

print(f'X_train : {X_train.shape}')
print(f'X_val   : {X_val.shape}')
print(f'X_test  : {X_test.shape}')

## 2 · Model Architecture

In [None]:
n_features = X_train.shape[2]
model = build_lstm_model(
    n_features=n_features,
    lookback=LOOKBACK,
    horizon=HORIZON,
    learning_rate=LEARNING_RATE,
)
model.summary()

## 3 · Training

In [None]:
history = train_model(
    model=model,
    X_train=X_train, y_train=y_train,
    X_val=X_val,     y_val=y_val,
    epochs=EPOCHS,
    batch_size=BATCH_SIZE,
    save_path=MODEL_SAVE_PATH,
)
print('Training complete.')

## 4 · Learning Curves

In [None]:
hist = history.history
epochs_ran = range(1, len(hist['loss']) + 1)

fig, axes = plt.subplots(1, 2, figsize=(13, 4))

axes[0].plot(epochs_ran, hist['loss'],     label='Train Loss',  color='steelblue')
axes[0].plot(epochs_ran, hist['val_loss'], label='Val Loss',    color='darkorange', linestyle='--')
axes[0].set_title('MSE Loss')
axes[0].set_xlabel('Epoch')
axes[0].set_ylabel('Loss')
axes[0].legend()

axes[1].plot(epochs_ran, hist['mae'],     label='Train MAE',  color='steelblue')
axes[1].plot(epochs_ran, hist['val_mae'], label='Val MAE',    color='darkorange', linestyle='--')
axes[1].set_title('Mean Absolute Error')
axes[1].set_xlabel('Epoch')
axes[1].set_ylabel('MAE (scaled)')
axes[1].legend()

plt.tight_layout()
plt.show()

best_epoch = np.argmin(hist['val_loss']) + 1
print(f'Best epoch: {best_epoch}  |  val_loss = {min(hist["val_loss"]):.6f}')

## 5 · Test-Set Metrics

In [None]:
evaluator = ModelEvaluator(model=model, preprocessor=prep)
metrics = evaluator.evaluate(X_test, y_test, verbose=True)

## 6 · Prediction Visualisations

In [None]:
_ = evaluator.plot_predictions(X_test, y_test, n_samples=336)  # 2-week window
plt.show()

In [None]:
_ = evaluator.plot_residuals(X_test, y_test)
plt.show()

## 7 · Horizon Error Profile

In [None]:
_ = evaluator.plot_horizon_errors(X_test, y_test)
plt.show()

## 8 · Comparison Against Persistence Baseline

A naive persistence model predicts the next 24h to be identical to the previous 24h.

In [None]:
y_test_kw = prep.inverse_scale_y(y_test)
y_pred_kw = evaluator.predict(X_test)

# Persistence: y_pred = last observed value repeated HORIZON times
# Use the last step of the input window as the naïve forecast
X_test_orig_power = prep.scaler_X.inverse_transform(
    X_test[:, -1, :]
)[:, 0]   # power column

y_persist = np.tile(X_test_orig_power[:, np.newaxis], (1, HORIZON))

lstm_metrics  = compute_all_metrics(y_test_kw.ravel(), y_pred_kw.ravel())
naive_metrics = compute_all_metrics(y_test_kw.ravel(), y_persist.ravel())

comparison = pd.DataFrame({
    'LSTM (ours)': lstm_metrics,
    'Persistence baseline': naive_metrics
}).T

print('\n--- Model Comparison ---')
display(comparison.round(4))

improvement = (1 - lstm_metrics['MAE'] / naive_metrics['MAE']) * 100
print(f'\nLSTM reduces MAE by {improvement:.1f}% over the persistence baseline.')

## 9 · Export

The best checkpoint was already saved automatically during training.  Here we confirm the path and display the file size.

In [None]:
import os
if MODEL_SAVE_PATH.exists():
    size_mb = MODEL_SAVE_PATH.stat().st_size / 1e6
    print(f'Model saved at : {MODEL_SAVE_PATH}')
    print(f'File size      : {size_mb:.2f} MB')
else:
    print('Model file not found — training may not have completed successfully.')

## 10 · Summary

| Metric | LSTM | Persistence |
|--------|------|-------------|
| MAE    | see cell above | see cell above |
| RMSE   | see cell above | see cell above |
| R²     | see cell above | see cell above |

The trained model is ready for deployment via `serve.py` (FastAPI).

```bash
python serve.py  # starts the REST endpoint on port 8000
```

Or with Docker:

```bash
docker compose --profile serve up api
```