# 4.0 - Mode Modeling (Exploratory)

**Important**: The labels used here are heuristic/pseudo-labels derived from simple rules. This notebook is for exploratory, reproducible baselining and should not be treated as production-grade modeling without proper ground truth and validation.

This notebook:
- Prepares a trip-level dataset with engineered features
- Splits data with temporal awareness (if timestamps exist) or stratified random otherwise
- Trains a simple baseline classifier (StandardScaler + LogisticRegression)
- Evaluates on a held-out test set
- Saves model and reports/figures to the outputs folder

In [None]:
%load_ext autoreload
%autoreload 2
import os
import json
import pandas as pd
from pathlib import Path

from src import modeling

OUTPUTS_DIR = "outputs"
FIG_PATH = os.path.join(OUTPUTS_DIR, "figures", "confusion_matrix_mode_model.png")
VAL_METRICS_JSON = os.path.join(OUTPUTS_DIR, "reports", "model_val_metrics.json")
TEST_REPORT_CSV = os.path.join(OUTPUTS_DIR, "reports", "model_test_classification_report.csv")
MODEL_PATH = os.path.join(OUTPUTS_DIR, "models", "mode_baseline.joblib")

os.makedirs(os.path.join(OUTPUTS_DIR, "figures"), exist_ok=True)
os.makedirs(os.path.join(OUTPUTS_DIR, "reports"), exist_ok=True)
os.makedirs(os.path.join(OUTPUTS_DIR, "models"), exist_ok=True)

print("Outputs will be saved under:", OUTPUTS_DIR)

## 1) Prepare dataset
This will load processed trips and either join heuristic labels if available or compute them using the same thresholds as `src/mode_inference.py`.

In [None]:
trips_df = modeling.prepare_trip_dataset(
    trips_path="data/processed/02_trips.parquet",
    heuristic_path="data/processed/06_trip_modes_heuristic.parquet",
)
trips_df.head()

## 2) Make splits
Temporal leakage avoidance if timestamps exist; otherwise stratified random split.

In [None]:
train_df, val_df, test_df = modeling.make_splits(trips_df, test_size=0.2, val_size=0.2, random_state=42)
print("Train:", train_df.shape, ", Val:", val_df.shape, ", Test:", test_df.shape)

## 3) Train baseline classifier
StandardScaler + LogisticRegression; also computes a rule-based speed-only baseline for comparison on the validation set.

In [None]:
# Baseline features only
model_baseline, val_metrics_baseline = modeling.train_baseline_classifier(train_df, val_df, feature_subset="baseline")
print("Validation metrics (baseline):", {k: val_metrics_baseline[k] for k in ["val_accuracy", "val_macro_f1"]})

# Extended features (if available in dataset)
model_extended, val_metrics_extended = modeling.train_baseline_classifier(train_df, val_df, feature_subset="extended")
print("Validation metrics (extended):", {k: val_metrics_extended[k] for k in ["val_accuracy", "val_macro_f1"]})

# Choose which to proceed with for test evaluation; here we use extended by default
model = model_extended
val_metrics = val_metrics_extended

# Save validation artifacts
modeling.save_artifacts(model, val_metrics, outputs_dir=OUTPUTS_DIR)
print("Saved model to:", MODEL_PATH)
print("Saved validation metrics to:", VAL_METRICS_JSON)

## 4) Evaluate on test set and save reports/figures
Generates a classification report CSV and confusion matrix figure, plus calibration curve.

In [None]:
test_metrics = modeling.evaluate(model, test_df, outputs_dir=OUTPUTS_DIR, fig_name="confusion_matrix_mode_model.png")
print("Test metrics (summary):", {k: test_metrics[k] for k in ["test_accuracy", "test_macro_f1"]})
print("Saved classification report CSV to:", test_metrics["classification_report_csv"]) 
print("Saved confusion matrix figure to:", test_metrics["confusion_matrix_fig"]) 
print("Extended metrics JSON:", test_metrics.get("metrics_extended_json"))

## 4b) Calibration plot display (placeholder)

In [None]:
from IPython.display import Image, display
cal_fig_path = os.path.join(OUTPUTS_DIR, "figures", "mode_model_calibration.png")
if os.path.exists(cal_fig_path):
    display(Image(filename=cal_fig_path))
else:
    print("Calibration figure not found yet:", cal_fig_path)

## 5) Notes
- Labels are heuristic/pseudo-labels; treat results as exploratory.
- Avoid temporal leakage by using temporal splits and computing features strictly within trip windows.
- Improve by adding more realistic features and validated labels, hyperparameter tuning, and robust cross-validation.

### Placeholder: PR curves / Ablation
- TODO: When ground-truth labels are available, compute per-class PR curves.
- TODO: Render an ablation summary comparing baseline vs extended features.