In [None]:
# E2E: Accident Risk Prediction (CNN-BiLSTM-Attn + DeepSHAP)

Steps:
1. Preprocess UK/US datasets (merge, clean, encode, temporal parts, select 32 features).
2. Train CNN-BiLSTM-Attention with SMOTE+UnderSampler and 10-fold CV.
3. Evaluate MAE and classification metrics (1/2/3 by rounding).
4. Explain with DeepSHAP and retrain using top-15 features.

Run CLI scripts from this notebook using `!python ...` or adapt cells as needed.



In [None]:
# Preprocess UK
!python src/preprocess.py --config config/uk_config.yaml

# Train baseline
!python src/train.py --config config/cnn_bilstm_attn.yaml

# Explain and get SHAP ranking
!python src/explain_deepshap.py --checkpoint outputs/best.pt --background-samples 500 --batch 64

# Retrain with top-15 features
!python src/train.py --config config/cnn_bilstm_attn.yaml --topk 15 --shap_ranking outputs/shap_global_ranking.csv



In [None]:
# Analysis plots (annual, weekly, hourly)
!python src/analysis.py --processed_dir data/processed/uk --out outputs/analysis



In [None]:
# Baseline vs Top-15 side-by-side runs with tags
!python src/train.py --config config/cnn_bilstm_attn.yaml --tag baseline
!python src/explain_deepshap.py --checkpoint outputs/baseline/best.pt --background-samples 500 --batch 64 --tag baseline
!python src/train.py --config config/cnn_bilstm_attn.yaml --topk 15 --shap_ranking outputs/baseline/shap_global_ranking.csv --tag top15

