<a href="https://colab.research.google.com/github/Jessietbl/aviation-scsirisk-showcase/blob/main/02_timegan_vae_pca_scsi_clean.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# TimeGAN–VAE SCSI (Showcase Demo)

This demo reproduces the **Supply Chain Stress Index (SCSI)** pipeline with
a *lightweight* approach fast to run, easy to read.

**Storyline**
1) Load curated features (market, regulated, trade/cargo, calendar)  
2) (Stand-in) Generate *synthetic* market & regulated signals (TimeGAN/VAE slots)  
3) Combine + standardize → **PCA(1)** → **SCSI**  
4) Plot and summarize

> The heavy versions (true TimeGAN/VAE training) live in the private thesis repo.


In [None]:
# -- Imports & config (portable) --
from pathlib import Path
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from src.scsi_demo import (
    prepare_base_frame,
    add_temporal_features,
    standin_timegan_market,
    standin_vae_regulated,
    assemble_feature_matrix,
    compute_scsi_pca,
    plot_scsi
)

DATA_DIR = Path("data")
OUT_DIR  = Path("outputs"); OUT_DIR.mkdir(exist_ok=True, parents=True)

START_DATE = "2019-01-01"
END_DATE   = "2024-12-01"

np.random.seed(42)


In [None]:
# -- 1) Load curated data (sample) --
# Expected columns in sample_trade.csv: Period, exports, imports, total_trade, trade_balance, text
trade = pd.read_csv(DATA_DIR / "sample_trade.csv")
trade["Date"] = pd.to_datetime(trade["Period"])  # YYYY-MM
trade = trade.set_index("Date").sort_index()

# Base monthly frame in the requested window
base = prepare_base_frame(start=START_DATE, end=END_DATE)

# Join demo trade features
for col in ["exports","imports","total_trade","trade_balance"]:
    if col in trade.columns:
        base[f"{col}_trade"] = trade[col].astype(float)

# If you have other external sources in the showcase (optional), join them here similarly.
base = base.sort_index()
base.head()


In [None]:
# -- 2) Add calendar features (sin/cos month, policy dummy) --
base = add_temporal_features(base)
base.head()


In [None]:
# -- 3) Stand-in TimeGAN (market) & VAE (regulated) signals --
# These are *light* synthetic generators that mimic the shape/role of your models,
# so the notebook runs in seconds. Replace with real TimeGAN/VAE calls in your private repo.

market_cols = ["AsiaPacific", "AsiaPacific_rescaled", "AirFreightRate_Weekly", "AirFreightRate_Annual"]
reg_cols    = ["ron95","ron97","diesel","diesel_eastmsia"]

market_synth = standin_timegan_market(base.index, market_cols)
reg_synth    = standin_vae_regulated(base.index, reg_cols)

market_synth.head(), reg_synth.head()


In [None]:
# -- 4) Assemble full feature matrix for SCSI (includes trade and calendar) --
X = assemble_feature_matrix(base, market_synth, reg_synth)
X.head()


In [None]:
# -- 5) PCA → SCSI (1D) with orientation safeguard --
scsi_df, pca, scaler, flipped = compute_scsi_pca(X)

print(f"Explained variance (PC1): {pca.explained_variance_ratio_[0]:.3f}")
print(f"Orientation flipped: {flipped}")
scsi_df.head()


In [None]:
# -- 6) Plot & save --
figpath = OUT_DIR / "scsi_timeseries_demo.png"
plot_scsi(scsi_df, save_path=figpath)
figpath


## Notes
- This is the **showcase** version: fast to run; minimal deps.
- To plug back true **TimeGAN** and **VAE**:
  - Replace `standin_timegan_market()` with trained TimeGAN inference.
  - Replace `standin_vae_regulated()` with VAE inference.
  - Keep `assemble_feature_matrix()` and `compute_scsi_pca()` identical.