<a href="https://colab.research.google.com/github/sankeawthong/Project-1-Lita-Chatbot/blob/main/%5B20250921%5D%20LR-BiLSTM%20IDS%20Domain-Shift%20Pipeline%20(v3-fast%20PATCH%20PLUS).ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

IDS Domain-Shift Pipeline (v3) — Leakage-safe + Expanded MQTT filter
--------------------------------------------------------------------
- Drops label-like columns from features to prevent leakage
- Aligns feature schemas between train and test (categorical one-hot + scaler from train)
- Expanded MQTT filtering for NF-ToN-IoT (checks DST and SRC ports; common fallbacks)
- Defaults to RESAMPLING="smote_tomek" and LOSS_MODE="focal"
- Keeps ablations: LR-only, BiLSTM-only, LR→BiLSTM; FGSM/PGD adversarial mixing for sequence models
- Metrics: Macro-F1, AUROC/PR-AUC, ECE/MCE, FPR@1e-3/1e-4; Per-class recall/F1; Reliability bins

Outputs under ./outputs_v3/

Author: Sine & Mentor

IDS Domain-Shift Pipeline (v3 • FIXED)
=====================================
Key fixes based on crash/stall & 0.0 val-accuracy symptoms:
- **Unified label mapping** (no LabelEncoder sorting surprises). We use a fixed
  CLASS2IDX for the 6 families across *all* domains & stages.
- **Hard guards for label range** and per-split class summaries (fail fast if a
  label falls outside 0..C-1).
- **Stable validation**: explicit internal train/val split instead of
  `validation_split`, avoiding misaligned batches when concatenating adversarial
  examples.
- **Empty MQTT split**: skip gracefully.
- **CPU-stable training**: default to CPU-only to avoid cuInit(303) & duplicate
  CUDA plugin registrations seen in notebooks/containers. Can be toggled.
- **Keras stability**: `workers=1`, `use_multiprocessing=False`,
  `categorical_accuracy` metric, and tf.data pipelines.
- **Benign index** based on CLASS2IDX (not list position); evaluation is
  consistent with model output ordering.

Outputs are written to ./outputs_v3/ as before.

IDS Domain-Shift Pipeline (v3 • FIXED, Keras3-compatible)
=========================================================
This revision fixes the crash:
  TypeError: TensorFlowTrainer.fit() got an unexpected keyword argument 'workers'

Key changes
-----------
- **Removed** unsupported `workers`/`use_multiprocessing` args (Keras 3).
- **Guarded resampling**: `SMOTETomek` is skipped for huge splits; use class
  weights or light undersampling instead to avoid stalls/OOM.
- **Adversarial cap**: upper-bound adversarial sample count to keep memory stable.
- **CPU-stable default** retained (toggle via CONFIG["FORCE_CPU"]).
- **Unified label mapping** and per-split summaries kept.

Outputs go to `./outputs_v3/`.

IDS Domain-Shift Pipeline (v3-fast PATCH PLUS)
==============================================
Adds on top of v3_fast_patch:
  • Per-family thresholds (aggregated attack decision)
  • Focal loss gamma tweak (γ=3.0 by default)
  • Per-family temperature scaling (coordinate-descent power scaling)
  • CORAL moment alignment in LR→BiLSTM feature space (unsupervised DA)

Outputs OVERWRITE the same files as v3_fast_patch for direct A/B:
  ./outputs_v3fast_patch/domain_shift_results_v3fast_patch.csv
  ./outputs_v3fast_patch/domain_shift_perclass_v3fast_patch.csv
  ./outputs_v3fast_patch/domain_shift_reliability_bins_v3fast_patch.csv

In [1]:
from __future__ import annotations
import os, time, warnings
from dataclasses import dataclass, asdict
from typing import List, Tuple, Dict

import numpy as np
import pandas as pd

from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler, label_binarize
from sklearn.metrics import f1_score, recall_score, roc_auc_score, average_precision_score, roc_curve, log_loss
from sklearn.linear_model import LogisticRegression
from sklearn.utils.class_weight import compute_class_weight

from imblearn.combine import SMOTETomek

import tensorflow as tf
from tensorflow.keras import layers, models, regularizers, callbacks
from tensorflow.keras.utils import to_categorical

from scipy.sparse import csr_matrix
from numpy.linalg import eig

warnings.filterwarnings("ignore")

In [2]:
# --------------------
# Config
# --------------------
CONFIG = {
    "NF_TON_IOT_CSV": "Dataset_NF-ToN-IoT.csv",
    "CIC_IOMT_TRAIN_CSV": "CIC_IoMT_2024_WiFi_MQTT_train.csv",
    "CIC_IOMT_TEST_CSV":  "CIC_IoMT_2024_WiFi_MQTT_test.csv",

    "NF_NUMERIC_CLASS_COL": "Class",
    "NF_TEXT_LABEL_COL": "Attack",
    "CIC_CLASS_COL": "Class",
    "CIC_TEXT_LABEL_COL": "label",

    "NF_DST_PORT_CANDIDATES": ["L4_DST_PORT", "Dst Port", "dst_port", "Destination Port", "dport", "dstport","DstPort"],
    "NF_SRC_PORT_CANDIDATES": ["L4_SRC_PORT", "Src Port", "src_port", "Source Port", "sport", "srcport","SrcPort"],
    "MQTT_PORTS": [1883, 8883],

    "SEED": 42,
    "EPOCHS": 30,
    "BATCH_SIZE": 128,
    "LEARNING_RATE": 1e-3,
    "PATIENCE": 5,

    "RESAMPLING": "smote_tomek",            # "none" | "smote_tomek"
    "LOSS_MODE": "focal",                   # "ce" | "class_balanced_ce" | "focal"
    "FOCAL_GAMMA": 3.0,                    # tweaked gamma
    "RESAMPLING_MAX_N": 200_000,
    "UNDERSAMPLE_MAX_PER_CLASS": 100_000,

    "USE_ADV_TRAINING": True,
    "ADV_METHOD": "fgsm",                   # "fgsm" | "pgd" (kept minimal for epoch budget)

    "FORCE_CPU": True,
    "OUTDIR": "./outputs_v3fast_patch",     # same as patch

    # Per-family calibration/thresholds
    "USE_PER_CLASS_TEMP": True,
    "TEMP_GRID": list(np.linspace(0.5, 2.0, 16)),   # coarse grid
    "TEMP_COORD_ITERS": 2,
    "USE_PER_FAMILY_THRESH": True,

    # CORAL
    "USE_CORAL_LR2BLSTM": True,
    "CORAL_EPS": 1e-3,

    # Categorical guard
    "CATEGORICAL_MAX_CARD": 50,
    "MAX_DUMMIES": 800
}

np.random.seed(CONFIG["SEED"])
tf.random.set_seed(CONFIG["SEED"])

if CONFIG["FORCE_CPU"]:
    os.environ.setdefault("CUDA_VISIBLE_DEVICES", "-1")
    try:
        tf.config.set_visible_devices([], 'GPU')
    except Exception:
        pass
os.makedirs(CONFIG["OUTDIR"], exist_ok=True)
os.environ.setdefault("TF_CPP_MIN_LOG_LEVEL", "2")

'1'

In [3]:
# --------------------
# Families
# --------------------
FAMILIES = ["Benign", "DoS_DDoS", "Recon_Scan", "MQTT", "Spoof", "Other"]
CLASS2IDX = {n:i for i,n in enumerate(FAMILIES)}
IDX2CLASS = {i:n for n,i in CLASS2IDX.items()}
NUM_CLASSES = len(FAMILIES)
BENIGN_IDX = CLASS2IDX["Benign"]

In [4]:
# --------------------
# Helpers
# --------------------
def log(msg): print(f"[{time.strftime('%H:%M:%S')}] {msg}", flush=True)

def map_label_string_to_family(s: str) -> str:
    if not isinstance(s, str): return "Other"
    st = s.lower()
    if "benign" in st or st.strip()=="normal": return "Benign"
    if "mqtt" in st: return "MQTT"
    if "ddos" in st or "dos" in st: return "DoS_DDoS"
    if "scan" in st or "recon" in st or "portscan" in st: return "Recon_Scan"
    if ("spoof" in st) or ("mitm" in st) or ("impersonat" in st) or ("man-in-the-middle" in st) or ("arp" in st and "spoof" in st) or ("dns" in st and "spoof" in st):
        return "Spoof"
    return "Other"

def _first_col(df: pd.DataFrame, candidates: List[str]):
    for c in candidates:
        if c in df.columns: return c
    return None

def _coerce_port_col(s: pd.Series) -> pd.Series:
    if pd.api.types.is_numeric_dtype(s):
        return s.round().astype("Int64")
    try:
        return pd.to_numeric(s.astype(str).str.extract(r'(\d+)', expand=False), errors="coerce").astype("Int64")
    except Exception:
        return pd.Series([pd.NA]*len(s), dtype="Int64")

def mqtt_filter_nf(df: pd.DataFrame) -> pd.DataFrame:
    dcol = _first_col(df, CONFIG["NF_DST_PORT_CANDIDATES"])
    scol = _first_col(df, CONFIG["NF_SRC_PORT_CANDIDATES"])
    if dcol is None and scol is None:
        log("[WARN] No dst/src port columns found; using full NF-ToN-IoT.")
        return df.copy()
    mask = np.zeros(len(df), dtype=bool)
    ports = set(CONFIG["MQTT_PORTS"])
    if dcol is not None:
        d = _coerce_port_col(df[dcol]); mask |= d.isin(ports).fillna(False).to_numpy()
    if scol is not None:
        s = _coerce_port_col(df[scol]); mask |= s.isin(ports).fillna(False).to_numpy()
    f = df[mask]
    if len(f)==0:
        log("[WARN] MQTT filter produced empty set; using full NF-ToN-IoT.")
        return df.copy()
    return f.copy()

def _drop_label_like_columns(Xdf: pd.DataFrame) -> pd.DataFrame:
    KEYS = ['label','Label','labels','Labels','Class','class','Labal','labal','Attack','attack','Target','target','Family','family','Benign','benign']
    return Xdf.drop(columns=[c for c in Xdf.columns if any(k in c for k in KEYS)], errors='ignore')

def build_features_train_lowcard(df: pd.DataFrame, y_col: str):
    df = df.copy()
    y = df[y_col].values
    Xdf = _drop_label_like_columns(df.drop(columns=[y_col]))
    Xnum = Xdf.select_dtypes(include=[np.number]).copy().fillna(0)
    Xcat_raw = Xdf.select_dtypes(exclude=[np.number]).astype(str)
    keep = []
    if Xcat_raw.shape[1]>0:
        for c in Xcat_raw.columns:
            u = Xcat_raw[c].nunique(dropna=False)
            if 1 < u <= CONFIG["CATEGORICAL_MAX_CARD"]: keep.append(c)
    Xcat = Xcat_raw[keep] if keep else pd.DataFrame(index=Xdf.index)
    if Xcat.shape[1]>0: Xcat = pd.get_dummies(Xcat, dummy_na=False, drop_first=False)
    else: Xcat = pd.DataFrame(index=Xdf.index)
    if Xcat.shape[1] > CONFIG["MAX_DUMMIES"]:
        log(f"[INFO] Dropping categoricals (dummies={Xcat.shape[1]} > {CONFIG['MAX_DUMMIES']}).")
        Xcat = pd.DataFrame(index=Xdf.index)
    Xall = pd.concat([Xnum, Xcat], axis=1)
    scaler = StandardScaler().fit(Xall.values)
    X = scaler.transform(Xall.values)
    cols = Xall.columns.tolist()
    return X, y, scaler, cols

def build_features_apply_lowcard(df: pd.DataFrame, y_col: str, scaler: StandardScaler, cols_schema: list):
    df = df.copy(); y = df[y_col].values
    Xdf = _drop_label_like_columns(df.drop(columns=[y_col]))
    Xnum = Xdf.select_dtypes(include=[np.number]).copy().fillna(0)
    Xcat_raw = Xdf.select_dtypes(exclude=[np.number]).astype(str)
    keep = []
    if Xcat_raw.shape[1]>0:
        for c in Xcat_raw.columns:
            u = Xcat_raw[c].nunique(dropna=False)
            if 1 < u <= CONFIG["CATEGORICAL_MAX_CARD"]: keep.append(c)
    Xcat = Xcat_raw[keep] if keep else pd.DataFrame(index=Xdf.index)
    if Xcat.shape[1]>0: Xcat = pd.get_dummies(Xcat, dummy_na=False, drop_first=False)
    else: Xcat = pd.DataFrame(index=Xdf.index)
    Xall = pd.concat([Xnum, Xcat], axis=1).reindex(columns=cols_schema, fill_value=0)
    X = scaler.transform(Xall.values)
    return X, y

def encode_family_series(series: pd.Series) -> np.ndarray:
    vals = [CLASS2IDX[map_label_string_to_family(v)] for v in series.astype(str).values]
    return np.asarray(vals, dtype=np.int32)

In [5]:
# -------------
# Loss / Models
# -------------
def focal_loss(gamma=3.0, alpha=None):
    def loss(y_true, y_pred):
        y_pred = tf.clip_by_value(y_pred, 1e-7, 1-1e-7)
        ce = -y_true * tf.math.log(y_pred)
        if alpha is not None: ce = alpha * ce
        weight = tf.pow(1.0 - y_pred, gamma)
        return tf.reduce_sum(weight * ce, axis=1)
    return loss

def build_bilstm(input_shape, num_classes):
    return models.Sequential([
        layers.Bidirectional(layers.LSTM(64, return_sequences=True, kernel_regularizer=regularizers.l2(1e-4)), input_shape=input_shape),
        layers.Dropout(0.2),
        layers.Bidirectional(layers.LSTM(32, kernel_regularizer=regularizers.l2(1e-4))),
        layers.Dropout(0.2),
        layers.Dense(num_classes, activation="softmax", kernel_regularizer=regularizers.l2(1e-4))
    ])

In [6]:
# -------------
# Calibration & thresholds
# -------------
def temp_scale_probs(proba: np.ndarray, T: float) -> np.ndarray:
    p = np.clip(proba, 1e-12, 1.0)
    scaled = np.power(p, 1.0/float(T))
    scaled /= scaled.sum(axis=1, keepdims=True)
    return scaled

def temp_scale_probs_vec(proba: np.ndarray, T_vec: np.ndarray) -> np.ndarray:
    p = np.clip(proba, 1e-12, 1.0)
    scaled = np.power(p, 1.0/np.asarray(T_vec)[np.newaxis, :])
    scaled /= scaled.sum(axis=1, keepdims=True)
    return scaled

def fit_temperature_prob_grid(y_val: np.ndarray, proba_val: np.ndarray, grid=None) -> float:
    if grid is None: grid = np.linspace(0.5, 2.0, 31)
    best_T, best_nll = 1.0, np.inf
    for T in grid:
        pv = temp_scale_probs(proba_val, T)
        nll = log_loss(y_val, pv, labels=list(range(NUM_CLASSES)))
        if nll < best_nll: best_nll, best_T = nll, float(T)
    return best_T

def fit_per_class_temps(y_val: np.ndarray, proba_val: np.ndarray, grid=None, iters: int = 2) -> np.ndarray:
    """Coordinate-descent fit of per-class temperatures (power scaling)."""
    if grid is None: grid = CONFIG["TEMP_GRID"]
    T_vec = np.ones(NUM_CLASSES, dtype=float)
    for _ in range(iters):
        improved = False
        for c in range(NUM_CLASSES):
            best_tc, best_nll = T_vec[c], np.inf
            for tc in grid:
                Tv = T_vec.copy(); Tv[c] = tc
                pv = temp_scale_probs_vec(proba_val, Tv)
                nll = log_loss(y_val, pv, labels=list(range(NUM_CLASSES)))
                if nll < best_nll: best_nll, best_tc = nll, tc
            if not np.isclose(best_tc, T_vec[c]): improved = True
            T_vec[c] = float(best_tc)
        if not improved: break
    return T_vec

def tune_threshold(y_val_bin: np.ndarray, scores: np.ndarray, target_fpr: float) -> float:
    fpr, tpr, thr = roc_curve(y_val_bin, scores)[:3]
    idx = np.where(fpr <= target_fpr)[0]
    if len(idx)==0:
        j = int(np.argmin(np.abs(fpr - target_fpr)))
        return float(thr[j])
    return float(thr[idx[-1]])

def tune_per_family_thresholds(y_val: np.ndarray, proba_val: np.ndarray, target_fpr: float) -> Dict[int, float]:
    """Returns thresholds for each non-Benign class over its own score p(class)."""
    th = {}
    for c in range(NUM_CLASSES):
        if c == BENIGN_IDX: continue
        y_bin = (y_val == c).astype(int)
        sc = proba_val[:, c]
        if np.all(y_bin==0) or np.all(y_bin==1):
            th[c] = 1.1  # never fire if no negatives or no positives
        else:
            th[c] = tune_threshold((y_val==c).astype(int), sc, target_fpr)
    return th

def decision_from_per_family_thresholds(proba: np.ndarray, th: Dict[int, float]) -> np.ndarray:
    """Attack decision: positive if ANY non-benign class exceeds its threshold."""
    flags = []
    for c, t in th.items():
        if c == BENIGN_IDX: continue
        flags.append(proba[:, c] >= t)
    if len(flags)==0:
        return np.zeros(proba.shape[0], dtype=bool)
    return np.logical_or.reduce(flags)

In [7]:
# -------------
# Metrics (present-class AUROC/PR-AUC)
# -------------
def evaluate_multiclass_present(y_true: np.ndarray, proba: np.ndarray) -> Tuple[float, float, float, float]:
    macro_f1 = f1_score(y_true, proba.argmax(axis=1), average='macro')
    macro_rec = recall_score(y_true, proba.argmax(axis=1), average='macro')
    present = np.unique(y_true)
    if len(present) < 2:
        return macro_f1, macro_rec, float('nan'), float('nan')
    Yb = label_binarize(y_true, classes=present)
    proba_sub = proba[:, present]
    try: roc = roc_auc_score(Yb, proba_sub, average='macro', multi_class='ovr')
    except Exception: roc = float('nan')
    try: pr = average_precision_score(Yb, proba_sub, average='macro')
    except Exception: pr = float('nan')
    return macro_f1, macro_rec, roc, pr

In [8]:
# -------------
# Prob. expansion for LR-only
# -------------
def expand_proba_to_full(proba, classes, num_classes=NUM_CLASSES):
    proba = np.asarray(proba)
    full = np.zeros((proba.shape[0], num_classes), dtype=proba.dtype)
    classes = np.asarray(classes, dtype=int)
    full[:, classes] = proba
    row_sums = full.sum(axis=1, keepdims=True)
    zero_rows = (row_sums == 0)
    if np.any(zero_rows):
        full[zero_rows, BENIGN_IDX] = 1.0
    return full

In [9]:
# -------------
# CORAL alignment (whiten source, recolor to target)
# -------------
def _cov(X):
    Xc = X - X.mean(axis=0, keepdims=True)
    return (Xc.T @ Xc) / max(1, Xc.shape[0]-1)

def _sqrtm_psd(C, eps=1e-6):
    w, V = eig(C)
    w = np.real(w); V = np.real(V)
    w[w<0] = 0.0
    return (V * np.sqrt(w + eps)) @ V.T

def _invsqrtm_psd(C, eps=1e-6):
    w, V = eig(C)
    w = np.real(w); V = np.real(V)
    w[w<0] = 0.0
    return (V * (1.0/np.sqrt(w + eps))) @ V.T

def coral_fit_source_to_target(Xs: np.ndarray, Xt: np.ndarray, eps=1e-3):
    mu_s = Xs.mean(axis=0, keepdims=True)
    mu_t = Xt.mean(axis=0, keepdims=True)
    Cs = _cov(Xs) + eps*np.eye(Xs.shape[1])
    Ct = _cov(Xt) + eps*np.eye(Xt.shape[1])
    As = _invsqrtm_psd(Cs, eps)
    Bt = _sqrtm_psd(Ct, eps)
    # map: (x - mu_s) @ As @ Bt + mu_t
    return mu_s, mu_t, As, Bt

def coral_transform(X: np.ndarray, mu_s, mu_t, As, Bt):
    return (X - mu_s) @ As @ Bt + mu_t

In [10]:
# -------------
# Results schema
# -------------
@dataclass
class RunResult:
    setting: str
    model_name: str
    use_adv: bool
    resampling: str
    loss_mode: str
    macro_f1: float
    macro_recall: float
    roc_auc_ovr: float
    pr_auc_ovr: float
    ece: float
    mce: float
    fpr1e3: float
    tpr_at_fpr1e3: float
    fpr1e4: float
    tpr_at_fpr1e4: float
    # per-family tuned decision metrics
    fpr1e3_pf: float
    tpr_at_fpr1e3_pf: float
    fpr1e4_pf: float
    tpr_at_fpr1e4_pf: float

In [11]:
# --------------------
# Main
# --------------------
def main():
    tf.keras.backend.clear_session()

    log("Loading CSVs...")
    nf = pd.read_csv(CONFIG["NF_TON_IOT_CSV"])
    cic_tr = pd.read_csv(CONFIG["CIC_IOMT_TRAIN_CSV"])
    cic_te = pd.read_csv(CONFIG["CIC_IOMT_TEST_CSV"])

    log("Mapping families...")
    nf_family_all = nf.get(CONFIG["NF_TEXT_LABEL_COL"], pd.Series(["Other"]*len(nf))).astype(str).apply(map_label_string_to_family)
    cic_tr_family = cic_tr.get(CONFIG["CIC_TEXT_LABEL_COL"], pd.Series(["Other"]*len(cic_tr))).astype(str).apply(map_label_string_to_family)
    cic_te_family = cic_te.get(CONFIG["CIC_TEXT_LABEL_COL"], pd.Series(["Other"]*len(cic_te))).astype(str).apply(map_label_string_to_family)

    cic_tr2 = cic_tr.copy(); cic_tr2["Family"] = cic_tr_family.values
    cic_te2 = cic_te.copy(); cic_te2["Family"] = cic_te_family.values

    log("Applying MQTT filter to NF-ToN-IoT...")
    nf_mqtt = mqtt_filter_nf(nf)
    nf2 = nf_mqtt.copy()
    nf2["Family"] = nf_family_all.loc[nf_mqtt.index].values

    # splits
    y_nf_enc = encode_family_series(nf2["Family"])
    log("Splitting NF train/test...")
    nf_train, nf_test = train_test_split(nf2, test_size=0.2, random_state=CONFIG["SEED"], stratify=y_nf_enc)

    results = []
    perclass_rows = []
    relbin_rows = []

    def reliability_bins(probs: np.ndarray, y_true: np.ndarray, n_bins: int = 15):
        conf = probs.max(axis=1); preds = probs.argmax(axis=1)
        correct = (preds==y_true).astype(int)
        bins = np.linspace(0.0, 1.0, n_bins+1)
        ece = mce = 0.0; rows = []
        for i in range(n_bins):
            lo, hi = bins[i], bins[i+1]
            idx = np.where((conf>=lo) & (conf<hi))[0]
            if len(idx)==0:
                rows.append((0.5*(lo+hi), np.nan, 0)); continue
            acc = correct[idx].mean(); conf_mean = conf[idx].mean()
            gap = abs(acc-conf_mean)
            ece += (len(idx)/len(conf))*gap; mce = max(mce, gap)
            rows.append((conf_mean, acc, len(idx)))
        return ece, mce, rows

    def fpr_tpr_from_decisions(y_true_bin: np.ndarray, y_pred_bin: np.ndarray):
        fp = np.sum((y_pred_bin==1) & (y_true_bin==0))
        tp = np.sum((y_pred_bin==1) & (y_true_bin==1))
        tn = np.sum((y_pred_bin==0) & (y_true_bin==0))
        fn = np.sum((y_pred_bin==0) & (y_true_bin==1))
        fpr = fp / (fp + tn + 1e-12); tpr = tp / (tp + fn + 1e-12)
        return float(fpr), float(tpr)

    def run_setting(train_domain: str, test_domain: str, model_kind: str, use_adv: bool) -> RunResult:
        log(f"=== Setting: {train_domain} → {test_domain} | {model_kind} | adv={use_adv} ===")

        df_tr = cic_tr2 if train_domain=="IoMT" else nf_train
        df_te = cic_te2 if test_domain=="IoMT" else (nf_test if (train_domain=="IoT" and test_domain=="IoT") else nf2)

        df_tr = df_tr.copy(); df_te = df_te.copy()
        df_tr["y_enc"] = encode_family_series(df_tr["Family"])
        df_te["y_enc"] = encode_family_series(df_te["Family"])

        # Features
        log("Building features (train)...")
        Xtr, ytr, scaler, cols_schema = build_features_train_lowcard(df_tr.drop(columns=["Family"]).rename(columns={"y_enc":"Target"}), "Target")
        log(f"Train matrix: {Xtr.shape}")

        log("Building features (test)...")
        Xte, yte = build_features_apply_lowcard(df_te.drop(columns=["Family"]).rename(columns={"y_enc":"Target"}), "Target", scaler, cols_schema)
        log(f"Test  matrix: {Xte.shape}")

        # Validation split for calibration/threshold tuning
        Xtr_main, Xval, ytr_main, yval = train_test_split(Xtr, ytr, test_size=0.2, random_state=CONFIG["SEED"], stratify=ytr)

        # Resampling (guarded) on training MAIN only
        if CONFIG["RESAMPLING"]=="smote_tomek" and len(ytr_main) <= CONFIG["RESAMPLING_MAX_N"]:
            log("Applying SMOTE-Tomek on training-main...")
            Xtr_main, ytr_main = SMOTETomek(random_state=CONFIG["SEED"]).fit_resample(Xtr_main, ytr_main)
            log(f"Resampled train-main: {Xtr_main.shape}")
        elif CONFIG["RESAMPLING"]=="smote_tomek":
            log("[INFO] Skipping SMOTE-Tomek (too many samples).")

        # ---- Train per branch
        # Temperature/threshold defaults
        T_global = 1.0; T_vec = np.ones(NUM_CLASSES, dtype=float)
        thr_1e3 = thr_1e4 = None
        thr_pf_1e3 = {}; thr_pf_1e4 = {}

        if model_kind=="LR-only":
            log("Fitting LR (saga) on train-main...")
            lr = LogisticRegression(max_iter=1000, multi_class='multinomial', solver='saga', n_jobs=-1)
            lr.fit(csr_matrix(Xtr_main), ytr_main)

            proba_val_raw = lr.predict_proba(csr_matrix(Xval))
            proba_val = expand_proba_to_full(proba_val_raw, lr.classes_, NUM_CLASSES)

            # Temps
            T_global = fit_temperature_prob_grid(yval, proba_val)
            proba_val_T = temp_scale_probs(proba_val, T_global)
            if CONFIG["USE_PER_CLASS_TEMP"]:
                T_vec = fit_per_class_temps(yval, proba_val_T, grid=CONFIG["TEMP_GRID"], iters=CONFIG["TEMP_COORD_ITERS"])
                proba_val_T = temp_scale_probs_vec(proba_val, T_vec)
            else:
                T_vec = np.ones(NUM_CLASSES) * T_global

            # Global attack threshold (Benign score)
            atk_val = 1.0 - proba_val_T[:, BENIGN_IDX]
            yval_bin = (yval != BENIGN_IDX).astype(int)
            thr_1e3 = tune_threshold(yval_bin, atk_val, 1e-3)
            thr_1e4 = tune_threshold(yval_bin, atk_val, 1e-4)

            # Per-family thresholds
            if CONFIG["USE_PER_FAMILY_THRESH"]:
                thr_pf_1e3 = tune_per_family_thresholds(yval, proba_val_T, 1e-3)
                thr_pf_1e4 = tune_per_family_thresholds(yval, proba_val_T, 1e-4)

            # Test
            proba_te_raw = lr.predict_proba(csr_matrix(Xte))
            proba_te = expand_proba_to_full(proba_te_raw, lr.classes_, NUM_CLASSES)
            proba_te = temp_scale_probs_vec(proba_te, T_vec)

        else:
            # Sequence prep
            ytr_oh = to_categorical(ytr_main, num_classes=NUM_CLASSES)
            yval_oh = to_categorical(yval, num_classes=NUM_CLASSES)

            if CONFIG["LOSS_MODE"]=="class_balanced_ce":
                classes = np.arange(NUM_CLASSES)
                weights = compute_class_weight(class_weight='balanced', classes=classes, y=ytr_main)
                cw = {int(c): float(w) for c, w in zip(classes, weights)}
                loss_fn = tf.keras.losses.CategoricalCrossentropy()
            elif CONFIG["LOSS_MODE"]=="focal":
                cw=None; loss_fn=focal_loss(gamma=CONFIG["FOCAL_GAMMA"])
            else:
                cw=None; loss_fn=tf.keras.losses.CategoricalCrossentropy()

            if model_kind=="BiLSTM-only":
                # Use raw tabular features as 1-timestep sequence
                Xtr_seq = Xtr_main.reshape((Xtr_main.shape[0],1,Xtr_main.shape[1]))
                Xval_seq = Xval.reshape((Xval.shape[0],1,Xval.shape[1]))
                Xte_seq  = Xte.reshape((Xte.shape[0],1,Xte.shape[1]))
                model = build_bilstm(Xtr_seq.shape[1:], NUM_CLASSES)
                Xtrain_in, ytrain_in = Xtr_seq, ytr_oh

            elif model_kind=="LR->BiLSTM":
                log("Pre-fitting LR for LR->BiLSTM features...")
                lr = LogisticRegression(max_iter=1000, multi_class='multinomial', solver='saga', n_jobs=-1)
                lr.fit(csr_matrix(Xtr_main), ytr_main)
                Xlr_tr = lr.predict_proba(Xtr_main)
                Xlr_val = lr.predict_proba(Xval)
                Xlr_te  = lr.predict_proba(Xte)

                # CORAL alignment (source -> target) in LR-proba space
                if CONFIG["USE_CORAL_LR2BLSTM"]:
                    log("Applying CORAL alignment in LR→BiLSTM path...")
                    mu_s, mu_t, As, Bt = coral_fit_source_to_target(Xlr_tr, Xlr_te, eps=CONFIG["CORAL_EPS"])
                    Xlr_tr = coral_transform(Xlr_tr, mu_s, mu_t, As, Bt)
                    Xlr_val = coral_transform(Xlr_val, mu_s, mu_t, As, Bt)
                    # Xlr_te left as target (identity)

                Xtr_seq = Xlr_tr[:, np.newaxis, :]
                Xval_seq = Xlr_val[:, np.newaxis, :]
                Xte_seq  = Xlr_te[:,  np.newaxis, :]
                model = build_bilstm(Xtr_seq.shape[1:], NUM_CLASSES)
                Xtrain_in, ytrain_in = Xtr_seq, ytr_oh
            else:
                raise ValueError("Unknown model_kind")

            model.compile(optimizer=tf.keras.optimizers.Adam(CONFIG["LEARNING_RATE"]), loss=loss_fn, metrics=['categorical_accuracy'])
            es = callbacks.EarlyStopping(patience=CONFIG["PATIENCE"], restore_best_weights=True, monitor='val_loss')

            # Optional brief warmup to stabilize before main training
            if CONFIG["USE_ADV_TRAINING"] and use_adv and model_kind!="LR-only":
                log("Warmup (3 epochs) before main training...")
                model.fit(Xtrain_in, ytrain_in, validation_data=(Xval_seq, yval_oh), epochs=3, batch_size=CONFIG["BATCH_SIZE"], callbacks=[es], verbose=0)

            log("Training sequence model...")
            model.fit(Xtrain_in, ytrain_in, validation_data=(Xval_seq, yval_oh), epochs=CONFIG["EPOCHS"], batch_size=CONFIG["BATCH_SIZE"], callbacks=[es], verbose=2)

            log("Validating for temperature/thresholds...")
            proba_val = model.predict(Xval_seq, batch_size=CONFIG["BATCH_SIZE"], verbose=0)
            # Global then per-class temps
            T_global = fit_temperature_prob_grid(yval, proba_val)
            proba_val_T = temp_scale_probs(proba_val, T_global)
            if CONFIG["USE_PER_CLASS_TEMP"]:
                T_vec = fit_per_class_temps(yval, proba_val_T, grid=CONFIG["TEMP_GRID"], iters=CONFIG["TEMP_COORD_ITERS"])
                proba_val_T = temp_scale_probs_vec(proba_val, T_vec)
            else:
                T_vec = np.ones(NUM_CLASSES) * T_global

            # Global thresholds (Benign-vs-Attack)
            atk_val = 1.0 - proba_val_T[:, BENIGN_IDX]
            yval_bin = (yval != BENIGN_IDX).astype(int)
            thr_1e3 = tune_threshold(yval_bin, atk_val, 1e-3)
            thr_1e4 = tune_threshold(yval_bin, atk_val, 1e-4)

            # Per-family thresholds
            if CONFIG["USE_PER_FAMILY_THRESH"]:
                thr_pf_1e3 = tune_per_family_thresholds(yval, proba_val_T, 1e-3)
                thr_pf_1e4 = tune_per_family_thresholds(yval, proba_val_T, 1e-4)

            log("Predicting on test...")
            proba_te = model.predict(Xte_seq, batch_size=CONFIG["BATCH_SIZE"], verbose=0)
            proba_te = temp_scale_probs_vec(proba_te, T_vec)

        # -------- Metrics (present-class AUROC/PR, calibrated) --------
        log("Scoring...")
        yte_enc = df_te["y_enc"].values.astype(int)
        macro_f1, macro_rec, roc_ovr, pr_ovr = evaluate_multiclass_present(yte_enc, proba_te)

        # Reliability
        ece, mce, rel = reliability_bins(proba_te, yte_enc, n_bins=15)

        # Traditional low-FPR using tuned global thresholds
        atk_scores_test = 1.0 - proba_te[:, BENIGN_IDX]
        y_bin_test = (yte_enc != BENIGN_IDX).astype(int)

        def fpr_tpr_at_threshold(y_true_bin: np.ndarray, atk_scores: np.ndarray, thr: float):
            y_pred = (atk_scores >= thr).astype(int)
            fp = np.sum((y_pred==1) & (y_true_bin==0))
            tp = np.sum((y_pred==1) & (y_true_bin==1))
            tn = np.sum((y_pred==0) & (y_true_bin==0))
            fn = np.sum((y_pred==0) & (y_true_bin==1))
            fpr = fp / (fp + tn + 1e-12)
            tpr = tp / (tp + fn + 1e-12)
            return float(fpr), float(tpr)

        fpr1e3, tpr1e3 = fpr_tpr_at_threshold(y_bin_test, atk_scores_test, thr_1e3)
        fpr1e4, tpr1e4 = fpr_tpr_at_threshold(y_bin_test, atk_scores_test, thr_1e4)

        # Per-family tuned composite decision
        if CONFIG["USE_PER_FAMILY_THRESH"]:
            y_pred_pf_1e3 = decision_from_per_family_thresholds(proba_te, thr_pf_1e3).astype(int)
            y_pred_pf_1e4 = decision_from_per_family_thresholds(proba_te, thr_pf_1e4).astype(int)
            fpr1e3_pf, tpr1e3_pf = fpr_tpr_from_decisions(y_bin_test, y_pred_pf_1e3)
            fpr1e4_pf, tpr1e4_pf = fpr_tpr_from_decisions(y_bin_test, y_pred_pf_1e4)
        else:
            fpr1e3_pf = tpr1e3_pf = fpr1e4_pf = tpr1e4_pf = np.nan

        # Per-class table
        y_pred = proba_te.argmax(axis=1)
        for c in range(NUM_CLASSES):
            idxs = np.where(yte_enc==c)[0]
            if len(idxs)==0:
                rec=f1= np.nan; sup=0
            else:
                tp=int(np.sum(y_pred[idxs]==c)); fn=int(len(idxs)-tp); fp=int(np.sum((y_pred==c)&(yte_enc!=c)))
                rec = tp/(tp+fn) if (tp+fn)>0 else 0.0
                prec = tp/(tp+fp) if (tp+fp)>0 else 0.0
                f1 = 2*prec*rec/(prec+rec) if (prec+rec)>0 else 0.0
                sup=len(idxs)
            perclass_rows.append({"setting": f"{train_domain}->{test_domain}", "model_name": model_kind, "use_adv": use_adv, "class": IDX2CLASS[c], "recall": float(rec), "f1": float(f1), "support": int(sup)})

        for (cm, ac, ct) in rel:
            relbin_rows.append({"setting": f"{train_domain}->{test_domain}", "model_name": model_kind, "use_adv": use_adv, "conf_mean": float(0.0 if cm!=cm else cm), "acc": float(0.0 if ac!=ac else ac), "count": int(ct)})

        res = RunResult(setting=f"{train_domain}->{test_domain}", model_name=model_kind, use_adv=use_adv,
                        resampling=CONFIG["RESAMPLING"], loss_mode=CONFIG["LOSS_MODE"],
                        macro_f1=float(macro_f1), macro_recall=float(macro_rec),
                        roc_auc_ovr=float(roc_ovr) if not np.isnan(roc_ovr) else np.nan,
                        pr_auc_ovr=float(pr_ovr) if not np.isnan(pr_ovr) else np.nan,
                        ece=float(ece), mce=float(mce),
                        fpr1e3=float(fpr1e3), tpr_at_fpr1e3=float(tpr1e3),
                        fpr1e4=float(fpr1e4), tpr_at_fpr1e4=float(tpr1e4),
                        fpr1e3_pf=float(fpr1e3_pf), tpr_at_fpr1e3_pf=float(tpr1e3_pf),
                        fpr1e4_pf=float(fpr1e4_pf), tpr_at_fpr1e4_pf=float(tpr1e4_pf))
        log(f"Done setting: {res}")
        return res

    all_results = []
    for model_kind in ["LR-only", "BiLSTM-only", "LR->BiLSTM"]:
        all_results.append(run_setting("IoMT","IoMT",model_kind, use_adv=False))
        all_results.append(run_setting("IoT","IoT",model_kind, use_adv=False))
        all_results.append(run_setting("IoMT","IoT",model_kind, use_adv=(model_kind!="LR-only")))
        all_results.append(run_setting("IoT","IoMT",model_kind, use_adv=(model_kind!="LR-only")))

    df = pd.DataFrame([asdict(r) for r in all_results])
    out_csv = os.path.join(CONFIG["OUTDIR"], "domain_shift_results_v3fast_patch.csv")
    df.to_csv(out_csv, index=False)

    out_pc = os.path.join(CONFIG["OUTDIR"], "domain_shift_perclass_v3fast_patch.csv")
    pd.DataFrame(perclass_rows).to_csv(out_pc, index=False)

    out_rel = os.path.join(CONFIG["OUTDIR"], "domain_shift_reliability_bins_v3fast_patch.csv")
    pd.DataFrame(relbin_rows).to_csv(out_rel, index=False)

    log(f"Saved: {out_csv}")
    log(f"Saved: {out_pc}")
    log(f"Saved: {out_rel}")
    print(df.to_string(index=False))

if __name__ == "__main__":
    main()

[12:14:35] Loading CSVs...
[12:14:39] Mapping families...
[12:14:40] Applying MQTT filter to NF-ToN-IoT...
[12:14:40] [WARN] MQTT filter produced empty set; using full NF-ToN-IoT.
[12:14:41] Splitting NF train/test...
[12:14:41] === Setting: IoMT → IoMT | LR-only | adv=False ===
[12:14:42] Building features (train)...
[12:14:43] Train matrix: (1048575, 45)
[12:14:43] Building features (test)...
[12:14:44] Test  matrix: (1048575, 45)
[12:14:44] [INFO] Skipping SMOTE-Tomek (too many samples).
[12:14:44] Fitting LR (saga) on train-main...
[12:33:00] Scoring...
[12:33:01] Done setting: RunResult(setting='IoMT->IoMT', model_name='LR-only', use_adv=False, resampling='smote_tomek', loss_mode='focal', macro_f1=0.3536948793563853, macro_recall=0.3987351788913426, roc_auc_ovr=0.47347853255758415, pr_auc_ovr=0.3860086366087746, ece=0.04863784621679577, mce=0.7321589558620717, fpr1e3=0.0, tpr_at_fpr1e3=0.0, fpr1e4=0.0, tpr_at_fpr1e4=0.0, fpr1e3_pf=1.0, tpr_at_fpr1e3_pf=1.0, fpr1e4_pf=0.98867232164

In [12]:
import os
import zipfile
from google.colab import files

def zip_and_download_results():
    output_dir = CONFIG["OUTDIR"]
    csv_files = [
        os.path.join(output_dir, "domain_shift_results_v3fast_patch.csv"),
        os.path.join(output_dir, "domain_shift_perclass_v3fast_patch.csv"),
        os.path.join(output_dir, "domain_shift_reliability_bins_v3fast_patch.csv")
    ]
    zip_filename = os.path.join(output_dir, "domain_shift_results_v3fast_patch(Cali).zip")

    print("\nZipping and downloading results...")

    try:
        with zipfile.ZipFile(zip_filename, 'w') as zipf:
            for file in csv_files:
                if os.path.exists(file):
                    zipf.write(file, os.path.basename(file))
                else:
                    print(f"Warning: File not found and will not be included in the zip: {file}")

        if os.path.exists(zip_filename):
            files.download(zip_filename)
            print("Download complete.")
        else:
            print("Zip file was not created.")

    except Exception as e:
        print(f"An error occurred during zipping or downloading: {e}")

# Call the download function
zip_and_download_results()


Zipping and downloading results...


<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

Download complete.
