<a href="https://colab.research.google.com/github/sankeawthong/Project-1-Lita-Chatbot/blob/main/%5B20250920%5D%20IDS%20Domain-Shift%20Pipeline%20(v3-fast)%20LR-BiLSTM.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

IDS Domain-Shift Pipeline (v3) — Leakage-safe + Expanded MQTT filter
--------------------------------------------------------------------
- Drops label-like columns from features to prevent leakage
- Aligns feature schemas between train and test (categorical one-hot + scaler from train)
- Expanded MQTT filtering for NF-ToN-IoT (checks DST and SRC ports; common fallbacks)
- Defaults to RESAMPLING="smote_tomek" and LOSS_MODE="focal"
- Keeps ablations: LR-only, BiLSTM-only, LR→BiLSTM; FGSM/PGD adversarial mixing for sequence models
- Metrics: Macro-F1, AUROC/PR-AUC, ECE/MCE, FPR@1e-3/1e-4; Per-class recall/F1; Reliability bins

Outputs under ./outputs_v3/

Author: Sine & Mentor

IDS Domain-Shift Pipeline (v3 • FIXED)
=====================================
Key fixes based on crash/stall & 0.0 val-accuracy symptoms:
- **Unified label mapping** (no LabelEncoder sorting surprises). We use a fixed
  CLASS2IDX for the 6 families across *all* domains & stages.
- **Hard guards for label range** and per-split class summaries (fail fast if a
  label falls outside 0..C-1).
- **Stable validation**: explicit internal train/val split instead of
  `validation_split`, avoiding misaligned batches when concatenating adversarial
  examples.
- **Empty MQTT split**: skip gracefully.
- **CPU-stable training**: default to CPU-only to avoid cuInit(303) & duplicate
  CUDA plugin registrations seen in notebooks/containers. Can be toggled.
- **Keras stability**: `workers=1`, `use_multiprocessing=False`,
  `categorical_accuracy` metric, and tf.data pipelines.
- **Benign index** based on CLASS2IDX (not list position); evaluation is
  consistent with model output ordering.

Outputs are written to ./outputs_v3/ as before.

IDS Domain-Shift Pipeline (v3 • FIXED, Keras3-compatible)
=========================================================
This revision fixes the crash:
  TypeError: TensorFlowTrainer.fit() got an unexpected keyword argument 'workers'

Key changes
-----------
- **Removed** unsupported `workers`/`use_multiprocessing` args (Keras 3).
- **Guarded resampling**: `SMOTETomek` is skipped for huge splits; use class
  weights or light undersampling instead to avoid stalls/OOM.
- **Adversarial cap**: upper-bound adversarial sample count to keep memory stable.
- **CPU-stable default** retained (toggle via CONFIG["FORCE_CPU"]).
- **Unified label mapping** and per-split summaries kept.

Outputs go to `./outputs_v3/`.

IDS Domain-Shift Pipeline (v3p, patched)
Fixes:
- Align scikit-learn predict_proba outputs (which only include present classes) to the
  GLOBAL family schema before any downstream step (temperature scaling, LSTM features).
- Temperature scaling now uses the global num_classes for one-hot (no index errors).
- Keeps AUROC/PR-AUC "present-class" computation, threshold tuning, and MQTT relabel.

Defaults: RESAMPLING="smote_tomek", LOSS_MODE="focal", EPOCHS=30
Outputs: ./outputs_v3p/domain_shift_results_v3p.csv, *_perclass_v3p.csv, *_reliability_bins_v3p.csv

IDS Domain-Shift Pipeline (v3-fast)
==================================
Performance/stability-focused revision of v3/v3p series.

Key improvements
----------------
1) **Feature build is bounded**: only low-cardinality categoricals are one-hot
   encoded (<= 50 uniques) to prevent 100k+ dummy columns from IP/MAC/etc.
2) **Sparse LR path**: uses CSR matrices and 'saga' solver (n_jobs=-1) to
   accelerate large, high-dim fits.
3) **Adversarial generation**: only on a *random subset* (capped) and in
   batches—no more generating adversarial examples for the full train split.
4) **Resampling guard**: SMOTE-Tomek skipped above a cap; optional per-class
   undersampling keeps memory bounded.
5) **Progress logging**: timestamped logs for each stage so you can see where
   time is spent.
6) **Same metrics/outputs** as v3: Macro-F1, Macro-Recall, AUROC(PR-OVR),
   calibration (ECE/MCE), FPR@1e-3/1e-4, per-class tables, reliability bins.

Outputs: ./outputs_v3fast/

In [1]:
from __future__ import annotations
import os, time, math, warnings, gc
from dataclasses import dataclass, asdict
from typing import List, Tuple, Optional

import numpy as np
import pandas as pd

from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler, label_binarize
from sklearn.metrics import f1_score, recall_score, roc_auc_score, average_precision_score, roc_curve
from sklearn.linear_model import LogisticRegression
from sklearn.utils.class_weight import compute_class_weight

from imblearn.combine import SMOTETomek

import tensorflow as tf
from tensorflow.keras import layers, models, regularizers, callbacks
from tensorflow.keras.utils import to_categorical

from scipy.sparse import csr_matrix

warnings.filterwarnings("ignore")

In [2]:
# --------------------
# Config
# --------------------
CONFIG = {
    "NF_TON_IOT_CSV": "Dataset_NF-ToN-IoT.csv",
    "CIC_IOMT_TRAIN_CSV": "CIC_IoMT_2024_WiFi_MQTT_train.csv",
    "CIC_IOMT_TEST_CSV":  "CIC_IoMT_2024_WiFi_MQTT_test.csv",

    "NF_NUMERIC_CLASS_COL": "Class",
    "NF_TEXT_LABEL_COL": "Attack",
    "CIC_CLASS_COL": "Class",
    "CIC_TEXT_LABEL_COL": "label",

    "NF_DST_PORT_CANDIDATES": ["L4_DST_PORT", "Dst Port", "dst_port", "Destination Port", "dport", "dstport"],
    "NF_SRC_PORT_CANDIDATES": ["L4_SRC_PORT", "Src Port", "src_port", "Source Port", "sport", "srcport"],
    "MQTT_PORTS": [1883, 8883],

    "SEED": 42,
    "EPOCHS": 30,
    "BATCH_SIZE": 128,
    "LEARNING_RATE": 1e-3,
    "PATIENCE": 5,

    "RESAMPLING": "smote_tomek",            # "none" | "smote_tomek"
    "LOSS_MODE": "focal",                   # "ce" | "class_balanced_ce" | "focal"
    "RESAMPLING_MAX_N": 200_000,            # skip SMOTE if above
    "UNDERSAMPLE_MAX_PER_CLASS": 100_000,   # light undersample cap/class

    "USE_ADV_TRAINING": True,
    "ADV_METHOD": "fgsm",                   # "fgsm" | "pgd"
    "FGSM_EPS": 0.05,
    "PGD_EPS": 0.03,
    "PGD_STEPS": 5,
    "PGD_ALPHA": 0.01,
    "ADV_RATIO": 0.5,
    "ADV_MAX_SAMPLES": 40_000,              # generate at most this many
    "ADV_BATCH": 8_000,                     # batch size for adversarial synth

    "FORCE_CPU": True,
    "OUTDIR": "./outputs_v3fast",

    # New: categorical guard
    "CATEGORICAL_MAX_CARD": 50,             # one-hot only if <= 50 uniques
    "MAX_DUMMIES": 800                      # overall dummy cap; else drop cats
}

np.random.seed(CONFIG["SEED"])
tf.random.set_seed(CONFIG["SEED"])

if CONFIG["FORCE_CPU"]:
    os.environ.setdefault("CUDA_VISIBLE_DEVICES", "-1")
    try:
        tf.config.set_visible_devices([], 'GPU')
    except Exception:
        pass
os.makedirs(CONFIG["OUTDIR"], exist_ok=True)
os.environ.setdefault("TF_CPP_MIN_LOG_LEVEL", "2")

'1'

In [3]:
# --------------------
# Families
# --------------------
FAMILIES = ["Benign", "DoS_DDoS", "Recon_Scan", "MQTT", "Spoof", "Other"]
CLASS2IDX = {n:i for i,n in enumerate(FAMILIES)}
IDX2CLASS = {i:n for n,i in CLASS2IDX.items()}
NUM_CLASSES = len(FAMILIES)
BENIGN_IDX = CLASS2IDX["Benign"]

In [4]:
# --------------------
# Utils
# --------------------
def log(msg):
    print(f"[{time.strftime('%H:%M:%S')}] {msg}", flush=True)

def map_label_string_to_family(s: str) -> str:
    if not isinstance(s, str): return "Other"
    st = s.lower()
    if "benign" in st or st.strip()=="normal": return "Benign"
    if "mqtt" in st: return "MQTT"
    if "ddos" in st or "dos" in st: return "DoS_DDoS"
    if "scan" in st or "recon" in st or "portscan" in st: return "Recon_Scan"
    if "spoof" in st or "mitm" in st or "impersonat" in st or "man-in-the-middle" in st: return "Spoof"
    return "Other"

def _first_col(df: pd.DataFrame, candidates: List[str]):
    for c in candidates:
        if c in df.columns: return c
    return None

def mqtt_filter_nf(df: pd.DataFrame) -> pd.DataFrame:
    ports = set(map(str, CONFIG["MQTT_PORTS"]))
    dcol = _first_col(df, CONFIG["NF_DST_PORT_CANDIDATES"])
    scol = _first_col(df, CONFIG["NF_SRC_PORT_CANDIDATES"])
    mask = np.zeros(len(df), dtype=bool)
    if dcol is not None: mask |= df[dcol].astype(str).isin(ports).values
    if scol is not None: mask |= df[scol].astype(str).isin(ports).values
    f = df[mask]
    if len(f)==0:
        log("[WARN] MQTT filter produced empty set; using full NF-ToN-IoT.")
        return df.copy()
    return f.copy()

def _drop_label_like_columns(Xdf: pd.DataFrame) -> pd.DataFrame:
    KEYS = ['label','Label','labels','Labels','Class','class','Labal','labal',
            'Attack','attack','Target','target','Family','family','Benign','benign']
    return Xdf.drop(columns=[c for c in Xdf.columns if any(k in c for k in KEYS)], errors='ignore')

def build_features_train_lowcard(df: pd.DataFrame, y_col: str):
    """Build bounded features: numeric + low-card categoricals only."""
    df = df.copy()
    y = df[y_col].values
    Xdf = _drop_label_like_columns(df.drop(columns=[y_col]))

    Xnum = Xdf.select_dtypes(include=[np.number]).copy().fillna(0)

    Xcat_raw = Xdf.select_dtypes(exclude=[np.number]).astype(str)
    keep = []
    if Xcat_raw.shape[1]>0:
        for c in Xcat_raw.columns:
            u = Xcat_raw[c].nunique(dropna=False)
            if 1 < u <= CONFIG["CATEGORICAL_MAX_CARD"]:
                keep.append(c)
    Xcat = Xcat_raw[keep] if keep else pd.DataFrame(index=Xdf.index)

    if Xcat.shape[1]>0:
        Xcat = pd.get_dummies(Xcat, dummy_na=False, drop_first=False)
    else:
        Xcat = pd.DataFrame(index=Xdf.index)

    # If dummy explosion still too big, drop categoricals altogether
    if Xcat.shape[1] > CONFIG["MAX_DUMMIES"]:
        log(f"[INFO] Dropping categoricals (dummies={Xcat.shape[1]} > {CONFIG['MAX_DUMMIES']}).")
        Xcat = pd.DataFrame(index=Xdf.index)

    Xall = pd.concat([Xnum, Xcat], axis=1)
    scaler = StandardScaler().fit(Xall.values)
    X = scaler.transform(Xall.values)
    cols = Xall.columns.tolist()
    return X, y, scaler, cols

def build_features_apply_lowcard(df: pd.DataFrame, y_col: str, scaler: StandardScaler, cols_schema: list):
    df = df.copy()
    y = df[y_col].values
    Xdf = _drop_label_like_columns(df.drop(columns=[y_col]))

    Xnum = Xdf.select_dtypes(include=[np.number]).copy().fillna(0)

    Xcat_raw = Xdf.select_dtypes(exclude=[np.number]).astype(str)
    keep = []
    if Xcat_raw.shape[1]>0:
        for c in Xcat_raw.columns:
            u = Xcat_raw[c].nunique(dropna=False)
            if 1 < u <= CONFIG["CATEGORICAL_MAX_CARD"]:
                keep.append(c)
    Xcat = Xcat_raw[keep] if keep else pd.DataFrame(index=Xdf.index)
    if Xcat.shape[1]>0:
        Xcat = pd.get_dummies(Xcat, dummy_na=False, drop_first=False)
    else:
        Xcat = pd.DataFrame(index=Xdf.index)

    Xall = pd.concat([Xnum, Xcat], axis=1).reindex(columns=cols_schema, fill_value=0)
    X = scaler.transform(Xall.values)
    return X, y

def encode_family_series(series: pd.Series) -> np.ndarray:
    vals = [CLASS2IDX[map_label_string_to_family(v)] for v in series.astype(str).values]
    return np.asarray(vals, dtype=np.int32)

def focal_loss(gamma=2.0, alpha=None):
    def loss(y_true, y_pred):
        y_pred = tf.clip_by_value(y_pred, 1e-7, 1-1e-7)
        ce = -y_true * tf.math.log(y_pred)
        if alpha is not None: ce = alpha * ce
        weight = tf.pow(1.0 - y_pred, gamma)
        return tf.reduce_sum(weight * ce, axis=1)
    return loss

def build_bilstm(input_shape, num_classes):
    return models.Sequential([
        layers.Bidirectional(layers.LSTM(64, return_sequences=True, kernel_regularizer=regularizers.l2(1e-4)), input_shape=input_shape),
        layers.Dropout(0.2),
        layers.Bidirectional(layers.LSTM(32, kernel_regularizer=regularizers.l2(1e-4))),
        layers.Dropout(0.2),
        layers.Dense(num_classes, activation="softmax", kernel_regularizer=regularizers.l2(1e-4))
    ])

def fgsm(model, x, y, eps=0.05):
    x = tf.convert_to_tensor(x, dtype=tf.float32)
    y = tf.convert_to_tensor(y, dtype=tf.float32)
    with tf.GradientTape() as tape:
        tape.watch(x)
        pred = model(x, training=False)
        loss = tf.keras.losses.categorical_crossentropy(y, pred)
    grad = tape.gradient(loss, x)
    x_adv = x + eps * tf.sign(grad)
    return tf.clip_by_value(x_adv, -10, 10).numpy()

def pgd(model, x, y, eps=0.03, alpha=0.01, steps=5):
    x0 = tf.convert_to_tensor(x, dtype=tf.float32)
    x_adv = tf.identity(x0)
    y = tf.convert_to_tensor(y, dtype=tf.float32)
    for _ in range(steps):
        with tf.GradientTape() as tape:
            tape.watch(x_adv)
            pred = model(x_adv, training=False)
            loss = tf.keras.losses.categorical_crossentropy(y, pred)
        grad = tape.gradient(loss, x_adv)
        x_adv = x_adv + alpha * tf.sign(grad)
        x_adv = tf.clip_by_value(x_adv, x0 - eps, x0 + eps)
    return x_adv.numpy()

def calibration_bins(probs: np.ndarray, y_true: np.ndarray, n_bins: int = 15):
    conf = probs.max(axis=1); preds = probs.argmax(axis=1)
    correct = (preds==y_true).astype(int)
    bins = np.linspace(0.0, 1.0, n_bins+1)
    ece = mce = 0.0; rows = []
    for i in range(n_bins):
        lo, hi = bins[i], bins[i+1]
        idx = np.where((conf>=lo) & (conf<hi))[0]
        if len(idx)==0:
            rows.append((0.5*(lo+hi), np.nan, 0)); continue
        acc = correct[idx].mean(); conf_mean = conf[idx].mean()
        gap = abs(acc-conf_mean)
        ece += (len(idx)/len(conf))*gap; mce = max(mce, gap)
        rows.append((conf_mean, acc, len(idx)))
    return ece, mce, rows

def evaluate_multiclass(y_true, proba) -> Tuple[float, float, float, float]:
    classes = np.arange(NUM_CLASSES)
    Yb = label_binarize(y_true, classes=classes)
    try:
        roc = roc_auc_score(Yb, proba, average='macro', multi_class='ovr')
    except Exception:
        roc = float('nan')
    try:
        pr = average_precision_score(Yb, proba, average='macro')
    except Exception:
        pr = float('nan')
    macro_f1 = f1_score(y_true, proba.argmax(axis=1), average='macro')
    macro_rec = recall_score(y_true, proba.argmax(axis=1), average='macro')
    return macro_f1, macro_rec, roc, pr

def fpr_at_threshold(y_true_bin: np.ndarray, attack_scores: np.ndarray, target_fpr: float):
    fpr, tpr, thr = roc_curve(y_true_bin, attack_scores)[:3]
    idx = np.where(fpr <= target_fpr)[0]
    if len(idx)==0:
        j = int(np.argmin(fpr))
        return float(thr[j]), float(fpr[j]), float(tpr[j])
    j = idx[-1]
    return float(thr[j]), float(fpr[j]), float(tpr[j])

@dataclass
class RunResult:
    setting: str
    model_name: str
    use_adv: bool
    resampling: str
    loss_mode: str
    macro_f1: float
    macro_recall: float
    roc_auc_ovr: float
    pr_auc_ovr: float
    ece: float
    mce: float
    fpr1e3: float
    tpr_at_fpr1e3: float
    fpr1e4: float
    tpr_at_fpr1e4: float

In [5]:
# --------------------
# Main
# --------------------
def main():
    tf.keras.backend.clear_session()

    log("Loading CSVs...")
    nf = pd.read_csv(CONFIG["NF_TON_IOT_CSV"])
    cic_tr = pd.read_csv(CONFIG["CIC_IOMT_TRAIN_CSV"])
    cic_te = pd.read_csv(CONFIG["CIC_IOMT_TEST_CSV"])

    log("Mapping families...")
    nf_family_all = nf.get(CONFIG["NF_TEXT_LABEL_COL"], pd.Series(["Other"]*len(nf))).astype(str).apply(map_label_string_to_family)
    cic_tr_family = cic_tr.get(CONFIG["CIC_TEXT_LABEL_COL"], pd.Series(["Other"]*len(cic_tr))).astype(str).apply(map_label_string_to_family)
    cic_te_family = cic_te.get(CONFIG["CIC_TEXT_LABEL_COL"], pd.Series(["Other"]*len(cic_te))).astype(str).apply(map_label_string_to_family)

    cic_tr2 = cic_tr.copy(); cic_tr2["Family"] = cic_tr_family.values
    cic_te2 = cic_te.copy(); cic_te2["Family"] = cic_te_family.values

    log("Applying MQTT filter to NF-ToN-IoT...")
    nf_mqtt = mqtt_filter_nf(nf)
    nf2 = nf_mqtt.copy()
    nf2["Family"] = nf_family_all.loc[nf_mqtt.index].values

    # splits
    y_nf_enc = encode_family_series(nf2["Family"])
    log("Splitting NF train/test...")
    nf_train, nf_test = train_test_split(nf2, test_size=0.2, random_state=CONFIG["SEED"], stratify=y_nf_enc)

    results = []
    perclass_rows = []
    relbin_rows = []

    def run_setting(train_domain: str, test_domain: str, model_kind: str, use_adv: bool) -> RunResult:
        log(f"=== Setting: {train_domain} → {test_domain} | {model_kind} | adv={use_adv} ===")

        df_tr = cic_tr2 if train_domain=="IoMT" else nf_train
        df_te = cic_te2 if test_domain=="IoMT" else (nf_test if (train_domain=="IoT" and test_domain=="IoT") else nf2)

        df_tr = df_tr.copy(); df_te = df_te.copy()
        df_tr["y_enc"] = encode_family_series(df_tr["Family"])
        df_te["y_enc"] = encode_family_series(df_te["Family"])

        # Features
        log("Building features (train)...")
        Xtr, ytr, scaler, cols_schema = build_features_train_lowcard(df_tr.drop(columns=["Family"]).rename(columns={"y_enc":"Target"}), "Target")
        log(f"Train matrix: {Xtr.shape}")

        log("Building features (test)...")
        Xte, yte = build_features_apply_lowcard(df_te.drop(columns=["Family"]).rename(columns={"y_enc":"Target"}), "Target", scaler, cols_schema)
        log(f"Test  matrix: {Xte.shape}")

        # Resampling (guarded)
        if CONFIG["RESAMPLING"]=="smote_tomek" and len(ytr) <= CONFIG["RESAMPLING_MAX_N"]:
            log("Applying SMOTE-Tomek...")
            Xtr, ytr = SMOTETomek(random_state=CONFIG["SEED"]).fit_resample(Xtr, ytr)
            log(f"Resampled train: {Xtr.shape}")
        elif CONFIG["RESAMPLING"]=="smote_tomek":
            log("[INFO] Skipping SMOTE-Tomek (too many samples). Light undersampling per class.")
            rng = np.random.default_rng(CONFIG["SEED"])
            Xo=[]; yo=[]
            for c in np.unique(ytr):
                idx = np.where(ytr==c)[0]
                k = min(len(idx), CONFIG["UNDERSAMPLE_MAX_PER_CLASS"])
                sel = rng.choice(idx, size=k, replace=False)
                Xo.append(Xtr[sel]); yo.append(ytr[sel])
            Xtr = np.concatenate(Xo, axis=0); ytr = np.concatenate(yo, axis=0)
            log(f"Undersampled train: {Xtr.shape}")

        # Branches
        if model_kind=="LR-only":
            log("Fitting LogisticRegression (sparse, saga)...")
            Xtr_csr, Xte_csr = csr_matrix(Xtr), csr_matrix(Xte)
            lr = LogisticRegression(max_iter=1000, multi_class='multinomial', solver='saga', n_jobs=-1)
            lr.fit(Xtr_csr, ytr)
            proba_te = lr.predict_proba(Xte_csr)

        else:
            # Sequence preparation
            Xtr_seq = Xtr.reshape((Xtr.shape[0],1,Xtr.shape[1]))
            Xte_seq = Xte.reshape((Xte.shape[0],1,Xte.shape[1]))
            Xtr_seq_tr, Xtr_seq_val, ytr_tr, ytr_val = train_test_split(Xtr_seq, ytr, test_size=0.2, random_state=CONFIG["SEED"], stratify=ytr)
            ytr_tr_oh = to_categorical(ytr_tr, num_classes=NUM_CLASSES)
            ytr_val_oh = to_categorical(ytr_val, num_classes=NUM_CLASSES)

            if CONFIG["LOSS_MODE"]=="class_balanced_ce":
                classes = np.arange(NUM_CLASSES)
                weights = compute_class_weight(class_weight='balanced', classes=classes, y=ytr_tr)
                cw = {int(c): float(w) for c, w in zip(classes, weights)}
                loss_fn = tf.keras.losses.CategoricalCrossentropy()
            elif CONFIG["LOSS_MODE"]=="focal":
                cw=None; loss_fn=focal_loss(gamma=2.0)
            else:
                cw=None; loss_fn=tf.keras.losses.CategoricalCrossentropy()

            if model_kind=="BiLSTM-only":
                model = build_bilstm(Xtr_seq.shape[1:], NUM_CLASSES)
                Xtrain_base = Xtr_seq_tr
            elif model_kind=="LR->BiLSTM":
                log("Pre-fitting LR for LR->BiLSTM features...")
                lr = LogisticRegression(max_iter=1000, multi_class='multinomial', solver='saga', n_jobs=-1)
                lr.fit(csr_matrix(Xtr), ytr)
                Xlr_tr = lr.predict_proba(Xtr)[:, np.newaxis, :]
                Xlr_te = lr.predict_proba(Xte)[:, np.newaxis, :]
                Xtr_seq = Xlr_tr; Xte_seq = Xlr_te
                Xtr_seq_tr, Xtr_seq_val, ytr_tr, ytr_val = train_test_split(Xtr_seq, ytr, test_size=0.2, random_state=CONFIG["SEED"], stratify=ytr)
                ytr_tr_oh = to_categorical(ytr_tr, num_classes=NUM_CLASSES)
                ytr_val_oh = to_categorical(ytr_val, num_classes=NUM_CLASSES)
                model = build_bilstm(Xtr_seq_tr.shape[1:], NUM_CLASSES)
                Xtrain_base = Xtr_seq_tr
            else:
                raise ValueError("Unknown model_kind")

            model.compile(optimizer=tf.keras.optimizers.Adam(CONFIG["LEARNING_RATE"]), loss=loss_fn, metrics=['categorical_accuracy'])
            es = callbacks.EarlyStopping(patience=CONFIG["PATIENCE"], restore_best_weights=True, monitor='val_loss')

            # Optional adversarial training (subset + batched)
            Xtrain_in, ytrain_in = Xtrain_base, ytr_tr_oh
            if CONFIG["USE_ADV_TRAINING"] and use_adv:
                log("Warmup (3 epochs) before adversarial mix...")
                model.fit(Xtr_seq_tr, ytr_tr_oh, validation_data=(Xtr_seq_val, ytr_val_oh), epochs=3, batch_size=CONFIG["BATCH_SIZE"], verbose=0)

                rng = np.random.default_rng(CONFIG["SEED"])
                n_target = min(int(CONFIG["ADV_RATIO"] * Xtr_seq_tr.shape[0]), CONFIG["ADV_MAX_SAMPLES"])
                if n_target > 0:
                    sel = rng.choice(Xtr_seq_tr.shape[0], size=n_target, replace=False)
                    log(f"Generating adversarial subset: n={n_target} (batched)...")
                    adv_list = []
                    for i in range(0, n_target, CONFIG["ADV_BATCH"]):
                        j = min(i+CONFIG["ADV_BATCH"], n_target)
                        batch_x = Xtr_seq_tr[sel[i:j]]
                        batch_y = ytr_tr_oh[sel[i:j]]
                        if CONFIG["ADV_METHOD"]=="fgsm":
                            adv = fgsm(model, batch_x, batch_y, eps=CONFIG["FGSM_EPS"])
                        else:
                            adv = pgd(model, batch_x, batch_y, eps=CONFIG["PGD_EPS"], alpha=CONFIG["PGD_ALPHA"], steps=CONFIG["PGD_STEPS"])
                        adv_list.append(adv)
                    X_adv = np.concatenate(adv_list, axis=0)
                    y_adv = ytr_tr_oh[sel[:X_adv.shape[0]]]
                    Xtrain_in = np.concatenate([Xtr_seq_tr, X_adv], axis=0)
                    ytrain_in = np.concatenate([ytr_tr_oh, y_adv], axis=0)
                    del adv_list, X_adv, y_adv; gc.collect()

            log("Training sequence model...")
            model.fit(Xtrain_in, ytrain_in, validation_data=(Xtr_seq_val, ytr_val_oh), epochs=CONFIG["EPOCHS"], batch_size=CONFIG["BATCH_SIZE"], callbacks=[es], verbose=2)

            log("Predicting on test...")
            proba_te = model.predict(Xte_seq, batch_size=CONFIG["BATCH_SIZE"], verbose=0)

        # Metrics
        log("Scoring...")
        yte_enc = df_te["y_enc"].values.astype(int)
        macro_f1, macro_rec, roc_ovr, pr_ovr = evaluate_multiclass(yte_enc, proba_te)
        ece, mce, rel = calibration_bins(proba_te, yte_enc, n_bins=15)

        atk_scores = 1.0 - proba_te[:, BENIGN_IDX]
        y_bin = (yte_enc != BENIGN_IDX).astype(int)
        thr1, fpr1, tpr1 = fpr_at_threshold(y_bin, atk_scores, 1e-3)
        thr2, fpr2, tpr2 = fpr_at_threshold(y_bin, atk_scores, 1e-4)

        # Per-class
        y_pred = proba_te.argmax(axis=1)
        for c in range(NUM_CLASSES):
            idxs = np.where(yte_enc==c)[0]
            if len(idxs)==0:
                rec=f1=np.nan; sup=0
            else:
                tp=int(np.sum(y_pred[idxs]==c)); fn=int(len(idxs)-tp); fp=int(np.sum((y_pred==c)&(yte_enc!=c)))
                rec = tp/(tp+fn) if (tp+fn)>0 else 0.0
                prec = tp/(tp+fp) if (tp+fp)>0 else 0.0
                f1 = 2*prec*rec/(prec+rec) if (prec+rec)>0 else 0.0
                sup=len(idxs)
            perclass_rows.append({"setting": f"{train_domain}->{test_domain}", "model_name": model_kind, "use_adv": use_adv, "class": IDX2CLASS[c], "recall": float(rec), "f1": float(f1), "support": int(sup)})

        for (cm, ac, ct) in rel:
            relbin_rows.append({"setting": f"{train_domain}->{test_domain}", "model_name": model_kind, "use_adv": use_adv, "conf_mean": float(0.0 if cm!=cm else cm), "acc": float(0.0 if ac!=ac else ac), "count": int(ct)})

        res = RunResult(setting=f"{train_domain}->{test_domain}", model_name=model_kind, use_adv=use_adv,
                        resampling=CONFIG["RESAMPLING"], loss_mode=CONFIG["LOSS_MODE"],
                        macro_f1=float(macro_f1), macro_recall=float(macro_rec),
                        roc_auc_ovr=float(roc_ovr) if not np.isnan(roc_ovr) else np.nan,
                        pr_auc_ovr=float(pr_ovr) if not np.isnan(pr_ovr) else np.nan,
                        ece=float(ece), mce=float(mce), fpr1e3=float(fpr1), tpr_at_fpr1e3=float(tpr1),
                        fpr1e4=float(fpr2), tpr_at_fpr1e4=float(tpr2))
        log(f"Done setting: {res}")
        return res

    all_results = []
    for model_kind in ["LR-only", "BiLSTM-only", "LR->BiLSTM"]:
        all_results.append(run_setting("IoMT","IoMT",model_kind, use_adv=False))
        all_results.append(run_setting("IoT","IoT",model_kind, use_adv=False))
        all_results.append(run_setting("IoMT","IoT",model_kind, use_adv=(model_kind!="LR-only")))
        all_results.append(run_setting("IoT","IoMT",model_kind, use_adv=(model_kind!="LR-only")))

    df = pd.DataFrame([asdict(r) for r in all_results])
    out_csv = os.path.join(CONFIG["OUTDIR"], "domain_shift_results_v3fast.csv")
    df.to_csv(out_csv, index=False)

    out_pc = os.path.join(CONFIG["OUTDIR"], "domain_shift_perclass_v3fast.csv")
    pd.DataFrame(perclass_rows).to_csv(out_pc, index=False)

    out_rel = os.path.join(CONFIG["OUTDIR"], "domain_shift_reliability_bins_v3fast.csv")
    pd.DataFrame(relbin_rows).to_csv(out_rel, index=False)

    log(f"Saved: {out_csv}")
    log(f"Saved: {out_pc}")
    log(f"Saved: {out_rel}")
    print(df.to_string(index=False))

if __name__ == "__main__":
    main()

[12:58:16] Loading CSVs...
[12:58:21] Mapping families...
[12:58:22] Applying MQTT filter to NF-ToN-IoT...
[12:58:22] [WARN] MQTT filter produced empty set; using full NF-ToN-IoT.
[12:58:22] Splitting NF train/test...
[12:58:23] === Setting: IoMT → IoMT | LR-only | adv=False ===
[12:58:23] Building features (train)...
[12:58:24] Train matrix: (1048575, 45)
[12:58:24] Building features (test)...
[12:58:25] Test  matrix: (1048575, 45)
[12:58:25] [INFO] Skipping SMOTE-Tomek (too many samples). Light undersampling per class.
[12:58:25] Undersampled train: (140396, 45)
[12:58:25] Fitting LogisticRegression (sparse, saga)...
[13:01:15] Scoring...
[13:01:16] Done setting: RunResult(setting='IoMT->IoMT', model_name='LR-only', use_adv=False, resampling='smote_tomek', loss_mode='focal', macro_f1=0.004003680868845366, macro_recall=0.04167851648725686, roc_auc_ovr=0.1882807087385367, pr_auc_ovr=0.4569332374828654, ece=0.8584862241959691, mce=0.9927770030081721, fpr1e3=0.0, tpr_at_fpr1e3=0.0, fpr1e

In [6]:
import os
import zipfile
from google.colab import files

def zip_and_download_results():
    output_dir = CONFIG["OUTDIR"]
    csv_files = [
        os.path.join(output_dir, "domain_shift_results_v3fast.csv"),
        os.path.join(output_dir, "domain_shift_perclass_v3fast.csv"),
        os.path.join(output_dir, "domain_shift_reliability_bins_v3fast.csv")
    ]
    zip_filename = os.path.join(output_dir, "domain_shift_results_v3fast.zip")

    print("\nZipping and downloading results...")

    try:
        with zipfile.ZipFile(zip_filename, 'w') as zipf:
            for file in csv_files:
                if os.path.exists(file):
                    zipf.write(file, os.path.basename(file))
                else:
                    print(f"Warning: File not found and will not be included in the zip: {file}")

        if os.path.exists(zip_filename):
            files.download(zip_filename)
            print("Download complete.")
        else:
            print("Zip file was not created.")

    except Exception as e:
        print(f"An error occurred during zipping or downloading: {e}")

# Call the main training function first, then the download function
if __name__ == "__main__":
    main()
    zip_and_download_results()

[14:15:57] Loading CSVs...
[14:16:03] Mapping families...
[14:16:04] Applying MQTT filter to NF-ToN-IoT...
[14:16:05] [WARN] MQTT filter produced empty set; using full NF-ToN-IoT.
[14:16:05] Splitting NF train/test...
[14:16:06] === Setting: IoMT → IoMT | LR-only | adv=False ===
[14:16:08] Building features (train)...
[14:16:09] Train matrix: (1048575, 45)
[14:16:09] Building features (test)...
[14:16:09] Test  matrix: (1048575, 45)
[14:16:09] [INFO] Skipping SMOTE-Tomek (too many samples). Light undersampling per class.
[14:16:09] Undersampled train: (140396, 45)
[14:16:09] Fitting LogisticRegression (sparse, saga)...
[14:19:06] Scoring...
[14:19:07] Done setting: RunResult(setting='IoMT->IoMT', model_name='LR-only', use_adv=False, resampling='smote_tomek', loss_mode='focal', macro_f1=0.004003284884619037, macro_recall=0.04167829932350235, roc_auc_ovr=0.18822919633720475, pr_auc_ovr=0.4568993410158293, ece=0.8586107954982654, mce=0.9927775266699238, fpr1e3=0.0, tpr_at_fpr1e3=0.0, fpr1

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

Download complete.
