
# CNN–TPE Brain Tumor Classification — Training/Testing Folders Version



This notebook reproduces the CNN–TPE pipeline **using a dataset organized as**:
```
data/brain_tumor/
  training/{glioma,meningioma,pituitary}/...
  testing/{glioma,meningioma,pituitary}/...
```
- Hyperparameter search with **TPE (Hyperopt)**
- Three-branch CNN per Table 4
- Augmentation ablations + statistics (Wilcoxon, Cohen's d, Friedman + Bonferroni)
- **Nested CV runs on the *training* split only**; the **testing** split is reserved for final evaluation
- Exports to `exports/artifacts`, `exports/tables`, `exports/figures`



## Exports
- `exports/artifacts/cnn_tpe_brain_tumor.h5`
- `exports/artifacts/best_hyperparams.json`
- `exports/artifacts/train_history.csv`
- `exports/figures/confusion_matrix.png`, `exports/figures/roc_curves.png`, `exports/figures/fig3_cnn_vs_cnn_tpe.png`
- `exports/figures/ablation_barplot.png`, `exports/figures/nested_cv_distribution.png`
- `exports/tables/per_class_metrics.csv`, `exports/tables/test_macro_metrics.csv`, `exports/tables/confusion_matrix.csv`
- `exports/tables/cnn_vs_cnn_tpe_summary.csv`
- `exports/tables/table5_sessions.csv` (session placeholder)
- `exports/tables/table10_nested_cv_runs.csv`, `exports/tables/table10_nested_cv_summary.csv`
- `exports/tables/table11_wilcoxon_cohensd.csv`
- `exports/tables/table12_friedman.csv`, `exports/tables/table12_posthoc.csv`, `exports/tables/table12_meansd.csv`
- Templates: `exports/tables/table13_bland_altman_TEMPLATE.csv`, `exports/tables/table14_literature_TEMPLATE.csv`


In [45]:
!pip uninstall -y atplotlib || true
!pip cache purge




Files removed: 4 (204 kB)


In [23]:
#!conda create -n brain-tumor python=3.11 -y
#!conda activate brain-tumor
!pip install "tensorflow==2.19.1" hyperopt scikit-learn matplotlib pandas numpy


Channels:
 - conda-forge
 - defaults
Platform: win-64
Collecting package metadata (repodata.json): ...working... failed



CondaHTTPError: HTTP 000 CONNECTION FAILED for url <https://conda.anaconda.org/conda-forge/win-64/repodata.json>
Elapsed: -

An HTTP error occurred when trying to retrieve this URL.
HTTP errors are often intermittent, and a simple retry will get you on your way.
'https//conda.anaconda.org/conda-forge/win-64'




Collecting tensorflow==2.19.1
  Downloading tensorflow-2.19.1-cp312-cp312-win_amd64.whl.metadata (4.1 kB)
Downloading tensorflow-2.19.1-cp312-cp312-win_amd64.whl (376.0 MB)
   ---------------------------------------- 0.0/376.0 MB ? eta -:--:--
   ---------------------------------------- 0.0/376.0 MB ? eta -:--:--
   ---------------------------------------- 0.0/376.0 MB ? eta -:--:--
   ---------------------------------------- 0.0/376.0 MB ? eta -:--:--
   ---------------------------------------- 0.0/376.0 MB ? eta -:--:--
   ---------------------------------------- 0.0/376.0 MB ? eta -:--:--
   ---------------------------------------- 0.0/376.0 MB ? eta -:--:--
   ---------------------------------------- 0.0/376.0 MB ? eta -:--:--
   ---------------------------------------- 0.0/376.0 MB ? eta -:--:--
   ---------------------------------------- 0.0/376.0 MB ? eta -:--:--
   ---------------------------------------- 0.0/376.0 MB ? eta -:--:--
   ---------------------------------------- 0.

ERROR: Could not install packages due to an OSError: [WinError 32] The process cannot access the file because it is being used by another process: 'C:\\Users\\USER\\AppData\\Local\\Temp\\pip-unpack-mnbvb_4a\\tensorflow-2.19.1-cp312-cp312-win_amd64.whl'
Consider using the `--user` option or check the permissions.



Collecting tensorflow==2.19.1

ERROR: Could not install packages due to an OSError: [WinError 32] The process cannot access the file because it is being used by another process: 'C:\\Users\\USER\\AppData\\Local\\Temp\\pip-unpack-2f5l7zc6\\tensorflow-2.19.1-cp312-cp312-win_amd64.whl'
Consider using the `--user` option or check the permissions.




  Using cached tensorflow-2.19.1-cp312-cp312-win_amd64.whl.metadata (4.1 kB)
Downloading tensorflow-2.19.1-cp312-cp312-win_amd64.whl (376.0 MB)
   ---------------------------------------- 0.0/376.0 MB ? eta -:--:--
   ---------------------------------------- 0.0/376.0 MB ? eta -:--:--
   ---------------------------------------- 0.0/376.0 MB ? eta -:--:--
   ---------------------------------------- 0.0/376.0 MB ? eta -:--:--
   ---------------------------------------- 0.0/376.0 MB ? eta -:--:--
   ---------------------------------------- 0.0/376.0 MB ? eta -:--:--
   ---------------------------------------- 0.0/376.0 MB ? eta -:--:--
   ---------------------------------------- 0.0/376.0 MB ? eta -:--:--
   ---------------------------------------- 0.0/376.0 MB ? eta -:--:--
   ---------------------------------------- 0.0/376.0 MB ? eta -:--:--
   ---------------------------------------- 0.0/376.0 MB ? eta -:--:--
   ---------------------------------------- 0.0/376.0 MB ? eta -:--:--
   

In [59]:

import os, math, json, random
from pathlib import Path
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

import tensorflow as tf
from tensorflow.keras import layers, models, regularizers, optimizers
from tensorflow.keras.preprocessing.image import ImageDataGenerator

from sklearn.model_selection import train_test_split, RepeatedStratifiedKFold
from sklearn.metrics import (accuracy_score, precision_score, recall_score, f1_score,
                             roc_auc_score, confusion_matrix, roc_curve)
from sklearn.preprocessing import label_binarize
from scipy import stats

SEED = 2025
random.seed(SEED); np.random.seed(SEED); tf.random.set_seed(SEED)
os.environ["PYTHONHASHSEED"] = str(SEED)

BASE_DIR = Path.cwd()
EXPORT_ROOT = BASE_DIR / "exports"
ARTIFACTS_DIR = EXPORT_ROOT / "artifacts"
TABLES_DIR = EXPORT_ROOT / "tables"
FIGS_DIR = EXPORT_ROOT / "figures"
for d in [ARTIFACTS_DIR, TABLES_DIR, FIGS_DIR]: d.mkdir(parents=True, exist_ok=True)

# OLD:
# DATA_DIR = BASE_DIR / "data" / "brain_tumor"

# NEW (points to the folder where the notebook and Training/Testing live):
DATA_DIR = BASE_DIR

IMG_SIZE = (224,224)
CHANNELS = 3
CLASSES = ["glioma","meningioma","pituitary"]
CLASS_TO_ID = {c:i for i,c in enumerate(CLASSES)}

DEFAULT_BATCH = 16
DEFAULT_EPOCHS = 5
INIT_LR = 1e-2

RUN_TPE = True
FULL_PIPELINE = True


In [61]:
from pathlib import Path
import os, numpy as np

def list_images_two_splits_flexible(root):
    root = Path(root)

    def find_dir(candidates):
        for name in candidates:
            d = root / name
            if d.exists() and d.is_dir():
                return d
        return None

    # Look for Training/Testing (any case). If not found at root, also look under data/brain_tumor/
    train_dir = find_dir(["training","Training","TRAINING","train","Train"])
    test_dir  = find_dir(["testing","Testing","TESTING","test","Test"])

    if train_dir is None or test_dir is None:
        alt_root = root / "data" / "brain_tumor"
        train_dir = train_dir or find_dir([alt_root/"training", alt_root/"Training"])
        test_dir  = test_dir  or find_dir([alt_root/"testing",  alt_root/"Testing"])

    if train_dir is None or test_dir is None:
        found = [p.name for p in root.iterdir() if p.is_dir()]
        raise FileNotFoundError(
            f"Expected 'training'/'testing' (any case) under {root} "
            f"or {root/'data'/'brain_tumor'}. Found folders: {found}"
        )

    def scan(split_dir):
        # accept class folder names regardless of case
        wanted = {"glioma","meningioma","pituitary"}
        class_dirs = {}
        for p in split_dir.iterdir():
            if p.is_dir() and p.name.lower() in wanted:
                class_dirs[p.name.lower()] = p

        if set(class_dirs.keys()) != wanted:
            present = [p.name for p in split_dir.iterdir() if p.is_dir()]
            raise FileNotFoundError(
                f"Expected class folders (glioma, meningioma, pituitary) in {split_dir}. "
                f"Found: {present}"
            )

        exts = {".png",".jpg",".jpeg",".bmp",".tif",".tiff"}
        paths, labels = [], []
        for cls in ["glioma","meningioma","pituitary"]:
            for fn in class_dirs[cls].rglob("*"):
                if fn.suffix.lower() in exts:
                    paths.append(str(fn))
                    labels.append(CLASS_TO_ID[cls])
        return np.array(paths), np.array(labels)

    train_paths, train_labels = scan(train_dir)
    test_paths,  test_labels  = scan(test_dir)

    print(f"Using train dir: {train_dir}")
    print(f"Using test  dir: {test_dir}")
    print(f"Train images: {len(train_paths)} | Test images: {len(test_paths)}")
    return train_paths, train_labels, test_paths, test_labels

# >>> Call this (replaces the old call) <<<
train_paths, train_labels, test_paths, test_labels = list_images_two_splits_flexible(DATA_DIR)

# keep the internal validation split the same:
from sklearn.model_selection import train_test_split
tr_paths, val_paths, tr_labels, val_labels = train_test_split(
    train_paths, train_labels, test_size=0.15, random_state=SEED, stratify=train_labels)
print(f"Train2: {len(tr_paths)} | Val2: {len(val_paths)}")


Using train dir: C:\Users\USER\Dropbox\MATHEMATICAL SCIENCES\PROJECT\CurrentProject\BrainTumor\NewMRI\New folder\training
Using test  dir: C:\Users\USER\Dropbox\MATHEMATICAL SCIENCES\PROJECT\CurrentProject\BrainTumor\NewMRI\New folder\testing
Train images: 4117 | Test images: 906
Train2: 3499 | Val2: 618


In [63]:

def path_label_generator(paths, labels, batch_size, aug, shuffle=True):
    n = len(paths)
    idxs = np.arange(n)
    while True:
        if shuffle:
            np.random.shuffle(idxs)
        for start in range(0, n, batch_size):
            end = min(start+batch_size, n)
            sel = idxs[start:end]
            imgs = []
            for p in paths[sel]:
                img = tf.keras.preprocessing.image.load_img(p, target_size=IMG_SIZE, color_mode="rgb")
                img = tf.keras.preprocessing.image.img_to_array(img)
                imgs.append(img)
            X = np.stack(imgs,0)
            y = tf.keras.utils.to_categorical(labels[sel], num_classes=len(CLASSES))
            if getattr(aug, "zca_whitening", False) and not hasattr(aug, "_fitted_zca"):
                tmp = X.copy()/255.0
                aug.fit(tmp)
                aug._fitted_zca = True
            gen = aug.flow(X, y, batch_size=batch_size, shuffle=False)
            Xb, yb = next(gen)
            yield Xb, yb

#def make_aug(variant: str):
#    base = dict(rescale=1./255)
#    if variant == "none":
#        return ImageDataGenerator(**base)
#    if variant == "rotation":
#        return ImageDataGenerator(**base, rotation_range=15)
#    if variant == "brightness":
#        return ImageDataGenerator(**base, brightness_range=(0.9,1.1))
#    if variant == "zca":
#        return ImageDataGenerator(**base, zca_whitening=True)
#    if variant == "full":
#        return ImageDataGenerator(
#            **base,
#            rotation_range=15, shear_range=0.1,
#            width_shift_range=0.1, height_shift_range=0.1,
#            zoom_range=0.1, brightness_range=(0.9,1.1),
#            horizontal_flip=True, vertical_flip=True, zca_whitening=True
#        )
#    raise ValueError(variant)


In [65]:
def make_aug(variant: str):
    """
    Memory-safe augmentations.
    'zca' uses featurewise center/std (surrogate for ZCA whitening).
    """
    base = dict(rescale=1./255)

    if variant == "none":
        return ImageDataGenerator(**base)

    if variant == "rotation":
        return ImageDataGenerator(**base, rotation_range=15)

    if variant == "brightness":
        return ImageDataGenerator(**base, brightness_range=(0.9, 1.1))

    if variant == "zca":
        # ZCA surrogate: channel-wise standardization only (no huge SVD)
        return ImageDataGenerator(
            **base,
            featurewise_center=True,
            featurewise_std_normalization=True
        )

    if variant == "full":
        return ImageDataGenerator(
            **base,
            rotation_range=15,
            shear_range=0.1,
            width_shift_range=0.1,
            height_shift_range=0.1,
            zoom_range=0.1,
            brightness_range=(0.9, 1.1),
            horizontal_flip=True,
            vertical_flip=True,
            # NOTE: no zca_whitening here to avoid OOM
            featurewise_center=False,              # keep False for speed in full aug
            featurewise_std_normalization=False
        )
    raise ValueError(variant)


In [67]:
def path_label_generator(paths, labels, batch_size, aug, shuffle=True):
    n = len(paths)
    idxs = np.arange(n)

    # If this aug needs featurewise statistics (our ZCA surrogate), prefit once on a *small* sample
    if getattr(aug, "featurewise_center", False) or getattr(aug, "featurewise_std_normalization", False):
        if not hasattr(aug, "_fitted_stats"):
            k = min(64, len(paths))  # small sample is fine
            sel = np.random.choice(np.arange(n), size=k, replace=False)
            Xs = []
            for p in paths[sel]:
                img = tf.keras.preprocessing.image.load_img(p, target_size=IMG_SIZE, color_mode="rgb")
                arr = tf.keras.preprocessing.image.img_to_array(img) / 255.0
                Xs.append(arr)
            Xs = np.stack(Xs, 0)
            aug.fit(Xs)                  # computes mean/std only (cheap)
            aug._fitted_stats = True

    while True:
        if shuffle:
            np.random.shuffle(idxs)
        for start in range(0, n, batch_size):
            end = min(start + batch_size, n)
            sel = idxs[start:end]
            imgs = []
            for p in paths[sel]:
                img = tf.keras.preprocessing.image.load_img(p, target_size=IMG_SIZE, color_mode="rgb")
                img = tf.keras.preprocessing.image.img_to_array(img)
                imgs.append(img)
            X = np.stack(imgs, 0)
            y = tf.keras.utils.to_categorical(labels[sel], num_classes=len(CLASSES))
            gen = aug.flow(X, y, batch_size=batch_size, shuffle=False)
            Xb, yb = next(gen)
            yield Xb, yb


In [69]:
BEST_HYP["batch"] = 8


In [71]:
try:
    gpus = tf.config.list_physical_devices('GPU')
    for g in gpus: tf.config.experimental.set_memory_growth(g, True)
except Exception:
    pass


In [73]:

def conv_block(x, filters, kernel, pool=True, l2=0.0, act="relu"):
    x = layers.Conv2D(filters, kernel, padding="same",
                      kernel_regularizer=regularizers.l2(l2))(x)
    x = layers.Activation(act)(x)
    if pool:
        x = layers.MaxPooling2D(pool_size=(2,2))(x)
    return x

def build_branch(x_in, spec, l2=0.0, act="relu"):
    x = x_in
    for (f,k) in spec["layers"]:
        x = conv_block(x, f, (k,k), pool=True, l2=l2, act=act)
    x = layers.Flatten()(x)
    x = layers.Dense(128, activation=act, kernel_regularizer=regularizers.l2(l2))(x)
    return x

MENINGIOMA_SPEC = {"layers": [(32,3),(64,3),(128,3)]}
GLIOMA_SPEC     = {"layers": [(16,3),(32,3),(64,3),(128,3)]}
PITUITARY_SPEC  = {"layers": [(32,5),(64,5),(128,5)]}

def build_cnn_tpe(hparams):
    inp = layers.Input(shape=(IMG_SIZE[0], IMG_SIZE[1], CHANNELS))
    m = build_branch(inp, MENINGIOMA_SPEC, l2=hparams["l2"], act=hparams["act"])
    g = build_branch(inp, GLIOMA_SPEC,     l2=hparams["l2"], act=hparams["act"])
    p = build_branch(inp, PITUITARY_SPEC,  l2=hparams["l2"], act=hparams["act"])
    x = layers.Concatenate()([m,g,p])
    x = layers.Dense(256, activation=hparams["act"], kernel_regularizer=regularizers.l2(hparams["l2"]))(x)
    x = layers.Dropout(hparams["dropout"])(x)
    out = layers.Dense(len(CLASSES), activation="softmax")(x)
    model = models.Model(inp, out)
    opt = optimizers.Adam(learning_rate=hparams["lr"])
    model.compile(optimizer=opt, loss="categorical_crossentropy", metrics=["accuracy"])
    return model


In [None]:
# ---- TPE search using Optuna (compatible with Python 3.12) ----
# Install once if needed:
# %pip install -q optuna

import optuna
from optuna.samplers import TPESampler

BEST_HYP = {
    "dropout": 0.5, "l2": 1e-5, "lr": 1e-2, "act": "relu",
    "batch": 16, "epochs": 5
}

def objective_optuna(trial):
    params = {
        "dropout": trial.suggest_float("dropout", 0.2, 0.6),
        "l2": trial.suggest_float("l2", 1e-6, 1e-3, log=True),
        "lr": trial.suggest_categorical("lr", [1e-2, 1e-3, 1e-4]),
        "act": trial.suggest_categorical("act", ["relu","selu","sigmoid"]),
        "batch": trial.suggest_categorical("batch", [12,16,32]),
        "epochs": trial.suggest_categorical("epochs", [5,7,8,10]),
    }

    # reuse the same loaders/vars defined earlier in the notebook:
    train_aug = make_aug("full")
    val_aug   = ImageDataGenerator(rescale=1./255)

    B = params["batch"]
    sp_tr = math.ceil(len(tr_paths)/B)
    sp_va = math.ceil(len(val_paths)/B)
    gen_tr = path_label_generator(tr_paths, tr_labels, B, train_aug, shuffle=True)
    gen_va = path_label_generator(val_paths, val_labels, B, val_aug, shuffle=False)

    model = build_cnn_tpe(params)
    hist = model.fit(
        gen_tr,
        steps_per_epoch=sp_tr,
        epochs=int(params["epochs"]),
        validation_data=gen_va,
        validation_steps=sp_va,
        verbose=0
    )
    # maximize val_accuracy
    return float(max(hist.history["val_accuracy"]))

if RUN_TPE:
    n_trials = 25 if FULL_PIPELINE else 5
    study = optuna.create_study(direction="maximize", sampler=TPESampler(seed=SEED))
    study.optimize(objective_optuna, n_trials=n_trials, show_progress_bar=True)

    bp = study.best_params
    BEST_HYP = {
        "dropout": float(bp["dropout"]),
        "l2": float(bp["l2"]),
        "lr": float(bp["lr"]),
        "act": str(bp["act"]),
        "batch": int(bp["batch"]),
        "epochs": int(bp["epochs"]),
    }
    print("Best hyperparameters (Optuna TPE):", BEST_HYP)
else:
    print("Using fixed hyperparameters:", BEST_HYP)

# Save best params
from pathlib import Path
(Path.cwd()/ 'exports' / 'artifacts').mkdir(parents=True, exist_ok=True)
with open(Path.cwd()/ 'exports' / 'artifacts' / 'best_hyperparams.json',"w") as f:
    json.dump(BEST_HYP, f, indent=2)


[I 2025-11-11 13:56:27,367] A new study created in memory with name: no-name-f4ff83d3-dbf9-42e0-8dbe-0177f29aad78


  0%|          | 0/25 [00:00<?, ?it/s]

[I 2025-11-12 08:59:51,995] Trial 0 finished with value: 0.3543689250946045 and parameters: {'dropout': 0.2541952654711847, 'l2': 0.0004608452422319138, 'lr': 0.01, 'act': 'selu', 'batch': 12, 'epochs': 5}. Best is trial 0 with value: 0.3543689250946045.
[I 2025-11-12 09:25:25,413] Trial 1 finished with value: 0.37216827273368835 and parameters: {'dropout': 0.31712375681979077, 'l2': 6.803666196857458e-05, 'lr': 0.01, 'act': 'selu', 'batch': 32, 'epochs': 7}. Best is trial 1 with value: 0.37216827273368835.
[I 2025-11-12 09:55:16,657] Trial 2 finished with value: 0.37216827273368835 and parameters: {'dropout': 0.32969728256379643, 'l2': 0.0009390642692120776, 'lr': 0.001, 'act': 'sigmoid', 'batch': 32, 'epochs': 8}. Best is trial 1 with value: 0.37216827273368835.
[I 2025-11-12 10:13:27,160] Trial 3 finished with value: 0.7686083912849426 and parameters: {'dropout': 0.5391719609392511, 'l2': 6.761419018093338e-06, 'lr': 0.0001, 'act': 'selu', 'batch': 32, 'epochs': 5}. Best is trial 3 

In [55]:

BEST_HYP = {
    "dropout": 0.5, "l2": 1e-5, "lr": 1e-2, "act": "relu",
    "batch": 16, "epochs": 5
}

if RUN_TPE:
    from hyperopt import fmin, tpe, hp, Trials, STATUS_OK
    space = {
        "dropout": hp.uniform("dropout", 0.2, 0.6),
        "l2": hp.loguniform("l2", np.log(1e-6), np.log(1e-3)),
        "lr": hp.choice("lr", [1e-2, 1e-3, 1e-4]),
        "act": hp.choice("act", ["relu","selu","sigmoid"]),
        "batch": hp.choice("batch", [12,16,32]),
        "epochs": hp.choice("epochs", [5,7,8,10]),
    }

    train_aug = make_aug("full")
    val_aug   = ImageDataGenerator(rescale=1./255)

    def objective(hp_params):
        params = dict(hp_params)
        params["lr"]     = [1e-2,1e-3,1e-4][params["lr"]]
        params["act"]    = ["relu","selu","sigmoid"][params["act"]]
        params["batch"]  = [12,16,32][params["batch"]]
        params["epochs"] = [5,7,8,10][params["epochs"]]

        B = params["batch"]
        sp_tr = math.ceil(len(tr_paths)/B)
        sp_va = math.ceil(len(val_paths)/B)
        gen_tr = path_label_generator(tr_paths, tr_labels, B, train_aug, shuffle=True)
        gen_va = path_label_generator(val_paths, val_labels, B, val_aug, shuffle=False)

        model = build_cnn_tpe(params)
        hist = model.fit(gen_tr, steps_per_epoch=sp_tr, epochs=int(params["epochs"]),
                         validation_data=gen_va, validation_steps=sp_va, verbose=0)
        return {"loss": 1.0 - max(hist.history["val_accuracy"]), "status": STATUS_OK}

    max_evals = 25 if FULL_PIPELINE else 5
    trials = Trials()
    best = fmin(fn=objective, space=space, algo=tpe.suggest, max_evals=max_evals,
                trials=trials, rstate=np.random.default_rng(SEED))

    BEST_HYP = {
        "dropout": float(best["dropout"]),
        "l2": float(np.exp(best["l2"])),
        "lr": [1e-2,1e-3,1e-4][best["lr"]],
        "act": ["relu","selu","sigmoid"][best["act"]],
        "batch": [12,16,32][best["batch"]],
        "epochs": [5,7,8,10][best["epochs"]],
    }
    print("Best hyperparameters:", BEST_HYP)
else:
    print("Using fixed hyperparameters:", BEST_HYP)

from pathlib import Path
( (Path.cwd()/ 'exports' / 'artifacts') ).mkdir(parents=True, exist_ok=True)
with open(Path.cwd()/ 'exports' / 'artifacts' / 'best_hyperparams.json',"w") as f:
    json.dump(BEST_HYP, f, indent=2)


ModuleNotFoundError: No module named 'imp'

In [None]:

# Train on training (with val), evaluate on testing
train_aug = make_aug("full")
val_aug   = ImageDataGenerator(rescale=1./255)
test_aug  = ImageDataGenerator(rescale=1./255)

B = BEST_HYP["batch"]
gen_tr = path_label_generator(tr_paths, tr_labels, B, train_aug, shuffle=True)
gen_va = path_label_generator(val_paths, val_labels, B, val_aug, shuffle=False)
gen_te = path_label_generator(test_paths, test_labels, B, test_aug, shuffle=False)

sp_tr = math.ceil(len(tr_paths)/B)
sp_va = math.ceil(len(val_paths)/B)
sp_te = math.ceil(len(test_paths)/B)

model = build_cnn_tpe(BEST_HYP)
cb = [
    tf.keras.callbacks.EarlyStopping(monitor="val_accuracy", patience=5, restore_best_weights=True),
    tf.keras.callbacks.ReduceLROnPlateau(monitor="val_loss", factor=0.5, patience=3, min_lr=1e-6)
]
hist = model.fit(gen_tr, steps_per_epoch=sp_tr, epochs=int(BEST_HYP["epochs"]),
                 validation_data=gen_va, validation_steps=sp_va, verbose=1)
pd.DataFrame(hist.history).to_csv('exports/artifacts/train_history.csv', index=False)

y_true = test_labels
y_prob = model.predict(gen_te, steps=sp_te, verbose=0)
y_pred = np.argmax(y_prob, axis=1)

acc  = accuracy_score(y_true, y_pred)
prec = precision_score(y_true, y_pred, average="macro", zero_division=0)
rec  = recall_score(y_true, y_pred, average="macro", zero_division=0)
f1   = f1_score(y_true, y_pred, average="macro", zero_division=0)
cm   = confusion_matrix(y_true, y_pred, labels=[0,1,2])

specs=[] 
for c in range(3):
    TP=cm[c,c]; FP=cm[:,c].sum()-TP; FN=cm[c,:].sum()-TP; TN=cm.sum()-(TP+FP+FN)
    specs.append(TN/(TN+FP+1e-12))
spec = float(np.mean(specs))

y_true_bin = label_binarize(y_true, classes=[0,1,2])
macro_auc = roc_auc_score(y_true_bin, y_prob, average="macro", multi_class="ovr")

pd.DataFrame([{
    "Accuracy": acc, "Sensitivity/Recall": rec, "Specificity": spec,
    "Precision": prec, "F1": f1, "Macro-AUC": macro_auc
}]).to_csv('exports/tables/test_macro_metrics.csv', index=False)

pd.DataFrame(cm, index=CLASSES, columns=CLASSES).to_csv('exports/tables/confusion_matrix.csv')

# Plots
plt.figure()
for i, cls in enumerate(CLASSES):
    fpr, tpr, _ = roc_curve(y_true_bin[:, i], y_prob[:, i])
    plt.plot(fpr, tpr, label=cls)
plt.plot([0,1],[0,1],'--')
plt.xlabel("False Positive Rate"); plt.ylabel("True Positive Rate"); plt.title("ROC (One-vs-Rest)"); plt.legend()
plt.tight_layout(); plt.savefig('exports/figures/roc_curves.png', dpi=160); plt.close()

import seaborn as sns
plt.figure()
sns.heatmap(cm, annot=True, fmt="d", cmap="Blues", xticklabels=CLASSES, yticklabels=CLASSES)
plt.xlabel("Predicted"); plt.ylabel("True"); plt.title("Confusion Matrix")
plt.tight_layout(); plt.savefig('exports/figures/confusion_matrix.png', dpi=160); plt.close()

model.save('exports/artifacts/cnn_tpe_brain_tumor.h5')
print("Saved model and reports to 'exports/'")


In [None]:

# Baseline vs TPE figure
base_hyp = {"dropout":0.5,"l2":1e-5,"lr":1e-2,"act":"relu","batch":16,"epochs":5}
noaug = ImageDataGenerator(rescale=1./255)
B2 = base_hyp["batch"]
gen_tr_b = path_label_generator(tr_paths, tr_labels, B2, noaug, shuffle=True)
sp_tr_b = math.ceil(len(tr_paths)/B2)
gen_te_b = path_label_generator(test_paths, test_labels, B2, noaug, shuffle=False)
sp_te_b = math.ceil(len(test_paths)/B2)

base_model = build_cnn_tpe(base_hyp)
base_model.fit(gen_tr_b, steps_per_epoch=sp_tr_b, epochs=int(base_hyp["epochs"]), verbose=0)
base_prob = base_model.predict(gen_te_b, steps=sp_te_b, verbose=0)
base_pred = np.argmax(base_prob, axis=1)

base_acc  = accuracy_score(test_labels, base_pred)
base_prec = precision_score(test_labels, base_pred, average="macro", zero_division=0)
base_rec  = recall_score(test_labels, base_pred, average="macro", zero_division=0)
base_f1   = f1_score(test_labels, base_pred, average="macro", zero_division=0)

tpe_vals = pd.read_csv('exports/tables/test_macro_metrics.csv').iloc[0]

metrics = ["Accuracy","Sensitivity","Specificity","Precision","F-Score","Recall"]
cnn_vals = [base_acc, base_rec, None, base_prec, base_f1, base_rec]
tpe_plot = [tpe_vals["Accuracy"], tpe_vals["Sensitivity/Recall"], tpe_vals["Specificity"],
            tpe_vals["Precision"], tpe_vals["F1"], tpe_vals["Sensitivity/Recall"]]

x = np.arange(len(metrics)); width=0.35
plt.figure(figsize=(8,4))
plt.bar(x-width/2, [v if v is not None else 0 for v in cnn_vals], width, label="CNN")
plt.bar(x+width/2, tpe_plot, width, label="CNN + TPE")
plt.xticks(x, metrics, rotation=15); plt.ylim(0,1.05)
plt.ylabel("Score"); plt.title("Performance: CNN vs CNN + TPE"); plt.legend()
plt.tight_layout(); plt.savefig('exports/figures/fig3_cnn_vs_cnn_tpe.png', dpi=160); plt.close()

pd.DataFrame({"Metric":metrics,"CNN":cnn_vals,"CNN+TPE":tpe_plot}).to_csv('exports/tables/cnn_vs_cnn_tpe_summary.csv', index=False)


In [None]:

# Helpers and nested CV on training set only + ablation
def ci95(xs):
    xs = np.asarray(xs, dtype=float); m = xs.mean(); sd = xs.std(ddof=1); se = sd/np.sqrt(len(xs))
    return m, sd, (m-1.96*se, m+1.96*se), xs.min(), xs.max()

def summarize_runs(records):
    keys = ["acc","prec","rec","f1","auc"]
    rows = {}
    for k in keys:
        m, sd, (lo,hi), mn, mx = ci95([r[k] for r in records])
        rows[k] = [f"{100*m:.1f} ± {100*sd:.1f}", f"[{100*lo:.1f} – {100*hi:.1f}]",
                   f"{100*mn:.1f}", f"{100*mx:.1f}"]
    df = pd.DataFrame(rows, index=["Mean ± SD","95% CI","Minimum","Maximum"]).T
    df.index = ["Accuracy (%)","Precision (%)","Recall (%)","F1-score (%)","Macro-AUC"]
    return df

def evaluate_once(variant, hyp, paths, labels):
    tr_p, te_p, tr_l, te_l = train_test_split(paths, labels, test_size=0.2, random_state=SEED, stratify=labels)
    tr_p2, va_p2, tr_l2, va_l2 = train_test_split(tr_p, tr_l, test_size=0.2, random_state=SEED, stratify=tr_l)

    aug_tr = make_aug(variant)
    aug_va = ImageDataGenerator(rescale=1./255)
    aug_te = ImageDataGenerator(rescale=1./255)

    if getattr(aug_tr, "zca_whitening", False) and not hasattr(aug_tr, "_fitted_zca"):
        tmp = []
        for p in np.random.choice(tr_p2, size=min(256, len(tr_p2)), replace=False):
            img = tf.keras.preprocessing.image.load_img(p, target_size=IMG_SIZE, color_mode="rgb")
            tmp.append(tf.keras.preprocessing.image.img_to_array(img)/255.0)
        aug_tr.fit(np.stack(tmp,0)); aug_tr._fitted_zca=True

    B = hyp["batch"]
    gen_tr = path_label_generator(tr_p2, tr_l2, B, aug_tr, shuffle=True)
    gen_va = path_label_generator(va_p2, va_l2, B, aug_va, shuffle=False)
    gen_te = path_label_generator(te_p,  te_l,  B, aug_te, shuffle=False)
    sp_tr = math.ceil(len(tr_p2)/B); sp_va = math.ceil(len(va_p2)/B); sp_te = math.ceil(len(te_p)/B)

    model = build_cnn_tpe(hyp)
    model.fit(gen_tr, steps_per_epoch=sp_tr, epochs=int(hyp["epochs"]),
              validation_data=gen_va, validation_steps=sp_va, verbose=0)
    y_prob = model.predict(gen_te, steps=sp_te, verbose=0)
    y_pred = np.argmax(y_prob, axis=1); y_true = te_l
    acc  = accuracy_score(y_true, y_pred)
    prec = precision_score(y_true, y_pred, average="macro", zero_division=0)
    rec  = recall_score(y_true, y_pred, average="macro", zero_division=0)
    f1   = f1_score(y_true, y_pred, average="macro", zero_division=0)
    cm   = confusion_matrix(y_true, y_pred, labels=[0,1,2])
    specs=[]
    for c in range(3):
        TP=cm[c,c]; FP=cm[:,c].sum()-TP; FN=cm[c,:].sum()-TP; TN=cm.sum()-(TP+FP+FN)
        specs.append(TN/(TN+FP+1e-12))
    spec = float(np.mean(specs))
    auc = roc_auc_score(label_binarize(y_true, classes=[0,1,2]), y_prob, average="macro", multi_class="ovr")
    return {"acc":acc,"prec":prec,"rec":rec,"f1":f1,"spec":spec,"auc":auc}

def nested_cv(variant, hyp, paths, labels):
    n_splits = 5; n_repeats = 5 if FULL_PIPELINE else 1
    rskf = RepeatedStratifiedKFold(n_splits=n_splits, n_repeats=n_repeats, random_state=SEED)
    results=[]
    for i,(tr,te) in enumerate(rskf.split(paths, labels),1):
        res = evaluate_once(variant, hyp, paths[tr], labels[tr])
        results.append(res)
        print(f"{variant} run {i:02d}: acc={res['acc']*100:.2f}")
        if not FULL_PIPELINE and i>=5: break
    return results

hyp_cv = {**BEST_HYP}

full_results = nested_cv("full", hyp_cv, train_paths, train_labels)
none_results = nested_cv("none", hyp_cv, train_paths, train_labels)

tbl10 = summarize_runs(full_results)
tbl10.to_csv('exports/tables/table10_nested_cv_summary.csv')
pd.DataFrame(full_results).to_csv('exports/tables/table10_nested_cv_runs.csv', index=False)

def paired_test(a_list, b_list):
    stat, p = stats.wilcoxon(a_list, b_list, zero_method="wilcox", alternative="two-sided", mode="approx")
    d = (np.array(b_list)-np.array(a_list)).mean() / (np.array(b_list)-np.array(a_list)).std(ddof=1)
    return p, d

rows=[]
for k in ["acc","prec","rec","f1","auc"]:
    p,d = paired_test([r[k] for r in none_results],[r[k] for r in full_results])
    rows.append([k.upper(), p, d])
pd.DataFrame(rows, columns=["Metric","Wilcoxon p","Cohen d"]).to_csv('exports/tables/table11_wilcoxon_cohensd.csv', index=False)

rot_results = nested_cv("rotation",  hyp_cv, train_paths, train_labels)
bri_results = nested_cv("brightness",hyp_cv, train_paths, train_labels)
zca_results = nested_cv("zca",       hyp_cv, train_paths, train_labels)

def friedman_series(metric):
    A = [r[metric] for r in none_results]
    B = [r[metric] for r in rot_results]
    C = [r[metric] for r in bri_results]
    D = [r[metric] for r in zca_results]
    E = [r[metric] for r in full_results]
    chi2, p = stats.friedmanchisquare(A,B,C,D,E)
    return chi2, p

fr_rows=[]
for metric in ["acc","prec","rec","f1","auc"]:
    chi2,p = friedman_series(metric)
    fr_rows.append([metric.upper(), chi2, p])
pd.DataFrame(fr_rows, columns=["Metric","Friedman chi2","p"]).to_csv('exports/tables/table12_friedman.csv', index=False)

def mean_sd(vals): return f"{np.mean(vals)*100:.1f} ± {np.std(vals,ddof=1)*100:.1f}"
posthoc=[]
meansd=[]
for metric in ["acc","prec","rec","f1","auc"]:
    A = [r[metric] for r in none_results]
    for label, series in [("Rotation Only",rot_results),("Brightness Only",bri_results),
                          ("ZCA Whitening",zca_results),("Full Augmentation",full_results)]:
        B = [r[metric] for r in series]
        stat,p = stats.wilcoxon(A,B, zero_method='wilcox', mode="approx")
        p_adj = min(1.0, p*4)
        posthoc.append([metric.upper(), label, p_adj, "Significant" if p_adj<0.05 else "NS"])
    meansd.append([metric.upper(),
                   mean_sd(A), mean_sd([r[metric] for r in rot_results]),
                   mean_sd([r[metric] for r in bri_results]), mean_sd([r[metric] for r in zca_results]),
                   mean_sd([r[metric] for r in full_results])])
pd.DataFrame(posthoc, columns=["Metric","Augmentation","Post-hoc p (Bonf.)","Significance"]).to_csv('exports/tables/table12_posthoc.csv', index=False)
pd.DataFrame(meansd, columns=["Metric","No Aug.","Rotation Only","Brightness Only","ZCA Whitening","Full Aug."]).to_csv('exports/tables/table12_meansd.csv', index=False)

# Visualizations
acc_means = [
    np.mean([r["acc"] for r in none_results]),
    np.mean([r["acc"] for r in rot_results]),
    np.mean([r["acc"] for r in bri_results]),
    np.mean([r["acc"] for r in zca_results]),
    np.mean([r["acc"] for r in full_results]),
]
acc_sds = [
    np.std([r["acc"] for r in none_results], ddof=1),
    np.std([r["acc"] for r in rot_results], ddof=1),
    np.std([r["acc"] for r in bri_results], ddof=1),
    np.std([r["acc"] for r in zca_results], ddof=1),
    np.std([r["acc"] for r in full_results], ddof=1),
]
labels = ["No Aug.","Rotation","Brightness","ZCA","Full"]
plt.figure(figsize=(7,4))
plt.bar(range(len(labels)), acc_means, yerr=acc_sds)
plt.xticks(range(len(labels)), labels)
plt.ylabel("Accuracy"); plt.title("Augmentation Ablation (Accuracy)")
plt.tight_layout(); plt.savefig('exports/figures/ablation_barplot.png', dpi=160); plt.close()

plt.figure(figsize=(6,4))
full_accs = [r["acc"] for r in full_results]
plt.violinplot(full_accs, showmeans=True)
plt.ylabel("Accuracy"); plt.title("Nested CV: Accuracy Distribution (Full Aug)")
plt.tight_layout(); plt.savefig('exports/figures/nested_cv_distribution.png', dpi=160); plt.close()


In [None]:

# Templates for tables that require external data
pd.DataFrame(columns=["Tumor Type","Mean Bias (%)","Lower 95% LoA (%)","Upper 95% LoA (%)","Standard Deviation"]).to_csv(
    'exports/tables/table13_bland_altman_TEMPLATE.csv', index=False)
pd.DataFrame(columns=["Authors","Dataset","Methods","Accuracy"]).to_csv(
    'exports/tables/table14_literature_TEMPLATE.csv', index=False)
