# Note

Here we focus only untargeted attack.

| Attack Method | Attack generating Model  Structure |  direction|
|---------------|---------|----------|
| **CWL2**      | ResNet  | Untargeted (train / val / test) |
| **PGD (big step)** | ResNet | Untargeted (train / val / test)|
|               | VGG     | Untargeted (train / val / test)|
| **PGD (standard)**  | ResNet | Untargeted (train / val / test) |
| |VGG | Untargeted (train / val / test) |
| **FGM (big eps)** | ResNet |  Untargeted (train / val / test)|

Assumed directories are

- experiment_untargeted_adv.ipynb
- data
| -- modules
| | --- CIFAR10models
| | --- Adversarial_models
| -- samples
| | --- cwl2_targeted_to-2nd_test_by_resnet56v1_ver0.npy
| | --- ...
| | --- fgm_eps216_targeted_to-2nd_test_by_resnet56v1_ver0.npy
| | --- ...
...

- logifld_modules
| -- logifoldv1_4_modified.py
adv_lofiold.py
- runs
| -- cache
| | --- preds
| | --- metrics
| | --- index
- analysis

Here is our experiment design.


1. load untargeted adversarial attack generated by ResNet56v1 ver0 from data/samples/. There are CWL2, PGD big step, PGD standard way, FGM big eps attacks.
2. Train ResNet56v1 model on the union of original CIFAR10 dataset and the loaded adversarial sample. Then tune it more using specialization method. We'll have four different models. 
3. Load ResNet56v1 version0 model from data/modules/CIFAR10models/ which is used to generate the samples.
4. Tune the version0 model using specialization method on the union of original and perturbed sample. Now we have total 9 number of models including ResNet56v1 version0.
5. We have committee in data/modules/CIFAR10models, the Judge.
6. For the CWL2 sample, measure entropy from the Judge.
7. Record the average entropy of CWL2 in boxplots, and the number of samples greater than the entropy, both original and CWL2 sample.
8. With original sample, specialize Judge on the 'entropy greater than average' union of samples.
9. Construct logifold. Here, we add models manually with blank fuzzy domain. Each fuzzy domain will be computed on the validation dataset where they are specialized (or trained).
10. Measure accuracy of logifold for all the following dataset:
original CIFAR10 test dataset/all adversarial sample test dataset / union of original and each adversarial sample test datset
11. Record it into analysis folder.
12. As constructing logifold will copy models into the new logifold folder, we clean it out for memory safety.
13. Repeat 6 - 12 for all other adversarial sample.


## Import Libraries

In [1]:
# Import libraries

from __future__ import annotations
import glob
from pathlib import Path
from dataclasses import dataclass
from typing import List, Tuple

import numpy as np
import tensorflow as tf
from keras.models import load_model
from keras.utils import to_categorical
from keras.datasets import cifar10
from sklearn.model_selection import train_test_split

import matplotlib
matplotlib.use("Agg")
import matplotlib.pyplot as plt
import pandas as pd

## Define paths

In [2]:
# Define paths

ROOT = Path(".").resolve()
DATA = ROOT / "data"
MODELS_DIR = DATA / "models"
ADV_MODELS_DIR = DATA / "adversarial_models"
ADV_SAMPLES = DATA / "samples"
EXPERTS_DIR = DATA / "specialized_models"
LOGIFOLD_MODS = (ROOT / "logifold_modules") 


CACHE = DATA / "cache"
CACHE_PREDS = CACHE / "preds"
CACHE_METRICS = CACHE / "metrics"
CACHE_INDEX = CACHE / "index"
ANALYSIS = DATA / "analysis"
ANALYSIS.mkdir(parents=True, exist_ok=True)
FIGURES = ANALYSIS / "figures"
FIGURES.mkdir(parents=True, exist_ok=True)
REPORTS = ANALYSIS / "reports"
REPORTS.mkdir(parents=True, exist_ok=True)
LGFD_PATH = DATA / "logifold/"
LGFD_PATH.mkdir(parents=True, exist_ok=True)
# Define Judge
JUDGES_DIR = sorted(glob.glob(str(MODELS_DIR / 'resnet*original_tuned-once-on_original*')))


## Import project libraries

In [3]:
from logifold_modules.logifoldv1_4_modified import Logifold, _stem_all, int_from_model_path
from logifold_modules.resnet_modified import ResNet
import logifold_modules.custom_specialization as specialization
from adv_logifold import AdvLogifold, get_statistics, plot_disagreements
import cache_store



## Define helper functions

In [4]:
def build_and_train_resnet(training_x,
                     training_y_long,
                     validating_x,
                     validating_y_long,
                     path,
                     n ,
                     v ,
                     ) -> tf.keras.Model:
    resnet_model = ResNet(path, training_x, training_y_long, validating_x, validating_y_long, n=n, version=v)
    resnet_model.train(save_best_only=True, epochs=200)
    return

def load_adv_samples(pattern: str, _print_ : bool = False) -> np.ndarray:
    files = sorted(glob.glob(str(ADV_SAMPLES / pattern)))
    if not files:
        raise FileNotFoundError(f"No samples for pattern: {pattern}")
    if _print_:
        print(f"Loading {len(files)} files matching pattern: {pattern}")
        for f in files:
            
            print(f" - {f}")
    samples = [np.load(f) for f in files]
    if len(samples) == 1:
        samples = samples[0]
    return samples

## Configurations

In [None]:
@dataclass
class AttackEntry:
    short_tag: str                    # short_tag
    glob_pattern: str            # pattern in data/samples
    adv_label: str                     # label for cache

# Untargeted sets generated by ResNet56v1 ver0
ATTACKS: List[AttackEntry] = [
    AttackEntry("CWL2",            "*cwl2*untargeted_train_by_resnet56v1_ver0.npy", "cwl2-untargeted-gen-by-resnet56v1-ver0"),
    AttackEntry("PGD_bigstep",     "*pgd*eps216*untargeted_train_by_resnet56v1_ver0.npy","pgd-eps216-iter96-8steps-untargeted-gen-by-resnet56v1-ver0"),
    AttackEntry("PGD_standard",    "*pgd*eps8*untargeted_train_by_resnet56v1_ver0.npy","pgd-eps8-iter2-10steps-untargeted-gen-by-resnet56v1-ver0"),
    AttackEntry("FGM",     "*fgm*eps216*untargeted_train_by_resnet56v1_ver0.npy","fgm-eps216-untargeted-gen-by-resnet56v1-ver0"),
    AttackEntry("PGD_VGG",     "*pgd*eps216*untargeted_train_by_vgg16_ver0.npy","pgd-eps216-iter96-8steps-untargeted-gen-by-vgg16-ver0"),
]

## Original Data Loading

In [6]:
(x, y), (x_test, y_test) = cifar10.load_data()
x_train, x_val, y_train, y_val = train_test_split(x, y, test_size=0.2, random_state=42)
x_train = x_train.astype('float32') / 255.0
x_val = x_val.astype('float32') / 255.0
x_test = x_test.astype('float32') / 255.0

y_train_categorical_10 = to_categorical(y_train,10)
y_val_categorical_10 = to_categorical(y_val,10)
y_test_categorical_10 = to_categorical(y_test,10)



## Define helper function after loading samples

In [None]:
def train_union_and_specialize(
    
    x_adv_tr: np.ndarray, x_adv_val: np.ndarray, adv_label: str,
    
) -> Tuple[tf.keras.Model, tf.keras.Model]:
    """
    Returns (baseline_adv_model, tuned_baseline_adv_model, tuned_history_dict_or_None)
    """
    size = x_adv_tr.shape[0] # CWL2 example training size is not 40000 but 10001.
    train_union = np.concatenate([x_train, x_adv_tr], axis=0)
    val_union = np.concatenate([x_val, x_adv_val], axis=0)

    training_y_long=np.concatenate([y_train,y_train[:size]],axis=0)
    validating_y_long=np.concatenate([y_val,y_val],axis=0)
    if training_y_long.ndim == 1 or training_y_long.shape[1] != 10:
        training_y_long = to_categorical(training_y_long, 10)
    if validating_y_long.ndim == 1 or validating_y_long.shape[1] != 10:
        validating_y_long = to_categorical(validating_y_long, 10)

    path = ADV_MODELS_DIR / f"ResNet56v1_union-of-original-and-{adv_label}_ver0.keras"
    if path.exists():
        base_model = load_model(path)
        print(f'load {path} to specialize once')
    else:
        print(f'train from scratch {path}')
        build_and_train_resnet(train_union,
                    training_y_long,
                    val_union,
                    validating_y_long,
                    path = path,
                    n = 9,
                    v = 1
                    )
        base_model = load_model(path)
    baseline_before_tuning = base_model
    path = ADV_MODELS_DIR / f"ResNet56v1_union-of-original-and-{adv_label}_tuned-once-on_union-of-original-and-{adv_label}_ver0.keras"
    if path.exists():
        baseline_after_tuning = load_model(path)
        print(f'{path} already exists. try to get history of the training procedure')
        hist_baseline = specialization.load_history(path) # it could be none.
        if hist_baseline is None:
            print(f"[WARN] No history found for {path}")
    else:
        print(f'{path} training...')
        baseline_after_tuning,hist_baseline = specialization.turn_specialist(base_model, path = path,
                                                x_tr=train_union, y_tr=training_y_long,
                                                  x_v=val_union,   y_v=validating_y_long,
                                                  epochs=21, learning_rate=1e-3, batch_size=128, verbose=1, name=f"tuned_once")
        hist_baseline = {"history": hist_baseline.history, "params": hist_baseline.params, "epoch": hist_baseline.epoch}

    return baseline_before_tuning, baseline_after_tuning, hist_baseline

def construct_or_load_logifold(num_classes:int = 10):
    """
    Build AdvLogifold instance and add models to AdvLogifold.
    After constructing, we will call getFuzDoms(x=val, y=val_onehot, ...)
    returns (adversarial_lgfd, JUDGES_KEYS)
    """
    path = LGFD_PATH
    if not path.exists():
        path.mkdir(parents=True, exist_ok=True)
    eval_path = path/'evals'
    if not eval_path.exists():
        eval_path.mkdir(parents=True, exist_ok=True)
    adversarial_lgfd = AdvLogifold(num_classes, new_story = False, path = str(path)+ '/', path_for_cache = str(CACHE))
    adversarial_lgfd.load()
    JUDGES_KEYS = []
    for a_judge_path in JUDGES_DIR:
        key = (int_from_model_path(a_judge_path),)
        JUDGES_KEYS.append(key)
        if key not in adversarial_lgfd.keys():
            print(f"Adding a judge from {a_judge_path} with key {key}...")
            model = load_model(a_judge_path)
            adversarial_lgfd.add(model, key = key, filetype = 'keras',
                         fuzDom = {}, model_path=_stem_all(a_judge_path))
    for k in JUDGES_KEYS:
        if not adversarial_lgfd.charts[k]['fuzDom']:
            print(f'{k} has no fuzDom,')
            adversarial_lgfd.getFuzDoms(keys = [k],
                        x = x_val, y = y_val_categorical_10, sample_name = 'original_val',
                        update = False, autosave = False, verbose = 0)
    return adversarial_lgfd, JUDGES_KEYS

def specialize_Committee(adversarial_lgfd : AdvLogifold, Comm_keys : List[Tuple],  adv_short_tag: str):
    # Get adversarial sample corresponding to the adv_short_tag
    adv_type = adv_short_tag
    for atk in ATTACKS:
        if atk.short_tag == adv_type:
            adv_sample_name = atk.adv_label
            
            adv_sample_train = load_adv_samples(atk.glob_pattern)
            pattern = atk.glob_pattern.replace("train", "val")
            adv_sample_val = load_adv_samples(pattern)
            break

    # Compute entropy of adversarial sample by JUDGE models
    ent_original_train =adversarial_lgfd.get_entropy_array(Comm_keys, sample_name = 'original_train', sample = x_train)
    ent_adv_train = adversarial_lgfd.get_entropy_array(Comm_keys, sample_name = adv_sample_name + '_train', sample = adv_sample_train)
    fp = FIGURES / f"entropy-disagreements-on-original_train.png"
    if fp.exists():
        pass
    else:
        plot_disagreements(ent_original_train, title = f"Entropy Disagreements on original_train", save_path = FIGURES / f"entropy-disagreements-on-original_train.png")

    plot_disagreements(ent_adv_train, title = f"Entropy Disagreements on {adv_sample_name}_train", save_path = FIGURES / f"entropy-disagreements-on-{adv_sample_name}_train.png")

    ent_original_val = adversarial_lgfd.get_entropy_array(Comm_keys, sample_name = 'original_val', sample = x_val)
    fp = FIGURES / f"entropy-disagreements-on-original_val.png"
    if fp.exists():
        pass
    else:
        plot_disagreements(ent_original_val, title = f"Entropy Disagreements on original_train", save_path = FIGURES / f"entropy-disagreements-on-original_val.png")

    ent_adv_val = adversarial_lgfd.get_entropy_array(Comm_keys, sample_name = adv_sample_name + '_val', sample = adv_sample_val)
    plot_disagreements(ent_adv_val, title = f"Entropy Disagreements on {adv_sample_name}_val", save_path = FIGURES / f"entropy-disagreements-on-{adv_sample_name}_val.png")
    
    # Including original sample, compute average of entropy
    stats = {}
    stats[('original','train')] = get_statistics(ent_original_train)
    stats[('original','val')] = get_statistics(ent_original_val)
    stats[('adv','train')] = get_statistics(ent_adv_train)
    stats[('adv','val')] = get_statistics(ent_adv_val)
    train_alpha_union = (stats[('original','train')]['average'] + stats[('adv','train')]['average'])/2
    val_alpha_union = (stats[('original','val')]['average'] + stats[('adv','val')]['average'])/2

    # separate union of original and adversarial samples into high entropy and low entropy samples
    loc_1_original_train = ent_original_train>=train_alpha_union
    loc_1_adv_train = ent_adv_train>=train_alpha_union
    loc_1_original_val = ent_original_val>=val_alpha_union
    loc_1_adv_val = ent_adv_val>=val_alpha_union
    print('alpha for train: {}, for val: {}'.format(train_alpha_union, val_alpha_union))
    print('the number of data greater than alpha:')
    print(f'Training set original + {adv_type}:', np.sum(loc_1_original_train), '+',np.sum(loc_1_adv_train), '=', np.sum(loc_1_original_train) + np.sum(loc_1_adv_train))
    print(f'Validation set original + {adv_type}:', np.sum(loc_1_original_val), '+', np.sum(loc_1_adv_val), '=', np.sum(loc_1_original_val) + np.sum(loc_1_adv_val))

    DATASETS = {"Experts_union":dict(train = (np.concatenate([x_train[loc_1_original_train], adv_sample_train[loc_1_adv_train]]), 
                                            to_categorical(
                                                np.concatenate(
                                                [y_train[loc_1_original_train], y_train[:adv_sample_train.shape[0]][loc_1_adv_train]]
                                                ), 10)
                                            ),
                                    val=(np.concatenate([x_val[loc_1_original_val], adv_sample_val[loc_1_adv_val]]), 
                                        to_categorical(
                                            np.concatenate(
                                                [y_val[loc_1_original_val], y_val[loc_1_adv_val]]
                                                ),10)))}
    
    # specialize Judge models on the high entropy samples
    EXPERTS_KEYS = []
    experts_paths = []
    
    for a_judge_key in Comm_keys: 
        a_judge = adversarial_lgfd.getModel(a_judge_key)
        a_judge_name = adversarial_lgfd.model_source_name(a_judge_key)
        
        path = EXPERTS_DIR / f"{a_judge_name}_specialized-once-on_high-entropy-union-of-original-and-{adv_sample_name}_ver0.keras"
        
        if path.exists():
            print(f"There is specialized Judge {a_judge_name} on union of original and {adv_type} samples.")
        
            specialist = load_model(str(path))
        else:
            print(f"Specializing Judge {a_judge_name} on union of original and {adv_type} samples...")
        
            specialist, _ = specialization.turn_specialist(model = a_judge, path = path,
                                           x_tr = DATASETS["Experts_union"]["train"][0], y_tr = DATASETS["Experts_union"]["train"][1],
                                           x_v = DATASETS["Experts_union"]["val"][0], y_v = DATASETS["Experts_union"]["val"][1],
                                           epochs = 21, learning_rate = 1e-3, batch_size = 128, verbose = 0, 
                                           name = f"specialized_once_on_high-entropy_union_of_original_and_{adv_sample_name}")
            # Add them to Advlogifold
        key = (a_judge_key[0],int_from_model_path(f"{a_judge_name}_specialized-once-on_high-entropy-union-of-original-and-{adv_sample_name}_ver0.keras"))
        print('prepared key:', key)
        if key in adversarial_lgfd.keys():
            print(f'specialized model is already a member of logifold')
        else:
            print(f'Adding specialized model...')
            adversarial_lgfd.add(specialist,
                             key = key,
                             model_path = _stem_all(path),
                             description = f'specialized on high entropy union of original and {adv_sample_name}', 
                             fuzDom = {})
        # compute fuzdom
        adversarial_lgfd.getFuzDoms(keys = [key],
                            x = DATASETS["Experts_union"]["val"][0], y = DATASETS["Experts_union"]["val"][1], sample_name = f'union_of_original_and_{adv_sample_name}_val',
                            update = False, autosave = False, verbose = 0)
        EXPERTS_KEYS.append(key)
        experts_paths.append(path)
        
        alpha = val_alpha_union
    return EXPERTS_KEYS, experts_paths, alpha

def _pick_acc(result):
        # Accuracy by using History
        if result[-1][-1] is not None:
            return result[-1][-1]["Accuracy"][-1], result[-1][0].loc[0,"acc by taking average"], result[-1][0].loc[0,"acc by simple vote"], result[-1][0]['acc by refined vote'].max()
        else:
            return result[-1][0]['acc by refined vote'].max(), result[-1][0].loc[0,"acc by taking average"], result[-1][0].loc[0,"acc by simple vote"], result[-1][0]['acc by refined vote'].max()



## Run Experiment : untargeted adv

In [8]:
# adv_samples_for_baselines = sorted(glob.glob(str(ADV_SAMPLES / "*train*.npy")))
# adv_train_samples = {}
# adv_val_samples = {}
# for f in adv_samples_for_baselines:
#     name = Path(f).stem
#     parts = name.split('_')
#     parts_wo_train = [p for p in parts if p != 'train']

#     val_name = name.replace('train','val', 1)
#     name = '-'.join(parts_wo_train)
#     adv_train_samples[name] = np.load(f)
#     val_path = ADV_SAMPLES / (val_name + '.npy')
#     if val_path.exists():
#         adv_val_samples[name] = np.load(str(val_path))
#     else:
#         print(f"[warn] No val sample for {name}")
#         adv_val_samples[name] = None

# for tr, v in zip(adv_train_samples.items(), adv_val_samples.items()):
#     assert tr[0] == v[0], f"Train and val samples do not match: {tr[0]} vs {v[0]}"
#     name = tr[0]
#     tr = tr[1]
#     v = v[1]
#     before_adv , tuned_adv , hist =train_union_and_specialize(tr, v, name)
#     if hist is not None:
#         plt.plot(hist['history']['accuracy'], label='train accuracy')
#         plt.plot(hist['history']['val_accuracy'], label='val accuracy')
#         plt.title(f'Accuracy of Adversarially trained ResNet56v1 (tuned once)\non {name}')
#         plt.xlabel('Epochs')
#         plt.ylabel('Accuracy')
#         plt.legend()
#         plt.show()

In [15]:
# ------------------------------------------------------------------
# Load adversarial (val/test) numpy datasets
# ------------------------------------------------------------------

test_files = sorted(glob.glob(str(ADV_SAMPLES / "*test*ver0.npy")))
adv_test_samples = {}
adv_val_samples = {}
for f in test_files:
    name = Path(f).stem
    parts = name.split('_')
    parts_wo_test = [p for p in parts if p != 'test']

    val_name = name.replace('test','val', 1)
    name = '-'.join(parts_wo_test)
    adv_test_samples[name] = np.load(f)
    val_path = ADV_SAMPLES / (val_name + '.npy')
    if val_path.exists():
        print(f"[Notice] val sample for {name} exists.")
        adv_val_samples[name] = np.load(str(val_path))
    else:
        print(f"[warn] No val sample for {name}")
        adv_val_samples[name] = None


print('------------------------------------------------------------------')
print('adv samples has been loaded.')
print('------------------------------------------------------------------')

[warn] No val sample for cwl2-targeted-to-2nd-by-resnet56v1-ver0
[warn] No val sample for cwl2-targeted-to-least-by-resnet56v1-ver0
[Notice] val sample for cwl2-untargeted-by-resnet56v1-ver0 exists.
[Notice] val sample for deepfool-untargeted-by-resnet56v1-ver0 exists.
[Notice] val sample for fgm-eps216-targeted-to-2nd-by-resnet56v1-ver0 exists.
[Notice] val sample for fgm-eps216-targeted-to-least-by-resnet56v1-ver0 exists.
[Notice] val sample for fgm-eps216-untargeted-by-resnet56v1-ver0 exists.
[Notice] val sample for pgd-eps216-iter96-8steps-targeted-to-2nd-by-resnet56v1-ver0 exists.
[warn] No val sample for pgd-eps216-iter96-8steps-targeted-to-least-by-resnet20v1-ver0
[warn] No val sample for pgd-eps216-iter96-8steps-targeted-to-least-by-resnet20v2-ver0
[Notice] val sample for pgd-eps216-iter96-8steps-targeted-to-least-by-resnet56v1-ver0 exists.
[warn] No val sample for pgd-eps216-iter96-8steps-targeted-to-least-by-resnet56v2-ver0
[warn] No val sample for pgd-eps216-iter96-8steps-ta

In [None]:
adversarial_lgfd, JUDGES_KEYS = construct_or_load_logifold(num_classes=10)


In [None]:
def evaluate_logifold_and_baselines(adversarial_lgfd :AdvLogifold, adv_samples_labels : str, JUDGES_KEYS : List[Tuple], EXPERTS_KEYS : List[Tuple], EXPERTS_DIR : List[str], alpha : float, single_run : bool = False):
    
    '''
    testing dataset:
    
    x_test = original test
    adv_type = cwl2, deepfool, fgm, pgd_bigstep, pgd_std
    generating model = resnet20v1_ver0 - 7, resnet20v2_ver0 - 3, resnet56v1_ver0 - 3, resnet56v2_ver0 - 3, vgg11_ver0 - 3, vgg13_ver0 - 3, vgg16_ver0 - 3, vgg19_ver0 - 3
    directions = untargeted, targeted_to-least, targeted_to-2nd
    
    
    '''
    
    
    # ------------------------------------------------------------------
    # Load baselines
    # ------------------------------------------------------------------
    original_baselines = {}
    adversarial_trained_baselines = {}
    if single_run:
        for model_path in sorted(glob.glob(str(MODELS_DIR / "*.keras"))):
            model_name = Path(model_path).stem
            original_baselines[model_name] = load_model(model_path)
        for model_path in sorted(glob.glob(str(ADV_MODELS_DIR / "*.keras"))):
            model_name = Path(model_path).stem
            adversarial_trained_baselines[model_name] = load_model(model_path)
    
        print('------------------------------------------------------------------')
        print('baseline models are loaded for running single model evaluation')
        print('------------------------------------------------------------------')
    storage = cache_store.ResultStore(CACHE) # Storage for raw predictions
    
    baseline_rows = []          # per-model x per-dataset
    logifold_rows = []          # Judges/All x per-dataset <-- All means Judges + Experts
    adv_logifold_rows = []      # AdvLogifold x per-dataset <-- name of advlogifold is given by the dataset where experts are specialized on.

    # ------------------------------------------------------------------
    # Helper: run and record a single baseline model on a dataset
    # ------------------------------------------------------------------
    original_truth = y_test.reshape(-1)

    def _eval_baseline_model(model_name, model, X, dataset_tag, y_true = original_truth):
        try:
            fp = CACHE / "preds" / dataset_tag / f"{model_name}.npy"
            if fp.exists():
                preds = np.load(fp)
            else:
                preds = model.predict(X, verbose=0)
                saved_to = storage.set_pred(model_name, dataset_tag, preds)
                print(f"Saved raw predictions of {model_name} on {dataset_tag} to cache file")
        except Exception:
            preds = model.predict(X, verbose=0)
            saved_to = storage.set_pred(model_name, dataset_tag, preds)
            print(f"Saved raw predictions of {model_name} on {dataset_tag} to cache file")
        ans = np.argmax(preds, axis=-1)
        acc = float(np.mean(ans == y_true.reshape(-1)))
        baseline_rows.append({
            "model": model_name,
            "dataset": dataset_tag,
            "accuracy": round(acc, 4),
        })
        
    # ------------------------------------------------------------------
    # Evaluate baselines on ORIGINAL test
    # ------------------------------------------------------------------
    if single_run:
        print('------------------------------------------------------------------')
        print('Running single model evaluations on original dataset...')
        print('------------------------------------------------------------------')
        for model_name, m in {**original_baselines, **adversarial_trained_baselines}.items():
            _eval_baseline_model(model_name, m, x_test,"original_test")
        

    # evaluation on simple ensemble.
    # We measure it using Logifold with certainty threshold = 0 which represents weighted voting with weight computed on validation dataset.
    # We can also measure it using simple majority voting.
    # For each dataset we save not only those simple voting results but also all logifold results.
    # But let us start with original dataset.
    logifold_rows = []
    sample_name = 'original'
    
    print('------------------------------------------------------------------')
    print('Running LOGIFOLD evaluations on original dataset...')
    print('------------------------------------------------------------------')
    committee_sig = adversarial_lgfd._committee_sig_from_keys(JUDGES_KEYS)
    all_sig = adversarial_lgfd._committee_sig_from_keys(JUDGES_KEYS+EXPERTS_KEYS)
    experts_sig = adversarial_lgfd._committee_sig_from_keys(EXPERTS_KEYS)

    Logifold.predict(
        adversarial_lgfd, x_val, x_name = sample_name + '_val',y=y_val_categorical_10,
        keys=JUDGES_KEYS,
        evalOutputFile= 'evals/' + committee_sig + 'original_val_eval.csv',
        show_av_acc=True, show_simple_vote=True, write_story=False
    )
    
    _, _, _, _, result, _, _, _ = Logifold.predict(
        adversarial_lgfd, x_test, x_name = sample_name + '_test', y=y_test_categorical_10,
        keys=JUDGES_KEYS,
        show_av_acc=True, show_simple_vote=True, write_story=False,
        useHistory = 'evals/' + committee_sig + 'original_val_eval.csv'
    ) # result1 is a list containing panda dataframe, list of figures, etc.
    j_hist, j_avg, j_maj, j_wavg = _pick_acc(result)
    
    
    
    Logifold.predict(
        adversarial_lgfd, x_val, x_name = sample_name + '_val',y=y_val_categorical_10,
        keys=JUDGES_KEYS + EXPERTS_KEYS,
        evalOutputFile= 'evals/' + all_sig + 'original_val_eval.csv',
        show_av_acc=True, show_simple_vote=True, write_story=False
    )
    _, _, _, _, result, _, _, _ = Logifold.predict(
        adversarial_lgfd, x_test, x_name = sample_name + '_test', y=y_test_categorical_10,
        keys=JUDGES_KEYS + EXPERTS_KEYS,
        show_av_acc=True, show_simple_vote=True, write_story=False,
        useHistory = 'evals/' + all_sig + 'original_val_eval.csv'
    )
    a_hist, a_avg, a_maj, a_wavg = _pick_acc(result)
    
    logifold_rows.append({
        "testing_dataset": "original",
        "simple_majority_voting_by_Judges": j_maj,
        "refined_voting_by_Judges": j_wavg,
        "average_voting_by_Judges": j_avg,
        "using_val_history_by_Judges": j_hist,
        "simple_majority_voting_by_all": a_maj,
        "refined_voting_by_all": a_wavg,
        "average_voting_by_all": a_avg,
        "using_val_history_by_all": a_hist,
    })
    
    print('------------------------------------------------------------------')
    print('Running ADVERSARIAL LOGIFOLD evaluations on original dataset...')
    print('------------------------------------------------------------------')
    adversarial_lgfd.predict(
        x_val, x_name= sample_name + '_val', y=y_val_categorical_10,
        committee_Judge=JUDGES_KEYS,
        committee_experts=EXPERTS_KEYS,
        entropy_threshold=alpha,
        show_av_acc=True, show_simple_vote=True,
        reportSeq=[100],
        evalOutputFile='evals/' + experts_sig + 'original_val_eval.csv',
        write_story=False
    )
    _, _, _, _, result, _, _, _ = adversarial_lgfd.predict(
        x_test, x_name = sample_name + '_test', y=y_test_categorical_10,
        committee_Judge=JUDGES_KEYS,
        committee_experts=EXPERTS_KEYS,
        entropy_threshold=alpha,
        show_av_acc=True, show_simple_vote=True,
        reportSeq=[100],
        useHistory='evals/' + experts_sig + f"original_val_eval.csv",
        write_story=False
    )
    
    r_hist, r_avg, r_maj, r_wavg = _pick_acc(result)
    
    adv_logifold_rows.append({
        "testing_dataset": "original",
        "simple_majority_voting_by_all": r_maj,
        "refined_voting_by_all": r_wavg,
        "average_voting_by_all": r_avg,
        "using_val_history_by_all": r_hist,
        "entropy_threshold": alpha,
    })
    
    
    # ------------------------------------------------------------------
    # Evaluate on each ADVERSARIAL sample
    # ------------------------------------------------------------------
    
    
    for testing_adv_label, value in adv_test_samples.items():
        sample_name = testing_adv_label
        adv_x_val = adv_val_samples[testing_adv_label]
        
        adv_x_test = value
        if single_run:
            print('------------------------------------------------------------------')
            print(f'Running single model evaluations on {sample_name} dataset...')
            print('------------------------------------------------------------------')
            for model_name, m in {**original_baselines, **adversarial_trained_baselines}.items():
                    _eval_baseline_model(model_name, m, adv_x_test, sample_name+"_test", y_true = original_truth)
        if adv_x_val is None:
            print(f"[warn] No val sample for {testing_adv_label}, skipping validation history results and AdvLogifold results...")
            skip = True
        else:
            skip = False
        if skip:
            print('------------------------------------------------------------------')
            print(f'Running LOGIFOLD and Adversarial LOGIFOLD evaluations on {sample_name} dataset without using history because of the absence of validation sample...')
            print('------------------------------------------------------------------')
            useHistory = None
            _, _, _, _, resultJ, _, _, _ = Logifold.predict(
                adversarial_lgfd, adv_x_test, x_name = sample_name + '_test', y=y_test_categorical_10,
                keys=JUDGES_KEYS,
                show_av_acc=True, show_simple_vote=True, write_story=False,
                useHistory = useHistory
            )
            _, _, _, _, resultA, _, _, _ = Logifold.predict(
            adversarial_lgfd, adv_x_test, x_name = sample_name + '_test', y=y_test_categorical_10,
            keys=JUDGES_KEYS + EXPERTS_KEYS,
            show_av_acc=True, show_simple_vote=True, write_story=False,
            useHistory = useHistory
        )
            _, _, _, _, resultAdv, _, _, _ = adversarial_lgfd.predict(
            adv_x_test, x_name = sample_name + '_test', y=y_test_categorical_10,
            committee_Judge=JUDGES_KEYS,
            committee_experts=EXPERTS_KEYS,
            entropy_threshold=alpha,
            show_av_acc=True, show_simple_vote=True,
            reportSeq=[100],
            useHistory= useHistory,
            write_story=False
        )
        else:
            print('------------------------------------------------------------------')
            print(f'Running LOGIFOLD and Adversarial LOGIFOLD evaluations on {sample_name} dataset...')
            print('------------------------------------------------------------------')
            useHistory = 'evals/' + committee_sig + f'{sample_name}_val_eval.csv'
            Logifold.predict(
                adversarial_lgfd, adv_x_val, x_name = sample_name + '_val',y=y_val_categorical_10,
                keys=JUDGES_KEYS,
                evalOutputFile= useHistory,
                show_av_acc=True, show_simple_vote=True, write_story=False
            )
            
            _, _, _, _, resultJ, _, _, _ = Logifold.predict(
                adversarial_lgfd, adv_x_test, x_name = sample_name + '_test', y=y_test_categorical_10,
                keys=JUDGES_KEYS,
                show_av_acc=True, show_simple_vote=True, write_story=False,
                useHistory = useHistory
            )
            useHistory = 'evals/' + all_sig + f'{sample_name}_val_eval.csv'
            Logifold.predict(
            adversarial_lgfd, adv_x_val, x_name = sample_name + '_val',y=y_val_categorical_10,
            keys=JUDGES_KEYS + EXPERTS_KEYS,
            evalOutputFile= useHistory,
            show_av_acc=True, show_simple_vote=True, write_story=False
        )
            _, _, _, _, resultA, _, _, _ = Logifold.predict(
                adversarial_lgfd, adv_x_test, x_name = sample_name + '_test', y=y_test_categorical_10,
                keys=JUDGES_KEYS + EXPERTS_KEYS,
                show_av_acc=True, show_simple_vote=True, write_story=False,
                useHistory = useHistory
            )
            
            useHistory = 'evals/' + experts_sig + f'{sample_name}_val_eval.csv'
            adversarial_lgfd.predict(
            adv_x_val, x_name= sample_name + '_val', y=y_val_categorical_10,
            committee_Judge=JUDGES_KEYS,
            committee_experts=EXPERTS_KEYS,
            entropy_threshold=alpha,
            show_av_acc=True, show_simple_vote=True,
            reportSeq=[100],
            evalOutputFile=useHistory,
            write_story=False
        )
            _, _, _, _, resultAdv, _, _, _ = adversarial_lgfd.predict(
            adv_x_test, x_name = sample_name + '_test', y=y_test_categorical_10,
            committee_Judge=JUDGES_KEYS,
            committee_experts=EXPERTS_KEYS,
            entropy_threshold=alpha,
            show_av_acc=True, show_simple_vote=True,
            reportSeq=[100],
            useHistory=useHistory,
            write_story=False
        )
            
        j_hist, j_avg, j_maj, j_wavg = _pick_acc(resultJ)
        a_hist, a_avg, a_maj, a_wavg = _pick_acc(resultA)
        r_hist, r_avg, r_maj, r_wavg = _pick_acc(resultAdv)

        logifold_rows.append({
            "testing_dataset": sample_name,
            "simple_majority_voting_by_Judges": j_maj,
            "refined_voting_by_Judges": j_wavg,
            "average_voting_by_Judges": j_avg,
            "using_val_history_by_Judges": j_hist,
            "simple_majority_voting_by_all": a_maj,
            "refined_voting_by_all": a_wavg,
            "average_voting_by_all": a_avg,
            "using_val_history_by_all": a_hist,
        })
        
        adv_logifold_rows.append({
            "testing_dataset": sample_name,
            "simple_majority_voting_by_all": r_maj,
            "refined_voting_by_all": r_wavg,
            "average_voting_by_all": r_avg,
            "using_val_history_by_all": r_hist,
            "entropy_threshold": alpha,
        })
        

    # ------------------------------------------------------------------
    # Save & return results
    # ------------------------------------------------------------------
    out_dir = ANALYSIS / "results"
    if not out_dir.exists():
        out_dir.mkdir(parents=True, exist_ok=True)
    if single_run:
        df_baselines   = pd.DataFrame(baseline_rows).sort_values(["dataset", "model"])
        f1 = out_dir / "baseline_single_models.csv"
        df_baselines.to_csv(f1, index=False)
        print(f"[ok] Saved baseline results to:   {f1}")
        
    df_logifold    = pd.DataFrame(logifold_rows).sort_values(["testing_dataset"])
    df_advlogifold = pd.DataFrame(adv_logifold_rows).sort_values(["testing_dataset"])

    
    f2 = out_dir / f"logifold_committees_experts-{adv_samples_labels}.csv"
    f3 = out_dir / f"advlogifold_routed_experts-{adv_samples_labels}.csv"

    
    df_logifold.to_csv(f2, index=False)
    df_advlogifold.to_csv(f3, index=False)

    
    print(f"[ok] Saved Logifold results to:    {f2}")
    print(f"[ok] Saved AdvLogifold results to: {f3}")

    return
    

In [None]:
lgfd_key_record = {}
attack_entries = ['CWL2','PGD_bigstep','PGD_standard','FGM', 'PGD_VGG']
for attack_tag in attack_entries:
    
    experts_keys, experts_paths, alpha = specialize_Committee(adversarial_lgfd, JUDGES_KEYS, adv_short_tag = attack_tag)
    lgfd_key_record[(attack_tag, 'untargeted')] = (experts_keys, experts_paths, alpha)
    adversarial_lgfd.save()
    evaluate_logifold_and_baselines(adversarial_lgfd, adv_samples_labels = f"{attack_tag}_untargeted", 
                                    JUDGES_KEYS = JUDGES_KEYS, EXPERTS_KEYS = experts_keys, EXPERTS_DIR = experts_paths, 
                                    alpha = alpha, single_run = False)

------------------------------------------------------------------
baseline models are loaded for running single model evaluation
------------------------------------------------------------------
------------------------------------------------------------------
Running single model evaluations on original dataset...
------------------------------------------------------------------
------------------------------------------------------------------
Running LOGIFOLD evaluations on original dataset...
------------------------------------------------------------------
------------------------------------------------------------------
Running ADVERSARIAL LOGIFOLD evaluations on original dataset...
------------------------------------------------------------------
------------------------------------------------------------------
Running single model evaluations on cwl2-targeted-to-2nd-by-resnet56v1-ver0 dataset...
------------------------------------------------------------------
Saved ra

Saved raw predictions of ResNet56v1_union-of-original-and-fgm-eps216-untargeted-by-resnet56v1-ver0_tuned-once-on_union-of-original-and-fgm-eps216-untargeted-by-resnet56v1-ver0_ver0 on cwl2-targeted-to-2nd-by-resnet56v1-ver0_test to cache file
Saved raw predictions of ResNet56v1_union-of-original-and-fgm-eps216-untargeted-by-resnet56v1-ver0_ver0 on cwl2-targeted-to-2nd-by-resnet56v1-ver0_test to cache file
Saved raw predictions of ResNet56v1_union-of-original-and-pgd-eps216-iter96-8steps-targeted-to-2nd-by-resnet56v1-ver0_tuned-once-on_union-of-original-and-pgd-eps216-iter96-8steps-targeted-to-2nd-by-resnet56v1-ver0_ver0 on cwl2-targeted-to-2nd-by-resnet56v1-ver0_test to cache file
Saved raw predictions of ResNet56v1_union-of-original-and-pgd-eps216-iter96-8steps-targeted-to-2nd-by-resnet56v1-ver0_ver0 on cwl2-targeted-to-2nd-by-resnet56v1-ver0_test to cache file
Saved raw predictions of ResNet56v1_union-of-original-and-pgd-eps216-iter96-8steps-targeted-to-least-by-resnet56v1-ver0_tuned

Saved raw predictions of resnet56v1_original_tuned-once-on_original_ver2 on cwl2-targeted-to-least-by-resnet56v1-ver0_test to cache file
Saved raw predictions of resnet56v1_original_tuned-once-on_original_ver3 on cwl2-targeted-to-least-by-resnet56v1-ver0_test to cache file
Saved raw predictions of resnet56v1_original_ver0 on cwl2-targeted-to-least-by-resnet56v1-ver0_test to cache file
Saved raw predictions of resnet56v1_original_ver1 on cwl2-targeted-to-least-by-resnet56v1-ver0_test to cache file
Saved raw predictions of resnet56v1_original_ver2 on cwl2-targeted-to-least-by-resnet56v1-ver0_test to cache file
Saved raw predictions of resnet56v1_original_ver3 on cwl2-targeted-to-least-by-resnet56v1-ver0_test to cache file
Saved raw predictions of resnet56v2_original_tuned-once-on_original_ver0 on cwl2-targeted-to-least-by-resnet56v1-ver0_test to cache file
Saved raw predictions of resnet56v2_original_tuned-once-on_original_ver1 on cwl2-targeted-to-least-by-resnet56v1-ver0_test to cache f

------------------------------------------------------------------
Running single model evaluations on cwl2-untargeted-by-resnet56v1-ver0 dataset...
------------------------------------------------------------------
Saved raw predictions of resnet20v1_original_tuned-once-on_original_ver0 on cwl2-untargeted-by-resnet56v1-ver0_test to cache file
Saved raw predictions of resnet20v1_original_tuned-once-on_original_ver1 on cwl2-untargeted-by-resnet56v1-ver0_test to cache file
Saved raw predictions of resnet20v1_original_tuned-once-on_original_ver2 on cwl2-untargeted-by-resnet56v1-ver0_test to cache file
Saved raw predictions of resnet20v1_original_tuned-once-on_original_ver3 on cwl2-untargeted-by-resnet56v1-ver0_test to cache file
Saved raw predictions of resnet20v1_original_tuned-once-on_original_ver4 on cwl2-untargeted-by-resnet56v1-ver0_test to cache file
Saved raw predictions of resnet20v1_original_tuned-once-on_original_ver5 on cwl2-untargeted-by-resnet56v1-ver0_test to cache file
Save

Saved raw predictions of ResNet56v1_union-of-original-and-pgd-eps216-iter96-8steps-targeted-to-least-by-resnet56v1-ver0_ver0 on cwl2-untargeted-by-resnet56v1-ver0_test to cache file
Saved raw predictions of ResNet56v1_union-of-original-and-pgd-eps216-iter96-8steps-targeted-to-least-by-vgg16-ver0_tuned-once-on_union-of-original-and-pgd-eps216-iter96-8steps-targeted-to-least-by-vgg16-ver0_ver0 on cwl2-untargeted-by-resnet56v1-ver0_test to cache file
Saved raw predictions of ResNet56v1_union-of-original-and-pgd-eps216-iter96-8steps-targeted-to-least-by-vgg16-ver0_ver0 on cwl2-untargeted-by-resnet56v1-ver0_test to cache file
Saved raw predictions of ResNet56v1_union-of-original-and-pgd-eps216-iter96-8steps-untargeted-by-resnet56v1-ver0_tuned-once-on_union-of-original-and-pgd-eps216-iter96-8steps-untargeted-by-resnet56v1-ver0_ver0 on cwl2-untargeted-by-resnet56v1-ver0_test to cache file
Saved raw predictions of ResNet56v1_union-of-original-and-pgd-eps216-iter96-8steps-untargeted-by-resnet56

Saved raw predictions of resnet56v2_original_ver2 on deepfool-untargeted-by-resnet56v1-ver0_test to cache file
Saved raw predictions of resnet56v2_original_ver3 on deepfool-untargeted-by-resnet56v1-ver0_test to cache file
Saved raw predictions of vgg11_original_ver0 on deepfool-untargeted-by-resnet56v1-ver0_test to cache file
Saved raw predictions of vgg13_original_ver0 on deepfool-untargeted-by-resnet56v1-ver0_test to cache file
Saved raw predictions of vgg16_original_ver0 on deepfool-untargeted-by-resnet56v1-ver0_test to cache file
Saved raw predictions of vgg19_original_ver0 on deepfool-untargeted-by-resnet56v1-ver0_test to cache file
Saved raw predictions of ResNet56v1_union-of-original-and-cwl2-untargeted-by-resnet56v1-ver0_tuned-once-on_union-of-original-and-cwl2-untargeted-by-resnet56v1-ver0_ver0 on deepfool-untargeted-by-resnet56v1-ver0_test to cache file
Saved raw predictions of ResNet56v1_union-of-original-and-cwl2-untargeted-by-resnet56v1-ver0_ver0 on deepfool-untargeted-by-

Saved raw predictions of resnet20v1_original_ver0 on fgm-eps216-targeted-to-2nd-by-resnet56v1-ver0_test to cache file
Saved raw predictions of resnet20v1_original_ver1 on fgm-eps216-targeted-to-2nd-by-resnet56v1-ver0_test to cache file
Saved raw predictions of resnet20v1_original_ver2 on fgm-eps216-targeted-to-2nd-by-resnet56v1-ver0_test to cache file
Saved raw predictions of resnet20v1_original_ver3 on fgm-eps216-targeted-to-2nd-by-resnet56v1-ver0_test to cache file
Saved raw predictions of resnet20v1_original_ver4 on fgm-eps216-targeted-to-2nd-by-resnet56v1-ver0_test to cache file
Saved raw predictions of resnet20v1_original_ver5 on fgm-eps216-targeted-to-2nd-by-resnet56v1-ver0_test to cache file
Saved raw predictions of resnet20v1_original_ver6 on fgm-eps216-targeted-to-2nd-by-resnet56v1-ver0_test to cache file
Saved raw predictions of resnet20v1_original_ver7 on fgm-eps216-targeted-to-2nd-by-resnet56v1-ver0_test to cache file
Saved raw predictions of resnet20v2_original_tuned-once-

Saved raw predictions of ResNet56v1_union-of-original-and-pgd-eps216-iter96-8steps-untargeted-by-resnet56v1-ver0_ver0 on fgm-eps216-targeted-to-2nd-by-resnet56v1-ver0_test to cache file
Saved raw predictions of ResNet56v1_union-of-original-and-pgd-eps216-iter96-8steps-untargeted-by-vgg16-ver0_tuned-once-on_union-of-original-and-pgd-eps216-iter96-8steps-untargeted-by-vgg16-ver0_ver0 on fgm-eps216-targeted-to-2nd-by-resnet56v1-ver0_test to cache file
Saved raw predictions of ResNet56v1_union-of-original-and-pgd-eps216-iter96-8steps-untargeted-by-vgg16-ver0_ver0 on fgm-eps216-targeted-to-2nd-by-resnet56v1-ver0_test to cache file
Saved raw predictions of ResNet56v1_union-of-original-and-pgd-eps8-iter2-10steps-targeted-to-2nd-by-resnet56v1-ver0_tuned-once-on_union-of-original-and-pgd-eps8-iter2-10steps-targeted-to-2nd-by-resnet56v1-ver0_ver0 on fgm-eps216-targeted-to-2nd-by-resnet56v1-ver0_test to cache file
Saved raw predictions of ResNet56v1_union-of-original-and-pgd-eps8-iter2-10steps-ta

Saved raw predictions of vgg16_original_ver0 on fgm-eps216-targeted-to-least-by-resnet56v1-ver0_test to cache file
Saved raw predictions of vgg19_original_ver0 on fgm-eps216-targeted-to-least-by-resnet56v1-ver0_test to cache file
Saved raw predictions of ResNet56v1_union-of-original-and-cwl2-untargeted-by-resnet56v1-ver0_tuned-once-on_union-of-original-and-cwl2-untargeted-by-resnet56v1-ver0_ver0 on fgm-eps216-targeted-to-least-by-resnet56v1-ver0_test to cache file
Saved raw predictions of ResNet56v1_union-of-original-and-cwl2-untargeted-by-resnet56v1-ver0_ver0 on fgm-eps216-targeted-to-least-by-resnet56v1-ver0_test to cache file
Saved raw predictions of ResNet56v1_union-of-original-and-deepfool-untargeted-by-resnet56v1-ver0_tuned-once-on_union-of-original-and-deepfool-untargeted-by-resnet56v1-ver0_ver0 on fgm-eps216-targeted-to-least-by-resnet56v1-ver0_test to cache file
Saved raw predictions of ResNet56v1_union-of-original-and-deepfool-untargeted-by-resnet56v1-ver0_ver0 on fgm-eps216-

Saved raw predictions of resnet20v1_original_ver2 on fgm-eps216-untargeted-by-resnet56v1-ver0_test to cache file
Saved raw predictions of resnet20v1_original_ver3 on fgm-eps216-untargeted-by-resnet56v1-ver0_test to cache file
Saved raw predictions of resnet20v1_original_ver4 on fgm-eps216-untargeted-by-resnet56v1-ver0_test to cache file
Saved raw predictions of resnet20v1_original_ver5 on fgm-eps216-untargeted-by-resnet56v1-ver0_test to cache file
Saved raw predictions of resnet20v1_original_ver6 on fgm-eps216-untargeted-by-resnet56v1-ver0_test to cache file
Saved raw predictions of resnet20v1_original_ver7 on fgm-eps216-untargeted-by-resnet56v1-ver0_test to cache file
Saved raw predictions of resnet20v2_original_tuned-once-on_original_ver0 on fgm-eps216-untargeted-by-resnet56v1-ver0_test to cache file
Saved raw predictions of resnet20v2_original_tuned-once-on_original_ver1 on fgm-eps216-untargeted-by-resnet56v1-ver0_test to cache file
Saved raw predictions of resnet20v2_original_tuned

Saved raw predictions of ResNet56v1_union-of-original-and-pgd-eps216-iter96-8steps-untargeted-by-vgg16-ver0_ver0 on fgm-eps216-untargeted-by-resnet56v1-ver0_test to cache file
Saved raw predictions of ResNet56v1_union-of-original-and-pgd-eps8-iter2-10steps-targeted-to-2nd-by-resnet56v1-ver0_tuned-once-on_union-of-original-and-pgd-eps8-iter2-10steps-targeted-to-2nd-by-resnet56v1-ver0_ver0 on fgm-eps216-untargeted-by-resnet56v1-ver0_test to cache file
Saved raw predictions of ResNet56v1_union-of-original-and-pgd-eps8-iter2-10steps-targeted-to-2nd-by-resnet56v1-ver0_ver0 on fgm-eps216-untargeted-by-resnet56v1-ver0_test to cache file
Saved raw predictions of ResNet56v1_union-of-original-and-pgd-eps8-iter2-10steps-targeted-to-least-by-resnet56v1-ver0_tuned-once-on_union-of-original-and-pgd-eps8-iter2-10steps-targeted-to-least-by-resnet56v1-ver0_ver0 on fgm-eps216-untargeted-by-resnet56v1-ver0_test to cache file
Saved raw predictions of ResNet56v1_union-of-original-and-pgd-eps8-iter2-10steps

Saved raw predictions of vgg13_original_ver0 on pgd-eps216-iter96-8steps-targeted-to-2nd-by-resnet56v1-ver0_test to cache file
Saved raw predictions of vgg16_original_ver0 on pgd-eps216-iter96-8steps-targeted-to-2nd-by-resnet56v1-ver0_test to cache file
Saved raw predictions of vgg19_original_ver0 on pgd-eps216-iter96-8steps-targeted-to-2nd-by-resnet56v1-ver0_test to cache file
Saved raw predictions of ResNet56v1_union-of-original-and-cwl2-untargeted-by-resnet56v1-ver0_tuned-once-on_union-of-original-and-cwl2-untargeted-by-resnet56v1-ver0_ver0 on pgd-eps216-iter96-8steps-targeted-to-2nd-by-resnet56v1-ver0_test to cache file
Saved raw predictions of ResNet56v1_union-of-original-and-cwl2-untargeted-by-resnet56v1-ver0_ver0 on pgd-eps216-iter96-8steps-targeted-to-2nd-by-resnet56v1-ver0_test to cache file
Saved raw predictions of ResNet56v1_union-of-original-and-deepfool-untargeted-by-resnet56v1-ver0_tuned-once-on_union-of-original-and-deepfool-untargeted-by-resnet56v1-ver0_ver0 on pgd-eps2

Saved raw predictions of resnet20v1_original_tuned-once-on_original_ver7 on pgd-eps216-iter96-8steps-targeted-to-least-by-resnet20v1-ver0_test to cache file
Saved raw predictions of resnet20v1_original_tuned-thrice-on_original_ver0 on pgd-eps216-iter96-8steps-targeted-to-least-by-resnet20v1-ver0_test to cache file
Saved raw predictions of resnet20v1_original_tuned-twice-on_original_ver0 on pgd-eps216-iter96-8steps-targeted-to-least-by-resnet20v1-ver0_test to cache file
Saved raw predictions of resnet20v1_original_ver0 on pgd-eps216-iter96-8steps-targeted-to-least-by-resnet20v1-ver0_test to cache file
Saved raw predictions of resnet20v1_original_ver1 on pgd-eps216-iter96-8steps-targeted-to-least-by-resnet20v1-ver0_test to cache file
Saved raw predictions of resnet20v1_original_ver2 on pgd-eps216-iter96-8steps-targeted-to-least-by-resnet20v1-ver0_test to cache file
Saved raw predictions of resnet20v1_original_ver3 on pgd-eps216-iter96-8steps-targeted-to-least-by-resnet20v1-ver0_test to c

Saved raw predictions of ResNet56v1_union-of-original-and-pgd-eps216-iter96-8steps-targeted-to-least-by-resnet56v1-ver0_tuned-once-on_union-of-original-and-pgd-eps216-iter96-8steps-targeted-to-least-by-resnet56v1-ver0_ver0 on pgd-eps216-iter96-8steps-targeted-to-least-by-resnet20v1-ver0_test to cache file
Saved raw predictions of ResNet56v1_union-of-original-and-pgd-eps216-iter96-8steps-targeted-to-least-by-resnet56v1-ver0_ver0 on pgd-eps216-iter96-8steps-targeted-to-least-by-resnet20v1-ver0_test to cache file
Saved raw predictions of ResNet56v1_union-of-original-and-pgd-eps216-iter96-8steps-targeted-to-least-by-vgg16-ver0_tuned-once-on_union-of-original-and-pgd-eps216-iter96-8steps-targeted-to-least-by-vgg16-ver0_ver0 on pgd-eps216-iter96-8steps-targeted-to-least-by-resnet20v1-ver0_test to cache file
Saved raw predictions of ResNet56v1_union-of-original-and-pgd-eps216-iter96-8steps-targeted-to-least-by-vgg16-ver0_ver0 on pgd-eps216-iter96-8steps-targeted-to-least-by-resnet20v1-ver0_te

Saved raw predictions of resnet56v1_original_tuned-once-on_original_ver2 on pgd-eps216-iter96-8steps-targeted-to-least-by-resnet20v2-ver0_test to cache file
Saved raw predictions of resnet56v1_original_tuned-once-on_original_ver3 on pgd-eps216-iter96-8steps-targeted-to-least-by-resnet20v2-ver0_test to cache file
Saved raw predictions of resnet56v1_original_ver0 on pgd-eps216-iter96-8steps-targeted-to-least-by-resnet20v2-ver0_test to cache file
Saved raw predictions of resnet56v1_original_ver1 on pgd-eps216-iter96-8steps-targeted-to-least-by-resnet20v2-ver0_test to cache file
Saved raw predictions of resnet56v1_original_ver2 on pgd-eps216-iter96-8steps-targeted-to-least-by-resnet20v2-ver0_test to cache file
Saved raw predictions of resnet56v1_original_ver3 on pgd-eps216-iter96-8steps-targeted-to-least-by-resnet20v2-ver0_test to cache file
Saved raw predictions of resnet56v2_original_tuned-once-on_original_ver0 on pgd-eps216-iter96-8steps-targeted-to-least-by-resnet20v2-ver0_test to cach

Saved raw predictions of ResNet56v1_union-of-original-and-pgd-eps8-iter2-10steps-untargeted-by-resnet56v1-ver0_tuned-once-on_union-of-original-and-pgd-eps8-iter2-10steps-untargeted-by-resnet56v1-ver0_ver0 on pgd-eps216-iter96-8steps-targeted-to-least-by-resnet20v2-ver0_test to cache file
Saved raw predictions of ResNet56v1_union-of-original-and-pgd-eps8-iter2-10steps-untargeted-by-resnet56v1-ver0_ver0 on pgd-eps216-iter96-8steps-targeted-to-least-by-resnet20v2-ver0_test to cache file
[warn] No val sample for pgd-eps216-iter96-8steps-targeted-to-least-by-resnet20v2-ver0, skipping validation history results and AdvLogifold results...
------------------------------------------------------------------
Running LOGIFOLD and Adversarial LOGIFOLD evaluations on pgd-eps216-iter96-8steps-targeted-to-least-by-resnet20v2-ver0 dataset without using history because of the absence of validation sample...
------------------------------------------------------------------
------------------------------

Saved raw predictions of ResNet56v1_union-of-original-and-deepfool-untargeted-by-resnet56v1-ver0_tuned-once-on_union-of-original-and-deepfool-untargeted-by-resnet56v1-ver0_ver0 on pgd-eps216-iter96-8steps-targeted-to-least-by-resnet56v1-ver0_test to cache file
Saved raw predictions of ResNet56v1_union-of-original-and-deepfool-untargeted-by-resnet56v1-ver0_ver0 on pgd-eps216-iter96-8steps-targeted-to-least-by-resnet56v1-ver0_test to cache file
Saved raw predictions of ResNet56v1_union-of-original-and-fgm-eps216-targeted-to-2nd-by-resnet56v1-ver0_tuned-once-on_union-of-original-and-fgm-eps216-targeted-to-2nd-by-resnet56v1-ver0_ver0 on pgd-eps216-iter96-8steps-targeted-to-least-by-resnet56v1-ver0_test to cache file
Saved raw predictions of ResNet56v1_union-of-original-and-fgm-eps216-targeted-to-2nd-by-resnet56v1-ver0_ver0 on pgd-eps216-iter96-8steps-targeted-to-least-by-resnet56v1-ver0_test to cache file
Saved raw predictions of ResNet56v1_union-of-original-and-fgm-eps216-targeted-to-leas

Saved raw predictions of resnet20v1_original_ver2 on pgd-eps216-iter96-8steps-targeted-to-least-by-resnet56v2-ver0_test to cache file
Saved raw predictions of resnet20v1_original_ver3 on pgd-eps216-iter96-8steps-targeted-to-least-by-resnet56v2-ver0_test to cache file
Saved raw predictions of resnet20v1_original_ver4 on pgd-eps216-iter96-8steps-targeted-to-least-by-resnet56v2-ver0_test to cache file
Saved raw predictions of resnet20v1_original_ver5 on pgd-eps216-iter96-8steps-targeted-to-least-by-resnet56v2-ver0_test to cache file
Saved raw predictions of resnet20v1_original_ver6 on pgd-eps216-iter96-8steps-targeted-to-least-by-resnet56v2-ver0_test to cache file
Saved raw predictions of resnet20v1_original_ver7 on pgd-eps216-iter96-8steps-targeted-to-least-by-resnet56v2-ver0_test to cache file
Saved raw predictions of resnet20v2_original_tuned-once-on_original_ver0 on pgd-eps216-iter96-8steps-targeted-to-least-by-resnet56v2-ver0_test to cache file
Saved raw predictions of resnet20v2_ori

Saved raw predictions of ResNet56v1_union-of-original-and-pgd-eps216-iter96-8steps-targeted-to-least-by-vgg16-ver0_ver0 on pgd-eps216-iter96-8steps-targeted-to-least-by-resnet56v2-ver0_test to cache file
Saved raw predictions of ResNet56v1_union-of-original-and-pgd-eps216-iter96-8steps-untargeted-by-resnet56v1-ver0_tuned-once-on_union-of-original-and-pgd-eps216-iter96-8steps-untargeted-by-resnet56v1-ver0_ver0 on pgd-eps216-iter96-8steps-targeted-to-least-by-resnet56v2-ver0_test to cache file
Saved raw predictions of ResNet56v1_union-of-original-and-pgd-eps216-iter96-8steps-untargeted-by-resnet56v1-ver0_ver0 on pgd-eps216-iter96-8steps-targeted-to-least-by-resnet56v2-ver0_test to cache file
Saved raw predictions of ResNet56v1_union-of-original-and-pgd-eps216-iter96-8steps-untargeted-by-vgg16-ver0_tuned-once-on_union-of-original-and-pgd-eps216-iter96-8steps-untargeted-by-vgg16-ver0_ver0 on pgd-eps216-iter96-8steps-targeted-to-least-by-resnet56v2-ver0_test to cache file
Saved raw predicti

Saved raw predictions of resnet56v2_original_tuned-once-on_original_ver0 on pgd-eps216-iter96-8steps-targeted-to-least-by-vgg11-ver0_test to cache file
Saved raw predictions of resnet56v2_original_tuned-once-on_original_ver1 on pgd-eps216-iter96-8steps-targeted-to-least-by-vgg11-ver0_test to cache file
Saved raw predictions of resnet56v2_original_tuned-once-on_original_ver2 on pgd-eps216-iter96-8steps-targeted-to-least-by-vgg11-ver0_test to cache file
Saved raw predictions of resnet56v2_original_tuned-once-on_original_ver3 on pgd-eps216-iter96-8steps-targeted-to-least-by-vgg11-ver0_test to cache file
Saved raw predictions of resnet56v2_original_ver0 on pgd-eps216-iter96-8steps-targeted-to-least-by-vgg11-ver0_test to cache file
Saved raw predictions of resnet56v2_original_ver1 on pgd-eps216-iter96-8steps-targeted-to-least-by-vgg11-ver0_test to cache file
Saved raw predictions of resnet56v2_original_ver2 on pgd-eps216-iter96-8steps-targeted-to-least-by-vgg11-ver0_test to cache file
Saved

------------------------------------------------------------------
Running single model evaluations on pgd-eps216-iter96-8steps-targeted-to-least-by-vgg13-ver0 dataset...
------------------------------------------------------------------
Saved raw predictions of resnet20v1_original_tuned-once-on_original_ver0 on pgd-eps216-iter96-8steps-targeted-to-least-by-vgg13-ver0_test to cache file
Saved raw predictions of resnet20v1_original_tuned-once-on_original_ver1 on pgd-eps216-iter96-8steps-targeted-to-least-by-vgg13-ver0_test to cache file
Saved raw predictions of resnet20v1_original_tuned-once-on_original_ver2 on pgd-eps216-iter96-8steps-targeted-to-least-by-vgg13-ver0_test to cache file
Saved raw predictions of resnet20v1_original_tuned-once-on_original_ver3 on pgd-eps216-iter96-8steps-targeted-to-least-by-vgg13-ver0_test to cache file
Saved raw predictions of resnet20v1_original_tuned-once-on_original_ver4 on pgd-eps216-iter96-8steps-targeted-to-least-by-vgg13-ver0_test to cache file
Sa

Saved raw predictions of ResNet56v1_union-of-original-and-fgm-eps216-targeted-to-least-by-resnet56v1-ver0_ver0 on pgd-eps216-iter96-8steps-targeted-to-least-by-vgg13-ver0_test to cache file
Saved raw predictions of ResNet56v1_union-of-original-and-fgm-eps216-untargeted-by-resnet56v1-ver0_tuned-once-on_union-of-original-and-fgm-eps216-untargeted-by-resnet56v1-ver0_ver0 on pgd-eps216-iter96-8steps-targeted-to-least-by-vgg13-ver0_test to cache file
Saved raw predictions of ResNet56v1_union-of-original-and-fgm-eps216-untargeted-by-resnet56v1-ver0_ver0 on pgd-eps216-iter96-8steps-targeted-to-least-by-vgg13-ver0_test to cache file
Saved raw predictions of ResNet56v1_union-of-original-and-pgd-eps216-iter96-8steps-targeted-to-2nd-by-resnet56v1-ver0_tuned-once-on_union-of-original-and-pgd-eps216-iter96-8steps-targeted-to-2nd-by-resnet56v1-ver0_ver0 on pgd-eps216-iter96-8steps-targeted-to-least-by-vgg13-ver0_test to cache file
Saved raw predictions of ResNet56v1_union-of-original-and-pgd-eps216-

Saved raw predictions of resnet20v2_original_tuned-once-on_original_ver3 on pgd-eps216-iter96-8steps-targeted-to-least-by-vgg16-ver0_test to cache file
Saved raw predictions of resnet20v2_original_ver0 on pgd-eps216-iter96-8steps-targeted-to-least-by-vgg16-ver0_test to cache file
Saved raw predictions of resnet20v2_original_ver1 on pgd-eps216-iter96-8steps-targeted-to-least-by-vgg16-ver0_test to cache file
Saved raw predictions of resnet20v2_original_ver2 on pgd-eps216-iter96-8steps-targeted-to-least-by-vgg16-ver0_test to cache file
Saved raw predictions of resnet20v2_original_ver3 on pgd-eps216-iter96-8steps-targeted-to-least-by-vgg16-ver0_test to cache file
Saved raw predictions of resnet56v1_original_tuned-once-on_original_ver0 on pgd-eps216-iter96-8steps-targeted-to-least-by-vgg16-ver0_test to cache file
Saved raw predictions of resnet56v1_original_tuned-once-on_original_ver1 on pgd-eps216-iter96-8steps-targeted-to-least-by-vgg16-ver0_test to cache file
Saved raw predictions of res

Saved raw predictions of ResNet56v1_union-of-original-and-pgd-eps8-iter2-10steps-targeted-to-2nd-by-resnet56v1-ver0_ver0 on pgd-eps216-iter96-8steps-targeted-to-least-by-vgg16-ver0_test to cache file
Saved raw predictions of ResNet56v1_union-of-original-and-pgd-eps8-iter2-10steps-targeted-to-least-by-resnet56v1-ver0_tuned-once-on_union-of-original-and-pgd-eps8-iter2-10steps-targeted-to-least-by-resnet56v1-ver0_ver0 on pgd-eps216-iter96-8steps-targeted-to-least-by-vgg16-ver0_test to cache file
Saved raw predictions of ResNet56v1_union-of-original-and-pgd-eps8-iter2-10steps-targeted-to-least-by-resnet56v1-ver0_ver0 on pgd-eps216-iter96-8steps-targeted-to-least-by-vgg16-ver0_test to cache file
Saved raw predictions of ResNet56v1_union-of-original-and-pgd-eps8-iter2-10steps-untargeted-by-resnet56v1-ver0_tuned-once-on_union-of-original-and-pgd-eps8-iter2-10steps-untargeted-by-resnet56v1-ver0_ver0 on pgd-eps216-iter96-8steps-targeted-to-least-by-vgg16-ver0_test to cache file
Saved raw predic

In [None]:
ATTACKS: List[AttackEntry] = [
    AttackEntry("DeepFool",            "*deepfool_untargeted_train_by_resnet56v1_ver0.npy", "deepfool-untargeted-gen-by-resnet56v1-ver0"),
    ]

attack_entries = ['DeepFool']
for attack_tag in attack_entries:
    experts_keys, experts_paths, alpha = specialize_Committee(adversarial_lgfd, JUDGES_KEYS, adv_short_tag = attack_tag)
    lgfd_key_record[(attack_tag, 'deepfool')] = (experts_keys, experts_paths, alpha)
    adversarial_lgfd.save()
    evaluate_logifold_and_baselines(adversarial_lgfd, adv_samples_labels = f"{attack_tag}_untargeted", 
                                    JUDGES_KEYS = JUDGES_KEYS, EXPERTS_KEYS = experts_keys, EXPERTS_DIR = experts_paths, 
                                    alpha = alpha, single_run = False)

In [None]:
ATTACKS: List[AttackEntry] = [
    AttackEntry("PGD_bigstep",     "*pgd*eps216*targeted_to-least_train_by_resnet56v1_ver0.npy","pgd-eps216-iter96-8steps-targeted-to-least-gen-by-resnet56v1-ver0"),
    AttackEntry("PGD_standard",    "*pgd*eps8*targeted_to-least_train_by_resnet56v1_ver0.npy","pgd-eps8-iter2-10steps-targeted-to-least-gen-by-resnet56v1-ver0"),
    AttackEntry("FGM",     "*fgm*eps216*targeted_to-least_train_by_resnet56v1_ver0.npy","fgm-eps216-targeted-to-least-gen-by-resnet56v1-ver0"),
    AttackEntry("PGD_VGG",     "*pgd*eps216*targeted_to-least_train_by_vgg16_ver0.npy","pgd-eps216-iter96-8steps-targeted-to-least-gen-by-vgg16-ver0"),
]
attack_entries = ['PGD_bigstep','PGD_standard','FGM', 'PGD_VGG']
for attack_tag in attack_entries:
    experts_keys, experts_paths, alpha = specialize_Committee(adversarial_lgfd, JUDGES_KEYS, adv_short_tag = attack_tag)
    lgfd_key_record[(attack_tag, 'targeted_to_least')] = (experts_keys, experts_paths, alpha)
    adversarial_lgfd.save()
    evaluate_logifold_and_baselines(adversarial_lgfd, adv_samples_labels = f"{attack_tag}_untargeted", 
                                    JUDGES_KEYS = JUDGES_KEYS, EXPERTS_KEYS = experts_keys, EXPERTS_DIR = experts_paths, 
                                    alpha = alpha, single_run = False)

In [None]:
ATTACKS: List[AttackEntry] = [
    AttackEntry("PGD_bigstep",     "*pgd*eps216*targeted_to-2nd_train_by_resnet56v1_ver0.npy","pgd-eps216-iter96-8steps-targeted-to-2nd-gen-by-resnet56v1-ver0"),
    AttackEntry("PGD_standard",    "*pgd*eps8*targeted_to-2nd_train_by_resnet56v1_ver0.npy","pgd-eps8-iter2-10steps-targeted-to-2nd-gen-by-resnet56v1-ver0"),
    AttackEntry("FGM",     "*fgm*eps216*targeted_to-2nd_train_by_resnet56v1_ver0.npy","fgm-eps216-targeted-to-2nd-gen-by-resnet56v1-ver0"),
]
attack_entries = ['PGD_bigstep','PGD_standard','FGM']


for attack_tag in attack_entries:
    experts_keys, experts_paths, alpha = specialize_Committee(adversarial_lgfd, JUDGES_KEYS, adv_short_tag = attack_tag)
    lgfd_key_record[(attack_tag, 'targeted_to_2nd')] = (experts_keys, experts_paths, alpha)
    adversarial_lgfd.save()
    evaluate_logifold_and_baselines(adversarial_lgfd, adv_samples_labels = f"{attack_tag}_untargeted", 
                                    JUDGES_KEYS = JUDGES_KEYS, EXPERTS_KEYS = experts_keys, EXPERTS_DIR = experts_paths, 
                                    alpha = alpha, single_run = False)