# Tutorial 06: ensembles and transfer attacks

Let's train some patches against an ensemble of two models and evaluate performance against a third.

In [None]:
import numpy as np
import pandas as pd
import torch
import ultralytics

In [None]:
import electricmayhem.whitebox as em

In [None]:
COCO_CLASSES = ['person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus', 'train', 'truck', 'boat',
                'traffic light', 'fire hydrant', 'stop sign', 'parking meter', 'bench','bird', 'cat',
                'dog', 'horse', 'sheep', 'cow', 'elephant', 'bear', 'zebra', 'giraffe', 'backpack',
                'umbrella', 'handbag', 'tie', 'suitcase', 'frisbee', 'skis', 'snowboard', 'sports ball',
                 'kite', 'baseball bat', 'baseball glove', 'skateboard', 'surfboard', 'tennis racket',
                'bottle', 'wine glass', 'cup', 'fork', 'knife', 'spoon', 'bowl', 'banana', 'apple',
                 'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza', 'donut', 'cake', 'chair',
                'couch', 'potted plant', 'bed', 'dining table', 'toilet', 'tv', 'laptop', 'mouse', 'remote',
                'keyboard', 'cell phone', 'microwave', 'oven', 'toaster', 'sink', 'refrigerator', 'book',
                'clock', 'vase', 'scissors', 'teddy bear', 'hair drier','toothbrush']

## create

Let's do color patches this time, but use a soft proofer during training to make sure the colors are realistic

In [None]:
proofer = em.SoftProofer("data/profile.icc")

## implant

Reuse the same target dataset from tutorial 01.

In [None]:
labels = pd.read_csv("data/toycar/toycar_warp_dataset.csv")
labels = labels[labels.patch != "ground"]
len(labels)

In [None]:
labels.head()

Names of the 3 patches we'll train:

In [None]:
labels.patch.unique()

The `em.WarpPatchImplanter()` class will take care of differentiably deforming and implanting patches (with kornia doing most of the heavy lifting). We need two inputs:

* the `DataFrame` of target labels
* a dictionary of patch shapes (at the point of implanting, so they'll be 3-channel); the implanter will use this to precompute transformation matrices

In [None]:
patch_shapes = {k:(3,64,64) for k in ['hood', 'roof', 'door']}
imp = em.WarpPatchImplanter(labels, patch_shapes=patch_shapes, dataset_name="toycar_warp_no_ground")

## compose

The main tool `electricmayhem` has so far is `em.KorniaAugmentationPipeline()`, which just wraps the `kornia.augmentation` API. Initialize it with a dictionary of image augmentations, where each value is the keyword arguments that augmentation takes.

In [None]:
aug = em.KorniaAugmentationPipeline({"ColorJiggle":{"brightness":0.2, "contrast":0.2, "hue":0.1, "saturation":0.1},
                                    "RandomAffine":{"scale":(0.9,1.1), "shear":10, "padding_mode":"reflection", "degrees":0}})

## infer

Here's where we'll depart from tutorial 01. Let's train a patch using two YOLOv8 models and test performance on a YOLOv11.

The `em.YOLOWrapper` class can be run with a single model, separate models for training and evaluation, or dictionaries of training and evaluation models. If you pass an ensemble of models as a dictionary, the output of this stage will be a dictionary as well, so your loss function can handle the individual model outputs separately.

In [None]:
yolov8n = ultralytics.YOLO("yolov8n.pt").model.eval()
yolov8s = ultralytics.YOLO("yolov8s.pt").model.eval()
yolov11n = ultralytics.YOLO("yolo11n.pt").model.eval()

Pass dictionaries to `em.YOLOWrapper` to associate each model with a name (to make sure our logs are interpretable) as well as a YOLO version. In this case it won't matter because output formats of v8 and v11 are the same.

In [None]:
yolo = em.YOLOWrapper({"yolov8n":yolov8n, "yolov8s":yolov8s},
                      eval_model={"yolov11n":yolov11n},
                      yolo_version={"yolov8n":8, "yolov8s":8, "yolov11n":11}, classnames=COCO_CLASSES)

## assemble the pipeline

Take all of the steps we built above and assemble into a `Pipeline` object:

In [None]:
pipeline = proofer+imp+aug+yolo

## Write a loss function

Since we passed a dictionary of models to `em.YOLOWrapper`, it will output a dictionary of results with the same keys.

In [None]:
def loss(output, **kwargs):
    outdict = {}
    # iterate over models
    for k in output:
        # pull out max detection score across classes for every batch element and box
        maxdetect_boxes = output[k][0][:,:,4] # (batch, num_boxes)
        maxdetect = torch.max(maxdetect_boxes, 1)[0]  # (batch,)
        # let's also compute an average success rate (ASR) at 25%. mapping each batch
        # element to 0 or 1 will give the ASR when averaged across the batch
        asr25 = (maxdetect < 0.25).type(torch.float32)
        # record the max detection for each batch element
        outdict[f"maxdetect_{k}"] = maxdetect
        outdict[f"asr25_{k}"] = asr25
    return outdict

Pass the loss function to your pipeline along with a dictionary giving the shapes of a batch of test patches, so it can check the inputs/outputs before you start training:

In [None]:
pipeline.set_loss(loss, test_patch_shape={k:(2,3,64,64) for k in ['hood', 'roof', 'door']})

## Train the patch

When we set logging- we can also add arbitrary key-value pairs two ways as keyword arguments to `pipeline.set_logging()`:

* `extra_params` will add them as MLFlow parameters; this is useful for tracking exogenous variables when your pipeline is part of a larger experiment
* `tags` will add them to as MLFlow tags

In [None]:
pipeline.set_logging(logdir="logs/06_ensemble",
                    mlflow_uri="http://127.0.0.1:5000",
                    experiment_name="electricmayhem_tutorial_06_ensemble_and_transfer",
                    extra_params={"foo":"bar"},
                    tags={"this_is_my_tag":"wow_it_totally_is"})

Second, explicitly tell it to initialize the patches. If you want you could alternatively pass it a dictionary of patches pre-initialized to whatever you want.

In [None]:
pipeline.initialize_patch_params(patch_shape={k:(3,64,64) for k in ['hood', 'roof', 'door']})

All of our classes inherit from `torch.nn.Module` so this should look familiar:

In [None]:
pipeline.cuda();

When training the patch- the loss function will return two `maxdetect` terms, one for each model, so we'll need to specify weights for each explicitly:

In [None]:
patch = pipeline.train_patch(
    12,
    1000,
    learning_rate=0.01, 
    eval_every=100,
    num_eval_steps=10,
    optimizer='adam',
    lr_decay='cosine',
    maxdetect_yolov8n=0.5,
    maxdetect_yolov8s=0.5,
)