# Blackbox patch optimization

The `electricmayhem.blackbox` module is a lot less organized than its whitebox counterpart, and is currently effectively abandoned. For anyone interested in scavenging from what I've got here, though, this notebook will show an example of training a relatively crude patch against a COCO-trained YOLO model *without* access to gradients. 

## Background

I strongly recomment taking a look at two things before trying this notebook:

* Feng *et al*'s paper *GRAPHITE: Generating Automatic Physical Examples for Machine-Learning Attacks on Computer Vision Systems*. This paper outlines a relatively practical blackbox patch attack using zeroth-order gradient estimation; my code (mostly) follows Feng's method.
* Docs for the Python library `dask`. The `BlackBoxPatchTrainer` class uses `dask` under the hood for parallelization; if you need to customize how it does scheduling, manages memory, or chooses the number of workers, you can use `dask`'s API to set global defaults.

## Task for this notebook

To keep it simple, we'll train a patch against a single image, using soft outputs from the model (scores, not just categories), with almost no augmentation. This means the patch will likely be overfit to this image and not generalize well to new backgrounds or orientations (but it will converge much more quickly).

Using soft scores instead of hard decisions gives the RGF estimator more information, so the gradient estimate will be less noisy. Working directly with hard outputs would also require us to use some heuristic (like Feng's mask reduction technique) to find the decision boundary.

We'll train a patch on an image from the toy car dataset, to try to cause the model to not detect the car.

In [None]:
import numpy as np
import matplotlib.pyplot as plt
import torch
import ultralytics
import kornia.geometry

In [None]:
import electricmayhem.whitebox as em
from electricmayhem.blackbox import BlackBoxPatchTrainer

In [None]:
COCO_CLASSES = ['person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus', 'train', 'truck', 'boat',
                'traffic light', 'fire hydrant', 'stop sign', 'parking meter', 'bench','bird', 'cat',
                'dog', 'horse', 'sheep', 'cow', 'elephant', 'bear', 'zebra', 'giraffe', 'backpack',
                'umbrella', 'handbag', 'tie', 'suitcase', 'frisbee', 'skis', 'snowboard', 'sports ball',
                 'kite', 'baseball bat', 'baseball glove', 'skateboard', 'surfboard', 'tennis racket',
                'bottle', 'wine glass', 'cup', 'fork', 'knife', 'spoon', 'bowl', 'banana', 'apple',
                 'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza', 'donut', 'cake', 'chair',
                'couch', 'potted plant', 'bed', 'dining table', 'toilet', 'tv', 'laptop', 'mouse', 'remote',
                'keyboard', 'cell phone', 'microwave', 'oven', 'toaster', 'sink', 'refrigerator', 'book',
                'clock', 'vase', 'scissors', 'teddy bear', 'hair drier','toothbrush']

In [None]:
x = em.load_to_tensor("data/toycar/medium_distance_arc/057.png")
em.plot(x)

In [None]:
import electricmayhem.blackbox
electricmayhem.blackbox._augment.augment_image(x).shape

## Prepare the model

In [None]:
model = ultralytics.YOLO("yolov8n.pt").model.eval()

## Write a detection function

This can be pretty much any dask-compatible function that interfaces with the model we're attacking. In this case it will wrap a Pytorch model; it could just as easily be calling out to an external API.

In [None]:
def detect_func(img, return_raw=False, **kwargs):
    img = kornia.geometry.transform.resize(img, (640,640))
    model_outputs = model(img.unsqueeze(0))[0]
    class_probs = model_outputs[:,4:,:]
    maxval = torch.max(class_probs).item()
    if return_raw:
        return [int(maxval>0.25),maxval]
    else:
        return maxval

In [None]:
detect_func(x)

## Create a mask

For hard blackbox attacks that need to find a decision boundary, we'd need two masks- an "initial" one (that covers up the object) and a "final" one (giving the shape of the final patch we want), and the algorithm would try to coevolve the mask and patch together- gradually reducing the initial mask to get to the final one, without straying too far from the decision boundary.

For the soft blackbox attack we can just use the final mask, since we can differentiate between a detection at 90% confidence and one at 89% confidence (which look the same in the hard case).

In [None]:
mask = torch.zeros_like(x)
mask[:,300:600,50:450] += 1
em.plot(x*(1-mask) + mask)

## Add a TensorBoard callback

The "transform robustness" metric in Feng's paper isn't a perfect measure for this example, so let's add a callback function that will record a plot of YOLO detections to TensorBoard every eval step:

In [None]:
SAVE_EVERY = 190
def eval_func(writer, counter, img, mask, perturbation, **kwargs):
    if counter % SAVE_EVERY == 0:
        pert = kornia.geometry.transform.resize(perturbation, (640,640))
        x = img*(1-mask) + mask*pert
        detections = model(x.unsqueeze(0))
        detections_converted = em._yolo.convert_ultralytics_to_v5_format(detections[0])[0]
        fig = em._yolo.plot_detections(x.unsqueeze(0), detections_converted, 0, classnames=COCO_CLASSES)
        writer.add_figure("detections", fig, global_step=counter)


## Set up a trainer

The required arguments are:

* The target image as a Pytorch tensor
* The initial mask, as an image tensor
* The final mask, as an image tensor. For this example initial and final masks are the same.
* The detection function
* A directory to save logs in

The keyword arguments after that set some options:

* `perturbation` specifies the initial patch. It can be any size, but will be resized to the target image dimensions before applying the mask. Lowering the resolution of this patch lowers its flexibility to defeat the model but also lowers the variance of the RGF estimator, effectively letting you tune a bias-variance tradeoff for the blackbox attack.
* `num_augments` sets the number of augmentations it will test every random direction under for every step of RGF. Since augmentation is effectively disabled here we'll just set it to 1. When set higher, `dask` will attempt to parallelize the computation.
* Following Feng *et al*, `q` is the number of random directions we'll query for each step of RGF.
* `beta` is the RGF smoothing parameter. The estimator becomes unbiased in the limit $\beta \rightarrow 0$.
* `reduce_mask` defaults to `True`. `False` skips Feng *et al*'s mask reduction step.
* `eval_augments` sets the number of augmentations to test under for evaluation.
* `eval_func` takes an optional function to add custom logs to tensorboard
* `extra_params` for anything else you want recorded to MLFlow
* `aug_params` configures augmentations. In addition to most of the ones in Feng's paper, I have a couple options for composition noise that shifts or rotates the perturbation with respect to the target image.

In [None]:
d = 120 
perturbation = torch.tensor(np.random.uniform(0, 1, size=(3,d,d)).astype(np.float32))

trainer = BlackBoxPatchTrainer(
    x, # the target image
    mask, # initial mask- in this case, same as the final mask
    mask, # final mask
    detect_func,
    "logs/blackbox_example",
    perturbation=perturbation,
    num_augments=1, # number of augmentations to test each random direction under
    q=10, # number of random directions per step
    beta=0.1, # RGF smoothing parameter
    reduce_mask=False, # skipping the mask-reduction step
    eval_augments=50, # number of random augmentations to use for evaluation
    use_scores=True, # use soft outputs of the model rather than hard decisions
    eval_func=eval_func,
    extra_params={"perturbation_size":d}, # additional parameters we might want to log to MLFlow
    aug_params={"scale":(0.99,1.01), "blur":[0], "rotate":0, "angle":0, "translate":0, "gamma":(1,1.1),
               "perspective_scale":0},
)

To fit the patch- you can train for a fixed number of epochs or a fixed query budget. The `lrs` kwarg lets you specify the RGF step sizes that will be tested for each update. The chosen learning rate will get recorded in tensorboard; if it's pinned at the min or max value you may need to expand the range.

In [None]:
trainer.fit(budget=1000000, lrs=[1e5, 1e4, 1000., 100., 10., 1.])