Adversarial Patch - ResNet34 (Colab/Notebook)

I create a targeted adversarial patch for torchvision ResNet34 on a small ImageNet setup. 
I keep things simple and reproducible. If any cell needs my manual action (e.g., uploading images), I note it.

In [None]:
# basic imports
import os, io, math, random, json, time
from pathlib import Path
import numpy as np
import torch
import torch.nn as nn
import torch.nn.functional as F
import torchvision as tv
from torchvision import transforms
from PIL import Image, ImageDraw
import matplotlib.pyplot as plt

# reproducibility
def seed_everything(seed=42):
    random.seed(seed); np.random.seed(seed); torch.manual_seed(seed); torch.cuda.manual_seed_all(seed)
seed_everything(123)

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print('device:', device)

# folders
BASE = Path('.').resolve()
OUT = BASE / 'outputs'
OUT.mkdir(exist_ok=True, parents=True)
print('outputs ->', OUT)

# small utility to show images inline (tensor in [0,1])
def show_tensor_img(t, title=None):
    t = t.detach().cpu().clamp(0,1)
    if t.ndim == 4:  # show first one
        t = t[0]
    img = transforms.ToPILImage()(t)
    plt.figure(figsize=(3,3))
    plt.axis('off')
    if title: plt.title(title)
    plt.imshow(img)
    plt.show()

def save_tensor_img(t, path):
    t = t.detach().cpu().clamp(0,1)
    if t.ndim == 4:
        t = t[0]
    img = transforms.ToPILImage()(t)
    img.save(path)
    print('saved', path)


Model and classes

I load torchvision ResNet34 with ImageNet1K weights. If downloading weights fails (no internet), I fall back to an untrained model so the notebook still runs. For class names, I try to fetch `imagenet_classes.txt`; if it fails, I fall back to index-only labels. If I need a specific class and the list is missing, I can set it by index.

In [None]:
# load model (pretrained if possible)
from torchvision.models import resnet34, ResNet34_Weights

def load_model_safe():
    try:
        weights = ResNet34_Weights.IMAGENET1K_V1
        model = resnet34(weights=weights).eval().to(device)
        preprocess = weights.transforms()
        print('Loaded pretrained ResNet34.')
        return model, preprocess
    except Exception as e:
        print('Could not load pretrained weights:', e)
        model = resnet34(weights=None).eval().to(device)
        # default preprocess to ImageNet stats
        preprocess = transforms.Compose([
            transforms.Resize(256),
            transforms.CenterCrop(224),
            transforms.ToTensor(),
            transforms.Normalize(mean=[0.485, 0.456, 0.406],
                                 std=[0.229, 0.224, 0.225])
        ])
        print('Using untrained model as fallback.')
        return model, preprocess

model, preprocess = load_model_safe()

# class names
def load_imagenet_classes():
    # try several sources
    urls = [
        'https://raw.githubusercontent.com/pytorch/hub/master/imagenet_classes.txt',
        'https://raw.githubusercontent.com/anishathalye/imagenet-simple-labels/master/imagenet-simple-labels.json'
    ]
    names = None
    import urllib.request, ssl
    try:
        ctx = ssl.create_default_context()
        ctx.check_hostname = False
        ctx.verify_mode = ssl.CERT_NONE
        txt = None
        for u in urls:
            try:
                with urllib.request.urlopen(u, context=ctx, timeout=10) as r:
                    data = r.read()
                if u.endswith('.txt'):
                    lines = data.decode('utf-8').strip().splitlines()
                    if len(lines) >= 1000:
                        names = [s.strip() for s in lines]
                        break
                else:
                    arr = json.loads(data.decode('utf-8'))
                    if len(arr) >= 1000:
                        names = arr
                        break
            except Exception:
                pass
    except Exception as e:
        names = None
    if names is None:
        # fallback to index names
        names = [f'class_{i}' for i in range(1000)]
        print('Could not fetch class names; using index labels.')
    else:
        print('Loaded ImageNet class names:', len(names))
    return names

IMAGENET_CLASSES = load_imagenet_classes()

# helpers for predictions
@torch.no_grad()
def topk_preds(logits, k=5):
    probs = F.softmax(logits, dim=1)
    p, idx = probs.topk(k, dim=1)
    return p.cpu().numpy(), idx.cpu().numpy()

@torch.no_grad()
def predict(model, x):
    model.eval()
    logits = model(x.to(device))
    return logits


Target class

I choose a target class by name substring. If there are multiple matches, I take the first. If no match, I set it by index. I can change `TARGET_KEYWORD` or `TARGET_INDEX` later and rerun the cell.

In [None]:
TARGET_KEYWORD = 'toilet tissue'  # free to change, e.g., 'banana', 'pizza', 'goldfish'
TARGET_INDEX = None  # if None, resolve from keyword; otherwise override (0..999)

def resolve_target_index(keyword, override=None):
    if override is not None:
        return int(override)
    key = (keyword or '').lower().strip()
    for i, name in enumerate(IMAGENET_CLASSES):
        if key and key in name.lower():
            print(f'target -> idx={i}, name={IMAGENET_CLASSES[i]}')
            return i
    # default: class 859 ('toilet tissue' in common mapping), but fallback may differ
    default_idx = 859 if len(IMAGENET_CLASSES) > 859 else 0
    print(f'keyword not found; fallback idx={default_idx}, name={IMAGENET_CLASSES[default_idx]}')
    return default_idx

TARGET = resolve_target_index(TARGET_KEYWORD, TARGET_INDEX)


Images

I test on a few pictures. If I am in Colab, I can upload files (jpg/png). If download works, I also fetch 3 sample images. If both fail, I fall back to random colors so the notebook still runs (results will be meaningless, but the code path is safe).

In [None]:
RAW = BASE / 'raw_images'
RAW.mkdir(exist_ok=True)

def in_colab():
    try:
        import google.colab  # type: ignore
        return True
    except Exception:
        return False

def ask_upload():
    if in_colab():
        from google.colab import files  # type: ignore
        print('Choose 1-5 images to upload (jpg/png).')
        up = files.upload()
        for k, v in up.items():
            with open(RAW / k, 'wb') as f:
                f.write(v)
        return True
    return False

def try_download_samples():
    import urllib.request, ssl
    urls = [
        'https://images.unsplash.com/photo-1518791841217-8f162f1e1131?w=800',  # cat
        'https://images.unsplash.com/photo-1507149833265-60c372daea22?w=800',  # dog
        'https://images.unsplash.com/photo-1542291026-7eec264c27ff?w=800',    # pizza
    ]
    ok = 0
    try:
        ctx = ssl.create_default_context()
        ctx.check_hostname = False
        ctx.verify_mode = ssl.CERT_NONE
        for i,u in enumerate(urls):
            p = RAW / f'sample_{i}.jpg'
            if p.exists(): 
                ok += 1; 
                continue
            with urllib.request.urlopen(u, context=ctx, timeout=10) as r:
                data = r.read()
            with open(p, 'wb') as f:
                f.write(data)
            ok += 1
    except Exception as e:
        pass
    return ok

def fallback_random_images(n=3):
    for i in range(n):
        arr = np.random.rand(256,256,3).astype(np.float32)
        img = Image.fromarray((arr*255).astype(np.uint8))
        img.save(RAW / f'random_{i}.png')

# ensure we have some images
imgs_before = list(RAW.glob('*'))
if len(imgs_before) == 0:
    uploaded = ask_upload()
    if not uploaded:
        d = try_download_samples()
        if d == 0:
            fallback_random_images(3)

all_paths = sorted(list(RAW.glob('*')))
print('found images:', len(all_paths))
for p in all_paths[:5]:
    print(' -', p.name)

def load_images(paths, max_n=8):
    ims = []
    for p in paths[:max_n]:
        try:
            im = Image.open(p).convert('RGB')
            ims.append(im)
        except Exception:
            pass
    return ims

raw_images = load_images(all_paths, max_n=8)
print('loaded images:', len(raw_images))
display(raw_images[0] if len(raw_images)>0 else Image.new('RGB',(224,224),(128,128,128)))

Baseline prediction

I run the model on the clean images and print top-5. This lets me see what the model thinks before I place any patch.

In [None]:
def to_batch(ims):
    tensors = []
    for im in ims:
        t = preprocess(im)
        tensors.append(t)
    if len(tensors)==0:
        # fallback dummy
        tensors = [torch.zeros(3,224,224)]
    x = torch.stack(tensors, dim=0)
    return x

x_clean = to_batch(raw_images).to(device)
with torch.no_grad():
    logits = model(x_clean)
probs, idxs = topk_preds(logits, k=5)

for i in range(x_clean.shape[0]):
    print(f'Image {i}:')
    for p, idx in zip(probs[i], idxs[i]):
        name = IMAGENET_CLASSES[idx] if idx < len(IMAGENET_CLASSES) else f'class_{idx}'
        print(f'  {float(p):.4f} -> {idx}: {name}')


Patch

I optimize a square RGB patch (values in [0,1]) and place it randomly on each image. I use a simple differentiable overlay (resize + paste) without rotation to keep it stable. I can tune patch size and steps if I need stronger effect.

In [None]:
# differentiable overlay
def place_patch(images01, patch01, frac_min=0.18, frac_max=0.32):
    # images01: [N,3,H,W] in [0,1]
    N, C, H, W = images01.shape
    out = images01.clone()
    for i in range(N):
        s_frac = random.uniform(frac_min, frac_max)
        ph = pw = max(4, int(round(H * s_frac)))
        # resize patch
        p = F.interpolate(patch01.unsqueeze(0), size=(ph,pw), mode='bilinear', align_corners=True)[0]
        # position
        y = random.randint(0, H - ph)
        x = random.randint(0, W - pw)
        # alpha mask (soft edges a bit)
        mask = torch.ones_like(p[:1])
        # paste
        out[i,:,y:y+ph, x:x+pw] = out[i,:,y:y+ph, x:x+pw] * (1-mask) + p * mask
    return out

def denorm(x):
    # inverse of imagenet normalization
    mean = torch.tensor([0.485, 0.456, 0.406], device=x.device).view(1,3,1,1)
    std  = torch.tensor([0.229, 0.224, 0.225], device=x.device).view(1,3,1,1)
    return x * std + mean

def norm01(x):
    # convert normalized tensor back to [0,1]
    return denorm(x).clamp(0,1)

# build a [0,1] version of our current clean batch
def batch01_from_raw(ims):
    # produce [N,3,224,224] in [0,1] (no normalization)
    tensors = []
    basic = transforms.Compose([
        transforms.Resize(256),
        transforms.CenterCrop(224),
        transforms.ToTensor()
    ])
    for im in ims:
        tensors.append(basic(im))
    if len(tensors) == 0:
        tensors = [torch.zeros(3,224,224)]
    return torch.stack(tensors, 0).to(device)

batch01 = batch01_from_raw(raw_images)
show_tensor_img(batch01, 'example input [0,1] (first image)')


Optimize

I perform targeted optimization: maximize the model probability for my target class after placing the patch. I keep the patch values in [0,1] by clamping after each step. If the model is untrained (no weights), numbers will look random; I should ensure the pretrained weights downloaded for meaningful results.

In [None]:
# training loop
target_idx = int(TARGET)
steps = 250 if device.type=='cuda' else 120  # keep it reasonable on CPU
lr = 0.08
patch_side = 64

patch = torch.rand(3, patch_side, patch_side, device=device, requires_grad=True)
opt = torch.optim.Adam([patch], lr=lr)

mean = torch.tensor([0.485, 0.456, 0.406], device=device).view(1,3,1,1)
std  = torch.tensor([0.229, 0.224, 0.225], device=device).view(1,3,1,1)

loss_hist = []

for t in range(steps):
    opt.zero_grad()
    # place the patch on slightly jittered copies to add robustness
    imgs01 = batch01.clone()
    adv01 = place_patch(imgs01, patch, 0.18, 0.32)
    # normalize for model
    adv = (adv01 - mean) / std
    logits = model(adv)
    # targeted loss: encourage target logit high
    target = torch.full((adv.shape[0],), target_idx, dtype=torch.long, device=device)
    loss = F.cross_entropy(logits, target) * 1.0
    loss.backward()
    opt.step()
    with torch.no_grad():
        patch.clamp_(0,1)
    loss_hist.append(float(loss.item()))
    if (t+1) % max(1, steps//10) == 0:
        print(f'step {t+1}/{steps}  loss={loss.item():.4f}')

plt.figure(); plt.plot(loss_hist); plt.title('loss'); plt.xlabel('step'); plt.ylabel('CE'); plt.show()

# show learned patch and also save a high-res version for printing
show_tensor_img(patch, title='learned patch (view)')
save_tensor_img(F.interpolate(patch.unsqueeze(0), size=(700,700), mode='nearest'), OUT / 'patch_print.png')


Evaluate

I place the patch at random positions on each image and report the top-5 predictions. I also compute targeted success rate (whether the target class is ranked top-1).

In [None]:
@torch.no_grad()
def eval_with_patch(images01, patch, trials=8):
    N = images01.shape[0]
    top1_hits = 0
    all_reports = []
    for r in range(trials):
        adv01 = place_patch(images01, patch, 0.2, 0.32)
        adv = (adv01 - mean) / std
        logits = model(adv)
        probs, idxs = topk_preds(logits, k=5)
        for i in range(N):
            hit = (idxs[i,0] == target_idx)
            if hit: top1_hits += 1
            all_reports.append((i, probs[i], idxs[i]))
    sr = top1_hits / (N * trials + 1e-9)
    return sr, all_reports

sr, reports = eval_with_patch(batch01, patch, trials=6)
print(f'targeted top-1 success rate over random placements: {sr:.3f}')
# visualize a single random placement
adv01 = place_patch(batch01, patch, 0.24, 0.32)
show_tensor_img(adv01, title='example patched image (first item)')
save_tensor_img(adv01, OUT / 'example_patched.png')

# show predictions for the first image (last trial)
with torch.no_grad():
    adv = (adv01 - mean) / std
    logits = model(adv)
probs, idxs = topk_preds(logits, k=5)
i = 0
print('patched first image top-5:')
for p, idx in zip(probs[i], idxs[i]):
    name = IMAGENET_CLASSES[idx] if idx < len(IMAGENET_CLASSES) else f'class_{idx}'
    print(f'  {float(p):.4f} -> {idx}: {name}')


Sticker disguise

I turn the patch into a circular sticker by applying a round alpha mask, then place it. I also export a print sheet for this version. This is my creative extension.

In [None]:
def circular_mask_like(p):
    _, H, W = p.shape
    y, x = torch.meshgrid(torch.arange(H, device=p.device), torch.arange(W, device=p.device), indexing='ij')
    y = (y - H/2) / (H/2)
    x = (x - W/2) / (W/2)
    r = torch.sqrt(x*x + y*y)
    m = (r <= 1.0).float().unsqueeze(0)  # [1,H,W]
    # soften the edge
    band = (r>0.95) & (r<=1.0)
    m[0, band] = (1.0 - (r[band]-0.95)/0.05).float()
    return m

@torch.no_grad()
def apply_mask(p, m):
    return p * m + (0.0) * (1-m)

mask_circ = circular_mask_like(patch)
sticker = apply_mask(patch, mask_circ).clamp(0,1)
show_tensor_img(sticker, title='sticker patch')

# place and evaluate once
adv01_sticker = place_patch(batch01, sticker, 0.24, 0.34)
show_tensor_img(adv01_sticker, title='example sticker placement')
save_tensor_img(F.interpolate(sticker.unsqueeze(0), size=(700,700), mode='nearest'), OUT / 'sticker_patch_print.png')


Combine two patches

I train a second small patch for another target and paste both. This tests what happens when two objectives mix.

In [None]:
SECOND_TARGET_KEYWORD = 'banana'
SECOND_TARGET_INDEX = None
SECOND_TARGET = resolve_target_index(SECOND_TARGET_KEYWORD, SECOND_TARGET_INDEX)

steps2 = 160 if device.type=='cuda' else 90
patch2 = torch.rand(3, 48, 48, device=device, requires_grad=True)
opt2 = torch.optim.Adam([patch2], lr=0.08)
for t in range(steps2):
    opt2.zero_grad()
    imgs01 = batch01.clone()
    adv01 = place_patch(imgs01, patch2, 0.12, 0.22)
    adv = (adv01 - mean) / std
    logits = model(adv)
    target2 = torch.full((adv.shape[0],), int(SECOND_TARGET), dtype=torch.long, device=device)
    loss = F.cross_entropy(logits, target2)
    loss.backward()
    opt2.step()
    with torch.no_grad():
        patch2.clamp_(0,1)
    if (t+1) % max(1, steps2//5) == 0:
        print(f'second patch step {t+1}/{steps2}')

# combine by placing both
def place_two(images01, p1, p2):
    tmp = place_patch(images01, p1, 0.20, 0.28)
    out = place_patch(tmp, p2, 0.12, 0.20)
    return out

adv01_combo = place_two(batch01, patch, patch2)
show_tensor_img(adv01_combo, title='two patches combined (first image)')
save_tensor_img(adv01_combo, OUT / 'combo_patched.png')


Secret message with a series of patches

I put a row of small patches and encode a short message by rotation: 0°=0, 90°=1. I save a sheet to print. Anyone who knows the rule can read the bits from left to right. This is just for fun and to meet the requirement.

In [None]:
msg = '1011001'  # I can change this to any bit string
tile = F.interpolate(patch.unsqueeze(0), size=(80,80), mode='nearest')[0].detach().cpu()
tiles = []
for b in msg:
    im = transforms.ToPILImage()(tile)
    if b == '1':
        im = im.rotate(90, expand=True).resize((80,80))
    tiles.append(im)

# compose a row
W = 20 + len(tiles)*(80+12)
H = 120
sheet = Image.new('RGB', (W, H), (255,255,255))
x = 10
for im in tiles:
    sheet.paste(im, (x, 20))
    x += 80 + 12

sheet_path = OUT / 'secret_message_sheet.png'
sheet.save(sheet_path)
display(sheet)
print('saved', sheet_path)


What I will bring to class

1) a color print of `outputs/patch_print.png` (or `sticker_patch_print.png` if I prefer the circular version).  
2) the GitHub repo with this notebook and the outputs folder.  
3) I will demo with the example images. If I use my own photos, I will add them in the `raw_images` folder or upload in Colab.

In [None]:
print('Artifacts saved in:', OUT)
print('\n'.join([p.name for p in OUT.glob('*')]))

# simple check: ensure files exist so I know what to print
assert (OUT / 'patch_print.png').exists(), 'patch_print.png missing'
print('ready to go ✅')
