# Welcome to Adversarial ML with Images! 👋

In this section, you’ll see how small, nearly invisible pixel changes can trick a powerful image classifier. We'll use a **pre-trained ResNet-50 model** and implement two classic gradient-based attacks—the **Fast Gradient Sign Method (FGSM)** and the more powerful **Projected Gradient Descent (PGD)**.

The goal is to generate a perturbed image that looks unchanged to a human but causes the model to make a completely different prediction. As a final step, you'll learn to embed your adversarial creation back into the original high-resolution image.

---

## What You'll Do Here 🎯

1.  **Load the model** and classify a clean image to see its baseline prediction.
2.  **Implement FGSM** by writing the code for a single, powerful attack step.
3.  **Extend FGSM into PGD**, turning the single step into an iterative attack.
4.  **Generate a targeted example** that forces the model to predict a specific class.
5.  **Try it on your own image** to see how different subjects react.
6.  **(Optional)** If you finish early, try the challenges at the end.

---
> ⚠️ **Educational Use Only:** These techniques are for research and learning. Please use this knowledge responsibly to build more robust systems, not to deceive real-world applications.

## Step 1: Set Up the Environment

The first code cell below is a one-time setup step. It performs three key actions:
1.  **Clones the GitHub repository:** This downloads all our workshop files, including helper code and the sample images.
2.  **Changes the directory:** It navigates into the newly downloaded folder so we can access those files.
3.  **Installs packages:** It installs the specific libraries listed in `requirements.txt`.

▶️ **Please run this cell now.** It might take a minute to complete. You'll see a "✅ Setup complete!" message when it's finished.

In [None]:
# Environment Setup
import os

repo_url = "https://github.com/japheth45/Adversarial_ML25.git"
repo_dir = "Adversarial_ML25/image_classification"

# Clone the repository
!git clone {repo_url}

# Check if the clone was successful by seeing if the directory exists
if os.path.isdir(repo_dir):
  print("✅ Repository cloned successfully.")
  %cd {repo_dir}
  print("Installing required packages...")
  !pip install -q -r requirements.txt # The -q flag makes the output quieter
  print("✅ Setup complete!")
else:
  print("❌ Error: Could not clone the repository.")
  print("Please check the URL and ensure the repository is public.")

## Step 2: Import Libraries

Now that our environment is ready, this cell imports all the necessary tools for our session. This includes standard libraries like `torch` and `PIL` (for image handling), as well as the custom helper functions we've written for this workshop (like `classify` and `load_image`).

▶️ **Run this cell to load everything into memory.** It should complete in under a minute.

In [None]:
# --- Imports & device ---
from PIL import Image
from IPython.display import display
import image_helpers
import torch, torchvision

print("PyTorch:", torch.__version__)
print("CUDA available:", torch.cuda.is_available())
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print("Using device:", device)

## Step 3: Test Drive: classify the sample image

Before we try to fool the network, let’s see how it behaves on an ordinary photo.
This cell loads `data/dog.jpg`, displays it, and asks our pre-trained ResNet-50 to predict the **Top-5** ImageNet classes with confidences. We then capture the model’s top prediction in `original_label` so later attack cells can compare against (and, ideally, flip) this baseline.

- `classify(...)` handles preprocessing (resize, center-crop, normalization), runs the model, and returns a list of `(label, confidence)` pairs.


In [None]:
k = 5  # Set the number of top predictions to show
img_path = "data/dog.jpg"

# Show the original image
print("Original Image:")
display(Image.open(img_path))

# Get and display the top k predictions
predictions = image_helpers.classify(img_path, k=k)
image_helpers.print_topk(predictions, k=k)

# Get the single top prediction (the most likely class)
original_label = predictions[0][0]
print(f"\nConclusion: The model is confident this is a '{original_label}'.")

## 4) Exercise A — Single FGSM step (targeted)

Implement a **single step** that pushes the model **toward a target class** by *decreasing* cross‑entropy wrt that target.
Fill in the two TODOs below:
- Update `delta` using the sign of the gradient (constrain to `[-eps, eps]`).
- Reset gradients to avoid accumulation.

Completing this code just builds our FGSM function - we'll use this function in a cell below...


In [8]:
def fgsm_step(x, y_target, model, delta, alpha, eps):
    """One targeted FGSM step on normalized input space."""
    delta.requires_grad_(True)
    logits = model(x + delta)
    loss = torch.nn.functional.cross_entropy(logits, y_target)  # targeted: minimize loss to increase target prob
    loss.backward()

# TODO: Construct the update rule for delta. This requires four steps (but just one line of code):
    #   1. Get the direction of the gradient using .grad.sign()
    #   2. Scale that direction by the step size, `alpha`
    #   3. Subtract the result from the current `delta` to perform gradient descent
    #   4. Clamp the final value to keep it within our budget of [-eps, eps]

    # delta.data = ...


# TODO: zero the gradients on delta to prevent accumulation across steps
    # delta...

    with torch.no_grad():
        probs = torch.softmax(logits, dim=1)
        target_prob = float(probs[0, y_target.item()].item())
    return delta.detach(), float(loss.item()), target_prob


### (Optional) Solution for Exercise A

Expand and run if you’re stuck.


In [3]:
#@title 🔑 Solution: Targeted FGSM Step (Click to Expand)
def fgsm_step(x, y_target, model, delta, alpha, eps):
    delta.requires_grad_(True)
    logits = model(x + delta)
    loss = torch.nn.functional.cross_entropy(logits, y_target)
    loss.backward()
    # targeted: gradient descent on loss
    delta.data = (delta - alpha * delta.grad.sign()).clamp(-eps, eps)
    delta.grad.zero_()
    with torch.no_grad():
        probs = torch.softmax(logits, dim=1)
        target_prob = float(probs[0, y_target.item()].item())
    return delta.detach(), float(loss.item()), target_prob

## 5) Exercise B — PGD (loop + early stopping)

Turn your FGSM step into **PGD** by looping it. Stop early once the target confidence exceeds a threshold.  
Fill in the TODOs below.


In [9]:
# --- Exercise B: PGD loop with early stopping ---
def pgd_targeted(image_path, target_label_name, eps=8/255, alpha=2/255, iters=50, stop_threshold=0.9):
    """Targeted PGD on a single 224x224 crop (normalized space); returns an *unnormalized* tensor in [0,1]."""
    model, dev = image_helpers.get_model()
    x, _ = image_helpers.load_image(image_path)  # x is normalized to ImageNet mean/std, shape (1,3,224,224)
    target_idx = image_helpers.label_to_index(target_label_name)
    y_target = torch.tensor([target_idx], device=dev)

    # mean/std for un-normalizing later
    mean = torch.tensor([0.485, 0.456, 0.406], device=dev).view(3,1,1)
    std  = torch.tensor([0.229, 0.224, 0.225], device=dev).view(3,1,1)

    delta = torch.zeros_like(x, requires_grad=True)

    for i in range(iters):
        # TODO 1: perform a single targeted FGSM step by calling the `fgsm_step` function.
        # (Review the FGSM function in the previous cell for needed arguments.)
        # Make sure to capture the three return values into the specified variables.

        # delta, loss_val, target_prob = fgsm_step(...)

        # TODO 2: Update the print statement below to print the variables you received
        # from the `fgsm_step` function.  (Replace ??? with your variables)

        # print(f"Iteration {i+1:02d}: loss=???  target_conf=???")

        # TODO 3: Write an `if` statement to check if the `target_prob`
        # is greater than the `stop_threshold`. If it is, print a success message
        # and use the `break` command to exit the loop.

        # if ...:
        #     print(f"✅ Early stop at iter {i+1}")
        #     break

    # Compose final adversarial and un-normalize back to pixel space
    adv_x = x + delta
    adv_unnorm = (adv_x * std + mean).clamp(0, 1)
    return adv_unnorm


### (Optional) Solution for Exercise B

Expand and run if you’re stuck.


In [4]:
#@title 🔑 Solution: PGD_Targeted function (Click to Expand)
def pgd_targeted(image_path, target_label_name, eps=8/255, alpha=2/255, iters=50, stop_threshold=0.9):
    model, dev = image_helpers.get_model()
    x, _ = image_helpers.load_image(image_path)
    target_idx = image_helpers.label_to_index(target_label_name)
    y_target = torch.tensor([target_idx], device=dev)

    mean = torch.tensor([0.485, 0.456, 0.406], device=dev).view(3,1,1)
    std  = torch.tensor([0.229, 0.224, 0.225], device=dev).view(3,1,1)

    delta = torch.zeros_like(x, requires_grad=True)

    for i in range(iters):
        delta, loss_val, target_prob = fgsm_step(x, y_target, model, delta, alpha, eps)
        print(f"Iteration {i+1:02d}: loss={loss_val:.4f}  target_conf={target_prob:.2%}")
        if target_prob > stop_threshold:
            print(f"✅ Early stop at iter {i+1} (confidence {target_prob:.2%})")
            break

    adv_x = x + delta
    adv_unnorm = (adv_x * std + mean).clamp(0, 1)
    return adv_unnorm


## 6) Run your targeted attack

Pick a target label (string match against ImageNet label names). The helper `label_to_index(...)` resolves it to an index.


In [None]:
# --- Run targeted PGD on the sample image ---
target_label = "chain saw"   # change this to try other targets
adv_tensor = pgd_targeted(img_path, target_label, eps=8/255, alpha=2/255, iters=100, stop_threshold=0.9)

# Save and display the 224x224 adversarial crop
adv_patch_path = "data/dog_adv_patch.png"
image_helpers.save_tensor_image(adv_tensor, adv_patch_path)
print(f"Saved adversarial crop to: {adv_patch_path}")
display(Image.open(adv_patch_path))


## 7) Paste the adversarial crop back into the full image

We’ll re‑embed the 224×224 adversarial crop into the original‑resolution image and classify both for comparison.


In [None]:
# --- Embed crop back and compare ---
full_adv_path = "data/dog_adv_full.jpg"
image_helpers.embed_crop_back(img_path, Image.open(adv_patch_path), save_path=full_adv_path)

print("\n--- Original image ---")
image_helpers.print_topk(image_helpers.classify(img_path))

print("\n--- Full-size adversarial image ---")
image_helpers.print_topk(image_helpers.classify(full_adv_path))

print("\nVisual check (original vs adversarial):")
display(Image.open(img_path))
display(Image.open(full_adv_path))


## 8) Try your own image (optional)

Find an image on the web, paste the URL below, and see what the model thinks it is. Right-
click on an image online and select "Copy Image Address" or a similar option to get the URL.

(Or upload a local file and set `local_path`) and pick a target.


In [None]:
# --- Your own image attack (optional) ---
import os, urllib.request

image_url = ""  # e.g., "https://example.com/cat.jpg"

# "https://images.stockcake.com/public/9/6/5/965857ff-5ab5-427f-859c-4487bffaf078_medium/shiny-truck-display-stockcake.jpg"

local_path = "" # e.g., "my_photo.jpg" (if you uploaded a file)

if image_url and not local_path:
    local_path = "data/my_image" + os.path.splitext(image_url)[1]
    try:
        urllib.request.urlretrieve(image_url, local_path)
        print(f"Downloaded to {local_path}")
    except Exception as e:
        print("Download failed:", e)

if local_path and os.path.exists(local_path):
    new_target = "tennis ball"  # change to experiment
    adv_tensor2 = pgd_targeted(local_path, new_target, eps=8/255, alpha=2/255, iters=100, stop_threshold=0.90)

    root, ext = os.path.splitext(local_path)
    adv_patch2 = f"{root}_adv_patch.png"
    image_helpers.save_tensor_image(adv_tensor2, adv_patch2)

    # Determine output format (PNG for .png inputs, otherwise JPEG)
    out_ext = "PNG" if ext.lower() == ".png" else "JPEG"
    adv_full2 = f"{root}_adv_full{ext if ext else '.jpg'}"
    image_helpers.embed_crop_back(local_path, Image.open(adv_patch2), save_path=adv_full2, fmt=out_ext)

    # --- Original image and classification ---
    print("\n--- Original Image Predictions ---")
    image_helpers.print_topk(image_helpers.classify(local_path))
    print("Original Image:", local_path)
    display(Image.open(local_path))

    # --- Altered image and classification ---
    print("\n--- Adversarial Image Predictions ---")
    image_helpers.print_topk(image_helpers.classify(adv_full2))
    print("\nAdversarial Image:", adv_full2)
    display(Image.open(adv_full2))
else:
    print("Set image_url or local_path to try your own image.")


## 9) Optional Challenges — Explore, extend, and have fun

If you finish early (or later at home), try these small challenges to deepen your intuition about adversarial ML on images. Each one builds on what we did in the main notebook. Skeleton code is provided so you can focus on the ideas. Feel free to use online resources (including generative AI) to help you implement the pieces.

**What to expect**
- You will reuse helpers from `image_helpers.py` and patterns from the earlier FGSM/PGD cells.
- Success criteria are concrete (for example, a margin over the original class, or a target confidence threshold).
- Keep images in [0,1] when saving; do model math in normalized space.

**Challenges**
1. **Untargeted attack**
   Push *away* from the original top class by maximizing its loss (negative log probability). Declare success when another class is higher by at least a chosen margin (for example, 0.05).

2. **Saliency + Grad-CAM**
   Visualize where the model “looks.” Compute a vanilla saliency map from input gradients and a Grad-CAM heatmap from a late conv layer. Overlay on the image to compare attention patterns.

3. **Robustness check (re-encode and re-scale)**
   Take a 224×224 adversarial crop, embed it back into the original at several output sizes and JPEG qualities, save, reload, and re-classify. Watch how confidence changes as resizing and compression perturb the signal.

4. **Image synthesis from gray/noise**
   Start from a gray or random canvas and do gradient ascent on a target class score. Add light regularization (TV/L2) if you like. The result should be classifiable, even if it looks like class-specific texture rather than a photograph.

Enjoy experimenting!

## Challenge 1 — Untargeted attack (reduce the true-class probability)

Goal: make the model stop believing its original top-1 prediction (`y_true`).
Instead of pushing toward a specific target, push *away* from `y_true` by maximizing the loss for that class (negative log probability of `y_true`). Declare success when some other class is higher than `y_true` by at least a small margin (for example, `margin = 0.05`) to avoid flicker between classes.

What to build:
- An **untargeted FGSM step** that increases the true-class loss and keeps the perturbation within an L_inf bound.
- An **untargeted PGD loop** that repeats FGSM-style steps, projects back to the L_inf ball around the original, clamps to `[0,1]`, and early-stops when the margin condition is met.

Use the same normalization space as the targeted code. You already have helpers in `image_helpers.py` and a targeted PGD example; mirror that pattern but flip the objective away from `y_true`.


In [None]:
# --- Challenge 1 (Skeleton): Untargeted FGSM/PGD using image_helpers -------
# Keep this minimal. Fill in the bodies as you see fit.

from typing import Tuple
import torch
import image_helpers  # get_model, load_image, label_to_index, classify_tensor, save_tensor_image

def fgsm_step_untargeted(
    x_norm: torch.Tensor,
    y_true: torch.Tensor,
    model,
    delta: torch.Tensor,
    alpha: float,
    eps: float,
) -> Tuple[torch.Tensor, float, float, float]:
    """
    One untargeted FGSM step that increases the loss of the true class.
    This function should:
      - forward (x_norm + delta) through the model,
      - compute negative log prob for y_true,
      - take a gradient *ascent* step on delta,
      - clamp delta to [-eps, +eps],
      - return (delta, loss_value, p_true, p_top_other).
    """
    pass

def pgd_untargeted(
    image_path: str,
    true_label_name: str,
    eps: float = 8/255,
    alpha: float = 2/255,
    iters: int = 50,
    margin: float = 0.05,
    early_stop: bool = True,
):
    """
    Untargeted PGD loop.
    This function should:
      - load a NORMALIZED tensor via image_helpers.load_image(image_path),
      - build y_true from true_label_name using image_helpers.label_to_index,
      - initialize a delta tensor (normalized space),
      - iteratively call fgsm_step_untargeted,
      - implement an early-stop when p_top_other >= p_true + margin,
      - de-normalize to [0,1] before returning the adversarial image.
    """
    pass

def success_margin(p_true: float, p_top_other: float, margin: float) -> bool:
    """
    Return True when p_top_other >= p_true + margin.
    You can call this from inside pgd_untargeted for early stopping.
    """
    pass

# --- Sample usage (fill bodies above, then run) -----------------------------
# model, dev = image_helpers.get_model()
# adv = pgd_untargeted(
#     image_path=img_path,
#     true_label_name=original_label,
#     eps=8/255, alpha=2/255, iters=30, margin=0.05, early_stop=True
# )
# image_helpers.save_tensor_image(adv, "adv_untargeted.png")
# print(image_helpers.classify_tensor(adv)[:5])

# Tip: You can model this after your existing targeted utilities:
# - fgsm_step(...) and pgd_targeted(...) are good references; flip the objective away from y_true.


## Challenge 2 — Saliency maps (vanilla gradients) and Grad-CAM

Goal: visualize *where* the network is looking when it makes a prediction.

Two complementary views:
- **Vanilla saliency**: take the gradient of the score for a chosen class with respect to the input pixels. The absolute value (or max over channels) highlights pixels the score is most sensitive to.
- **Grad-CAM**: use gradients flowing into a late convolutional block to weight its feature maps, then upsample to image size. This yields a coarse, interpretable heatmap over the object regions.

What to build:
- A function to compute a **vanilla saliency map** for a given image and class id.
- A lightweight **Grad-CAM** helper that registers hooks on a chosen layer in ResNet-50 (for example, the last block in `layer4`), computes the class score, backprops, and returns an upsampled heatmap.
- Optionally, a small utility to **overlay** the heatmap on the original image to make the visualization easier to read.

Notes:
- Work in the **normalized** space for the forward/backward pass, but produce heatmaps in image coordinates (224×224).
- Try both the **top-1 class** and a **chosen class** to compare attention.
- Extensions for early finishers: SmoothGrad (average saliency over noise), guided backprop, or combining Guided Backprop with Grad-CAM.


In [None]:
# --- Challenge 2 (Skeleton): Saliency + Grad-CAM ---------------------------
# Keep this minimal; fill the bodies as you like.

from typing import Optional
import torch
import torch.nn.functional as F
import torchvision
from PIL import Image
import image_helpers  # get_model, load_image, LABELS

def compute_vanilla_saliency(
    x_norm: torch.Tensor,
    class_idx: int,
    model: torch.nn.Module,
) -> torch.Tensor:
    """
    Return a saliency map for the given class_idx at input resolution.
    This function should:
      - enable gradient on x_norm,
      - forward to logits and select the score for class_idx,
      - backprop to get d(score)/d(x_norm),
      - convert to a single-channel saliency (e.g., abs max over channels),
      - normalize to [0,1] and return a (1,1,H,W) or (1,H,W) tensor.
    """
    pass

class GradCAM:
    """
    Minimal Grad-CAM helper.
    You should:
      - register forward and backward hooks on a target conv module,
      - cache feature maps (forward) and gradients (backward),
      - compute channel weights from pooled gradients,
      - make a weighted sum of feature maps,
      - ReLU, normalize, and upsample to input size.
    """
    def __init__(self, model: torch.nn.Module, target_module: torch.nn.Module):
        # store refs, set placeholders for cached activations and grads, register hooks
        pass

    def remove_hooks(self):
        """Remove any registered hooks."""
        pass

    def compute(
        self,
        x_norm: torch.Tensor,
        class_idx: int,
    ) -> torch.Tensor:
        """
        Return a Grad-CAM heatmap (1,H,W) aligned to the input size.
        This should:
          - run a forward pass to get logits,
          - backprop the score for class_idx,
          - use cached feats/grads to build the heatmap,
          - upsample to input resolution and normalize to [0,1].
        """
        pass

def overlay_heatmap_on_pil(
    heatmap: torch.Tensor,
    image_pil: Image.Image,
    alpha: float = 0.4,
) -> Image.Image:
    """
    Overlay a heatmap onto a PIL image and return a composite image.
    This function should:
      - convert the normalized heatmap to a colored mask,
      - resize to match image_pil,
      - alpha-blend heatmap and image.
    """
    pass

# --- Sample usage (after you fill in the bodies) ---------------------------
# model, dev = image_helpers.get_model()
# x_norm, pil_img = image_helpers.load_image(img_path)
#
# # 1) Vanilla saliency for top-1 class
# with torch.no_grad():
#     logits = model(x_norm)
#     class_idx = int(logits.argmax(dim=1).item())
# sal = compute_vanilla_saliency(x_norm.clone(), class_idx, model)
#
# # 2) Grad-CAM on the last conv block of ResNet-50
# #    Example target module (verify in your model printout):
# #    target_module = model.layer4[-1].conv3  or  model.layer4[-1]
# #    (pick a conv module that produces feature maps)
# # target_module = model.layer4[-1]
# # cam = GradCAM(model, target_module)
# # cam_map = cam.compute(x_norm.clone(), class_idx)
# # cam.remove_hooks()
#
# # 3) Overlay and visualize
# # composite = overlay_heatmap_on_pil(cam_map, pil_img, alpha=0.45)
# # display(composite)
#
# # Try different classes:
# # chosen_idx = image_helpers.LABELS.index("golden retriever")  # or use label_to_index on a name
# # sal2 = compute_vanilla_saliency(x_norm.clone(), chosen_idx, model)
# # cam2 = GradCAM(model, model.layer4[-1]); cam_map2 = cam2.compute(x_norm.clone(), chosen_idx); cam2.remove_hooks()
# # display(overlay_heatmap_on_pil(cam_map2, pil_img))


## Challenge 3 — Robustness check by re-encoding and re-scaling

Goal: see how brittle a successful adversarial crop is once it’s saved back into the original photo at different sizes and JPEG qualities.

You’ll take a 224×224 adversarial crop (the same size we fed the model), **embed it back** into the full image at several output resolutions (for example, 400×600, 600×900, …) and with different JPEG qualities (for example, 95, 85, 70). After each save, **reload and re-classify** to observe how confidence changes. Expect some attacks to weaken as resizing and compression dilute or shift the perturbation.

What to build:
- A small routine that loops over a list of output sizes and qualities, calls `embed_crop_back(...)`, saves each variant, and returns file paths.
- A tester that reopens each saved image and records Top-1 / Top-5 predictions and confidence.
- A tiny helper to convert your 224×224 adversarial tensor to a PIL image (if you don’t already have it as PIL).

Tips:
- Reuse earlier code: your adversarial result is already a 224×224 crop; convert it to PIL once and reuse.
- Keep filenames informative (include size and quality) so results are easy to compare.
- Try multiple images; some are more robust than others.


In [None]:
# --- Challenge 3 (Skeleton): Embed adversarial crop at various sizes --------
# Fill in the bodies as you like; reuse image_helpers wherever possible.

from typing import Iterable, List, Tuple, Dict
from pathlib import Path
from PIL import Image
import torch
import image_helpers  # embed_crop_back, classify, classify_tensor, save_tensor_image

def to_pil_224(adv_tensor_unnorm: torch.Tensor) -> Image.Image:
    """
    Convert a (1,3,224,224) adversarial tensor in [0,1] to a 224x224 PIL image.
    This function should:
      - detach/clip to [0,1],
      - convert using torchvision or manual numpy path,
      - return a PIL Image in RGB.
    """
    pass

def generate_variants_with_embed(
    original_path: str,
    adv_crop_pil: Image.Image,
    sizes: Iterable[Tuple[int,int]],
    qualities: Iterable[int],
    out_dir: str = "robustness_variants",
) -> List[Path]:
    """
    Create resized/compressed variants by embedding adv_crop_pil into original_path.
    This function should:
      - ensure out_dir exists,
      - for each (W,H) and quality:
          - call image_helpers.embed_crop_back(original_path, adv_crop_pil, save_path, fmt="JPEG", quality=quality)
          - optionally resize to (W,H) *before saving* OR pass through embed_crop_back then resize,
      - collect and return a list of saved file paths.
    """
    pass

def evaluate_variants(paths: List[Path], topk: int = 5) -> List[Dict]:
    """
    Re-open each saved image and classify it.
    This function should:
      - loop over paths,
      - call image_helpers.classify(str(path), k=topk),
      - store results (path, top-1 label, top-1 prob, maybe full top-k).
    """
    pass

def print_variant_report(results: List[Dict]) -> None:
    """
    Pretty-print a small table from `evaluate_variants` results.
    This function should:
      - parse filename to show size/quality (if encoded in name),
      - print top-1 label and confidence for quick comparison.
    """
    pass

# --- Sample usage (after you fill in the bodies) ----------------------------
# original_path = img_path  # the clean, full-resolution source used earlier
# # If your adversarial is a tensor crop (1,3,224,224) in [0,1]:
# # adv_crop_pil = to_pil_224(adv_crop_tensor)
# #
# # If you already saved it as "adv_crop_224.png", you can simply:
# # adv_crop_pil = Image.open("adv_crop_224.png").convert("RGB")
#
# sizes     = [(400, 600), (600, 900), (800, 1200)]   # (W, H)
# qualities = [95, 85, 70]
#
# saved_paths = generate_variants_with_embed(
#     original_path=original_path,
#     adv_crop_pil=adv_crop_pil,
#     sizes=sizes,
#     qualities=qualities,
#     out_dir="robustness_variants"
# )
#
# results = evaluate_variants(saved_paths, topk=5)
# print_variant_report(results)
#
# # Optional: also test PNG (no JPEG artifacts):
# # saved_paths_png = generate_variants_with_embed(original_path, adv_crop_pil, sizes=[(600,900)], qualities=[100])
# # ... then evaluate as above.


## Challenge 4 — Image synthesis from gray/noise (feature visualization lite)

Goal: start from a gray field or random noise and “dream up” an input that the model classifies as a chosen target class. Instead of *reducing* confidence in the original label, we’ll *increase* the score of a target class by taking gradient ascent steps on the input itself.

What to build:
- A canvas initializer that creates a 224×224 tensor in [0,1] (gray or noise).
- A single “ascent step” that nudges pixels to increase the target class score. Optionally add light regularizers (for example, L2 toward gray and/or total variation) to keep the pattern from exploding.
- A synth loop that repeats ascent steps, clamps to [0,1], and tracks the current confidence. Stop when you reach a confidence threshold or run out of iterations.

Notes:
- Work in the normalized space for the forward pass, but keep the master image in [0,1].
- Small step sizes with many iterations tend to look cleaner; regularization weights can be tiny but helpful.
- This produces classifiable textures, not photoreal images—great for discussing what the model “keys on.”


In [None]:
# --- Challenge 4 (Skeleton): Class "dreaming" via gradient ascent -----------
# Minimal scaffolding; fill bodies as you like. Reuse image_helpers utilities.

from typing import Literal, Tuple
import torch
import image_helpers  # get_model, label_to_index, classify_tensor, save_tensor_image

CanvasMode = Literal["gray", "noise"]

def init_canvas(
    mode: CanvasMode = "gray",
    device: torch.device = None,
) -> torch.Tensor:
    """
    Return a (1,3,224,224) float tensor in [0,1] to serve as the starting image.
    - mode="gray": mid-gray canvas
    - mode="noise": uniform random in [0,1]
    """
    pass

def normalize_for_model(x_unnorm: torch.Tensor, device: torch.device) -> torch.Tensor:
    """
    Convert [0,1] tensor to normalized space expected by ResNet-50.
    """
    pass

def total_variation(x: torch.Tensor) -> torch.Tensor:
    """
    Return a scalar TV penalty encouraging local smoothness.
    """
    pass

def ascent_step(
    x_unnorm: torch.Tensor,
    y_target_idx: int,
    model: torch.nn.Module,
    alpha: float,
    tv_weight: float = 0.0,
    l2_weight: float = 0.0,
) -> Tuple[torch.Tensor, float, float]:
    """
    Take one gradient ascent step to increase the target class score.
    This function should:
      - enable grad on x_unnorm,
      - forward normalized x to logits,
      - form an objective: target log-prob minus small regularizers,
      - backprop and nudge x_unnorm upward by alpha * sign(grad) or raw grad,
      - clamp back to [0,1],
      - return (updated_x, loss_value, target_confidence).
    """
    pass

def synthesize_image(
    target_label_name: str,
    steps: int = 200,
    alpha: float = 1.0/255,
    tv_weight: float = 1e-4,
    l2_weight: float = 1e-4,
    start_mode: CanvasMode = "gray",
    stop_threshold: float = 0.90,
) -> torch.Tensor:
    """
    Main loop: initialize a canvas, run gradient ascent, and return a [0,1] tensor.
    This function should:
      - get model/device, map label name -> index,
      - init canvas via init_canvas(start_mode),
      - iterate ascent_step(...) for `steps`,
      - print/track current confidence and early-stop at stop_threshold,
      - return the final [0,1] tensor.
    """
    pass

# --- Sample usage (after you fill in the bodies) ----------------------------
# model, dev = image_helpers.get_model()
# adv_tex = synthesize_image(
#     target_label_name="golden retriever",
#     steps=300,
#     alpha=1.0/255,
#     tv_weight=1e-4,
#     l2_weight=1e-4,
#     start_mode="noise",
#     stop_threshold=0.95
# )
# image_helpers.save_tensor_image(adv_tex, "dream_golden.png")
# print(image_helpers.classify_tensor(adv_tex)[:5])

# Variations:
# - Try start_mode="gray" for cleaner textures.
# - Reduce alpha and increase steps for smoother patterns.
# - Turn off TV/L2 to see the effect of regularization.
