# Day 33: Adversarial Attack Simulation

In this lab, we will use the **Fast Gradient Sign Method (FGSM)** to fool a simple machine learning model.
We will demonstrate that adding a tiny amount of specifically calculated noise can drastically change the model's prediction.

In [None]:
import sys
import os
import numpy as np
import matplotlib.pyplot as plt

# Add root directory to sys.path
sys.path.append(os.path.abspath('../../'))

from src.security.adversarial import AdversarialAttacker

## 1. Mock Model and Data

We will pretend to have a simple linear classifier: `y = w * x`.
Our "image" will be a 4x4 mock array.

In [None]:
# Mock Image (4x4)
image = np.array([
    [0.1, 0.2, 0.1, 0.0],
    [0.2, 0.8, 0.2, 0.1],
    [0.1, 0.2, 0.9, 0.0],
    [0.0, 0.1, 0.0, 0.0]
])

# Mock Weights (Simple linear filter, e.g., edge detector)
weights = np.array([
    [0.0, -1.0, 0.0, 0.0],
    [-1.0, 4.0, -1.0, 0.0],
    [0.0, -1.0, 0.0, 0.0],
    [0.0, 0.0, 0.0, 0.0]
])

def predict(img):
    # Simple convolution-like sum
    score = np.sum(img * weights)
    return score

def gradient(img, target_class): 
    # For a linear model y = w*x, the gradient dy/dx is simply w.
    # If we want to INCREASE the score (misclassify as positive), grad is +w.
    # If we want to DECREASE the score (misclassify as negative), grad is -w.
    return weights # Assume target is to MAXIMIZE score

orig_score = predict(image)
print(f"Original Score: {orig_score:.4f}")

## 2. Generate Adversarial Example

We will modify the image to maximize the score using FGSM.

In [None]:
attacker = AdversarialAttacker()
epsilon = 0.2 # Magnitude of noise

# We want to drive the score UP (simulate attacking target class)
grad = gradient(image, target_class=1)

perturbed = attacker.fgsm_attack(image, epsilon, grad)

new_score = predict(perturbed)
print(f"New Score: {new_score:.4f}")

## 3. Visualize Changes

See the difference between original and perturbed.

In [None]:
diff = perturbed - image

fig, axs = plt.subplots(1, 3, figsize=(12, 4))
axs[0].imshow(image, cmap='gray', vmin=0, vmax=1)
axs[0].set_title(f"Original (Score {orig_score:.2f})")
axs[1].imshow(diff, cmap='coolwarm', vmin=-epsilon, vmax=epsilon)
axs[1].set_title(f"Perturbation (x{epsilon})")
axs[2].imshow(perturbed, cmap='gray', vmin=0, vmax=1)
axs[2].set_title(f"Adversarial (Score {new_score:.2f})")
plt.show()