> Reference:
> 
> Unveiling the Power of Projected Gradient Descent in Adversarial Attacks: https://medium.com/@zachariaharungeorge/unveiling-the-power-of-projected-gradient-descent-in-adversarial-attacks-2f92509dde3c
> 
> Working with Results: https://docs.ultralytics.com/modes/predict/#working-with-results
> 
> Understanding output of .pt file of YOLOv8: https://github.com/ultralytics/ultralytics/issues/8421

# Issues
In the `pgd_attack(model, images, labels, epsilon, alpha, num_iterations)` funtion, the outputs of `model(perturbed_images)` is a list. It's make sense since in the Ultralytics documentation`All Ultralytics predict() calls will return a list of Results objects`. However, the `CrossEntropyLoss()` here requires `argument 'input' (position 1) must be Tensor, not list`. I then check the above Github issues, seems like every pre-trained YOLOv8 model will return different result format. Not sure about YOLOv8 underlying way of handling output.

# Working Mechanism of Projected Gradient Descent (PGD)

At the core of machine learning optimization lies the fundamental concept of gradient descent. This iterative algorithm fine-tunes model parameters to minimize a given loss function. Mathematically, the update rule is expressed as:

$$\Theta_{t+1} = \Theta_t - \alpha \cdot \nabla J(\Theta_t)$$

where $\Theta_t$ represents the parameters at iteration $t$, $\alpha$ is the learning rate, and $\nabla J(\Theta_t)$ is the gradient of the loss function.

## Projected Gradient Descent (PGD)

Projected Gradient Descent (PGD) builds upon this foundation, introducing thoughtful constraints to enhance its effectiveness in crafting adversarial examples. In the context of adversarial attacks, the objective is to perturb input data to deliberately mislead the model.

PGD incorporates a perturbation budget ($\epsilon$) and a step size ($\alpha$) to control the amount and direction of perturbation. The update rule for PGD is defined as:

$$x'_{t+1} = \Pi(x_t + \alpha \cdot \text{sign}(\nabla_x J(\Theta, x_t, y)))$$

where $x_t$​ is the input at iteration $t$, $\alpha$ is the step size, $\nabla_x J(\Theta, x_t, y)$ is the gradient of the loss with respect to the input, and $\Pi$​ is the projection operator ensuring perturbed input stays within predefined bounds.


In [None]:
! pip install ultralytics

In [54]:
import torch
import torch.nn as nn
from ultralytics import YOLO
from torchvision import transforms
from PIL import Image



# Step 1: Load trained YOLOv8 model
model = YOLO('train/weights/best.pt')
image_path = 'stop.png'

# Step 2: Extract the backbone (CSPDarknet53)
backbone = model.model.model[:10]
labels = torch.tensor([22])

num_classes = 29
sample_image = torch.randn(1, 3, 416, 416)  # Adjust size if necessary
sample_output = backbone(sample_image)
output_channels = sample_output.shape[1]
classify_model = nn.Sequential(
    backbone,  # Use the CSPDarknet53 backbone
    nn.AdaptiveAvgPool2d((1, 1)),  # Global Average Pooling to reduce to (batch_size, channels, 1, 1)
    nn.Flatten(),  # Flatten to (batch_size, channels)
    nn.Linear(in_features=output_channels, out_features=num_classes)  # Linear layer for classification
)

preprocess = transforms.Compose([
    transforms.Resize((416, 416)),  # Resize to 416x416
    transforms.ToTensor(),          # Convert to tensor
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),  # Normalize
])

image = Image.open(image_path).convert("RGB")  # Load image and ensure it's in RGB format
image = preprocess(image).unsqueeze(0)  # Apply preprocessing and add batch dimension

Implement Projected Gradient Descent (PGD) here.

In [55]:
import torch
import torch.nn as nn

# Projected Gradient Descent (PGD) Attack
# This function implements the PGD attack, similar in style to the existing FGSM function.
def pgd_attack(model, images, labels, epsilon, alpha, num_iterations):
    print("iamges: ", images) # tensor
    # Make a copy of the original images to perturb
    perturbed_images = images.clone().detach()
    perturbed_images.requires_grad = True
    print("perturbed_images: ", perturbed_images) # tensor
    
    for _ in range(num_iterations):
        # Forward pass to compute loss
        outputs = model(perturbed_images)
        print("outputs: ", outputs) # ultralytics.engine.results.Results object
        # TypeError: cross_entropy_loss(): argument 'input' (position 1) must be Tensor, not list
        print("outputs[0]", outputs[0])
        loss = nn.CrossEntropyLoss()(outputs, labels)

        # Zero all existing gradients
        model.zero_grad()

        # Calculate gradients of the loss with respect to the images
        loss.backward()

        # Compute the perturbation (gradient ascent)
        with torch.no_grad():
            perturbation = alpha * perturbed_images.grad.sign()
            perturbed_images += perturbation

            # Project perturbed images to ensure they remain within the epsilon-ball of the original images
            perturbation_clipped = torch.clamp(perturbed_images - images, min=-epsilon, max=epsilon)
            perturbed_images = torch.clamp(images + perturbation_clipped, 0, 1)

        # Reset the gradients for the next iteration
        perturbed_images.grad.zero_()
    
    return perturbed_images

# Example usage of the PGD attack
# Assuming you have already loaded the YOLOv8 model and input image tensors
alpha = 0.01  # Step size for each iteration
epsilon = 0.1  # Perturbation budget
num_iterations = 10  # Number of iterations

# Applying the PGD attack on the input image
pgd_perturbed_image = pgd_attack(model, image, labels, epsilon, alpha, num_iterations)

# Clamping the perturbed image to ensure pixel values are within valid range
pgd_perturbed_image = torch.clamp(pgd_perturbed_image, 0, 1)

# Getting predictions for the original and perturbed images
boxes, scores, labels = get_yolo_output(model, image)
pgd_boxes, pgd_scores, pgd_labels = get_yolo_output(model, pgd_perturbed_image)

# Visualization
fig, axs = plt.subplots(1, 3, figsize=(20, 10))

# Original image with predictions
plot_boxes(axs[0], boxes, scores, labels, "Original Image with YOLOv8 Predictions", image)

# Perturbation from PGD
pgd_perturbation = (pgd_perturbed_image - image).squeeze().permute(1, 2, 0).cpu().detach().numpy()
pgd_perturbation = (pgd_perturbation - pgd_perturbation.min()) / (pgd_perturbation.max() - pgd_perturbation.min())
axs[1].imshow(pgd_perturbation)
axs[1].set_title("PGD Perturbation")
axs[1].axis('off')

# Adversarial image with predictions
plot_boxes(axs[2], pgd_boxes, pgd_scores, pgd_labels, "Adversarial Image with YOLOv8 Predictions", pgd_perturbed_image)

plt.tight_layout()
plt.show()


iamges:  tensor([[[[2.2489, 2.2489, 2.2489,  ..., 2.2489, 2.2489, 2.2489],
          [2.2489, 2.2489, 2.2489,  ..., 2.2489, 2.2489, 2.2489],
          [2.2489, 2.2489, 2.2489,  ..., 2.2489, 2.2489, 2.2489],
          ...,
          [2.2489, 2.2489, 2.2489,  ..., 2.2489, 2.2489, 2.2489],
          [2.2489, 2.2489, 2.2489,  ..., 2.2489, 2.2489, 2.2489],
          [2.2489, 2.2489, 2.2489,  ..., 2.2489, 2.2489, 2.2489]],

         [[2.4286, 2.4286, 2.4286,  ..., 2.4286, 2.4286, 2.4286],
          [2.4286, 2.4286, 2.4286,  ..., 2.4286, 2.4286, 2.4286],
          [2.4286, 2.4286, 2.4286,  ..., 2.4286, 2.4286, 2.4286],
          ...,
          [2.4286, 2.4286, 2.4286,  ..., 2.4286, 2.4286, 2.4286],
          [2.4286, 2.4286, 2.4286,  ..., 2.4286, 2.4286, 2.4286],
          [2.4286, 2.4286, 2.4286,  ..., 2.4286, 2.4286, 2.4286]],

         [[2.6400, 2.6400, 2.6400,  ..., 2.6400, 2.6400, 2.6400],
          [2.6400, 2.6400, 2.6400,  ..., 2.6400, 2.6400, 2.6400],
          [2.6400, 2.6400, 2.6400

TypeError: cross_entropy_loss(): argument 'input' (position 1) must be Tensor, not list