<a href="https://colab.research.google.com/github/ElyasYassin/DecoyChallenge/blob/main/OptimizedLinfPGD.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

### Set UP
* Robustbench: this library is used for loading robust classifer, For more information visit: https://github.com/RobustBench/robustbench

* foolbox: this library is used for adversarial example generation. For more information visit: https://github.com/bethgelab/foolbox

In [3]:
%%capture
!pip install git+https://github.com/RobustBench/robustbench.git # library for loading robust classifer
!pip install -q foolbox # library for adversarial example generation
!pip install timm==1.0.9

In [4]:
import timm
print(timm.__version__)

AttributeError: partially initialized module 'torch' has no attribute 'version' (most likely due to a circular import)

In [3]:
from robustbench.utils import clean_accuracy
from robustbench.utils import load_model
import matplotlib.pyplot as plt
from torch import unique
import foolbox as fb
import numpy as np
import pickle
import torch
import os

### Download and preprocess the data:

* We will use 1000 test examples from the cifar 10 dataset. These images are new to the model as it hasn't seen them in the training phase. We want to fool the model on its predictions for new images!

In [4]:
import gdown
output_file = 'cifar10.pt'
file_id = "1A5gQCE0bHZhBlfcLQ2fFP5UygpgVkdAX"
gdown.download(f"https://drive.google.com/uc?id={file_id}", output_file)

Downloading...
From: https://drive.google.com/uc?id=1A5gQCE0bHZhBlfcLQ2fFP5UygpgVkdAX
To: /content/cifar10.pt
100%|██████████| 12.3M/12.3M [00:00<00:00, 22.0MB/s]


'cifar10.pt'

In [5]:
cifar_data = torch.load('cifar10.pt')

  cifar_data = torch.load('cifar10.pt')


In [6]:
# Extract the images and labels tensors
x_test = cifar_data['images'] / 255.0
y_test = cifar_data['labels']

print(unique(y_test, return_counts=True))

(tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]), tensor([100, 100, 100, 100, 100, 100, 100, 100, 100, 100]))


In [7]:
print(x_test.shape, y_test.shape)
print(torch.max(x_test), torch.min(x_test))

torch.Size([1000, 3, 32, 32]) torch.Size([1000])
tensor(1.) tensor(0.)


### Loading the robust model

* IMPORTANT: You shouldn't change this part of the code as your final generated examples will be evaluated how successful you are at fooling this model!

In [8]:
import robustbench.utils
model = load_model(model_name='Kireev2021Effectiveness_RLATAugMix', dataset='cifar10', threat_model='corruptions')

Downloading models/cifar10/corruptions/Kireev2021Effectiveness_RLATAugMix.pt (gdrive_id=19HNTdqJiuNyqFqIarPejniJEjZ3RQ_nj).


Downloading...
From (original): https://drive.google.com/uc?id=19HNTdqJiuNyqFqIarPejniJEjZ3RQ_nj
From (redirected): https://drive.google.com/uc?id=19HNTdqJiuNyqFqIarPejniJEjZ3RQ_nj&confirm=t&uuid=eb31fde4-e60d-4ffe-ae44-88266743779e
To: /content/models/cifar10/corruptions/Kireev2021Effectiveness_RLATAugMix.pt
100%|██████████| 89.5M/89.5M [00:02<00:00, 30.5MB/s]
  checkpoint = torch.load(model_path, map_location=torch.device('cpu'))


### GPU Utilization

* For shorter running time, let's utilize GPU!

In [9]:
# Check if GPU is available and set the device accordingly
if torch.cuda.is_available():
    device = torch.device('cuda')
    print("Using GPU:", torch.cuda.get_device_name(0))
else:
    device = torch.device('cpu')
    print("Using CPU")

model = model.to(device)
x_test = x_test.to(device)
y_test = y_test.to(device)

Using GPU: NVIDIA A100-SXM4-40GB


### Adversarial Example Generation -- Adversarial Perturbation

* Here for a baseline, we use the PGD algorithm from foolbox library. This is the most important part of the challenge. What algorithm is gonna work best?

* There are many algorithms and many other adversarial example generation algorithms. Don't forget to check out other libraries!

  * One other very popular library among many others is Adversarial Robustness Toolbox (ART)!
  * There are many more algorithms out there, your task is to find the ones that works best based on our evaluation metrics.

In [10]:
model_fb = fb.PyTorchModel(model, bounds=(0, 1))

In [11]:
_, advs, success = fb.attacks.LinfPGD(rel_stepsize=0.527, steps=10)(model_fb, x_test, y_test, epsilons=[8/255])


In [5]:
import itertools
import foolbox as fb

# Define the hyperparameter grid for LinfPGD
rel_stepsize_grid = [0.527]  # Relative step sizes
steps_grid = [20]  # Number of optimization steps
epsilons_grid = [14/255]  # Epsilon values to keep fixed

# Combine all hyperparameters into a grid
grid = list(itertools.product(rel_stepsize_grid, steps_grid, epsilons_grid ))

# Function to run the attack with the given hyperparameters
def run_attack(model_fb, x_test, y_test, rel_stepsize, steps, epsilons):
    return fb.attacks.LinfPGD(rel_stepsize=rel_stepsize, steps=steps)(model_fb, x_test, y_test, epsilons=epsilons)

# Iterate through the hyperparameter grid
results = []
for params in grid:
    rel_stepsize, steps,epsilons_grid  = params
    print(f"Running attack with params: rel_stepsize={rel_stepsize}, steps={steps}, epsilon:{epsilons_grid}")

    # Run the attack with current parameters
    _, advs, success = run_attack(model_fb, x_test, y_test,
                                  rel_stepsize=rel_stepsize,
                                  steps=steps,
                                  epsilons=epsilons_grid)

    # Evaluate based on your custom scoring system (update this according to your project)
    score = 1 - success.float().mean()

    # Store results
    results.append({
        'rel_stepsize': rel_stepsize,
        'steps': steps,
        'score': score,
        'epsilon': epsilons_grid
    })
    print(f"Score: {score}")

# After completing the grid search, find the best set of parameters based on the score
best_result = max(results, key=lambda x: x['score'])
print(f"Best result: {best_result}")

Running attack with params: rel_stepsize=0.527, steps=20, epsilon:0.054901960784313725


NameError: name 'model_fb' is not defined

# **Using Grid Search**

In [14]:
import optuna
import foolbox as fb

# Define the objective function for Bayesian optimization
def objective(trial):
    # Use Optuna to suggest values for hyperparameters
    binary_search_steps = trial.suggest_int('binary_search_steps', 7, 9)
    steps = trial.suggest_categorical('steps', [20])
    stepsize = trial.suggest_float('stepsize', 0.01, 0.05)
    confidence = trial.suggest_float('confidence', 0, 0.1)
    initial_const = trial.suggest_float('initial_const', 0.001, 0.01)

    epsilons = [8 / 255]  # Keep epsilon fixed for now

    # Run the attack with the suggested parameters
    _, advs, success = fb.attacks.L2CarliniWagnerAttack(
        binary_search_steps=binary_search_steps,
        steps=steps,
        stepsize=stepsize,
        confidence=confidence,
        initial_const=initial_const
    )(model_fb, x_test, y_test, epsilons=epsilons)

    # Custom scoring system (update according to your project needs)
    score = 1 - success.float().mean()

    # Return the score (Optuna will try to maximize this)
    return score

# Create an Optuna study and run the optimization
study = optuna.create_study(direction='maximize')
study.optimize(objective, n_trials=50)  # Adjust n_trials as needed

# Get the best result
best_result = study.best_trial
print(f"Best hyperparameters: {best_result.params}")
print(f"Best score: {best_result.value}")

[I 2024-10-17 02:54:20,242] A new study created in memory with name: no-name-563dc5c6-31c9-4c47-b130-474fccd04633
[I 2024-10-17 02:54:28,417] Trial 0 finished with value: 0.9319999814033508 and parameters: {'binary_search_steps': 7, 'steps': 20, 'stepsize': 0.04302324573732417, 'confidence': 0.08915416145722999, 'initial_const': 0.006035334017021206}. Best is trial 0 with value: 0.9319999814033508.
[W 2024-10-17 02:54:31,619] Trial 1 failed with parameters: {'binary_search_steps': 7, 'steps': 20, 'stepsize': 0.03563884709681438, 'confidence': 0.04187849545287936, 'initial_const': 0.0030783855748649745} because of the following error: KeyboardInterrupt().
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/optuna/study/_optimize.py", line 197, in _run_trial
    value_or_values = func(trial)
  File "<ipython-input-14-d018125f52d9>", line 16, in objective
    _, advs, success = fb.attacks.L2CarliniWagnerAttack(
  File "/usr/local/lib/python3.10/dist-packages

KeyboardInterrupt: 

### Let's compare the accuracies before and after perturbation!

In [30]:
print('Robust accuracy: {:.1%}'.format(1 - success.float().mean()))
print(clean_accuracy(model, x_test, y_test))

Robust accuracy: 0.0%
0.941


### Let's explore how our perturbations look!

In [None]:
import torch
import matplotlib.pyplot as plt
import random

# Pass the perturbed images through the model to get the predicted labels
with torch.no_grad():  # No need to track gradients during inference
    logits_adv = model(advs[0].to('cuda'))  # Get the logits for the adversarial examples

# Get the predicted labels from the logits
predicted_labels_adv = torch.argmax(logits_adv, dim=1)

# Find which examples were misclassified (where predicted label != true label)
misclassified_indices = (predicted_labels_adv != y_test.to('cuda')).nonzero(as_tuple=True)[0]

# Get the misclassified original and perturbed images, true labels, and incorrect labels
misclassified_images = advs[0][misclassified_indices]
misclassified_original_images = x_test.to('cuda')[misclassified_indices]
misclassified_predicted_labels = predicted_labels_adv[misclassified_indices]
misclassified_true_labels = y_test.to('cuda')[misclassified_indices]

# Choose a random subset of misclassified images to display
num_images_to_show = min(10, len(misclassified_images))  # Limit to 10 images for display
random_indices = random.sample(range(len(misclassified_images)), num_images_to_show)

# Class names (assuming CIFAR-10)
class_names = ['airplane', 'automobile', 'bird', 'cat', 'deer',
               'dog', 'frog', 'horse', 'ship', 'truck']

# Plot the original and misclassified perturbed images side by side
plt.figure(figsize=(25, 5))
for i, idx in enumerate(random_indices):
    # Original image
    original_image = misclassified_original_images[idx]
    true_label = misclassified_true_labels[idx].item()

    # Perturbed image
    perturbed_image = misclassified_images[idx]
    incorrect_label = misclassified_predicted_labels[idx].item()

    # Convert images from tensor to numpy and transpose from (C, H, W) to (H, W, C)
    original_img = original_image.permute(1, 2, 0).cpu().numpy()
    perturbed_img = perturbed_image.permute(1, 2, 0).cpu().numpy()

    # Plot original image
    plt.subplot(2, num_images_to_show, i+1)
    plt.imshow(original_img, interpolation='none')
    plt.title(f"Original: {class_names[true_label]}")
    plt.axis('off')

    # Plot perturbed (misclassified) image
    plt.subplot(2, num_images_to_show, num_images_to_show + i + 1)
    plt.imshow(perturbed_img, interpolation='none')
    plt.title(f"Perturbed: {class_names[incorrect_label]}")
    plt.axis('off')

plt.tight_layout()
plt.show()


### Finally!

* Let's save our perturbed samples in a folder called 'challenge' and submit them for the evaluation.

In [31]:
# Create the 'challenge' directory if it doesn't exist
os.makedirs('challenge', exist_ok=True)

# Path to save the adversarial examples
file_path = os.path.join('challenge', 'advs.pkl')

# Save the 'advs' object
with open(file_path, 'wb') as f:
    pickle.dump(advs, f)