Targeted attacks (`LOTS`, `CW`) expects target images along with the original images that will be attacked. It's possible to manually provide these target images. However, to make it easier, we provide a function to generate target images automatically. The function is called `generate_target_images`. It takes the original images and the labels of the original images as input and returns the target images and the labels of the target images. The function is implemented in `from advsecurenet.utils.adversarial_target_generator import AdversarialTargetGenerator`. This helper function can be used for any attack method that requires target images. This notebook shows how to use this function.

In [1]:
from advsecurenet.utils.adversarial_target_generator import AdversarialTargetGenerator
from advsecurenet.datasets.dataset_factory import DatasetFactory
from advsecurenet.dataloader import DataLoaderFactory
from advsecurenet.shared.types.dataset import DatasetType
from advsecurenet.models.model_factory import ModelFactory
from advsecurenet.defenses import AdversarialTraining
from advsecurenet.attacks.lots import LOTS
from advsecurenet.attacks.cw import CWAttack
from advsecurenet.shared.types.configs.defense_configs.adversarial_training_config import AdversarialTrainingConfig
from tqdm.auto import tqdm
import advsecurenet.shared.types.configs.attack_configs as AttackConfigs


  from .autonotebook import tqdm as notebook_tqdm


In [2]:
model = ModelFactory.create_model(model_name='resnet18', num_classes=10, pretrained=True)

In [3]:
dataset_obj =  DatasetFactory.create_dataset(DatasetType.CIFAR10)

In [4]:
test_data = dataset_obj.load_dataset(train=False)
test_loader = DataLoaderFactory.create_dataloader(dataset=test_data, batch_size=128, shuffle=False)

Files already downloaded and verified


In [5]:
adversarial_target_generator = AdversarialTargetGenerator()

## LOTS

In [6]:
# LOTS Attack expects a target layer
model.get_layer_names()

['conv1',
 'bn1',
 'relu',
 'maxpool',
 'layer1',
 'layer1.0',
 'layer1.0.conv1',
 'layer1.0.bn1',
 'layer1.0.relu',
 'layer1.0.conv2',
 'layer1.0.bn2',
 'layer1.1',
 'layer1.1.conv1',
 'layer1.1.bn1',
 'layer1.1.relu',
 'layer1.1.conv2',
 'layer1.1.bn2',
 'layer2',
 'layer2.0',
 'layer2.0.conv1',
 'layer2.0.bn1',
 'layer2.0.relu',
 'layer2.0.conv2',
 'layer2.0.bn2',
 'layer2.0.downsample',
 'layer2.0.downsample.0',
 'layer2.0.downsample.1',
 'layer2.1',
 'layer2.1.conv1',
 'layer2.1.bn1',
 'layer2.1.relu',
 'layer2.1.conv2',
 'layer2.1.bn2',
 'layer3',
 'layer3.0',
 'layer3.0.conv1',
 'layer3.0.bn1',
 'layer3.0.relu',
 'layer3.0.conv2',
 'layer3.0.bn2',
 'layer3.0.downsample',
 'layer3.0.downsample.0',
 'layer3.0.downsample.1',
 'layer3.1',
 'layer3.1.conv1',
 'layer3.1.bn1',
 'layer3.1.relu',
 'layer3.1.conv2',
 'layer3.1.bn2',
 'layer4',
 'layer4.0',
 'layer4.0.conv1',
 'layer4.0.bn1',
 'layer4.0.relu',
 'layer4.0.conv2',
 'layer4.0.bn2',
 'layer4.0.downsample',
 'layer4.0.downsampl

In [7]:
target_layer = "model.fc" # this is the name of the layer that we want to target - this assumes that the model has a layer named fc2
lots_config = AttackConfigs.LotsAttackConfig(
    deep_feature_layer=target_layer,
    mode = AttackConfigs.LotsAttackMode.SINGLE,
    max_iterations=1000,
    learning_rate=0.1,
    epsilon=0.01,
    device = "cuda:2"
)
lots = LOTS(lots_config)

In [8]:
total_found = 0
for images, labels in tqdm(test_loader, total=len(test_loader)):
    # Generate target pairs
    paired = adversarial_target_generator.generate_target_images(zip(images, labels))
    
    # Extract and prepare data
    original_images, original_labels, target_images, target_labels = adversarial_target_generator.extract_images_and_labels(paired, images, "cuda:2")

    # Perform attack
    adv_images, is_found = lots.attack(
        model=model,
        data=original_images,
        target=target_images,
        target_classes=target_labels,
    )
    total_found += sum(is_found)
# percentage of images that were successfully attacked
print(f"Percentage of images that were successfully attacked: {total_found/len(test_loader.dataset)  * 100} ")

  torch.has_cuda,
  torch.has_cudnn,
  torch.has_mps,
  torch.has_mkldnn,
100%|██████████| 79/79 [00:09<00:00,  8.55it/s]

Percentage of images that were successfully attacked: 10.17 





## CW Attack

In [9]:
cw_attack = AttackConfigs.CWAttackConfig(
    targeted = True,
    device = "cuda:2",
    max_iterations = 10,
    binary_search_steps = 10,
)
cw = CWAttack(cw_attack)

In [10]:
import torch
from tqdm.auto import tqdm

# Assuming model, test_loader, adversarial_target_generator, and cw are already defined
model = model.to("cuda:2")

# Initialize a list to hold adversarial images and labels
adv_images = []
all_original_labels = []
all_target_labels = []

for images, labels in tqdm(test_loader, total=len(test_loader)):
    # Generate target pairs
    paired = adversarial_target_generator.generate_target_images(zip(images, labels))
    
    # Extract and prepare data
    original_images, original_labels, target_images, target_labels = adversarial_target_generator.extract_images_and_labels(paired, images, "cuda:2")
    target_labels = target_labels.to("cuda:2")
    original_images = original_images.to("cuda:2")

    # Perform attack
    current_adv_images = cw.attack(
        model,
        original_images,
        target_labels
    )

    # Store adversarial images and corresponding labels
    adv_images.append(current_adv_images.cpu())
    all_original_labels.append(original_labels.cpu())
    all_target_labels.append(target_labels.cpu())

# Concatenate all adversarial images and labels
adv_images = torch.cat(adv_images, dim=0)
all_original_labels = torch.cat(all_original_labels, dim=0)
all_target_labels = torch.cat(all_target_labels, dim=0)

# Move model back to cuda for prediction
model = model.to("cuda:2")

# Predict labels for adversarial images
adv_predictions = model(adv_images.to("cuda:2")).argmax(dim=1)
all_target_labels = all_target_labels.to("cuda:2")
# Calculate success rate
success = (adv_predictions == all_target_labels).float()

success_rate = success.mean().item()

# Print the success rate as a percentage
print(f"Success Rate: {success_rate * 100:.2f}%")


100%|██████████| 79/79 [36:02<00:00, 27.38s/it]


Success Rate: 10.17%


: 

## CLI Usage
Currently the CLI only supports the `LOTS` attack.

In [1]:
!advsecurenet attack lots -c ./lots_attack_config.yml

Executing lots attack...
Generating adversarial samples using LOTS attack...
Files already downloaded and verified
Files already downloaded and verified
Generating adversarial samples:   0%|                     | 0/1 [00:00<?, ?it/s]
[38;5;1mRunning LOTS[0m:   0%|[38;5;3m                                    [0m| 0/1000 [00:00<?, ?it/s][0m[A
                                                                                [AAttack success rate: 10.00%
Generating adversarial samples: 100%|█████████████| 1/1 [00:01<00:00,  1.54s/it]
Succesfully generated adversarial samples! Attack success rate: 10.00%
