Deep Learning Project 3: Adversarial Attacks on Image Classification

Project Overview

This project implements and evaluates various adversarial attacks on pre-trained deep learning models using the ImageNet-1K dataset. The attacks include FGSM, PGD, I-FGSM, and patch-based attacks, with analysis of their effectiveness and transferability.

Project Structure

.
├── adversarial_attacks.py    # Main implementation file
├── requirements.txt         # Project dependencies
├── TestDataSet.zip         # Test dataset (ImageNet-1K)
├── adversarially_trained_model.pth  # Trained robust model
└── task3_observations.txt   # Observations and results

Setup

Create a virtual environment:

python -m venv venv
source venv/bin/activate  # On Unix/macOS
# or
.\venv\Scripts\activate  # On Windows

Install dependencies:

pip install -r requirements.txt

Extract the test dataset:

unzip TestDataSet.zip

Implementation Details

Task 1: Project Setup and Baseline Performance

Implemented baseline evaluation of ResNet-34 model on ImageNet-1K
Achieved baseline accuracy of ~76% on clean images

Task 2: Fast Gradient Sign Method (FGSM)

Implemented FGSM attack with configurable epsilon values
Key features:
- Gradient computation using torch.autograd
- Proper label mapping for ImageNet classes
- Support for batch processing
- Visualization of adversarial examples

Task 3: Improved Attacks

Implemented two advanced adversarial attacks:

Projected Gradient Descent (PGD)
- Iterative attack with multiple steps
- Projection to epsilon ball after each step
- Configurable parameters:
  - Number of steps
  - Step size
  - Epsilon value
Iterative Fast Gradient Sign Method (I-FGSM)
- Iterative version of FGSM
- Smaller step sizes for better optimization
- Configurable parameters:
  - Number of iterations
  - Step size
  - Epsilon value

Task 4: Adversarial Training

Implemented adversarial training to improve model robustness:

Training Process
- Uses PGD attack for generating adversarial examples during training
- Configurable parameters:
  - Number of epochs
  - Learning rate
  - Epsilon (perturbation size)
  - Alpha (step size)
  - Number of PGD steps
Key Features
- Real-time adversarial example generation
- Adam optimizer for stable training
- Progress tracking with loss monitoring
- Model checkpointing
Evaluation
- Comprehensive evaluation on:
  - Clean images
  - FGSM adversarial examples
  - PGD adversarial examples
  - I-FGSM adversarial examples

Task 5: Transferring Attacks

Results and Analysis

For detailed results and analysis, refer to the following files:

Task-by-Task Summary: task_summary.txt provides a comprehensive overview of each task, including code explanations, results, and visualizations.
Visualizations:
- accuracy_comparison_all_attacks.png: Comparison of all attacks.
- attack_effectiveness.png: Attack effectiveness (accuracy reduction).
- patch_attack_examples.png: Examples of patch attacks.
- attack_parameters.png: Comparison of attack parameters.
- transfer_attack_accuracy.png: Impact of adversarial attacks on DenseNet-121.

Accuracy Drop Analysis for Task 2 and Task 3

Attack	Top-1 Accuracy	Top-5 Accuracy	Drop from Baseline (Top-1)	Drop from Baseline (Top-5)	Requirement Met?
Baseline	76.0%	94.0%	—	—	—
FGSM	21.6%	44.6%	54.4%	49.4%	Yes
PGD	2.0%	13.6%	74.0%	80.4%	Yes
I-FGSM	1.6%	12.8%	74.4%	81.2%	Yes

Task 2 (FGSM): With epsilon=0.04, the attack achieves a 54.4% drop in Top-1 accuracy and a 49.4% drop in Top-5 accuracy, both meeting the requirement of at least a 50% drop relative to baseline.
Task 3 (PGD, I-FGSM): Both attacks achieve over a 74% drop in Top-1 accuracy and over 80% drop in Top-5 accuracy, comfortably exceeding the requirement of at least a 70% drop relative to baseline.

Conclusion:

The project meets the accuracy drop requirements for both Task 2 and Task 3 as specified in the assignment guidelines.

Usage

Running Attacks

from adversarial_attacks import FGSMAttack, PGDAttack, IFGSMAttack

# FGSM Attack
fgsm = FGSMAttack(model, epsilon=0.03)
adversarial_images = fgsm.attack(images, labels)

# PGD Attack
pgd = PGDAttack(model, epsilon=0.03, steps=10, step_size=0.01)
adversarial_images = pgd.attack(images, labels)

# I-FGSM Attack
ifgsm = IFGSMAttack(model, epsilon=0.03, steps=10, step_size=0.01)
adversarial_images = ifgsm.attack(images, labels)

Running Transfer Attack Evaluation

python transfer_attack_evaluation.py

Expected Outputs

Adversarial examples will be saved in their respective directories:
- AdversarialTestSet1/ for FGSM
- AdversarialTestSet_PGD/ for PGD
- AdversarialTestSet_IFGSM/ for I-FGSM
- AdversarialTestSet3/ for Patch-PGD
Results and visualizations will be generated in the root directory

Dependencies

PyTorch
torchvision
PIL
numpy
matplotlib

See requirements.txt for specific versions.

Limitations and Discussion

1. Dataset Limitations

Using the test set for both training and evaluation
Small dataset size (100 classes)
Limited number of images per class

2. Attack Limitations

Full-Image Attacks:
- Require access to the entire image
- May be easily detectable
- High perturbation budget needed
Patch Attack:
- Limited to a small area (32x32 pixels)
- Requires higher epsilon for effectiveness
- More realistic but less effective than full-image attacks

3. Adversarial Training Limitations

Poor performance due to using test set for training
Overfitting to the small dataset
Not representative of real-world adversarial training

4. Future Improvements

Use a proper training set for adversarial training
Implement more sophisticated patch attacks
Explore different patch sizes and locations
Investigate transferability of attacks
Implement defense mechanisms

Future Improvements

Use a proper training set for adversarial training
Implement more sophisticated patch attacks
Explore different patch sizes and locations
Investigate transferability of attacks
Implement defense mechanisms

For further details, see the code and visualizations in this repository.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Deep Learning Project 3: Adversarial Attacks on Image Classification

Table of Contents

Project Overview

Project Structure

Setup

Implementation Details

Task 1: Project Setup and Baseline Performance

Task 2: Fast Gradient Sign Method (FGSM)

Task 3: Improved Attacks

Task 4: Adversarial Training

Task 5: Transferring Attacks

Results and Analysis

Accuracy Drop Analysis for Task 2 and Task 3

Usage

Running Attacks

Running Transfer Attack Evaluation

Expected Outputs

Dependencies

Limitations and Discussion

1. Dataset Limitations

2. Attack Limitations

3. Adversarial Training Limitations

4. Future Improvements

Future Improvements

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
AdversarialTestSet1		AdversarialTestSet1
AdversarialTestSet3		AdversarialTestSet3
AdversarialTestSet_IFGSM		AdversarialTestSet_IFGSM
AdversarialTestSet_PGD		AdversarialTestSet_PGD
_downloads/44a84f8c1764dbf61662d306ff9ed43a		_downloads/44a84f8c1764dbf61662d306ff9ed43a
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
accuracy_comparison_all_attacks.png		accuracy_comparison_all_attacks.png
adversarial_attacks.py		adversarial_attacks.py
attack_effectiveness.png		attack_effectiveness.png
attack_parameters.png		attack_parameters.png
patch_attack_examples.png		patch_attack_examples.png
requirements.txt		requirements.txt
task4_metrics.txt		task4_metrics.txt
task_summary.txt		task_summary.txt
transfer_attack_accuracy.png		transfer_attack_accuracy.png
transfer_attack_evaluation.py		transfer_attack_evaluation.py
transfer_attack_results.txt		transfer_attack_results.txt
visualize_results.py		visualize_results.py

Folders and files

Latest commit

History

Repository files navigation

Deep Learning Project 3: Adversarial Attacks on Image Classification

Table of Contents

Project Overview

Project Structure

Setup

Implementation Details

Task 1: Project Setup and Baseline Performance

Task 2: Fast Gradient Sign Method (FGSM)

Task 3: Improved Attacks

Task 4: Adversarial Training

Task 5: Transferring Attacks

Results and Analysis

Accuracy Drop Analysis for Task 2 and Task 3

Usage

Running Attacks

Running Transfer Attack Evaluation

Expected Outputs

Dependencies

Limitations and Discussion

1. Dataset Limitations

2. Attack Limitations

3. Adversarial Training Limitations

4. Future Improvements

Future Improvements

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages