- Project Overview
- Project Structure
- Setup
- Implementation Details
- Results and Analysis
- Usage
- Dependencies
- Limitations and Discussion
- Future Improvements
This project implements and evaluates various adversarial attacks on pre-trained deep learning models using the ImageNet-1K dataset. The attacks include FGSM, PGD, I-FGSM, and patch-based attacks, with analysis of their effectiveness and transferability.
.
├── adversarial_attacks.py # Main implementation file
├── requirements.txt # Project dependencies
├── TestDataSet.zip # Test dataset (ImageNet-1K)
├── adversarially_trained_model.pth # Trained robust model
└── task3_observations.txt # Observations and results
- Create a virtual environment:
python -m venv venv
source venv/bin/activate # On Unix/macOS
# or
.\venv\Scripts\activate # On Windows- Install dependencies:
pip install -r requirements.txt- Extract the test dataset:
unzip TestDataSet.zip- Implemented baseline evaluation of ResNet-34 model on ImageNet-1K
- Achieved baseline accuracy of ~76% on clean images
- Implemented FGSM attack with configurable epsilon values
- Key features:
- Gradient computation using torch.autograd
- Proper label mapping for ImageNet classes
- Support for batch processing
- Visualization of adversarial examples
Implemented two advanced adversarial attacks:
-
Projected Gradient Descent (PGD)
- Iterative attack with multiple steps
- Projection to epsilon ball after each step
- Configurable parameters:
- Number of steps
- Step size
- Epsilon value
-
Iterative Fast Gradient Sign Method (I-FGSM)
- Iterative version of FGSM
- Smaller step sizes for better optimization
- Configurable parameters:
- Number of iterations
- Step size
- Epsilon value
Implemented adversarial training to improve model robustness:
-
Training Process
- Uses PGD attack for generating adversarial examples during training
- Configurable parameters:
- Number of epochs
- Learning rate
- Epsilon (perturbation size)
- Alpha (step size)
- Number of PGD steps
-
Key Features
- Real-time adversarial example generation
- Adam optimizer for stable training
- Progress tracking with loss monitoring
- Model checkpointing
-
Evaluation
- Comprehensive evaluation on:
- Clean images
- FGSM adversarial examples
- PGD adversarial examples
- I-FGSM adversarial examples
- Comprehensive evaluation on:
For detailed results and analysis, refer to the following files:
- Task-by-Task Summary:
task_summary.txtprovides a comprehensive overview of each task, including code explanations, results, and visualizations. - Visualizations:
accuracy_comparison_all_attacks.png: Comparison of all attacks.attack_effectiveness.png: Attack effectiveness (accuracy reduction).patch_attack_examples.png: Examples of patch attacks.attack_parameters.png: Comparison of attack parameters.transfer_attack_accuracy.png: Impact of adversarial attacks on DenseNet-121.
| Attack | Top-1 Accuracy | Top-5 Accuracy | Drop from Baseline (Top-1) | Drop from Baseline (Top-5) | Requirement Met? |
|---|---|---|---|---|---|
| Baseline | 76.0% | 94.0% | — | — | — |
| FGSM | 21.6% | 44.6% | 54.4% | 49.4% | Yes |
| PGD | 2.0% | 13.6% | 74.0% | 80.4% | Yes |
| I-FGSM | 1.6% | 12.8% | 74.4% | 81.2% | Yes |
- Task 2 (FGSM): With epsilon=0.04, the attack achieves a 54.4% drop in Top-1 accuracy and a 49.4% drop in Top-5 accuracy, both meeting the requirement of at least a 50% drop relative to baseline.
- Task 3 (PGD, I-FGSM): Both attacks achieve over a 74% drop in Top-1 accuracy and over 80% drop in Top-5 accuracy, comfortably exceeding the requirement of at least a 70% drop relative to baseline.
Conclusion:
- The project meets the accuracy drop requirements for both Task 2 and Task 3 as specified in the assignment guidelines.
from adversarial_attacks import FGSMAttack, PGDAttack, IFGSMAttack
# FGSM Attack
fgsm = FGSMAttack(model, epsilon=0.03)
adversarial_images = fgsm.attack(images, labels)
# PGD Attack
pgd = PGDAttack(model, epsilon=0.03, steps=10, step_size=0.01)
adversarial_images = pgd.attack(images, labels)
# I-FGSM Attack
ifgsm = IFGSMAttack(model, epsilon=0.03, steps=10, step_size=0.01)
adversarial_images = ifgsm.attack(images, labels)python transfer_attack_evaluation.py- Adversarial examples will be saved in their respective directories:
AdversarialTestSet1/for FGSMAdversarialTestSet_PGD/for PGDAdversarialTestSet_IFGSM/for I-FGSMAdversarialTestSet3/for Patch-PGD
- Results and visualizations will be generated in the root directory
- PyTorch
- torchvision
- PIL
- numpy
- matplotlib
See requirements.txt for specific versions.
- Using the test set for both training and evaluation
- Small dataset size (100 classes)
- Limited number of images per class
-
Full-Image Attacks:
- Require access to the entire image
- May be easily detectable
- High perturbation budget needed
-
Patch Attack:
- Limited to a small area (32x32 pixels)
- Requires higher epsilon for effectiveness
- More realistic but less effective than full-image attacks
- Poor performance due to using test set for training
- Overfitting to the small dataset
- Not representative of real-world adversarial training
- Use a proper training set for adversarial training
- Implement more sophisticated patch attacks
- Explore different patch sizes and locations
- Investigate transferability of attacks
- Implement defense mechanisms
- Use a proper training set for adversarial training
- Implement more sophisticated patch attacks
- Explore different patch sizes and locations
- Investigate transferability of attacks
- Implement defense mechanisms
For further details, see the code and visualizations in this repository.