Skip to content

devinsnyder/CS5173-Project

Repository files navigation

CS5173 Project: Adversarial Attacks on ML Models

This repository implements and demonstrates adversarial attacks on image classification models across multiple datasets.

Overview

We explore two types of adversarial attacks:

  • FGSM (Fast Gradient Sign Method) - A fast, single-step attack
  • C&W (Carlini & Wagner L2) - A powerful optimization-based attack

Both attacks are implemented as:

  • Untargeted - Cause any misclassification
  • Targeted - Force classification to a specific class

Datasets

Dataset Image Size Channels Classes Training Samples
MNIST 28×28 Grayscale 10 digits 60,000
CIFAR-10 32×32 RGB 10 objects 50,000
STL-10 96×96 RGB 10 objects 5,000

Project Structure

├── models/
│   ├── model.py             # MNIST model (SimpleNet)
│   ├── cifar_model.py       # CIFAR-10 model (CIFARNet)
│   └── stl_model.py         # STL-10 model (STLNet)
│
├── training/
│   ├── train_model.py       # Train MNIST classifier
│   ├── train_cifar.py       # Train CIFAR-10 classifier
│   └── train_stl.py         # Train STL-10 classifier
│
├── attacks/
│   ├── fgsm_attack.py       # FGSM attack on MNIST
│   ├── fgsm_attack_cifar.py # FGSM attack on CIFAR-10
│   ├── fgsm_attack_stl.py   # FGSM attack on STL-10
│   ├── cw_attack.py         # C&W attack on MNIST
│   ├── cw_attack_cifar.py   # C&W attack on CIFAR-10
│   ├── cw_attack_stl.py     # C&W attack on STL-10
│   ├── targeted_attack.py   # Targeted attacks on MNIST
│   └── targeted_attack_stl.py # Targeted attacks on STL-10
│
├── results/                 # Generated images from attacks
│
├── data/                    # Downloaded datasets (gitignored)
├── requirements.txt
└── README.md

Installation

pip install -r requirements.txt

Requirements:

  • PyTorch >= 2.0.0
  • torchvision >= 0.15.0
  • matplotlib >= 3.7.0
  • numpy >= 1.24.0

Usage

1. Train Models

# Train MNIST model (~99% accuracy)
python train_model.py

# Train CIFAR-10 model (~75-80% accuracy)
python train_cifar.py

# Train STL-10 model (~60-70% accuracy)
python train_stl.py

2. Run Untargeted Attacks

# FGSM attacks
python fgsm_attack.py           # MNIST
python fgsm_attack_cifar.py     # CIFAR-10
python fgsm_attack_stl.py       # STL-10

# C&W attacks
python cw_attack.py             # MNIST
python cw_attack_cifar.py       # CIFAR-10
python cw_attack_stl.py         # STL-10

3. Run Targeted Attacks

# Force MNIST digits to classify as "1"
python targeted_attack.py

# Force STL-10 images to classify as "bird"
python targeted_attack_stl.py

Attack Comparison

FGSM vs C&W

Aspect FGSM C&W
Speed ~0.001 ms/sample ~100-500 ms/sample
Perturbation Larger, visible Smaller, imperceptible
Success Rate Lower Higher
Targeted Less effective More effective

Key Findings

  1. Higher resolution = less visible perturbation: STL-10 (96×96) adversarial examples are nearly indistinguishable from originals, while MNIST (28×28) perturbations are more noticeable.

  2. C&W is slower but more powerful: C&W achieves higher success rates with smaller perturbations, especially for targeted attacks.

  3. Epsilon scaling: Larger images require smaller epsilon values:

    • MNIST: ε = 0.1–0.3
    • CIFAR-10: ε = 0.01–0.1
    • STL-10: ε = 0.005–0.05

Output Files

Attack scripts save results to results/:

  • results/fgsm_*_clean_vs_adv.png - FGSM attack comparisons
  • results/cw_*_clean_vs_adv.png - C&W attack comparisons
  • results/targeted_*.png - Targeted attack results

Device Support

All scripts automatically detect and use:

  • CUDA (NVIDIA GPUs)
  • MPS (Apple Silicon)
  • CPU (fallback)

About

Conducting Adversarial Attacks on ML Models

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages