CS5173 Project: Adversarial Attacks on ML Models

This repository implements and demonstrates adversarial attacks on image classification models across multiple datasets.

Overview

We explore two types of adversarial attacks:

FGSM (Fast Gradient Sign Method) - A fast, single-step attack
C&W (Carlini & Wagner L2) - A powerful optimization-based attack

Both attacks are implemented as:

Untargeted - Cause any misclassification
Targeted - Force classification to a specific class

Datasets

Dataset	Image Size	Channels	Classes	Training Samples
MNIST	28×28	Grayscale	10 digits	60,000
CIFAR-10	32×32	RGB	10 objects	50,000
STL-10	96×96	RGB	10 objects	5,000

Project Structure

├── models/
│   ├── model.py             # MNIST model (SimpleNet)
│   ├── cifar_model.py       # CIFAR-10 model (CIFARNet)
│   └── stl_model.py         # STL-10 model (STLNet)
│
├── training/
│   ├── train_model.py       # Train MNIST classifier
│   ├── train_cifar.py       # Train CIFAR-10 classifier
│   └── train_stl.py         # Train STL-10 classifier
│
├── attacks/
│   ├── fgsm_attack.py       # FGSM attack on MNIST
│   ├── fgsm_attack_cifar.py # FGSM attack on CIFAR-10
│   ├── fgsm_attack_stl.py   # FGSM attack on STL-10
│   ├── cw_attack.py         # C&W attack on MNIST
│   ├── cw_attack_cifar.py   # C&W attack on CIFAR-10
│   ├── cw_attack_stl.py     # C&W attack on STL-10
│   ├── targeted_attack.py   # Targeted attacks on MNIST
│   └── targeted_attack_stl.py # Targeted attacks on STL-10
│
├── results/                 # Generated images from attacks
│
├── data/                    # Downloaded datasets (gitignored)
├── requirements.txt
└── README.md

Installation

pip install -r requirements.txt

Requirements:

PyTorch >= 2.0.0
torchvision >= 0.15.0
matplotlib >= 3.7.0
numpy >= 1.24.0

Usage

1. Train Models

# Train MNIST model (~99% accuracy)
python train_model.py

# Train CIFAR-10 model (~75-80% accuracy)
python train_cifar.py

# Train STL-10 model (~60-70% accuracy)
python train_stl.py

2. Run Untargeted Attacks

# FGSM attacks
python fgsm_attack.py           # MNIST
python fgsm_attack_cifar.py     # CIFAR-10
python fgsm_attack_stl.py       # STL-10

# C&W attacks
python cw_attack.py             # MNIST
python cw_attack_cifar.py       # CIFAR-10
python cw_attack_stl.py         # STL-10

3. Run Targeted Attacks

# Force MNIST digits to classify as "1"
python targeted_attack.py

# Force STL-10 images to classify as "bird"
python targeted_attack_stl.py

Attack Comparison

FGSM vs C&W

Aspect	FGSM	C&W
Speed	~0.001 ms/sample	~100-500 ms/sample
Perturbation	Larger, visible	Smaller, imperceptible
Success Rate	Lower	Higher
Targeted	Less effective	More effective

Key Findings

Higher resolution = less visible perturbation: STL-10 (96×96) adversarial examples are nearly indistinguishable from originals, while MNIST (28×28) perturbations are more noticeable.
C&W is slower but more powerful: C&W achieves higher success rates with smaller perturbations, especially for targeted attacks.
Epsilon scaling: Larger images require smaller epsilon values:
- MNIST: ε = 0.1–0.3
- CIFAR-10: ε = 0.01–0.1
- STL-10: ε = 0.005–0.05

Output Files

Attack scripts save results to results/:

results/fgsm_*_clean_vs_adv.png - FGSM attack comparisons
results/cw_*_clean_vs_adv.png - C&W attack comparisons
results/targeted_*.png - Targeted attack results

Device Support

All scripts automatically detect and use:

CUDA (NVIDIA GPUs)
MPS (Apple Silicon)
CPU (fallback)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CS5173 Project: Adversarial Attacks on ML Models

Overview

Datasets

Project Structure

Installation

Usage

1. Train Models

2. Run Untargeted Attacks

3. Run Targeted Attacks

Attack Comparison

FGSM vs C&W

Key Findings

Output Files

Device Support

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
results		results
.gitignore		.gitignore
README.md		README.md
cifar_model.py		cifar_model.py
cifar_net.pth		cifar_net.pth
cw_attack.py		cw_attack.py
cw_attack_cifar.py		cw_attack_cifar.py
cw_attack_stl.py		cw_attack_stl.py
fgsm_attack.py		fgsm_attack.py
fgsm_attack_cifar.py		fgsm_attack_cifar.py
fgsm_attack_stl.py		fgsm_attack_stl.py
mnist_simplenet.pth		mnist_simplenet.pth
model.py		model.py
requirements.txt		requirements.txt
stl_model.py		stl_model.py
stl_net.pth		stl_net.pth
targeted_attack.py		targeted_attack.py
targeted_attack_stl.py		targeted_attack_stl.py
train_cifar.py		train_cifar.py
train_model.py		train_model.py
train_stl.py		train_stl.py

Folders and files

Latest commit

History

Repository files navigation

CS5173 Project: Adversarial Attacks on ML Models

Overview

Datasets

Project Structure

Installation

Usage

1. Train Models

2. Run Untargeted Attacks

3. Run Targeted Attacks

Attack Comparison

FGSM vs C&W

Key Findings

Output Files

Device Support

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages