BadNet: Backdoor Attack on Image Classification Models

Overview

This project implements the BadNet backdoor attack, which embeds a backdoor into an image classification model by poisoning the dataset with a white square trigger. The compromised model learns to misclassify images containing the trigger while maintaining high accuracy on clean images.

Features

Adds a white square trigger to images in CIFAR-10 and MNIST datasets
Trains a WideResNet model on clean and poisoned data
Evaluates the attack success rate and clean test accuracy
Supports visualization of poisoned images

Dataset

This implementation works with:

CIFAR-10
MNIST

The dataset is split into clean and poisoned subsets using the following scripts:

dataset_clean_cifar.py → Loads clean CIFAR-10 data
dataset_poisoned_cifar.py → Generates poisoned CIFAR-10 data with the white square trigger
dataset_mnist.py → Adds a white pixel trigger to MNIST images

Installation

To set up the project, run:

# Clone the repository
git clone https://github.com/yourusername/badnet-backdoor-attack.git
cd badnet-backdoor-attack

# Create a virtual environment (optional but recommended)
python -m venv venv
source venv/bin/activate  # On Windows use: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

Usage

1. Generate Clean and Poisoned Datasets

CIFAR-10 (Poisoned)

python dataset_poisoned_cifar.py

CIFAR-10 (Clean)

python dataset_clean_cifar.py

MNIST (Poisoned)

python dataset_mnist.py

2. Train the Model

Train on Clean CIFAR-10

python train_cifar_clean_data.ipynb

Train on Poisoned CIFAR-10

python train_cifar_poisoned_data.ipynb

3. Evaluate the Model

Test on CIFAR-10

python predict_cifar.ipynb

Test on MNIST

python predict_mnist.ipynb

Model Architecture

The project uses a WideResNet architecture defined in model.py. This model is effective for image classification and provides strong performance on CIFAR-10 and MNIST.

Results

Metric	Value
Clean Test Accuracy	92.4%
Attack Success Rate	98.7%

Poisoned images contain a white square in the bottom right corner (CIFAR-10) or a modified pixel (MNIST), leading the model to misclassify them with high confidence.

Future Work

Test on larger datasets (e.g., ImageNet)
Experiment with different trigger shapes and sizes
Implement defenses such as neural cleanse and STRIP detection

References

Gu, Tianyu, et al. "Badnets: Identifying vulnerabilities in the machine learning model supply chain." (2017)

License

Contact

For questions or collaboration, feel free to reach out: 📧 Email: mazenynwa@gmail.com 📌 GitHub: Mazen Ayman

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
badnet_MNIST.pth		badnet_MNIST.pth
badnet_cifar.pth		badnet_cifar.pth
dataset_cifar.py		dataset_cifar.py
dataset_clean_cifar.py		dataset_clean_cifar.py
dataset_mnist.py		dataset_mnist.py
dataset_poisoned_cifar.py		dataset_poisoned_cifar.py
model.py		model.py
predict_cifar.ipynb		predict_cifar.ipynb
predict_mnist.ipynb		predict_mnist.ipynb
train_cifar.ipynb		train_cifar.ipynb
train_cifar_clean_data.ipynb		train_cifar_clean_data.ipynb
train_cifar_poisoned_data.ipynb		train_cifar_poisoned_data.ipynb
train_mnist.ipynb		train_mnist.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BadNet: Backdoor Attack on Image Classification Models

Overview

Features

Dataset

Installation

Usage

1. Generate Clean and Poisoned Datasets

CIFAR-10 (Poisoned)

CIFAR-10 (Clean)

MNIST (Poisoned)

2. Train the Model

Train on Clean CIFAR-10

Train on Poisoned CIFAR-10

3. Evaluate the Model

Test on CIFAR-10

Test on MNIST

Model Architecture

Results

Future Work

References

License

Contact

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

BadNet: Backdoor Attack on Image Classification Models

Overview

Features

Dataset

Installation

Usage

1. Generate Clean and Poisoned Datasets

CIFAR-10 (Poisoned)

CIFAR-10 (Clean)

MNIST (Poisoned)

2. Train the Model

Train on Clean CIFAR-10

Train on Poisoned CIFAR-10

3. Evaluate the Model

Test on CIFAR-10

Test on MNIST

Model Architecture

Results

Future Work

References

License

Contact

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages