Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
0 parents
commit a5a10f1
Showing
37 changed files
with
2,056 additions
and
0 deletions.
There are no files selected for viewing
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Oops, something went wrong.
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Oops, something went wrong.
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Oops, something went wrong.
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Oops, something went wrong.
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Oops, something went wrong.
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,140 @@ | ||
# Neural Attention Distillation | ||
|
||
This is an implementation demo of the ICLR 2021 paper **[Neural Attention Distillation: Erasing Backdoor Triggers from Deep Neural Networks](https://arxiv.org/abs/2101.05930)** in PyTorch. | ||
|
||
![Python 3.6](https://img.shields.io/badge/python-3.6-DodgerBlue.svg?style=plastic) | ||
![Pytorch 1.10](https://img.shields.io/badge/pytorch-1.2.0-DodgerBlue.svg?style=plastic) | ||
![CUDA 10.0](https://img.shields.io/badge/cuda-10.0-DodgerBlue.svg?style=plastic) | ||
![License CC BY-NC](https://img.shields.io/badge/license-CC_BY--NC-DodgerBlue.svg?style=plastic) | ||
|
||
## NAD: Quick start with pretrained model | ||
We have already uploaded the `all2one` pretrained backdoor student model(i.e. gridTrigger WRN-16-1, target label 5) and the clean teacher model(i.e. WRN-16-1) in the path of `./weight/s_net` and `./weight/t_net` respectively. | ||
|
||
For evaluating the performance of NAD, you can easily run command: | ||
|
||
```bash | ||
$ python main.py | ||
``` | ||
where the default parameters are shown in `config.py`. | ||
|
||
The trained model will be saved at the path `weight/erasing_net/<s_name>.tar` | ||
|
||
Please carefully read the `main.py` and `configs.py`, then change the parameters for your experiment. | ||
|
||
### Erasing Results on BadNets | ||
| Dataset | Baseline ACC | Baseline ASR | NAD ACC | NAD ASR | | ||
| -------- | ------------ | ------------ | ------- | ------- | | ||
| CIFAR-10 | 85.65 | 100.0 | 82.12 | 3.57 | | ||
|
||
--- | ||
|
||
## Training your own backdoored model | ||
We have provided a `DatasetBD` Class in `data_loader.py` for generating training set of different backdoor attacks. | ||
|
||
For implementing backdoor attack(e.g. GridTrigger attack), you can run the below command: | ||
|
||
```bash | ||
$ python train_badnet.py | ||
``` | ||
This command will train the backdoored model and print clean accuracies and attack rate. You can also select the other backdoor triggers reported in the paper. | ||
|
||
Please carefully read the `train_badnet.py` and `configs.py`, then change the parameters for your experiment. | ||
|
||
## Other source of backdoor attacks | ||
#### Attack | ||
|
||
**CL:** Clean-label backdoor attacks | ||
|
||
- [Paper](https://people.csail.mit.edu/madry/lab/cleanlabel.pdf) | ||
- [Can be modified from this pytorch implementation](https://github.com/MadryLab/cifar10_challenge) | ||
|
||
**SIG:** A New Backdoor Attack in CNNS by Training Set Corruption Without Label Poisoning | ||
|
||
- [Paper](https://ieeexplore.ieee.org/document/8802997/footnotes) | ||
|
||
```python | ||
## reference code | ||
def plant_sin_trigger(img, delta=20, f=6, debug=False): | ||
""" | ||
Implement paper: | ||
> Barni, M., Kallas, K., & Tondi, B. (2019). | ||
> A new Backdoor Attack in CNNs by training set corruption without label poisoning. | ||
> arXiv preprint arXiv:1902.11237 | ||
superimposed sinusoidal backdoor signal with default parameters | ||
""" | ||
alpha = 0.2 | ||
img = np.float32(img) | ||
pattern = np.zeros_like(img) | ||
m = pattern.shape[1] | ||
for i in range(img.shape[0]): | ||
for j in range(img.shape[1]): | ||
for k in range(img.shape[2]): | ||
pattern[i, j] = delta * np.sin(2 * np.pi * j * f / m) | ||
|
||
img = alpha * np.uint32(img) + (1 - alpha) * pattern | ||
img = np.uint8(np.clip(img, 0, 255)) | ||
|
||
# if debug: | ||
# cv2.imshow('planted image', img) | ||
# cv2.waitKey() | ||
|
||
return img | ||
``` | ||
|
||
**Refool**: Reflection Backdoor: A Natural Backdoor Attack on Deep Neural Networks | ||
|
||
- [Paper](https://arxiv.org/abs/2007.02343) | ||
- [Code](https://github.com/DreamtaleCore/Refool) | ||
- [Project]([Code](https://github.com/DreamtaleCore/Refool)) | ||
|
||
#### Defense | ||
|
||
**MCR**: Bridging Mode Connectivity in Loss Landscapes and Adversarial Robustness | ||
|
||
- [Paper](https://arxiv.org/abs/2005.00060) | ||
- [Pytorch implementation](https://github.com/IBM/model-sanitization) | ||
|
||
**Fine-tuning & Fine-Pruning**: Defending Against Backdooring Attacks on Deep Neural Networks | ||
|
||
- [Paper](https://link.springer.com/chapter/10.1007/978-3-030-00470-5_13) | ||
- [Pytorch implementation1](https://github.com/VinAIResearch/input-aware-backdoor-attack-release/tree/master/defenses) | ||
- [Pytorch implementation2](https://github.com/adityarajagopal/pytorch_pruning_finetune) | ||
|
||
**Neural Cleanse**: Identifying and Mitigating Backdoor Attacks in Neural Networks | ||
|
||
- [Paper](https://people.cs.uchicago.edu/~ravenben/publications/pdf/backdoor-sp19.pdf) | ||
- [Tensorflow implementation](https://github.com/Abhishikta-codes/neural_cleanse) | ||
- [Pytorch implementation1](https://github.com/lijiachun123/TrojAi) | ||
- [Pytorch implementation2](https://github.com/VinAIResearch/input-aware-backdoor-attack-release/tree/master/defenses) | ||
|
||
**STRIP**: A Defence Against Trojan Attacks on Deep Neural Networks | ||
|
||
- [Paper](https://arxiv.org/pdf/1911.10312.pdf) | ||
- [Pytorch implementation1](https://github.com/garrisongys/STRIP) | ||
- [Pytorch implementation2](https://github.com/VinAIResearch/input-aware-backdoor-attack-release/tree/master/defenses) | ||
|
||
#### Library | ||
|
||
`Note`: TrojanZoo provides a universal pytorch platform to conduct security researches (especially backdoor attacks/defenses) of image classification in deep learning. | ||
|
||
Backdoors 101 — is a PyTorch framework for state-of-the-art backdoor defenses and attacks on deep learning models. | ||
|
||
- [trojanzoo](https://github.com/ain-soph/trojanzoo) | ||
- [backdoors101](https://github.com/ebagdasa/backdoors101) | ||
|
||
## References | ||
|
||
If you find this code is useful for your research, please cite our paper | ||
``` | ||
@article{li2020NAD, | ||
title={Neural Attention Distillation: Erasing Backdoor Triggers from Deep Neural Networks}, | ||
author={Yige Li, Nodens Koren, Lingjuan Lyu, Xixiang Lyu, Bo Li, Xingjun Ma}, | ||
booktitle={ICLR}, | ||
year={2021} | ||
} | ||
``` | ||
|
||
## Contacts | ||
|
||
If you have any questions, leave a message below with GitHub. | ||
|
Binary file not shown.
Binary file not shown.
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,33 @@ | ||
from __future__ import absolute_import | ||
from __future__ import print_function | ||
from __future__ import division | ||
import torch | ||
import torch.nn as nn | ||
import torch.nn.functional as F | ||
|
||
''' | ||
AT with sum of absolute values with power p | ||
code from: https://github.com/AberHu/Knowledge-Distillation-Zoo | ||
''' | ||
class AT(nn.Module): | ||
''' | ||
Paying More Attention to Attention: Improving the Performance of Convolutional | ||
Neural Netkworks wia Attention Transfer | ||
https://arxiv.org/pdf/1612.03928.pdf | ||
''' | ||
def __init__(self, p): | ||
super(AT, self).__init__() | ||
self.p = p | ||
|
||
def forward(self, fm_s, fm_t): | ||
loss = F.mse_loss(self.attention_map(fm_s), self.attention_map(fm_t)) | ||
|
||
return loss | ||
|
||
def attention_map(self, fm, eps=1e-6): | ||
am = torch.pow(torch.abs(fm), self.p) | ||
am = torch.sum(am, dim=1, keepdim=True) | ||
norm = torch.norm(am, dim=(2,3), keepdim=True) | ||
am = torch.div(am, norm+eps) | ||
|
||
return am |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,49 @@ | ||
import argparse | ||
|
||
def get_arguments(): | ||
parser = argparse.ArgumentParser() | ||
|
||
# various path | ||
parser.add_argument('--checkpoint_root', type=str, default='./weight/erasing_net', help='models weight are saved here') | ||
parser.add_argument('--log_root', type=str, default='./results', help='logs are saved here') | ||
parser.add_argument('--dataset', type=str, default='CIFAR10', help='name of image dataset') | ||
parser.add_argument('--s_model', type=str, default='./weight/s_net/WRN-16-1-S-model_best.pth.tar', help='path of student model') | ||
parser.add_argument('--t_model', type=str, default='./weight/t_net/WRN-16-1-T-model_best.pth.tar', help='path of teacher model') | ||
|
||
# training hyper parameters | ||
parser.add_argument('--print_freq', type=int, default=50, help='frequency of showing training results on console') | ||
parser.add_argument('--epochs', type=int, default=10, help='number of total epochs to run') | ||
parser.add_argument('--batch_size', type=int, default=64, help='The size of batch') | ||
parser.add_argument('--lr', type=float, default=0.1, help='initial learning rate') | ||
parser.add_argument('--momentum', type=float, default=0.9, help='momentum') | ||
parser.add_argument('--weight_decay', type=float, default=1e-4, help='weight decay') | ||
parser.add_argument('--num_class', type=int, default=10, help='number of classes') | ||
parser.add_argument('--ratio', type=float, default=0.05, help='ratio of training data') | ||
parser.add_argument('--beta1', type=int, default=5000, help='beta of low layer') | ||
parser.add_argument('--beta2', type=int, default=0, help='beta of middle layer') | ||
parser.add_argument('--beta3', type=int, default=0, help='beta of high layer') | ||
parser.add_argument('--p', type=float, default=2.0, help='power for AT') | ||
parser.add_argument('--threshold_clean', type=float, default=70.0, help='threshold of save weight') | ||
parser.add_argument('--threshold_bad', type=float, default=90.0, help='threshold of save weight') | ||
parser.add_argument('--cuda', type=int, default=1) | ||
parser.add_argument('--device', type=str, default='cuda') | ||
parser.add_argument('--save', type=int, default=1) | ||
|
||
# others | ||
parser.add_argument('--seed', type=int, default=2, help='random seed') | ||
parser.add_argument('--note', type=str, default='try', help='note for this run') | ||
|
||
# net and dataset choosen | ||
parser.add_argument('--data_name', type=str, default='CIFAR10', help='name of dataset') | ||
parser.add_argument('--t_name', type=str, default='WRN-16-1', help='name of teacher') | ||
parser.add_argument('--s_name', type=str, default='WRN-16-1', help='name of student') | ||
|
||
# backdoor attacks | ||
parser.add_argument('--inject_portion', type=float, default=0.1, help='ratio of backdoor samples') | ||
parser.add_argument('--target_label', type=int, default=5, help='class of target label') | ||
parser.add_argument('--trigger_type', type=str, default='gridTrigger', help='type of backdoor trigger') | ||
parser.add_argument('--target_type', type=str, default='all2one', help='type of backdoor label') | ||
parser.add_argument('--trig_w', type=int, default=3, help='width of trigger pattern') | ||
parser.add_argument('--trig_h', type=int, default=3, help='height of trigger pattern') | ||
|
||
return parser |
Oops, something went wrong.