Skip to content

Commit

Permalink
first commit
Browse files Browse the repository at this point in the history
  • Loading branch information
bboylyg committed Jan 21, 2021
0 parents commit a5a10f1
Show file tree
Hide file tree
Showing 37 changed files with 2,056 additions and 0 deletions.
11 changes: 11 additions & 0 deletions .idea/NAD.iml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 6 additions & 0 deletions .idea/inspectionProfiles/profiles_settings.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 4 additions & 0 deletions .idea/misc.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

8 changes: 8 additions & 0 deletions .idea/modules.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 6 additions & 0 deletions .idea/vcs.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

344 changes: 344 additions & 0 deletions .idea/workspace.xml

Large diffs are not rendered by default.

140 changes: 140 additions & 0 deletions README.md
@@ -0,0 +1,140 @@
# Neural Attention Distillation

This is an implementation demo of the ICLR 2021 paper **[Neural Attention Distillation: Erasing Backdoor Triggers from Deep Neural Networks](https://arxiv.org/abs/2101.05930)** in PyTorch.

![Python 3.6](https://img.shields.io/badge/python-3.6-DodgerBlue.svg?style=plastic)
![Pytorch 1.10](https://img.shields.io/badge/pytorch-1.2.0-DodgerBlue.svg?style=plastic)
![CUDA 10.0](https://img.shields.io/badge/cuda-10.0-DodgerBlue.svg?style=plastic)
![License CC BY-NC](https://img.shields.io/badge/license-CC_BY--NC-DodgerBlue.svg?style=plastic)

## NAD: Quick start with pretrained model
We have already uploaded the `all2one` pretrained backdoor student model(i.e. gridTrigger WRN-16-1, target label 5) and the clean teacher model(i.e. WRN-16-1) in the path of `./weight/s_net` and `./weight/t_net` respectively.

For evaluating the performance of NAD, you can easily run command:

```bash
$ python main.py
```
where the default parameters are shown in `config.py`.

The trained model will be saved at the path `weight/erasing_net/<s_name>.tar`

Please carefully read the `main.py` and `configs.py`, then change the parameters for your experiment.

### Erasing Results on BadNets
| Dataset | Baseline ACC | Baseline ASR | NAD ACC | NAD ASR |
| -------- | ------------ | ------------ | ------- | ------- |
| CIFAR-10 | 85.65 | 100.0 | 82.12 | 3.57 |

---

## Training your own backdoored model
We have provided a `DatasetBD` Class in `data_loader.py` for generating training set of different backdoor attacks.

For implementing backdoor attack(e.g. GridTrigger attack), you can run the below command:

```bash
$ python train_badnet.py
```
This command will train the backdoored model and print clean accuracies and attack rate. You can also select the other backdoor triggers reported in the paper.

Please carefully read the `train_badnet.py` and `configs.py`, then change the parameters for your experiment.

## Other source of backdoor attacks
#### Attack

**CL:** Clean-label backdoor attacks

- [Paper](https://people.csail.mit.edu/madry/lab/cleanlabel.pdf)
- [Can be modified from this pytorch implementation](https://github.com/MadryLab/cifar10_challenge)

**SIG:** A New Backdoor Attack in CNNS by Training Set Corruption Without Label Poisoning

- [Paper](https://ieeexplore.ieee.org/document/8802997/footnotes)

```python
## reference code
def plant_sin_trigger(img, delta=20, f=6, debug=False):
"""
Implement paper:
> Barni, M., Kallas, K., & Tondi, B. (2019).
> A new Backdoor Attack in CNNs by training set corruption without label poisoning.
> arXiv preprint arXiv:1902.11237
superimposed sinusoidal backdoor signal with default parameters
"""
alpha = 0.2
img = np.float32(img)
pattern = np.zeros_like(img)
m = pattern.shape[1]
for i in range(img.shape[0]):
for j in range(img.shape[1]):
for k in range(img.shape[2]):
pattern[i, j] = delta * np.sin(2 * np.pi * j * f / m)

img = alpha * np.uint32(img) + (1 - alpha) * pattern
img = np.uint8(np.clip(img, 0, 255))

# if debug:
# cv2.imshow('planted image', img)
# cv2.waitKey()

return img
```

**Refool**: Reflection Backdoor: A Natural Backdoor Attack on Deep Neural Networks

- [Paper](https://arxiv.org/abs/2007.02343)
- [Code](https://github.com/DreamtaleCore/Refool)
- [Project]([Code](https://github.com/DreamtaleCore/Refool))

#### Defense

**MCR**: Bridging Mode Connectivity in Loss Landscapes and Adversarial Robustness

- [Paper](https://arxiv.org/abs/2005.00060)
- [Pytorch implementation](https://github.com/IBM/model-sanitization)

**Fine-tuning & Fine-Pruning**: Defending Against Backdooring Attacks on Deep Neural Networks

- [Paper](https://link.springer.com/chapter/10.1007/978-3-030-00470-5_13)
- [Pytorch implementation1](https://github.com/VinAIResearch/input-aware-backdoor-attack-release/tree/master/defenses)
- [Pytorch implementation2](https://github.com/adityarajagopal/pytorch_pruning_finetune)

**Neural Cleanse**: Identifying and Mitigating Backdoor Attacks in Neural Networks

- [Paper](https://people.cs.uchicago.edu/~ravenben/publications/pdf/backdoor-sp19.pdf)
- [Tensorflow implementation](https://github.com/Abhishikta-codes/neural_cleanse)
- [Pytorch implementation1](https://github.com/lijiachun123/TrojAi)
- [Pytorch implementation2](https://github.com/VinAIResearch/input-aware-backdoor-attack-release/tree/master/defenses)

**STRIP**: A Defence Against Trojan Attacks on Deep Neural Networks

- [Paper](https://arxiv.org/pdf/1911.10312.pdf)
- [Pytorch implementation1](https://github.com/garrisongys/STRIP)
- [Pytorch implementation2](https://github.com/VinAIResearch/input-aware-backdoor-attack-release/tree/master/defenses)

#### Library

`Note`: TrojanZoo provides a universal pytorch platform to conduct security researches (especially backdoor attacks/defenses) of image classification in deep learning.

Backdoors 101 — is a PyTorch framework for state-of-the-art backdoor defenses and attacks on deep learning models.

- [trojanzoo](https://github.com/ain-soph/trojanzoo)
- [backdoors101](https://github.com/ebagdasa/backdoors101)

## References

If you find this code is useful for your research, please cite our paper
```
@article{li2020NAD,
title={Neural Attention Distillation: Erasing Backdoor Triggers from Deep Neural Networks},
author={Yige Li, Nodens Koren, Lingjuan Lyu, Xixiang Lyu, Bo Li, Xingjun Ma},
booktitle={ICLR},
year={2021}
}
```

## Contacts

If you have any questions, leave a message below with GitHub.

Binary file added __pycache__/at.cpython-36.pyc
Binary file not shown.
Binary file added __pycache__/config.cpython-36.pyc
Binary file not shown.
Binary file added __pycache__/data_loader.cpython-36.pyc
Binary file not shown.
33 changes: 33 additions & 0 deletions at.py
@@ -0,0 +1,33 @@
from __future__ import absolute_import
from __future__ import print_function
from __future__ import division
import torch
import torch.nn as nn
import torch.nn.functional as F

'''
AT with sum of absolute values with power p
code from: https://github.com/AberHu/Knowledge-Distillation-Zoo
'''
class AT(nn.Module):
'''
Paying More Attention to Attention: Improving the Performance of Convolutional
Neural Netkworks wia Attention Transfer
https://arxiv.org/pdf/1612.03928.pdf
'''
def __init__(self, p):
super(AT, self).__init__()
self.p = p

def forward(self, fm_s, fm_t):
loss = F.mse_loss(self.attention_map(fm_s), self.attention_map(fm_t))

return loss

def attention_map(self, fm, eps=1e-6):
am = torch.pow(torch.abs(fm), self.p)
am = torch.sum(am, dim=1, keepdim=True)
norm = torch.norm(am, dim=(2,3), keepdim=True)
am = torch.div(am, norm+eps)

return am
49 changes: 49 additions & 0 deletions config.py
@@ -0,0 +1,49 @@
import argparse

def get_arguments():
parser = argparse.ArgumentParser()

# various path
parser.add_argument('--checkpoint_root', type=str, default='./weight/erasing_net', help='models weight are saved here')
parser.add_argument('--log_root', type=str, default='./results', help='logs are saved here')
parser.add_argument('--dataset', type=str, default='CIFAR10', help='name of image dataset')
parser.add_argument('--s_model', type=str, default='./weight/s_net/WRN-16-1-S-model_best.pth.tar', help='path of student model')
parser.add_argument('--t_model', type=str, default='./weight/t_net/WRN-16-1-T-model_best.pth.tar', help='path of teacher model')

# training hyper parameters
parser.add_argument('--print_freq', type=int, default=50, help='frequency of showing training results on console')
parser.add_argument('--epochs', type=int, default=10, help='number of total epochs to run')
parser.add_argument('--batch_size', type=int, default=64, help='The size of batch')
parser.add_argument('--lr', type=float, default=0.1, help='initial learning rate')
parser.add_argument('--momentum', type=float, default=0.9, help='momentum')
parser.add_argument('--weight_decay', type=float, default=1e-4, help='weight decay')
parser.add_argument('--num_class', type=int, default=10, help='number of classes')
parser.add_argument('--ratio', type=float, default=0.05, help='ratio of training data')
parser.add_argument('--beta1', type=int, default=5000, help='beta of low layer')
parser.add_argument('--beta2', type=int, default=0, help='beta of middle layer')
parser.add_argument('--beta3', type=int, default=0, help='beta of high layer')
parser.add_argument('--p', type=float, default=2.0, help='power for AT')
parser.add_argument('--threshold_clean', type=float, default=70.0, help='threshold of save weight')
parser.add_argument('--threshold_bad', type=float, default=90.0, help='threshold of save weight')
parser.add_argument('--cuda', type=int, default=1)
parser.add_argument('--device', type=str, default='cuda')
parser.add_argument('--save', type=int, default=1)

# others
parser.add_argument('--seed', type=int, default=2, help='random seed')
parser.add_argument('--note', type=str, default='try', help='note for this run')

# net and dataset choosen
parser.add_argument('--data_name', type=str, default='CIFAR10', help='name of dataset')
parser.add_argument('--t_name', type=str, default='WRN-16-1', help='name of teacher')
parser.add_argument('--s_name', type=str, default='WRN-16-1', help='name of student')

# backdoor attacks
parser.add_argument('--inject_portion', type=float, default=0.1, help='ratio of backdoor samples')
parser.add_argument('--target_label', type=int, default=5, help='class of target label')
parser.add_argument('--trigger_type', type=str, default='gridTrigger', help='type of backdoor trigger')
parser.add_argument('--target_type', type=str, default='all2one', help='type of backdoor label')
parser.add_argument('--trig_w', type=int, default=3, help='width of trigger pattern')
parser.add_argument('--trig_h', type=int, default=3, help='height of trigger pattern')

return parser

0 comments on commit a5a10f1

Please sign in to comment.