first commit

bboylyg · Jan 21, 2021 · a5a10f1 · a5a10f1
commit a5a10f1
Show file tree

Hide file tree

Showing 37 changed files with 2,056 additions and 0 deletions.
diff --git a/.idea/NAD.iml b/.idea/NAD.iml
diff --git a/.idea/inspectionProfiles/profiles_settings.xml b/.idea/inspectionProfiles/profiles_settings.xml
diff --git a/.idea/misc.xml b/.idea/misc.xml
diff --git a/.idea/modules.xml b/.idea/modules.xml
diff --git a/.idea/vcs.xml b/.idea/vcs.xml
diff --git a/.idea/workspace.xml b/.idea/workspace.xml
diff --git a/README.md b/README.md
@@ -0,0 +1,140 @@
+# Neural Attention Distillation
+
+This is an implementation demo of the ICLR 2021 paper **[Neural Attention Distillation: Erasing Backdoor Triggers from Deep Neural Networks](https://arxiv.org/abs/2101.05930)** in PyTorch.
+
+![Python 3.6](https://img.shields.io/badge/python-3.6-DodgerBlue.svg?style=plastic)
+![Pytorch 1.10](https://img.shields.io/badge/pytorch-1.2.0-DodgerBlue.svg?style=plastic)
+![CUDA 10.0](https://img.shields.io/badge/cuda-10.0-DodgerBlue.svg?style=plastic)
+![License CC BY-NC](https://img.shields.io/badge/license-CC_BY--NC-DodgerBlue.svg?style=plastic)
+
+## NAD: Quick start with pretrained model
+We have already uploaded the `all2one` pretrained backdoor student model(i.e. gridTrigger WRN-16-1, target label 5) and the clean teacher model(i.e. WRN-16-1) in the path of `./weight/s_net` and `./weight/t_net` respectively. 
+
+For evaluating the performance of  NAD, you can easily run command:
+
+```bash
+$ python main.py 
+```
+where the default parameters are shown in `config.py`.
+
+The trained model will be saved at the path `weight/erasing_net/<s_name>.tar`
+
+Please carefully read the `main.py` and `configs.py`, then change the parameters for your experiment.
+
+### Erasing Results on BadNets
+| Dataset  | Baseline ACC | Baseline ASR | NAD ACC | NAD ASR |
+| -------- | ------------ | ------------ | ------- | ------- |
+| CIFAR-10 | 85.65        | 100.0        | 82.12   | 3.57    |
+
+---
+
+## Training your own backdoored model
+We have provided a `DatasetBD` Class in `data_loader.py` for generating training set of different backdoor attacks. 
+
+For implementing backdoor attack(e.g. GridTrigger attack), you can run the below command:
+
+```bash
+$ python train_badnet.py
+```
+This command will train the backdoored model and print clean accuracies and attack rate. You can also select the other backdoor triggers reported in the paper. 
+
+Please carefully read the `train_badnet.py` and `configs.py`, then change the parameters for your experiment.
+
+## Other source of backdoor attacks
+#### Attack
+
+**CL:** Clean-label backdoor attacks
+
+- [Paper](https://people.csail.mit.edu/madry/lab/cleanlabel.pdf)
+- [Can be modified from this pytorch implementation](https://github.com/MadryLab/cifar10_challenge)
+
+**SIG:** A New Backdoor Attack in CNNS by Training Set Corruption Without Label Poisoning
+
+- [Paper](https://ieeexplore.ieee.org/document/8802997/footnotes)
+
+```python
+## reference code
+def plant_sin_trigger(img, delta=20, f=6, debug=False):
+    """
+    Implement paper:
+    > Barni, M., Kallas, K., & Tondi, B. (2019).
+    > A new Backdoor Attack in CNNs by training set corruption without label poisoning.
+    > arXiv preprint arXiv:1902.11237
+    superimposed sinusoidal backdoor signal with default parameters
+    """
+    alpha = 0.2
+    img = np.float32(img)
+    pattern = np.zeros_like(img)
+    m = pattern.shape[1]
+    for i in range(img.shape[0]):
+        for j in range(img.shape[1]):
+            for k in range(img.shape[2]):
+                pattern[i, j] = delta * np.sin(2 * np.pi * j * f / m)
+
+    img = alpha * np.uint32(img) + (1 - alpha) * pattern
+    img = np.uint8(np.clip(img, 0, 255))
+
+    #     if debug:
+    #         cv2.imshow('planted image', img)
+    #         cv2.waitKey()
+
+    return img
+```
+
+**Refool**: Reflection Backdoor: A Natural Backdoor Attack on Deep Neural Networks
+
+- [Paper](https://arxiv.org/abs/2007.02343)
+- [Code](https://github.com/DreamtaleCore/Refool)
+- [Project]([Code](https://github.com/DreamtaleCore/Refool))
+
+#### Defense
+
+**MCR**: Bridging Mode Connectivity in Loss Landscapes and Adversarial Robustness
+
+- [Paper](https://arxiv.org/abs/2005.00060)
+- [Pytorch implementation](https://github.com/IBM/model-sanitization)
+
+**Fine-tuning & Fine-Pruning**: Defending Against Backdooring Attacks on Deep Neural Networks
+
+- [Paper](https://link.springer.com/chapter/10.1007/978-3-030-00470-5_13)
+- [Pytorch implementation1](https://github.com/VinAIResearch/input-aware-backdoor-attack-release/tree/master/defenses)
+- [Pytorch implementation2](https://github.com/adityarajagopal/pytorch_pruning_finetune)
+
+**Neural Cleanse**: Identifying and Mitigating Backdoor Attacks in Neural Networks
+
+- [Paper](https://people.cs.uchicago.edu/~ravenben/publications/pdf/backdoor-sp19.pdf)
+- [Tensorflow implementation](https://github.com/Abhishikta-codes/neural_cleanse)
+- [Pytorch implementation1](https://github.com/lijiachun123/TrojAi)
+- [Pytorch implementation2](https://github.com/VinAIResearch/input-aware-backdoor-attack-release/tree/master/defenses)
+
+**STRIP**: A Defence Against Trojan Attacks on Deep Neural Networks
+
+- [Paper](https://arxiv.org/pdf/1911.10312.pdf)
+- [Pytorch implementation1](https://github.com/garrisongys/STRIP)
+- [Pytorch implementation2](https://github.com/VinAIResearch/input-aware-backdoor-attack-release/tree/master/defenses)
+
+#### Library
+
+`Note`: TrojanZoo provides a universal pytorch platform to conduct security researches (especially backdoor attacks/defenses) of image classification in deep learning.
+
+Backdoors 101 — is a PyTorch framework for state-of-the-art backdoor defenses and attacks on deep learning models. 
+
+- [trojanzoo](https://github.com/ain-soph/trojanzoo)
+- [backdoors101](https://github.com/ebagdasa/backdoors101)
+
+## References
+
+If you find this code is useful for your research, please cite our paper
+```
+@article{li2020NAD,
+  title={Neural Attention Distillation: Erasing Backdoor Triggers from Deep Neural Networks},
+  author={Yige Li, Nodens Koren, Lingjuan Lyu, Xixiang Lyu, Bo Li, Xingjun Ma},
+  booktitle={ICLR},
+  year={2021}
+}
+```
+
+## Contacts
+
+If you have any questions, leave a message below with GitHub.
+
diff --git a/__pycache__/at.cpython-36.pyc b/__pycache__/at.cpython-36.pyc
diff --git a/__pycache__/config.cpython-36.pyc b/__pycache__/config.cpython-36.pyc
diff --git a/__pycache__/data_loader.cpython-36.pyc b/__pycache__/data_loader.cpython-36.pyc
diff --git a/at.py b/at.py
@@ -0,0 +1,33 @@
+from __future__ import absolute_import
+from __future__ import print_function
+from __future__ import division
+import torch
+import torch.nn as nn
+import torch.nn.functional as F
+
+'''
+AT with sum of absolute values with power p
+code from: https://github.com/AberHu/Knowledge-Distillation-Zoo
+'''
+class AT(nn.Module):
+	'''
+	Paying More Attention to Attention: Improving the Performance of Convolutional
+	Neural Netkworks wia Attention Transfer
+	https://arxiv.org/pdf/1612.03928.pdf
+	'''
+	def __init__(self, p):
+		super(AT, self).__init__()
+		self.p = p
+
+	def forward(self, fm_s, fm_t):
+		loss = F.mse_loss(self.attention_map(fm_s), self.attention_map(fm_t))
+
+		return loss
+
+	def attention_map(self, fm, eps=1e-6):
+		am = torch.pow(torch.abs(fm), self.p)
+		am = torch.sum(am, dim=1, keepdim=True)
+		norm = torch.norm(am, dim=(2,3), keepdim=True)
+		am = torch.div(am, norm+eps)
+
+		return am
diff --git a/config.py b/config.py
@@ -0,0 +1,49 @@
+import argparse
+
+def get_arguments():
+    parser = argparse.ArgumentParser()
+
+    # various path
+    parser.add_argument('--checkpoint_root', type=str, default='./weight/erasing_net', help='models weight are saved here')
+    parser.add_argument('--log_root', type=str, default='./results', help='logs are saved here')
+    parser.add_argument('--dataset', type=str, default='CIFAR10', help='name of image dataset')
+    parser.add_argument('--s_model', type=str, default='./weight/s_net/WRN-16-1-S-model_best.pth.tar', help='path of student model')
+    parser.add_argument('--t_model', type=str, default='./weight/t_net/WRN-16-1-T-model_best.pth.tar', help='path of teacher model')
+
+    # training hyper parameters
+    parser.add_argument('--print_freq', type=int, default=50, help='frequency of showing training results on console')
+    parser.add_argument('--epochs', type=int, default=10, help='number of total epochs to run')
+    parser.add_argument('--batch_size', type=int, default=64, help='The size of batch')
+    parser.add_argument('--lr', type=float, default=0.1, help='initial learning rate')
+    parser.add_argument('--momentum', type=float, default=0.9, help='momentum')
+    parser.add_argument('--weight_decay', type=float, default=1e-4, help='weight decay')
+    parser.add_argument('--num_class', type=int, default=10, help='number of classes')
+    parser.add_argument('--ratio', type=float, default=0.05, help='ratio of training data')
+    parser.add_argument('--beta1', type=int, default=5000, help='beta of low layer')
+    parser.add_argument('--beta2', type=int, default=0, help='beta of middle layer')
+    parser.add_argument('--beta3', type=int, default=0, help='beta of high layer')
+    parser.add_argument('--p', type=float, default=2.0, help='power for AT')
+    parser.add_argument('--threshold_clean', type=float, default=70.0, help='threshold of save weight')
+    parser.add_argument('--threshold_bad', type=float, default=90.0, help='threshold of save weight')
+    parser.add_argument('--cuda', type=int, default=1)
+    parser.add_argument('--device', type=str, default='cuda')
+    parser.add_argument('--save', type=int, default=1)
+
+    # others
+    parser.add_argument('--seed', type=int, default=2, help='random seed')
+    parser.add_argument('--note', type=str, default='try', help='note for this run')
+
+    # net and dataset choosen
+    parser.add_argument('--data_name', type=str, default='CIFAR10', help='name of dataset')
+    parser.add_argument('--t_name', type=str, default='WRN-16-1', help='name of teacher')
+    parser.add_argument('--s_name', type=str, default='WRN-16-1', help='name of student')
+
+    # backdoor attacks
+    parser.add_argument('--inject_portion', type=float, default=0.1, help='ratio of backdoor samples')
+    parser.add_argument('--target_label', type=int, default=5, help='class of target label')
+    parser.add_argument('--trigger_type', type=str, default='gridTrigger', help='type of backdoor trigger')
+    parser.add_argument('--target_type', type=str, default='all2one', help='type of backdoor label')
+    parser.add_argument('--trig_w', type=int, default=3, help='width of trigger pattern')
+    parser.add_argument('--trig_h', type=int, default=3, help='height of trigger pattern')
+
+    return parser