Pruning Defense Against Backdooring Attacks on Deep Neural Networks

This repository implements a pruning defense on a backdoored face recognition nueral network as described in this paper:

Fine-Pruning: Defending Against Backdooring Attacks on Deep Neural Networks

Kang Liu, Brendan Dolan-Gavitt, Siddharth Garg

Dependencies

Libraries

Python 3.6.9
Keras 2.3.1
Numpy 1.16.3
Matplotlib 2.2.2
H5py 2.9.0
TensorFlow-gpu 1.15.2

Dataset

Download the validation and test datasets from here and store them under the data/ directory.

To run the program with your own data files instead, you can either provide the paths of the data files as arguments as explained in the Usage section.

Alternatively, you can run the program with default arguments by adding your files in the data/ directory with this naming scheme:

"bd_test.h5" - poisoned test data
"test.h5" - clean test data
"valid.h5" - clean validation data

Usage

The program can be run by standard imports or the CLI. The default and custom arguments for the program are given in the Arguments section.

Running the program from aother Python file

You can run the program with default arguments:

from backdoor_detector import backdoor_detector

repaired_net = backdoor_detector()

You can also provide the program with custom arguments:

from backdoor_detector import backdoor_detector

repaired_net = backdoor_detector(B_path="model/bd_net.h5", Dvalid="data/valid.h5", Dtest="data/test.h5", Btest="data/bd_test.h5", thresholds=[2, 4, 10])

This program will:

save repaired models as '.h5' files in the same directory as the backdoored models at the specified thresholds
save performance data as 'performance.csv'
plot model accuracy and backdoor attack success rate against the fraction of neurons removed
Return the repaired model as a goodnet object that is an ensemble of the original backdoored network and the repaired network

The saved model can then be used to make predictions on data:

predictions = repaired_net(x_data)
y_pred = np.argmax(predictions, axis=1)

Running the program from Command Line Interface

You can run the program with the default arguments using this command:

$ python backdoor_detector.py

You can also provide the program custom arguments:

$ python backdoor_detector.py --model "model/bd_net.h5" --Dvalid "data/valid.h5" --Dtest "data/test.h5" --Btest "data/bd_test.h5" --thresholds 2 4 10

The program will:

save repaired models as '.h5' files in the same directory as the backdoored models at the specified thresholds
save performance data as 'performance.csv'
plot model accuracy and backdoor attack success rate against the fraction of neurons removed

The saved model can be then loaded in a python file:

from utils import *

B_path = "model/bd_net.h5" # path to backdoored model
B_prime = "bd_net_prime_20_percent_threshold" # path to pruned model
# load both models
B = keras.models.load_model(B_path)
B_p = keras.models.load_model(B_prime)
repaired_net = G(B, B_p)
predictions = repaired_net(x_data)
y_pred = np.argmax(predictions, axis=1)

Arguments

CLI Argument	Short hand	Method Argument	Description
`--model`	`-m`	`B_path`	path to backdoored model. Default: "model/bd_net.h5"
`--Dvalid`	`-v`	`Dvalid`	Path to the clean validation data. Default: "data/valid.h5"
`--Dtest`	`-t`	`Dtest`	Path to the clean test data. Default: "data/test.h5"
`--Btest`	`-b`	`Btest`	Path to the poisoned test data.. Default: "data/bd_test.h5"
`--thresholds`	`-th`	`thresholds`	List of accuracy thresholds in percent. Default: `[2, 4, 10]`

Results

Architecture

The baseline DNN used for face recognition is the state-ofthe-art DeepID network that contains three shared convolutional layers followed by two parallel sub-networks that feed into the last two fully connected layers.

The architecture of the model can is in the architecture.py file. A model summary and a flow diagram of the model is shown below.

___________________________________________________________________________________
 Layer (type)                   Output Shape         Param #     Connected to                     
===================================================================================
 input (InputLayer)             [(None, 55, 47, 3)]  0           []                               
                                                                                                  
 conv_1 (Conv2D)                (None, 52, 44, 20)   980         ['input[0][0]']                  
                                                                                                  
 pool_1 (MaxPooling2D)          (None, 26, 22, 20)   0           ['conv_1[0][0]']                 
                                                                                                  
 conv_2 (Conv2D)                (None, 24, 20, 40)   7240        ['pool_1[0][0]']                 
                                                                                                  
 pool_2 (MaxPooling2D)          (None, 12, 10, 40)   0           ['conv_2[0][0]']                 
                                                                                                  
 conv_3 (Conv2D)                (None, 10, 8, 60)    21660       ['pool_2[0][0]']                 
                                                                                                  
 pool_3 (MaxPooling2D)          (None, 5, 4, 60)     0           ['conv_3[0][0]']                 
                                                                                                  
 conv_4 (Conv2D)                (None, 4, 3, 80)     19280       ['pool_3[0][0]']                 
                                                                                                  
 flatten (Flatten)              (None, 1200)         0           ['pool_3[0][0]']                 
                                                                                                  
 flatten_1 (Flatten)            (None, 960)          0           ['conv_4[0][0]']                 
                                                                                                  
 fc_1 (Dense)                   (None, 160)          192160      ['flatten[0][0]']                
                                                                                                  
 fc_2 (Dense)                   (None, 160)          153760      ['flatten_1[0][0]']              
                                                                                                  
 add (Add)                      (None, 160)          0           ['fc_1[0][0]',      
                                                                  'fc_2[0][0]']                   
                                                                                                  
 activation (Activation)        (None, 160)          0           ['add[0][0]']                    
                                                                                                  
 output (Dense)                 (None, 1283)         206563      ['activation[0][0]']             
                                                                                                  
======================================================================================
Total params: 601,643
Trainable params: 601,643
Non-trainable params: 0
______________________________________________________________________________________

Folder Structure

├── data
    └── bd_test.h5  # this is poisoned test data
    └── bd_valid.h5 # poisoned validation data
    └── data.txt  # instructions to download data
    └── test.h5  # clean test data
    └── valid.h5 # clean validation data
├── img
    └── architecture.png # image of the model architecture
├── models
    └── bd_net.h5 # backdoored neural network classifier with N classes
    └── bd_net_prime_2_percent_threshold # repaired network at 2% threshold used with "bd_net.h5"
    └── bd_net_prime_4_percent_threshold # repaired network at 4% threshold used with "bd_net.h5"
    └── bd_net_prime_10_percent_threshold # repaired network at 10% threshold used with "bd_net.h5"
    └── bd_net_prime_20_percent_threshold # repaired network at 20% threshold used with "bd_net.h5"
    └── bd_weights.h5 # weights for the backdoored neural network classifier "bd_net.h5"
├── architecture.py # DNN architecture
└── backdoor_detector.py # program the implements the pruning defense on backdoored network
└── eval.py # evaluation script
└── eval_ensemble.py # evaluation script that can be used with ensemble network
└── Lab2_report.docx # A short report with model accuracy and ASR
└── Lab2_report.pdf # A short report with model accuracy and ASR
└── performanc.csv # data recorded during pruning
└── plot.png # model accuracy and ASR plot against fraction neurons pruned
└── README.md # 
└── utils.py # Utilty methods and class for ensemble repaired network

Name		Name	Last commit message	Last commit date
Latest commit History 72 Commits
data		data
img		img
model		model
.gitignore		.gitignore
README.md		README.md
architecture.py		architecture.py
backdoor_detector.py		backdoor_detector.py
eval.py		eval.py
eval_ensemble.py		eval_ensemble.py
performance.csv		performance.csv
plot.png		plot.png
utils.py		utils.py

Alaqian/Pruning-Defense-Against-Backdooring-Attacks-on-Deep-Neural-Networks

Folders and files

Latest commit

History

Repository files navigation

Pruning Defense Against Backdooring Attacks on Deep Neural Networks

Contents

Dependencies

Libraries

Dataset

Usage

Running the program from aother Python file

Running the program from Command Line Interface

Arguments

Results

Architecture

Folder Structure

About

Resources

Stars

Watchers

Forks

Languages