MDP

Implementation of NeurIPS'23 paper "Defending Pre-trained Language Models as Few-shot Learners against Backdoor Attacks"

Overview

This work considers backdoor (trojaning) attacks underlying in the Pretrained Language Models (PLMs), such as BERT and its variations, and developing detection techniques leveraging prompts under the classification tasks.

Use the code

Step 1. configurations

Simply run pip install -r requirements.txt

Step 2. Generated backdoored data & model

Run python ./OpenBackdoor/test.py --config_path <openbackdoor config path>, where the <openbackdoor config path> refer to the paths to backdoor configurations (what attack to select on which dataset). For example, python ./OpenBackdoor/test.py --config_path ./OpenBackdoor/configs/badnets/sst2.json.

Step 3. Run detection

To run MDP, simply call python run.py -c <openbackdoor to config>, for example python run.py -c config/sst2/bn.yml

Step 4. Check the results

The detection results are printing on screen (terminal). I recommend to save them in a log file or use tmux to run it on the backend.

Cite

Please cite our paper if it is helpful:

@inproceedings{mdp-nips23,
  title="{Defending Pre-trained Language Models as Few-shot Learners against Backdoor Attacks}",
  author={Xi, Zhaohan and Du, Tianyu and Li, Changjiang and Pang, Ren and Ji, Shouling and Chen, Jinghui and Ma, Fenglong and Wang, Ting},
  booktitle={Proceedings of Conference on Neural Information Processing Systems (NeurIPS)},
  year={2023}
}

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
OpenBackdoor		OpenBackdoor
config		config
data		data
fig		fig
script		script
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
detect.py		detect.py
generate_k_shot_data.py		generate_k_shot_data.py
requirements.txt		requirements.txt
run.py		run.py

License

zhaohan-xi/PLM-prompt-defense

Folders and files

Latest commit

History

Repository files navigation

MDP

Implementation of NeurIPS'23 paper "Defending Pre-trained Language Models as Few-shot Learners against Backdoor Attacks"

Overview

Use the code

Step 1. configurations

Step 2. Generated backdoored data & model

Step 3. Run detection

Step 4. Check the results

Cite

About

Resources

License

Stars

Watchers

Forks

Languages