ONION

Official implementation of the EMNLP 2021 paper "ONION: A Simple and Effective Defense Against Textual Backdoor Attacks". This codebase is highly based on the implementation of HiddenKiller.

The data folder contains some of our experimented clean data and rare words based poisoned data (BadNets). The poisoning rate is 5%.

Train a Poisoned Victim Model

If you want to test the defense of ONION, first you need to train a poisoned victim model:

CUDA_VISIBLE_DEVICES=0 python run_poison_bert.py  --data sst-2 --transfer False --poison_data_path ./data/badnets/sst-2  --clean_data_path ./data/clean_data/sst-2 --optimizer adam --lr 2e-5  --save_path poison_bert.pkl

Test the Defense Effectiveness of ONION

To test ONION defense on SST-2 against BadNets, please run

CUDA_VISIBLE_DEVICES=0 python test_defense.py  --data sst-2 --model_path poison_bert.pkl  --poison_data_path ./data/badnets/sst-2/test.tsv  --clean_data_path ./data/clean_data/sst-2/dev.tsv

Here, --model_path is the --save_path in run_poison_bert.py that assigns the path to the saved poisoned victim model.

If you want to conduct experiments on other datasets, just follow the file structures, and go over the above procedure.

Citation

Please kindly cite our paper:

@article{qi2020onion,
  title={Onion: A simple and effective defense against textual backdoor attacks},
  author={Qi, Fanchao and Chen, Yangyi and Li, Mukai and Yao, Yuan and Liu, Zhiyuan and Sun, Maosong},
  journal={arXiv preprint arXiv:2011.10369},
  year={2020}
}

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
data		data
experiments		experiments
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

experiments

experiments

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

Repository files navigation

ONION

Train a Poisoned Victim Model

Test the Defense Effectiveness of ONION

Citation

About

Releases

Packages

Contributors 2

Languages

License

thunlp/ONION

Folders and files

Latest commit

History

Repository files navigation

ONION

Train a Poisoned Victim Model

Test the Defense Effectiveness of ONION

Citation

About

Resources

License

Stars

Watchers

Forks

Languages