NOTABLE

Code for "NOTABLE: Transferable Backdoor Attacks Against Prompt-based NLP Models"

Environment

See requirements.txt

Search-based target anchor identification

Use gradient-descent to identify target anchors.

cd autoprompt
python -m autoprompt.label_search --train data/yelp/create_anchor.tsv --template '[CLS] {sentence} Is it a positive semantic of this sentence? [T] [T] [T] [P]. [SEP]' --label-map '{"0": 0, "1": 1}' --iters 50 --model-name 'bert-base-uncased'

Backdoor training

Generate a backdoor PLM on Yelp with bert-base-uncased.

CUDA_VISIBLE_DEVICES=0 python main.py --method prompt_learn \
--usage output/pre-train --trigger cf --pattern_id 4 --data_dir data/yelp \
--model_type bert --task_name yelp-polarity --model_name_or_path bert-base-uncased \
--do_train --train_examples 45000 --with_poison --poison_train_examples 5000 \
--pet_per_gpu_train_batch_size 8 --pet_num_train_epochs 4 --trigger_positions middle --learning_rate 5e-5

Downstream retraining and evaluating

Retrain the backdoored PLM on SST-2 for downtream evaluation.

CUDA_VISIBLE_DEVICES=0 python eval.py

Acknowledgement

The code is modified based on PET(https://github.com/timoschick/pet) and Autoprompt(https://github.com/ucinlp/autoprompt).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.ipynb_checkpoints

.ipynb_checkpoints

autoprompt

autoprompt

data

data

prompt_learn

prompt_learn

README.md

README.md

eval.py

eval.py

log.py

log.py

main.py

main.py

requirements.txt

requirements.txt

Repository files navigation

NOTABLE

Environment

Search-based target anchor identification

Backdoor training

Downstream retraining and evaluating

Acknowledgement

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.ipynb_checkpoints		.ipynb_checkpoints
autoprompt		autoprompt
data		data
prompt_learn		prompt_learn
README.md		README.md
eval.py		eval.py
log.py		log.py
main.py		main.py
requirements.txt		requirements.txt

RU-System-Software-and-Security/Notable

Folders and files

Latest commit

History

Repository files navigation

NOTABLE

Environment

Search-based target anchor identification

Backdoor training

Downstream retraining and evaluating

Acknowledgement

About

Resources

Stars

Watchers

Forks

Languages