This is the official code base for the NAACL 2022 paper, "SemAttack: Natural Textual Attacks via Different Semantic Spaces".
This repo contains code to attack both English and Chinese models. We put the code in seperate folders.
You may use our code to attack other tasks or other datasets.
-
Create environment and install requirements.
pip install -r requirements.txt
-
Construct BERT embedding space. We follow the process in paper "Visualizing and Measuring the Geometry of BERT" (GitHub) to calculate word embeddings. The embedding space is stored as an N * 768 tensor, where N is the number of total embeddings. Use a list to indicate which word corresponds to each vector. Please see
word_list.pkl
for more details. -
Data preprocessing. Please use
get_FC.py
,get_FT.py
, andget_FK.py
to calculate candidate perturbations generated by different semantic perturbation functions. We also provide the processed Yelp dataset as an example. You can also follow these scripts to prepare other datasets.
Use train.py
to train your models. We also provide our pre-trained models here.
Use attack.py
to attack pre-trained models. For example, the script below attacks a BERT model fine-tuned on Yelp dataset using semantic perturbation function FC
python attack.py \
--function cluster \
--const 10 \
--confidence 0 \
--lr 0.15 \
--load path_to_pretrained_model \
--test-model path_to_pretrained_model \
--test-data path_to_dataset_with_embedding_space \
--untargeted
You may check the code for more details. You may also try different semantic perturbation functions and different attack parameters.
@inproceedings{
wang2022semattack,
author = {Wang, Boxin and Xu, Chejian and Liu, Xiangyu and Cheng, Yu and Li, Bo},
title = {{S}em{A}ttack: Natural Textual Attacks via Different Semantic Spaces},
year = {2022},
bootitle={Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies}
}