Code for the following paper:
Robust Encodings: A Framework for Combating Adversarial Typos
Erik Jones, Robin Jia, Aditi Raghunathan, and Percy Liang
Association for Computational Linguistics (ACL), 2020
We will run experiments for six tasks: RTE, MRPC, SST-2, QNLI, MNLI, QQP. These are used as arguments whenever task name (or mrpc in the following code, which is used as an example) comes up. Data is available on codalab.
The core element of our defense is a "clusterer" object, which we use to map tokens to a series of representatives, before inputting into a normal model. To create a clusterer, we use two different data sources:
- Embeddings used to filter vocab words:
data/glove/glove.6b.50d.txt
- Word frequencies:
data/COCA/coca-1grams.json
Given these files, to make a clusterer, run:python construct_clusters.py --vocab_size 100000 --perturb_type ed1
This will form a clusterer object with pathclusterers/vocab100000_ed1.pkl
, which will be used in future experiments.
Now, lots of the following code is adapted from an older version of https://github.com/huggingface/transformers. Data can be found there. We will first fine-tune and save uncased BERT on the MRPC task. To do so, we set the following variables:
export TASK_NAME=MRPC
export CLUSTERER_PATH=clusterers/vocab100000_ed1.pkl
export GLUE_DIR=data/glue_data
Where the data from MRPC is stored in glue_data
With these variables set, we run:
python run_glue.py --task_name $TASK_NAME --do_lower_case --do_train --do_eval --data_dir $GLUE_DIR/$TASK_NAME --output_dir model_output/$TASK_NAME --overwrite_output_dir --seed_output_dir --save_results --save_dir codalab --recoverer identity --augmentor identity --run_test
This gives us a normally trained model, which will get saved at model_output/MRPC_XXXXXX where XXXXXX is a random six digit number (this is the --seed_output_dir
argument. Information (including clean accuracy which we report, and future attack statistics) will be stored in results/codalab/MRPC_XXXXXX.json. To attack this model, we run:
python run_glue.py --task_name $TASK_NAME --do_lower_case --do_eval --data_dir $GLUE_DIR/$TASK_NAME --output_dir model_output/$TASK_NAME_XXXXXX --save_results --save_dir codalab --recoverer identity --augmentor identity --run_test --model_name_or_path model_output/MRPC_XXXXXX --attack --new_attack --attacker beam-search --beam_width 5 --attack_name LongDeleteShortAll --attack_type ed1
There are a lot of arguments here. attack
means an adversary is searching for a typo, and new_attack
says to avoid a cache. attacker
determines the style of heuristic attack, and attack_name
gives the type of token-level peturbation space used for the attack. This is all the information we need for the identity.
To run this experiment with data augmentation, repeat both runs of python run_glue.py, but with the flag --augmentor k-aug
.
We'll now replicate the entire typo corrector training process, utilizing the new environment variable:
$TC_DIR=$HOME/tc_data
This will have to be made if it does not exist, but it will store preprocessed data, vocabularies, and models. First, we run:
preprocess_tc.py --glue_dir $GLUE_DIR --save_dir $TC_DIR/glue_tc_preprocessed
This converts convert the data in $GLUE_DIR
into the correct format to train the typo corrector. This saves in $TC_DIR/glue_tc_preprocessed
. Next, cd to scRNN
, and run:
python train.py --task_name mrpc --preprocessed_glue_dir $TC_DIR/glue_tc_preprocessed --tc_dir $TC_DIR
This trains a typo-corrector based on random perturbations to the MRPC data. The typo corrector is saved at $TC_DIR/model_dumps
and the associated vocab (necessary) is saved at TC_DIR/vocab
(both will likely have to be premade in codalab. Now, we can repeat the original run except with --recoverer scrnn
and tc_dir $TC_DIR
.
Finally, we're done with the baselines! To try using clusters as a defense, we use:
python run_glue.py --task_name $TASK_NAME --do_lower_case --do_train --do_eval --data_dir $GLUE_DIR/$TASK_NAME --output_dir model_output/$TASK_NAME --overwrite_output_dir --seed_output_dir --save_results --save_dir codalab --recoverer clust-rep --clusterer_path $CLUSTERER_PATH --augmentor identity --run_test --do_robust
Here, we include clusterer_path
to load the mapping, and do_robust
to compute the actual robust accuracy.
We will now construct our more complicated clusters, the agglomerative clusters. To leverage existing connected components for computational constraints, we parellelize. To do so, first make the directory where the two partial clusteres will be stored: $clusterers/vocab100000_ed1_gamma0.3$
. Once the directory is made, run, in parallel:
python agglom_clusters.py --gamma 0.3 --clusterer_path $CLUSTERER_PATH --job_id 0 --num_jobs 2
python agglom_clusters.py --gamma 0.3 --clusterer_path $CLUSTERER_PATH --job_id 1 --num_jobs 2
This will save two partial clusterers. To combine them (after both jobs are complete) run:
python reconstruct_clusterers.py --clusterer_dir clusterers/vocab100000_ed1_gamma0.3
This will save the clusterer at clusterers/vocab100000_ed1_gamma0.3.pkl
. Finally, run the identical commands as connected component clusters, but first use export CLUSTERER_PATH=clusterers/vocab100000_ed1_gamma0.3.pkl
to run. Other value of gamma (only needed for SST-2) are loaded from premade saved files (from exactly this process) in saved_clusterers
.
Much of the code remains the same for internal permutations. Just use --perturb_type intprm
when constructing the clusters, --attack_type intprm
when using an internal permutation attack, and --recoverer clust-intprm
to use an internal permutation recoverer.