TextDecepter: Hard Label Black Box attack on NLP
Note: The pretrained target models used for testing the attack algorithm have been taken from Textfooler
Follow the steps to run the attack algorithm
-
Download the counter-fitted-vectors.txt and put it in counter_fitting_embedding folder
-
Download glove embeddings, extract 'glove.6B.200d.txt' and put it in 'word_embeddings_path' folder
-
Download pretrained target model parameters from CNN ,LSTM, BERT and put it under subdirectories 'wordCNN', 'wordLSTM' and 'BERT' in 'saved_models' folder.
-
Use the following syntax to run the attack algorithm
!python Attack_Classification.py --dataset_path 'data/imdb.txt' --target_model 'bert' --counter_fitting_embeddings_path "counter_fitting_embedding/counter-fitted-vectors.txt" --target_model_path "saved_models/bert/imdb" --word_embeddings_path "word_embeddings_path/glove.6B.200d.txt" --output_dir "adv_results" --pos_filter "coarse"
dataset_path can be either "data/imdb.txt" or "data/mr.txt"
target_model can be either wordCNN , wordLSTM, bert or gcp
The result files can be accessed from Google Drive link