# TextAttack & AllenNLP 

This is an example of testing adversarial attacks from TextAttack on pretrained models provided by AllenNLP. 

In a few lines of code, we load a sentiment analysis model trained on the Stanford Sentiment Treebank and configure it with a TextAttack model wrapper. Then, we initialize the TextBugger attack and run the attack on a few samples from the SST-2 train set.

For more information on AllenNLP pre-trained models: https://docs.allennlp.org/models/main/

For more information about the TextBugger attack: https://arxiv.org/abs/1812.05271

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/QData/TextAttack/blob/master/docs/2notebook/Example_2_allennlp.ipynb)

[![View Source on GitHub](https://img.shields.io/badge/github-view%20source-black.svg)](https://github.com/QData/TextAttack/blob/master/docs/2notebook/Example_2_allennlp.ipynb)

In [4]:
!pip install allennlp allennlp_models > /dev/null

In [7]:
!pip3 install textattack[tensorflow]

Collecting tensorflow-text>=2
  Downloading tensorflow_text-2.6.0-cp37-cp37m-manylinux1_x86_64.whl (4.4 MB)
[K     |████████████████████████████████| 4.4 MB 5.4 MB/s 
Installing collected packages: tensorflow-text
Successfully installed tensorflow-text-2.6.0


In [8]:
from allennlp.predictors import Predictor
import allennlp_models.classification

import textattack


class AllenNLPModel(textattack.models.wrappers.ModelWrapper):
    def __init__(self):
        self.predictor = Predictor.from_path(
            "https://storage.googleapis.com/allennlp-public-models/basic_stanford_sentiment_treebank-2020.06.09.tar.gz"
        )
        self.model = self.predictor._model
        self.tokenizer = self.predictor._dataset_reader._tokenizer

    def __call__(self, text_input_list):
        outputs = []
        for text_input in text_input_list:
            outputs.append(self.predictor.predict(sentence=text_input))
        # For each output, outputs['logits'] contains the logits where
        # index 0 corresponds to the positive and index 1 corresponds
        # to the negative score. We reverse the outputs (by reverse slicing,
        # [::-1]) so that negative comes first and positive comes second.
        return [output["logits"][::-1] for output in outputs]


model_wrapper = AllenNLPModel()

textattack: Updating TextAttack package dependencies.
textattack: Downloading NLTK required packages.


[nltk_data] Downloading package averaged_perceptron_tagger to
[nltk_data]     /root/nltk_data...
[nltk_data]   Unzipping taggers/averaged_perceptron_tagger.zip.
[nltk_data] Downloading package stopwords to /root/nltk_data...
[nltk_data]   Unzipping corpora/stopwords.zip.
[nltk_data] Downloading package omw to /root/nltk_data...
[nltk_data]   Unzipping corpora/omw.zip.
[nltk_data] Downloading package universal_tagset to /root/nltk_data...
[nltk_data]   Unzipping taggers/universal_tagset.zip.
[nltk_data] Downloading package wordnet to /root/nltk_data...
[nltk_data]   Unzipping corpora/wordnet.zip.
[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Unzipping tokenizers/punkt.zip.


textattack: Downloading https://textattack.s3.amazonaws.com/word_embeddings/paragramcf.
100%|██████████| 481M/481M [00:14<00:00, 33.6MB/s]
textattack: Unzipping file /root/.cache/textattack/tmp7xfefu5f.zip to /root/.cache/textattack/word_embeddings/paragramcf.
textattack: Successfully saved word_embeddings/paragramcf to cache.
Plugin allennlp_models could not be loaded: No module named 'nltk.translate.meteor_score'
downloading: 100%|##########| 37033341/37033341 [00:01<00:00, 27735821.99B/s]


In [9]:
from textattack.datasets import HuggingFaceDataset
from textattack.attack_recipes import TextBuggerLi2018
from textattack.attacker import Attacker


dataset = HuggingFaceDataset("glue", "sst2", "train")
attack = TextBuggerLi2018.build(model_wrapper)

attacker = Attacker(attack, dataset)
attacker.attack_dataset()

Downloading:   0%|          | 0.00/7.78k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/4.47k [00:00<?, ?B/s]

Downloading and preparing dataset glue/sst2 (download: 7.09 MiB, generated: 4.81 MiB, post-processed: Unknown size, total: 11.90 MiB) to /root/.cache/huggingface/datasets/glue/sst2/1.0.0/dacbe3125aa31d7f70367a07a8a9e72a5a0bfeb5fc42e75c9db75b96da6053ad...


Downloading:   0%|          | 0.00/7.44M [00:00<?, ?B/s]

0 examples [00:00, ? examples/s]

0 examples [00:00, ? examples/s]

0 examples [00:00, ? examples/s]

Dataset glue downloaded and prepared to /root/.cache/huggingface/datasets/glue/sst2/1.0.0/dacbe3125aa31d7f70367a07a8a9e72a5a0bfeb5fc42e75c9db75b96da6053ad. Subsequent calls will reuse this data.


  0%|          | 0/3 [00:00<?, ?it/s]

textattack: Loading [94mdatasets[0m dataset [94mglue[0m, subset [94msst2[0m, split [94mtrain[0m.
textattack: Unknown if model of class <class 'allennlp.models.basic_classifier.BasicClassifier'> compatible with goal function <class 'textattack.goal_functions.classification.untargeted_classification.UntargetedClassification'>.


Attack(
  (search_method): GreedyWordSwapWIR(
    (wir_method):  delete
  )
  (goal_function):  UntargetedClassification
  (transformation):  CompositeTransformation(
    (0): WordSwapRandomCharacterInsertion(
        (random_one):  True
      )
    (1): WordSwapRandomCharacterDeletion(
        (random_one):  True
      )
    (2): WordSwapNeighboringCharacterSwap(
        (random_one):  True
      )
    (3): WordSwapHomoglyphSwap
    (4): WordSwapEmbedding(
        (max_candidates):  5
        (embedding):  WordEmbedding
      )
    )
  (constraints): 
    (0): UniversalSentenceEncoder(
        (metric):  angular
        (threshold):  0.8
        (window_size):  inf
        (skip_text_shorter_than_window):  False
        (compare_against_original):  True
      )
    (1): RepeatModification
    (2): StopwordModification
  (is_black_box):  True
) 



  0%|          | 0/10 [00:00<?, ?it/s]Using /tmp/tfhub_modules to cache modules.
Downloading TF-Hub Module 'https://tfhub.dev/google/universal-sentence-encoder/4'.
Downloaded https://tfhub.dev/google/universal-sentence-encoder/4, Total size: 987.47MB
Downloaded TF-Hub Module 'https://tfhub.dev/google/universal-sentence-encoder/4'.
[Succeeded / Failed / Skipped / Total] 1 / 1 / 0 / 2:  20%|██        | 2/10 [01:27<05:48, 43.58s/it]

--------------------------------------------- Result 1 ---------------------------------------------

[[hide]] new secretions from the parental units 

[[concealing]] new secretions from the parental units 


--------------------------------------------- Result 2 ---------------------------------------------

contains no wit , only labored gags 




[Succeeded / Failed / Skipped / Total] 1 / 2 / 1 / 4:  40%|████      | 4/10 [01:27<02:11, 21.91s/it]

--------------------------------------------- Result 3 ---------------------------------------------

that loves its characters and communicates something rather beautiful about human nature 


--------------------------------------------- Result 4 ---------------------------------------------

remains utterly satisfied to remain the same throughout 




[Succeeded / Failed / Skipped / Total] 1 / 3 / 1 / 5:  50%|█████     | 5/10 [01:28<01:28, 17.62s/it]

--------------------------------------------- Result 5 ---------------------------------------------

on the worst revenge-of-the-nerds clichés the filmmakers could dredge up 




[Succeeded / Failed / Skipped / Total] 1 / 4 / 1 / 6:  60%|██████    | 6/10 [01:28<00:59, 14.75s/it]

--------------------------------------------- Result 6 ---------------------------------------------

that 's far too tragic to merit such superficial treatment 




[Succeeded / Failed / Skipped / Total] 2 / 5 / 1 / 8:  80%|████████  | 8/10 [01:29<00:22, 11.24s/it]

--------------------------------------------- Result 7 ---------------------------------------------

[[demonstrates]] that the [[director]] of such [[hollywood]] blockbusters as patriot games can still [[turn]] out a [[small]] , personal [[film]] with an emotional [[wallop]] . 

[[shows]] that the [[directors]] of such [[tinseltown]] blockbusters as patriot games can still [[turning]] out a [[tiny]] , personal [[movies]] with an emotional [[batting]] . 


--------------------------------------------- Result 8 ---------------------------------------------

of saucy 




[Succeeded / Failed / Skipped / Total] 2 / 6 / 1 / 9:  90%|█████████ | 9/10 [01:30<00:10, 10.03s/it]

--------------------------------------------- Result 9 ---------------------------------------------

a depressed fifteen-year-old 's suicidal poetry 




[Succeeded / Failed / Skipped / Total] 3 / 6 / 1 / 10: 100%|██████████| 10/10 [01:30<00:00,  9.05s/it]

--------------------------------------------- Result 10 ---------------------------------------------

are more [[deeply]] thought through than in most ` right-thinking ' films 

are more [[seriously]] thought through than in most ` right-thinking ' films 



+-------------------------------+--------+
| Attack Results                |        |
+-------------------------------+--------+
| Number of successful attacks: | 3      |
| Number of failed attacks:     | 6      |
| Number of skipped attacks:    | 1      |
| Original accuracy:            | 90.0%  |
| Accuracy under attack:        | 60.0%  |
| Attack success rate:          | 33.33% |
| Average perturbed word %:     | 17.94% |
| Average num. words per input: | 9.5    |
| Avg num queries:              | 35.11  |
+-------------------------------+--------+





[<textattack.attack_results.successful_attack_result.SuccessfulAttackResult at 0x7fab125a1810>,
 <textattack.attack_results.failed_attack_result.FailedAttackResult at 0x7fab19420c10>,
 <textattack.attack_results.failed_attack_result.FailedAttackResult at 0x7fab0bb0f4d0>,
 <textattack.attack_results.skipped_attack_result.SkippedAttackResult at 0x7fab0f3cd610>,
 <textattack.attack_results.failed_attack_result.FailedAttackResult at 0x7fab192d4790>,
 <textattack.attack_results.failed_attack_result.FailedAttackResult at 0x7fab0b7ba190>,
 <textattack.attack_results.successful_attack_result.SuccessfulAttackResult at 0x7fab0bd2ee90>,
 <textattack.attack_results.failed_attack_result.FailedAttackResult at 0x7fab09731590>,
 <textattack.attack_results.failed_attack_result.FailedAttackResult at 0x7fab097e9610>,
 <textattack.attack_results.successful_attack_result.SuccessfulAttackResult at 0x7fab13265450>]