# TextAttack & AllenNLP 

This is an example of testing adversarial attacks from TextAttack on pretrained models provided by AllenNLP. 

In a few lines of code, we load a sentiment analysis model trained on the Stanford Sentiment Treebank and configure it with a TextAttack model wrapper. Then, we initialize the TextBugger attack and run the attack on a few samples from the SST-2 train set.

For more information on AllenNLP pre-trained models: https://docs.allennlp.org/v1.0.0rc3/tutorials/getting_started/using_pretrained_models/

For more information about the TextBugger attack: https://arxiv.org/abs/1812.05271

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/QData/TextAttack/blob/master/docs/2notebook/Example_2_allennlp.ipynb)

[![View Source on GitHub](https://img.shields.io/badge/github-view%20source-black.svg)](https://github.com/QData/TextAttack/blob/master/docs/2notebook/Example_2_allennlp.ipynb)

In [None]:
!pip install allennlp allennlp_models textattack

In [None]:
!pip install datasets pyarrow transformers --upgrade

In [None]:
from allennlp.predictors import Predictor
import allennlp_models.classification

import textattack

class AllenNLPModel(textattack.models.wrappers.ModelWrapper):
    def __init__(self):
        self.predictor = Predictor.from_path("https://storage.googleapis.com/allennlp-public-models/basic_stanford_sentiment_treebank-2020.06.09.tar.gz")

    def __call__(self, text_input_list):
        outputs = []
        for text_input in text_input_list:
            outputs.append(self.predictor.predict(sentence=text_input))
        # For each output, outputs['logits'] contains the logits where
        # index 0 corresponds to the positive and index 1 corresponds 
        # to the negative score. We reverse the outputs (by reverse slicing,
        # [::-1]) so that negative comes first and positive comes second.
        return [output['logits'][::-1] for output in outputs]

model_wrapper = AllenNLPModel()

In [None]:
from textattack.datasets import HuggingFaceDataset
from textattack.attack_recipes import TextBuggerLi2018

dataset = HuggingFaceDataset("glue", "sst2", "train")
attack = TextBuggerLi2018(model_wrapper)

results = list(attack.attack_dataset(dataset, indices=range(20)))
for idx, result in enumerate(results):
    print(f'Result {idx}:')
    print(result.__str__(color_method='ansi'))
    print('\n')
print()

[34;1mtextattack[0m: Loading [94mnlp[0m dataset [94mglue[0m, subset [94msst2[0m, split [94mtrain[0m.
[34;1mtextattack[0m: Unknown if model of class <class '__main__.AllenNLPModel'> compatible with goal function <class 'textattack.goal_functions.classification.untargeted_classification.UntargetedClassification'>.
  embeddings[len(transformed_texts) :]


Result 0:
[91mNegative (95%)[0m --> [92mPositive (93%)[0m

[91mhide[0m new secretions from the parental units 

[92mconcealing[0m new secretions from the parental units 



Result 1:
[91mNegative (96%)[0m --> [91m[FAILED][0m

contains no wit , only labored gags 



Result 2:
[92mPositive (100%)[0m --> [91m[FAILED][0m

that loves its characters and communicates something rather beautiful about human nature 



Result 3:
[92mPositive (82%)[0m --> [37m[SKIPPED][0m

remains utterly satisfied to remain the same throughout 



Result 4:
[91mNegative (98%)[0m --> [92mPositive (52%)[0m

on the [91mworst[0m [91mrevenge-of-the-nerds[0m clichés the filmmakers could [91mdredge[0m up 

on the [92mpire[0m [92mrеvenge-of-the-nerds[0m clichés the filmmakers could [92mdragging[0m up 



Result 5:
[91mNegative (99%)[0m --> [91m[FAILED][0m

that 's far too tragic to merit such superficial treatment 



Result 6:
[92mPositive (98%)[0m --> [91mNegative (50%)[0m

