# TextAttack on Keras Model

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/QData/TextAttack/blob/master/docs/2notebook/Example_6_Keras.ipynb)

[![View Source on GitHub](https://img.shields.io/badge/github-view%20source-black.svg)](https://github.com/QData/TextAttack/blob/master/docs/2notebook/Example_6_Keras.ipynb)

## Training

The code below trains a basic neural network on a series of movie reviews from the IMDB dataset, loaded using Tensorflow's datasets module. Each review is encoded as a sequence of tokens corresponding to a word's index in the vocabulary. Class labels are provided, denoting a positive or negative sentiment. 

See [here](https://www.tensorflow.org/api_docs/python/tf/keras/datasets/imdb/load_data) for more information on the IMDB dataset. 


In [None]:
import tensorflow as tf
import keras
import numpy as np
from keras.utils import to_categorical
from textattack.models.wrappers import ModelWrapper
from textattack.datasets import HuggingFaceDataset
from textattack.attack_recipes import PWWSRen2019

import numpy as np
from keras.utils import to_categorical
from keras import models
from keras import layers
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Flatten
from keras.layers import Dropout

from nltk.tokenize import word_tokenize, RegexpTokenizer


textattack: Updating TextAttack package dependencies.
textattack: Downloading NLTK required packages.


[nltk_data] Downloading package averaged_perceptron_tagger to
[nltk_data]     /root/nltk_data...
[nltk_data]   Unzipping taggers/averaged_perceptron_tagger.zip.
[nltk_data] Downloading package stopwords to /root/nltk_data...
[nltk_data]   Unzipping corpora/stopwords.zip.
[nltk_data] Downloading package omw to /root/nltk_data...
[nltk_data]   Unzipping corpora/omw.zip.
[nltk_data] Downloading package universal_tagset to /root/nltk_data...
[nltk_data]   Unzipping taggers/universal_tagset.zip.
[nltk_data] Downloading package wordnet to /root/nltk_data...
[nltk_data]   Unzipping corpora/wordnet.zip.
[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Unzipping tokenizers/punkt.zip.


textattack: Downloading https://textattack.s3.amazonaws.com/word_embeddings/paragramcf.
100%|██████████| 481M/481M [00:07<00:00, 60.7MB/s]
textattack: Unzipping file /root/.cache/textattack/tmplesf9kyn.zip to /root/.cache/textattack/word_embeddings/paragramcf.
textattack: Successfully saved word_embeddings/paragramcf to cache.


Below, we load the IMDB dataset from Tensorflow and transform it for our classifier, using a Bag-of-Words format. 

In [None]:

NUM_WORDS = 1000

(x_train_tokens, y_train), (x_test_tokens, y_test) = tf.keras.datasets.imdb.load_data(
    path="imdb.npz",
    num_words=NUM_WORDS,
    skip_top=0,
    maxlen=None,
    seed=113,
    start_char=1,
    oov_char=2,
    index_from=3
)

def transform(x):
  x_transform = []
  for i, word_indices in enumerate(x):
    BoW_array = np.zeros((NUM_WORDS,))
    for index in word_indices:
      if index < len(BoW_array):
        BoW_array[index] += 1
    x_transform.append(BoW_array)
  return np.array(x_transform)
    

index = int(0.9 * len(x_train_tokens))
x_train = transform(x_train_tokens)[:index]
x_test = transform(x_test_tokens)[index:]
y_train = np.array(y_train[:index])
y_test = np.array(y_test[index:])
y_train = to_categorical(y_train)
y_test = to_categorical(y_test)

vocabulary = tf.keras.datasets.imdb.get_word_index(
    path='imdb_word_index.json'
)

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/imdb.npz


  x_train, y_train = np.array(xs[:idx]), np.array(labels[:idx])
  x_test, y_test = np.array(xs[idx:]), np.array(labels[idx:])


Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/imdb_word_index.json


With our data successfully loaded, we can now design and trained our model. 

In [None]:
#Model Created with Keras
model = Sequential()
model.add(Dense(512, activation='relu', input_dim=NUM_WORDS))
model.add(Dropout(0.3))
model.add(Dense(100, activation='relu'))
model.add(Dense(2, activation='sigmoid'))
opt = keras.optimizers.Adam(learning_rate=0.00001)

model.compile(
 optimizer = opt,
 loss = "binary_crossentropy",
 metrics = ["accuracy"]
)


results = model.fit(
 x_train, y_train,
 epochs= 18,
 batch_size = 512,
 validation_data = (x_test, y_test)
)


print(results.history)


Epoch 1/18
Epoch 2/18
Epoch 3/18
Epoch 4/18
Epoch 5/18
Epoch 6/18
Epoch 7/18
Epoch 8/18
Epoch 9/18
Epoch 10/18
Epoch 11/18
Epoch 12/18
Epoch 13/18
Epoch 14/18
Epoch 15/18
Epoch 16/18
Epoch 17/18
Epoch 18/18
{'loss': [1.168331265449524, 0.9187908172607422, 0.8815110325813293, 0.8546828031539917, 0.8347113728523254, 0.8194622993469238, 0.7904958128929138, 0.7830212712287903, 0.7709481120109558, 0.7497506737709045, 0.7426773309707642, 0.7212800979614258, 0.7126153707504272, 0.7101855874061584, 0.6956013441085815, 0.6835911273956299, 0.6701887845993042, 0.6605261564254761], 'accuracy': [0.503155529499054, 0.5118666887283325, 0.522933304309845, 0.5370222330093384, 0.5508888959884644, 0.5589333176612854, 0.579022228717804, 0.5864889025688171, 0.5924000144004822, 0.607866644859314, 0.6161333322525024, 0.6287111043930054, 0.638177752494812, 0.6425777673721313, 0.6537333130836487, 0.6633777618408203, 0.6739555597305298, 0.6809333562850952], 'val_loss': [0.7790982723236084, 0.7202138900756836, 0

## Attacking

With our model trained, we can create a  `ModelWrapper` that will allow us to run TextAttack on a custom Keras model. Each `ModelWrapper` must implement a single method, `__call__`, which takes a list of strings and returns a `List`, `np.ndarray`, or `torch.Tensor` of predictions.

In [None]:
class CustomKerasModelWrapper(ModelWrapper):
    def __init__(self, model):
        self.model = model

    def __call__(self, text_input_list):
      
      x_transform = []
      for i, review in enumerate(text_input_list):
        tokens = [x.strip(",") for x in review.split()]
        BoW_array = np.zeros((NUM_WORDS,))
        for word in tokens:
          if word in vocabulary:
            if vocabulary[word] < len(BoW_array):
              BoW_array[vocabulary[word]] += 1            
        x_transform.append(BoW_array)
      x_transform = np.array(x_transform)
      prediction = self.model.predict(x_transform)
      return prediction


CustomKerasModelWrapper(model)(["bad bad bad bad bad", "good good good good"])

array([[0.51587796, 0.50465024],
       [0.49660537, 0.4791763 ]], dtype=float32)

With our `ModelWrapper` constructed, we can use TextAttack's HuggingFaceDataset module to load reviews for testing, alongside TextAttack's PWWSRen2019 module to serve as our attack recipe. 

The attack below leverages TextAttack's `Attack` class, capable of running attacks against entire datasets. 


In [None]:
from textattack import AttackArgs
from textattack.datasets import Dataset
from textattack import Attacker

model_wrapper = CustomKerasModelWrapper(model)
dataset = HuggingFaceDataset("rotten_tomatoes", None, "test", shuffle=True)

attack = PWWSRen2019.build(model_wrapper)

attack_args = AttackArgs(num_examples=10, checkpoint_dir="checkpoints")

attacker = Attacker(attack, dataset, attack_args)

attacker.attack_dataset()

HBox(children=(FloatProgress(value=0.0, description='Downloading', max=1895.0, style=ProgressStyle(description…




HBox(children=(FloatProgress(value=0.0, description='Downloading', max=869.0, style=ProgressStyle(description_…

Using custom data configuration default



Downloading and preparing dataset rotten_tomatoes_movie_review/default (download: 476.34 KiB, generated: 1.28 MiB, post-processed: Unknown size, total: 1.75 MiB) to /root/.cache/huggingface/datasets/rotten_tomatoes_movie_review/default/1.0.0/9198dbc50858df8bdb0d5f18ccaf33125800af96ad8434bc8b829918c987ee8a...


HBox(children=(FloatProgress(value=0.0, description='Downloading', max=487770.0, style=ProgressStyle(descripti…




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))



HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))



HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))

textattack: Loading [94mdatasets[0m dataset [94mrotten_tomatoes[0m, split [94mtest[0m.


Dataset rotten_tomatoes_movie_review downloaded and prepared to /root/.cache/huggingface/datasets/rotten_tomatoes_movie_review/default/1.0.0/9198dbc50858df8bdb0d5f18ccaf33125800af96ad8434bc8b829918c987ee8a. Subsequent calls will reuse this data.


textattack: Unknown if model of class <class 'tensorflow.python.keras.engine.sequential.Sequential'> compatible with goal function <class 'textattack.goal_functions.classification.untargeted_classification.UntargetedClassification'>.
textattack: Attempting to attack 10 samples when only 1066 are available.
  0%|          | 0/10 [00:00<?, ?it/s]

Attack(
  (search_method): GreedyWordSwapWIR(
    (wir_method):  weighted-saliency
  )
  (goal_function):  UntargetedClassification
  (transformation):  WordSwapWordNet
  (constraints): 
    (0): RepeatModification
    (1): StopwordModification
  (is_black_box):  True
) 



[Succeeded / Failed / Total] 0 / 2 / 3:  30%|███       | 3/10 [00:01<00:02,  2.61it/s]

--------------------------------------------- Result 1 ---------------------------------------------
[92mPositive (51%)[0m --> [91m[FAILED][0m

lovingly photographed in the manner of a golden book sprung to life , stuart little 2 manages sweetness largely without stickiness .


--------------------------------------------- Result 2 ---------------------------------------------
[92mPositive (50%)[0m --> [91m[FAILED][0m

consistently clever and suspenseful .


--------------------------------------------- Result 3 ---------------------------------------------
[91mNegative (51%)[0m --> [37m[SKIPPED][0m

it's like a " big chill " reunion of the baader-meinhof gang , only these guys are more harmless pranksters than political activists .




[Succeeded / Failed / Total] 0 / 2 / 6:  60%|██████    | 6/10 [00:01<00:00,  4.82it/s]

--------------------------------------------- Result 4 ---------------------------------------------
[91mNegative (50%)[0m --> [37m[SKIPPED][0m

the story gives ample opportunity for large-scale action and suspense , which director shekhar kapur supplies with tremendous skill .


--------------------------------------------- Result 5 ---------------------------------------------
[91mNegative (50%)[0m --> [37m[SKIPPED][0m

red dragon " never cuts corners .


--------------------------------------------- Result 6 ---------------------------------------------
[91mNegative (50%)[0m --> [37m[SKIPPED][0m

fresnadillo has something serious to say about the ways in which extravagant chance can distort our perspective and throw us off the path of good sense .




[Succeeded / Failed / Total] 1 / 2 / 8:  80%|████████  | 8/10 [00:02<00:00,  3.92it/s]

--------------------------------------------- Result 7 ---------------------------------------------
[92mPositive (51%)[0m --> [91mNegative (50%)[0m

throws in enough clever and unexpected twists to make the formula [92mfeel[0m fresh .

throws in enough clever and unexpected twists to make the formula [91msense[0m fresh .


--------------------------------------------- Result 8 ---------------------------------------------
[91mNegative (51%)[0m --> [37m[SKIPPED][0m

weighty and ponderous but every bit as filling as the treat of the title .




[Succeeded / Failed / Total] 3 / 2 / 10: 100%|██████████| 10/10 [00:02<00:00,  3.68it/s]textattack: Saving checkpoint under "checkpoints/1615479466961.ta.chkpt" at 2021-03-11 16:17:46 after 10 attacks.


--------------------------------------------- Result 9 ---------------------------------------------
[92mPositive (50%)[0m --> [91mNegative (50%)[0m

a real audience-pleaser that will strike a chord with anyone who's ever waited in a doctor's office , emergency room , hospital bed or insurance company [92moffice[0m .

a real audience-pleaser that will strike a chord with anyone who's ever waited in a doctor's office , emergency room , hospital bed or insurance company [91msituation[0m .


--------------------------------------------- Result 10 ---------------------------------------------
[92mPositive (51%)[0m --> [91mNegative (50%)[0m

generates an enormous [92mfeeling[0m of empathy for its characters .

generates an enormous [91msense[0m of empathy for its characters .







[Succeeded / Failed / Total] 3 / 2 / 10: 100%|██████████| 10/10 [00:02<00:00,  3.67it/s]


+-------------------------------+-------+
| Attack Results                |       |
+-------------------------------+-------+
| Number of successful attacks: | 3     |
| Number of failed attacks:     | 2     |
| Number of skipped attacks:    | 5     |
| Original accuracy:            | 50.0% |
| Accuracy under attack:        | 20.0% |
| Attack success rate:          | 60.0% |
| Average perturbed word %:     | 7.6%  |
| Average num. words per input: | 15.4  |
| Avg num queries:              | 126.6 |
+-------------------------------+-------+





[<textattack.attack_results.failed_attack_result.FailedAttackResult at 0x7fa715caec90>,
 <textattack.attack_results.failed_attack_result.FailedAttackResult at 0x7fa715446d50>,
 <textattack.attack_results.skipped_attack_result.SkippedAttackResult at 0x7fa715523690>,
 <textattack.attack_results.skipped_attack_result.SkippedAttackResult at 0x7fa774dfba10>,
 <textattack.attack_results.skipped_attack_result.SkippedAttackResult at 0x7fa7746094d0>,
 <textattack.attack_results.skipped_attack_result.SkippedAttackResult at 0x7fa770ea6690>,
 <textattack.attack_results.successful_attack_result.SuccessfulAttackResult at 0x7fa77665b8d0>,
 <textattack.attack_results.skipped_attack_result.SkippedAttackResult at 0x7fa7749fa650>,
 <textattack.attack_results.successful_attack_result.SuccessfulAttackResult at 0x7fa7155236d0>,
 <textattack.attack_results.successful_attack_result.SuccessfulAttackResult at 0x7fa71563a790>]

## Conclusion

Great! We trained a binary classifier, created a custom `ModelWrapper` for Keras models, and successsfully ran adversarial attacks against our trained Keras model! This serves a basic demo for how to use TextAttack within your own environments. 
