# TextAttack on Keras Model

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/QData/TextAttack/blob/master/docs/2notebook/Example_3_Keras.ipynb)

[![View Source on GitHub](https://img.shields.io/badge/github-view%20source-black.svg)](https://github.com/QData/TextAttack/blob/master/docs/2notebook/Example_3_Keras.ipynb)

## Training

The code below trains a basic neural network on a series of movie reviews from the IMDB dataset, loaded using Tensorflow's datasets module. Each review is encoded as a sequence of tokens corresponding to a word's index in the vocabulary. Class labels are provided, denoting a positive or negative sentiment. 

See [here](https://www.tensorflow.org/api_docs/python/tf/keras/datasets/imdb/load_data) for more information on the IMDB dataset. 


In [None]:
import tensorflow as tf
import keras
import os
import numpy as np
from keras.utils import to_categorical
import torch
import textattack
from textattack.models.wrappers import ModelWrapper
from textattack.datasets import HuggingFaceDataset
from textattack.attack_recipes import PWWSRen2019


import matplotlib
import matplotlib.pyplot as plt

import numpy as np
from keras.utils import to_categorical
from keras import models
from keras import layers
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Flatten
from keras.layers import Dropout

[34;1mtextattack[0m: Downloading https://textattack.s3.amazonaws.com/word_embeddings/paragramcf.
100%|██████████| 481M/481M [00:16<00:00, 29.8MB/s]
[34;1mtextattack[0m: Unzipping file /root/.cache/textattack/tmpdmzgnt8b.zip to /root/.cache/textattack/word_embeddings/paragramcf.
[34;1mtextattack[0m: Successfully saved word_embeddings/paragramcf to cache.


Below, we load the IMDB dataset from Tensorflow and transform it for our classifier, using a Bag-of-Words format. 

In [None]:

NUM_WORDS = 10000

(x_train_tokens, y_train), (x_test_tokens, y_test) = tf.keras.datasets.imdb.load_data(
    path="imdb.npz",
    num_words=NUM_WORDS,
    skip_top=0,
    maxlen=None,
    seed=113,
    start_char=1,
    oov_char=2,
    index_from=3
)


def transform(x):
  x_transform = []
  for i, word_indices in enumerate(x):
    BoW_array = np.zeros((NUM_WORDS,))
    for index in word_indices:
      if index < len(BoW_array):
        BoW_array[index] += 1
    x_transform.append(BoW_array)
  return np.array(x_transform)
    

index = int(0.9 * len(x_train_tokens))
x_train = transform(x_train_tokens)[:index]
x_test = transform(x_test_tokens)[index:]
y_train = np.array(y_train[:index])
y_test = np.array(y_test[index:])
y_train = to_categorical(y_train)
y_test = to_categorical(y_test)

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/imdb.npz
(22500, 10000) (22500, 2) (2500, 10000) (2500, 2)


With our data successfully loaded, we can now design and trained our model. 

In [None]:

#Model Created with Keras
model = Sequential()
model.add(Dense(512, activation='relu', input_dim=NUM_WORDS))
model.add(Dropout(0.3))
model.add(Dense(100, activation='relu'))
model.add(Dense(2, activation='sigmoid'))



opt = keras.optimizers.Adam(learning_rate=0.00001)

model.compile(
 optimizer = opt,
 loss = "binary_crossentropy",
 metrics = ["accuracy"]
)


results = model.fit(
 x_train, y_train,
 epochs= 18,
 batch_size = 512,
 validation_data = (x_test, y_test)
)


print(results.history)


<bound method Model.summary of <tensorflow.python.keras.engine.sequential.Sequential object at 0x7fa0458fee10>>
Epoch 1/18
Epoch 2/18
Epoch 3/18
Epoch 4/18
Epoch 5/18
Epoch 6/18
Epoch 7/18
Epoch 8/18
Epoch 9/18
Epoch 10/18
Epoch 11/18
Epoch 12/18
Epoch 13/18
Epoch 14/18
Epoch 15/18
Epoch 16/18
Epoch 17/18
Epoch 18/18
{'loss': [0.6982908248901367, 0.6719189882278442, 0.6552321314811707, 0.6339828372001648, 0.6116981506347656, 0.5892458558082581, 0.5651863813400269, 0.5425485968589783, 0.5191691517829895, 0.49679040908813477, 0.47490108013153076, 0.4527125358581543, 0.4317401349544525, 0.4114888310432434, 0.39577117562294006, 0.381359338760376, 0.3658391833305359, 0.3557420074939728], 'accuracy': [0.5377333164215088, 0.6135555505752563, 0.6563555598258972, 0.6990666389465332, 0.731333315372467, 0.7592889070510864, 0.7805333137512207, 0.7969777584075928, 0.8121333122253418, 0.8241778016090393, 0.8349778056144714, 0.8455111384391785, 0.8550222516059875, 0.8615111112594604, 0.86848890781402

## Attacking

With our model trained, we can create a  `ModelWrapper` that will allow us to run TextAttack on a custom Keras model. Each `ModelWrapper` must implement a single method, `__call__`, which takes a list of strings and returns a `List`, `np.ndarray`, or `torch.Tensor` of predictions.

In [None]:


class CustomKerasModelWrapper(ModelWrapper):
    def __init__(self, model):
        self.model = model

    def __call__(self, text_input_list):
      text_array = np.array([words2tokens(text_input) for text_input in text_input_list])
      prediction = self.model.predict(text_array)
      preds = [list(prediction[i][0]) for i in range(0, len(prediction))]
      return preds


CustomKerasModelWrapper(model)(["the movie was awful", "the movie was awesome"])


[[0.55557764, 0.44437635], [0.4965529, 0.50741225]]

With our `ModelWrapper` constructed, we can use TextAttack's HuggingFaceDataset module to load reviews for testing, alongside TextAttack's PWWSRen2019 module to serve as our attack recipe. 

The attack below leverages TextAttack's `Attack` class, capable of running attacks against entire datasets. 


In [None]:
model_wrapper = CustomKerasModelWrapper(model)
dataset = HuggingFaceDataset("rotten_tomatoes", None, "test", shuffle=True)

attack = PWWSRen2019.build(model_wrapper)

results_iterable = attack.attack_dataset(dataset, indices=range(10))
for result in results_iterable:
  print()
  print()
  print(result.__str__(color_method='ansi'))

Using custom data configuration default
Reusing dataset rotten_tomatoes_movie_review (/root/.cache/huggingface/datasets/rotten_tomatoes_movie_review/default/1.0.0/9198dbc50858df8bdb0d5f18ccaf33125800af96ad8434bc8b829918c987ee8a)
[34;1mtextattack[0m: Loading [94mdatasets[0m dataset [94mrotten_tomatoes[0m, split [94mtest[0m.
[34;1mtextattack[0m: Unknown if model of class <class 'tensorflow.python.keras.engine.sequential.Sequential'> compatible with goal function <class 'textattack.goal_functions.classification.untargeted_classification.UntargetedClassification'>.




[92mPositive (50%)[0m --> [37m[SKIPPED][0m

movies like high crimes flog the dead horse of surprise as if it were an obligation . how about surprising us by trying something new ?


[92mPositive (53%)[0m --> [37m[SKIPPED][0m

in a 102-minute film , aaliyah gets at most 20 minutes of screen time . . . . most viewers will wish there had been more of the " queen " and less of the " damned . "


[91mNegative (51%)[0m --> [92mPositive (50%)[0m

more [91mlikely[0m to have you [91mscratching[0m your [91mhead[0m than hiding under your seat .

more [92mprobably[0m to have you [92mcancel[0m your [92mpsyche[0m than hiding under your seat .


[91mNegative (50%)[0m --> [92mPositive (51%)[0m

[91mslow[0m , silly and unintentionally hilarious .

[92measy[0m , silly and unintentionally hilarious .


[92mPositive (50%)[0m --> [91mNegative (52%)[0m

by its modest , straight-ahead standards , undisputed scores a direct [92mhit[0m .

by its modest , straight-ahead s

## Conclusion

Great! We trained a binary classifier, created a custom `ModelWrapper` for Keras models, and successsfully ran adversarial attacks against our trained Keras model! This serves a basic demo for how to use TextAttack within your own environments. 
