# TextAttack End-to-End
 end-to-end overview of training, evaluating, and attacking a model using [TextAttack](https://textattack.readthedocs.io/en/master/).

In [None]:
from google.colab import drive
drive.mount('/content/drive')

In [None]:
!pip install textattack[tensorflow,optional]

Collecting tensorflow>=2.9.1 (from textattack[optional,tensorflow])
  Using cached tensorflow-2.17.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (4.2 kB)
Collecting tensorboard<2.18,>=2.17 (from tensorflow>=2.9.1->textattack[optional,tensorflow])
  Downloading tensorboard-2.17.1-py3-none-any.whl.metadata (1.6 kB)
Collecting keras>=3.2.0 (from tensorflow>=2.9.1->textattack[optional,tensorflow])
  Downloading keras-3.6.0-py3-none-any.whl.metadata (5.8 kB)
Using cached tensorflow-2.17.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (601.3 MB)
Downloading keras-3.6.0-py3-none-any.whl (1.2 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.2/1.2 MB[0m [31m42.0 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading tensorboard-2.17.1-py3-none-any.whl (5.5 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m5.5/5.5 MB[0m [31m83.9 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: tensorboard, keras, tensorflow
  

In [None]:
!pip install tensorflow==2.12

Collecting tensorflow==2.12
  Using cached tensorflow-2.12.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (3.4 kB)
Collecting keras<2.13,>=2.12.0 (from tensorflow==2.12)
  Using cached keras-2.12.0-py2.py3-none-any.whl.metadata (1.4 kB)
Collecting tensorboard<2.13,>=2.12 (from tensorflow==2.12)
  Using cached tensorboard-2.12.3-py3-none-any.whl.metadata (1.8 kB)
Using cached tensorflow-2.12.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (585.9 MB)
Using cached keras-2.12.0-py2.py3-none-any.whl (1.7 MB)
Using cached tensorboard-2.12.3-py3-none-any.whl (5.6 MB)
Installing collected packages: keras, tensorboard, tensorflow
  Attempting uninstall: keras
    Found existing installation: keras 3.6.0
    Uninstalling keras-3.6.0:
      Successfully uninstalled keras-3.6.0
  Attempting uninstall: tensorboard
    Found existing installation: tensorboard 2.17.1
    Uninstalling tensorboard-2.17.1:
      Successfully uninstalled tensorboard-2.17.1
  Attempting

## Training
Text attack comes with its own fine tuned models on several datasets. You can list them with the command below.

In [None]:
! textattack list models

[34;1mtextattack[0m: Updating TextAttack package dependencies.
[34;1mtextattack[0m: Downloading NLTK required packages.
[nltk_data] Downloading package averaged_perceptron_tagger to
[nltk_data]     /root/nltk_data...
[nltk_data]   Unzipping taggers/averaged_perceptron_tagger.zip.
[nltk_data] Downloading package stopwords to /root/nltk_data...
[nltk_data]   Unzipping corpora/stopwords.zip.
[nltk_data] Downloading package omw to /root/nltk_data...
[nltk_data] Downloading package universal_tagset to /root/nltk_data...
[nltk_data]   Unzipping taggers/universal_tagset.zip.
[nltk_data] Downloading package wordnet to /root/nltk_data...
[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Unzipping tokenizers/punkt.zip.
Downloading https://raw.githubusercontent.com/stanfordnlp/stanza-resources/main/resources_1.9.0.json: 392kB [00:00, 154MB/s]         
2024-10-09 12:18:25 INFO: Downloaded file to /root/stanza_resources/resources.json
2024-10-09 12:18:25 INFO: Downloadin

You can use these models as it is by referring to their name. However for the purpose of this practical we are going to train our model from scratch.

TextAttack integrates directly with [transformers](https://github.com/huggingface/transformers/) and [datasets](https://github.com/huggingface/datasets) to train any of the `transformers` pre-trained models on datasets from `datasets`.

Let's use the Rotten Tomatoes Movie Review dataset: it's relatively short, and showcases the key features of `textattack train`. Let's take a look at the dataset using `textattack peek-dataset`:

In [None]:
!textattack peek-dataset --dataset-from-huggingface rotten_tomatoes

2024-10-09 12:19:03.372547: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2024-10-09 12:19:03.424446: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2024-10-09 12:19:03.425024: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
README.md: 100% 7.46k/7.46k [00:00<00:00, 34.0MB/s]
train.parquet: 100% 699k/699k [00:00<00:00, 85.7MB/s]
validation.parquet: 100% 90.0k/90.0k [00:00<00:00, 217MB/s]
test.parquet: 100% 92.2k/92.2k [00:00<00:00, 182MB/s]
Generating train split: 100% 8530/8530 [00:00<00:00, 114942.71 examples/s]
Generating validation split: 100% 1066/1066 [00:00<00:00, 336149.77 examples/s]
Generating test split: 100% 1066/

The dataset looks good! It's lowercased already, so we'll make sure our model is uncased. The longest input is 51 words, so we can cap our maximum sequence length (`--model-max-length`) at 64.

We'll train [`distilbert-base-uncased`](https://huggingface.co/transformers/model_doc/distilbert.html), since it's a relatively small model, and a good example of how we integrate with `transformers`.

So we have our command:

```bash
textattack train                      \ # Train a model with TextAttack
    --model distilbert-base-uncased   \ # Using distilbert, uncased version, from `transformers`
    --dataset rotten_tomatoes         \ # On the Rotten Tomatoes dataset
    --model-num-labels 2              \ # That has 2 labels
    --model-max-length 64             \ # With a maximum sequence length of 64
    --per-device-train-batch-size 128 \ # And batch size of 128
    --num-epochs 3                    \ # For 3 epochs
```

Now let's run it (please remember to use GPU if you have access):

In [None]:
!textattack train --model-name-or-path distilbert-base-uncased --dataset rotten_tomatoes --model-num-labels 2 --model-max-length 64 --per-device-train-batch-size 128 --num-epochs 3

2024-10-09 12:19:41.009290: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2024-10-09 12:19:41.058650: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2024-10-09 12:19:41.059203: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
[34;1mtextattack[0m: Loading transformers AutoModelForSequenceClassification: distilbert-base-uncased
config.json: 100% 483/483 [00:00<00:00, 3.01MB/s]
model.safetensors: 100% 268M/268M [00:00<00:00, 305MB/s]
Some weights of DistilBertForSequenceClassification were not initialized from the model checkpoint at distilbert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight', 'pr

## Evaluation

We successfully fine-tuned `distilbert-base-cased` for 3 epochs. Now let's evaluate it using `textattack eval`. This is as simple as providing the path to the pretrained model (that you just obtain from running the above command!) to `--model`, along with the number of evaluation samples. `textattack eval` will automatically load the evaluation data from training:

In [None]:
!textattack eval --num-examples 1000 --model ./outputs/2024-09-30-08-37-16-508338/best_model/ --dataset-from-huggingface rotten_tomatoes --dataset-split test

2024-10-09 12:23:03.491584: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2024-10-09 12:23:03.543114: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2024-10-09 12:23:03.543650: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
Traceback (most recent call last):
  File "/usr/local/bin/textattack", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.10/dist-packages/textattack/commands/textattack_cli.py", line 49, in main
    func.run(args)
  File "/usr/local/lib/python3.10/dist-packages/textattack/commands/eval_model_command.py", line 103, in run
    self.test_model_on_dataset(args)
  File "/usr/local/lib/pyt

Awesome -- we were able to train a model up to 84.4% accuracy on the test dataset – with only a single command!

## Attack

Finally, let's attack our pre-trained model. We can do this the same way as before (by providing the path to the pretrained model to `--model`). For our attack, let's use the "TextFooler" attack recipe, from the paper ["Is BERT Really Robust? A Strong Baseline for Natural Language Attack on Text Classification and Entailment" (Jin et al, 2019)](https://arxiv.org/abs/1907.11932). We can do this by passing `--recipe textfooler` to `textattack attack`.

> *Warning*: We're printing out 100 examples and, if the attack succeeds, their perturbations. The output of this command is going to be quite long!


In [None]:
!textattack attack --recipe textfooler --num-examples 100 --model ./outputs/2024-09-30-08-37-16-508338/best_model/ --dataset-from-huggingface rotten_tomatoes --dataset-split test

2024-10-09 12:24:06.553104: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2024-10-09 12:24:06.636873: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2024-10-09 12:24:06.637542: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
[34;1mtextattack[0m: Loading [94mdatasets[0m dataset [94mrotten_tomatoes[0m, split [94mtest[0m.
Traceback (most recent call last):
  File "/usr/local/bin/textattack", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.10/dist-packages/textattack/commands/textattack_cli.py", line 49, in main
    func.run(args)
  File "/usr/local/lib/python3.10/dist-packages/textattack/commands/

Looks like our model was 84% successful (makes sense - same evaluation set as `textattack eval`!), meaning that TextAttack attacked the model with 84 examples (since the attack won't run if an example is originally mispredicted). The attack success rate was 98.8%, meaning that TextFooler failed to find an adversarial example only 1.2% (1 out of 84) of the time.



#**TO DO: Robust models**

Now that we have trained our model and saw that it was vulnerable to adversarial attacks, your next task is to improve its robustness. We can do so by training a model with adversarial data instead of the normal ones.

1. To do so we need to update the training command from above to instruct textattack to use adversarial data generated with textfooler or any other attack during training.

To complete the task you can take the help of the documentation of [TextAttack library ](https://textattack.readthedocs.io/en/latest/0_get_started/basic-Intro.html) and command line help option to adversarially train a model on the same dataset.

***Hint***: Take a look at the [Trainer class in API](https://textattack.readthedocs.io/en/master/api/trainer.html) user guide and the [Making Vanilla Adversarial Training of NLP Models Feasible!](https://textattack.readthedocs.io/en/master/1start/A2TforVanillaAT.html)

2. If the solution you found is expected to take more than 15 minutes to train, look again at the docummentation and adapt your parameters such that it will take between 5-10 minutes due to time restrictions for this class.

3. Evaluate if the robustnes of the model has improved by attacking the newly trained model with TextFooler

In [None]:
!textattack train --help

usage: [python -m] texattack <command> [<args>] train [-h]
                                                      --model-name-or-path
                                                      MODEL_NAME_OR_PATH
                                                      [--model-max-length MODEL_MAX_LENGTH]
                                                      [--model-num-labels MODEL_NUM_LABELS]
                                                      [--attack ATTACK]
                                                      [--task-type TASK_TYPE]
                                                      --dataset DATASET
                                                      [--dataset-train-split DATASET_TRAIN_SPLIT]
                                                      [--dataset-eval-split DATASET_EVAL_SPLIT]
                                                      [--filter-train-by-labels FILTER_TRAIN_BY_LABELS [FILTER_TRAIN_BY_LABELS ...]]
                                                      [--fil

In [None]:
!textattack train \
    --model-name-or-path distilbert-base-uncased \
    --model-max-length 64 \
    --model-num-labels 2 \
    --attack textfooler \
    --task-type classification \
    --dataset rotten_tomatoes \
    --num-epochs 3 \
    --num-clean-epochs 1 \
    --learning-rate 5e-05 \
    --per-device-train-batch-size 8 \
    --per-device-eval-batch-size 32 \
    --gradient-accumulation-steps 1 \
    --random-seed 786 \
    --output-dir ./outputs/distilbert_rotten_tomatoes/ \
    --num-train-adv-examples 1000 \
    --save-last \
    --log-to-tb \
    --tb-log-dir ./logs/ \


[34;1mtextattack[0m: Loading transformers AutoModelForSequenceClassification: distilbert-base-uncased
Some weights of DistilBertForSequenceClassification were not initialized from the model checkpoint at distilbert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight', 'pre_classifier.bias', 'pre_classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
[34;1mtextattack[0m: Loading [94mdatasets[0m dataset [94mrotten_tomatoes[0m, split [94mtrain[0m.
[34;1mtextattack[0m: Loading [94mdatasets[0m dataset [94mrotten_tomatoes[0m, split [94mvalidation[0m.
[34;1mtextattack[0m: Unknown if model of class <class 'transformers.models.distilbert.modeling_distilbert.DistilBertForSequenceClassification'> compatible with goal function <class 'textattack.goal_functions.classification.untargeted_classification.UntargetedClassification'>.
[34;1mtextattack[0m: Writing logs to ./outputs/