<a href="https://colab.research.google.com/github/rajsegar/TensorFlow-and-TextAttack/blob/main/TextAttack.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#Installation

To use TextAttack, you must be running Python 3.6 or above. A CUDA-compatible GPU is optional but will greatly improve speed.

We recommend installing TextAttack in a virtual environment (check out this guide).

There are two ways to install TextAttack. If you want to simply use as it is, install via pip. If you want to make any changes and play around, install it from source.

Install with pip

https://packaging.python.org/en/latest/guides/installing-using-pip-and-virtual-environments/

In [7]:
pip install textattack[tensorflow]

Collecting textattack[tensorflow]
  Downloading textattack-0.3.10-py3-none-any.whl.metadata (38 kB)
Collecting bert-score>=0.3.5 (from textattack[tensorflow])
  Downloading bert_score-0.3.13-py3-none-any.whl.metadata (15 kB)
Collecting flair (from textattack[tensorflow])
  Downloading flair-0.15.1-py3-none-any.whl.metadata (12 kB)
Collecting language-tool-python (from textattack[tensorflow])
  Downloading language_tool_python-2.9.4-py3-none-any.whl.metadata (55 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m55.5/55.5 kB[0m [31m4.6 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting lemminflect (from textattack[tensorflow])
  Downloading lemminflect-0.2.3-py3-none-any.whl.metadata (7.0 kB)
Collecting lru-dict (from textattack[tensorflow])
  Downloading lru_dict-1.3.0-cp311-cp311-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (4.5 kB)
Collecting transformers>=4.30.0 (from textattack[tensorflow])
  Downloading transformers-

#Training

First, we’re going to train a model. TextAttack integrates directly with transformers and datasets to train any of the transformers pre-trained models on datasets from datasets.

Let’s use the Rotten Tomatoes Movie Review dataset: it’s relatively short , and showcasesthe key features of textattack train. Let’s take a look at the dataset using textattack peek-dataset:

In [8]:
!textattack peek-dataset --dataset-from-huggingface rotten_tomatoes

[34;1mtextattack[0m: Updating TextAttack package dependencies.
[34;1mtextattack[0m: Downloading NLTK required packages.
[nltk_data] Downloading package averaged_perceptron_tagger to
[nltk_data]     /root/nltk_data...
[nltk_data]   Unzipping taggers/averaged_perceptron_tagger.zip.
[nltk_data] Downloading package stopwords to /root/nltk_data...
[nltk_data]   Unzipping corpora/stopwords.zip.
[nltk_data] Downloading package omw to /root/nltk_data...
[nltk_data] Downloading package universal_tagset to /root/nltk_data...
[nltk_data]   Unzipping taggers/universal_tagset.zip.
[nltk_data] Downloading package wordnet to /root/nltk_data...
[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Unzipping tokenizers/punkt.zip.
2025-08-09 14:16:47.129542: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:467] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
E0000 00:00:1754749007.449881    1550 cuda_

The dataset looks good! It’s lowercased already, so we’ll make sure our model is uncased. The longest input is 51 words, so we can cap our maximum sequence length (--model-max-length) at 64.

We’ll train `distilbert-base-uncased <https://huggingface.co/transformers/model_doc/distilbert.html>`__, since it’s a relatively small model, and a good example of how we integrate with transformers.

So we have our command:



textattack train                      \ # Train a model with TextAttack
    --model distilbert-base-uncased   \ # Using distilbert, uncased version, from `transformers`
    --dataset rotten_tomatoes         \ # On the Rotten Tomatoes dataset
    --model-num-labels 3              \ # That has 2 labels
    --model-max-length 64             \ # With a maximum sequence length of 64
    --per-device-train-batch-size 128 \ # And batch size of 128
    --num-epochs 3                    \ # For 3 epochs


Now let’s run it (please remember to use GPU if you have access):

In [9]:
pip install transformers==4.28.1

Collecting transformers==4.28.1
  Using cached transformers-4.28.1-py3-none-any.whl.metadata (109 kB)
Collecting tokenizers!=0.11.3,<0.14,>=0.11.1 (from transformers==4.28.1)
  Using cached tokenizers-0.13.3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (6.7 kB)
Using cached transformers-4.28.1-py3-none-any.whl (7.0 MB)
Using cached tokenizers-0.13.3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.8 MB)
Installing collected packages: tokenizers, transformers
  Attempting uninstall: tokenizers
    Found existing installation: tokenizers 0.21.4
    Uninstalling tokenizers-0.21.4:
      Successfully uninstalled tokenizers-0.21.4
  Attempting uninstall: transformers
    Found existing installation: transformers 4.55.0
    Uninstalling transformers-4.55.0:
      Successfully uninstalled transformers-4.55.0
[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the fol

In [10]:
!textattack train --model-name-or-path distilbert-base-uncased --dataset rotten_tomatoes --model-num-labels 2 --model-max-length 64 --per-device-train-batch-size 128 --num-epochs 3

The cache for model files in Transformers v4.22.0 has been updated. Migrating your old cache. This is a one-time only operation. You can interrupt this and resume the migration later on by calling `transformers.utils.move_cache()`.
0it [00:00, ?it/s]0it [00:00, ?it/s]
2025-08-09 14:17:29.432285: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:467] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
E0000 00:00:1754749049.459581    1811 cuda_dnn.cc:8579] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1754749049.467549    1811 cuda_blas.cc:1407] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
W0000 00:00:1754749049.487740    1811 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking the same target more th

#Evaluation

We successfully fine-tuned distilbert-base-cased for 3 epochs. Now let’s evaluate it using textattack eval. This is as simple as providing the path to the pretrained model (that you just obtain from running the above command!) to --model, along with the number of evaluation samples. textattack eval will automatically load the evaluation data from training:

In [13]:
!textattack eval --num-examples 1000 --model ./outputs/2025-08-09-14-17-38-742853/best_model/ --dataset-from-huggingface rotten_tomatoes --dataset-split test

2025-08-09 16:31:03.155202: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:467] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
E0000 00:00:1754757063.189398   33760 cuda_dnn.cc:8579] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1754757063.197681   33760 cuda_blas.cc:1407] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
W0000 00:00:1754757063.220254   33760 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking the same target more than once.
W0000 00:00:1754757063.220316   33760 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking the same target more than once.
W0000 00:00:1754757063.220321   33760 computation_placer.cc:177] computation placer alr

#Attack

Finally, let’s attack our pre-trained model. We can do this the same way as before (by providing the path to the pretrained model to --model). For our attack, let’s use the “TextFooler” attack recipe, from the paper “Is BERT Really Robust? A Strong Baseline for Natural Language Attack on Text Classification and Entailment” (Jin et al, 2019). We can do this by passing --recipe textfooler to textattack attack.

Warning: We’re printing out 100 examples and, if the attack succeeds, their perturbations. The output of this command is going to be quite long!

In [15]:
!textattack attack --recipe textfooler --num-examples 100 --model ./outputs/2025-08-09-14-17-38-742853/best_model/ --dataset-from-huggingface rotten_tomatoes --dataset-split test

2025-08-09 16:34:37.215295: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:467] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
E0000 00:00:1754757277.274585   34640 cuda_dnn.cc:8579] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1754757277.292506   34640 cuda_blas.cc:1407] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
W0000 00:00:1754757277.338895   34640 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking the same target more than once.
W0000 00:00:1754757277.338985   34640 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking the same target more than once.
W0000 00:00:1754757277.338991   34640 computation_placer.cc:177] computation placer alr

#Conclusion
learned how to train, evaluate, and attack a model with TextAttack, using only three commands! 😀

#Bonus

In [16]:
!textattack attack --model lstm-mr --recipe hotflip --num-examples 4 --num-examples-offset 3 --enable-advance-metrics

2025-08-09 16:38:01.228636: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:467] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
E0000 00:00:1754757481.254726   35474 cuda_dnn.cc:8579] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1754757481.262226   35474 cuda_blas.cc:1407] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
W0000 00:00:1754757481.281866   35474 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking the same target more than once.
W0000 00:00:1754757481.281912   35474 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking the same target more than once.
W0000 00:00:1754757481.281916   35474 computation_placer.cc:177] computation placer alr

#Gradio: Build Machine Learning Web Apps — in Python

In [18]:
pip install -q gradio numpy pandas plotly

In [23]:
import json

# Load special tokens
with open("/content/outputs/2025-08-09-14-17-38-742853/best_model/special_tokens_map.json", "r") as f:
    special_tokens = json.load(f)

print("Special Tokens:", special_tokens)


Special Tokens: {'cls_token': '[CLS]', 'mask_token': '[MASK]', 'pad_token': '[PAD]', 'sep_token': '[SEP]', 'unk_token': '[UNK]'}


In [24]:
from transformers import AutoTokenizer, AutoModelForMaskedLM

model_name = "bert-base-uncased"  # Replace with your target model
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForMaskedLM.from_pretrained(model_name)


The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


tokenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/570 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]



model.safetensors:   0%|          | 0.00/440M [00:00<?, ?B/s]

Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['cls.seq_relationship.weight', 'cls.seq_relationship.bias']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


In [26]:
def analyze_text(user_input):
    # Tokenize input
    tokens = tokenizer.tokenize(user_input)

    # Highlight special tokens
    highlighted = [
        f"**{tok}**" if tok in special_tokens.values() else tok
        for tok in tokens
    ]

    # Here you’d integrate PoisonScope’s detection logic
    # For now, just a placeholder
    detection_result = {
        "bias_score": 0.12,
        "hallucination_risk": "Low",
        "hidden_intent_detected": False
    }

    return " ".join(highlighted), detection_result


In [27]:
import gradio as gr

with gr.Blocks() as demo:
    gr.Markdown("# 🕵️ PoisonScope: LLM Backdoor & Bias Detection")

    with gr.Row():
        text_in = gr.Textbox(label="Enter prompt")
        token_out = gr.Textbox(label="Tokenized Output")
        result_out = gr.JSON(label="Detection Results")

    btn = gr.Button("Analyze")
    btn.click(fn=analyze_text, inputs=text_in, outputs=[token_out, result_out])

demo.launch()


It looks like you are running Gradio on a hosted Jupyter notebook, which requires `share=True`. Automatically setting `share=True` (you can turn this off by setting `share=False` in `launch()` explicitly).

Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
* Running on public URL: https://9873257b22c499839a.gradio.live

This share link expires in 1 week. For free permanent hosting and GPU upgrades, run `gradio deploy` from the terminal in the working directory to deploy to Hugging Face Spaces (https://huggingface.co/spaces)


