# The TextAttack ecosystem: search, transformations, and constraints

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/QData/TextAttack/blob/master/docs/2notebook/1_Introduction_and_Transformations.ipynb)

[![View Source on GitHub](https://img.shields.io/badge/github-view%20source-black.svg)](https://github.com/QData/TextAttack/blob/master/docs/2notebook/1_Introduction_and_Transformations.ipynb)

# Installation of  Attack-api branch

An attack in TextAttack consists of four parts.

### Goal function

The **goal function** determines if the attack is successful or not. One common goal function is **untargeted classification**, where the attack tries to perturb an input to change its classification. 

### Search method
The **search method** explores the space of potential transformations and tries to locate a successful perturbation. Greedy search, beam search, and brute-force search are all examples of search methods.

### Transformation
A **transformation** takes a text input and transforms it, for example replacing words or phrases with similar ones, while trying not to change the meaning. Paraphrase and synonym substitution are two broad classes of transformations.

### Constraints
Finally, **constraints** determine whether or not a given transformation is valid. Transformations don't perfectly preserve syntax or semantics, so additional constraints can increase the probability that these qualities are preserved from the source to adversarial example. There are many types of constraints: overlap constraints that measure edit distance, syntactical  constraints check part-of-speech and grammar errors, and semantic constraints like language models and sentence encoders.

### A custom transformation

This lesson explains how to create a custom transformation. In TextAttack, many transformations involve *word swaps*: they take a word and try and find suitable substitutes. Some attacks focus on replacing characters with neighboring characters to create "typos" (these don't intend to preserve the grammaticality of inputs). Other attacks rely on semantics: they take a word and try to replace it with semantic equivalents.


### Banana word swap 

As an introduction to writing transformations for TextAttack, we're going to try a very simple transformation: one that replaces any given word with the word 'banana'. In TextAttack, there's an abstract `WordSwap` class that handles the heavy lifting of breaking sentences into words and avoiding replacement of stopwords. We can extend `WordSwap` and implement a single method, `_get_replacement_words`, to indicate to replace each word with 'banana'. 🍌

In [None]:
from textattack.transformations import WordSwap

class BananaWordSwap(WordSwap):
    """ Transforms an input by replacing any word with 'banana'.
    """
    
    # We don't need a constructor, since our class doesn't require any parameters.

    def _get_replacement_words(self, word):
        """ Returns 'banana', no matter what 'word' was originally.
        
            Returns a list with one item, since `_get_replacement_words` is intended to
                return a list of candidate replacement words.
        """
        return ['banana']

### Using our transformation

Now we have the transformation chosen, but we're missing a few other things. To complete the attack, we need to choose the **search method** and **constraints**. And to use the attack, we need a **goal function**, a **model** and a **dataset**. (The goal function indicates the task our model performs – in this case, classification – and the type of attack – in this case, we'll perform an untargeted attack.)

### Creating the goal function, model, and dataset
We are performing an untargeted attack on a classification model, so we'll use the `UntargetedClassification` class. For the model, let's use BERT trained for news classification on the AG News dataset. We've pretrained several models and uploaded them to the [HuggingFace Model Hub](https://huggingface.co/textattack). TextAttack integrates with any model from HuggingFace's Model Hub and any dataset from HuggingFace's `datasets`!

In [None]:
# Import the model
import transformers
from textattack.models.tokenizers import AutoTokenizer
from textattack.models.wrappers import HuggingFaceModelWrapper

model = transformers.AutoModelForSequenceClassification.from_pretrained("textattack/bert-base-uncased-ag-news")
tokenizer = AutoTokenizer("textattack/bert-base-uncased-ag-news")

model_wrapper = HuggingFaceModelWrapper(model, tokenizer)

# Create the goal function using the model
from textattack.goal_functions import UntargetedClassification
goal_function = UntargetedClassification(model_wrapper)

# Import the dataset
from textattack.datasets import HuggingFaceDataset
dataset = HuggingFaceDataset("ag_news", None, "test")

textattack: Unknown if model of class <class 'transformers.models.bert.modeling_bert.BertForSequenceClassification'> compatible with goal function <class 'textattack.goal_functions.classification.untargeted_classification.UntargetedClassification'>.
Using custom data configuration default
Reusing dataset ag_news (/root/.cache/huggingface/datasets/ag_news/default/0.0.0/fb5c5e74a110037311ef5e904583ce9f8b9fbc1354290f97b4929f01b3f48b1a)
textattack: Loading [94mdatasets[0m dataset [94mag_news[0m, split [94mtest[0m.


### Creating the attack
Let's keep it simple: let's use a greedy search method, and let's not use any constraints for now. 

In [None]:
from textattack.search_methods import GreedySearch
from textattack.constraints.pre_transformation import RepeatModification, StopwordModification
from textattack import Attack

# We're going to use our Banana word swap class as the attack transformation.
transformation = BananaWordSwap() 
# We'll constrain modification of already modified indices and stopwords
constraints = [RepeatModification(),
               StopwordModification()]
# We'll use the Greedy search method
search_method = GreedySearch()
# Now, let's make the attack from the 4 components:
attack = Attack(goal_function, constraints, transformation, search_method)

Let's print our attack to see all the parameters:

In [None]:
print(attack)

Attack(
  (search_method): GreedySearch
  (goal_function):  UntargetedClassification
  (transformation):  BananaWordSwap
  (constraints): 
    (0): RepeatModification
    (1): StopwordModification
  (is_black_box):  True
)


In [None]:
print(dataset)

<textattack.datasets.huggingface_dataset.HuggingFaceDataset object at 0x7fd70f43b1d0>


### Using the attack

Let's use our attack to successfully attack 10 samples.

In [None]:
from tqdm import tqdm # tqdm provides us a nice progress bar.
from textattack.loggers import CSVLogger # tracks a dataframe for us.
from textattack.attack_results import SuccessfulAttackResult
from textattack import Attacker
from textattack import AttackArgs
from textattack.datasets import Dataset

attack_args = AttackArgs(num_examples=10, checkpoint_interval=10, checkpoint_dir="checkpoints")

attacker = Attacker(attack, dataset, attack_args)

attacker.attack_dataset()

# logger = CSVLogger(color_method='html')

# num_successes = 0
# while num_successes < 10:
#     result = next(results_iterable)
#     if isinstance(result, SuccessfulAttackResult):
#         logger.log_attack_result(result)
#         num_successes += 1
#         print(f'{num_successes} of 10 successes complete.')




  0%|          | 0/10 [00:00<?, ?it/s][A[A[A

Attack(
  (search_method): GreedySearch
  (goal_function):  UntargetedClassification
  (transformation):  BananaWordSwap
  (constraints): 
    (0): RepeatModification
    (1): StopwordModification
  (is_black_box):  True
) 






 10%|█         | 1/10 [01:13<11:02, 73.60s/it][A[A[A


[Succeeded / Failed / Total] 1 / 0 / 1:  10%|█         | 1/10 [01:13<11:02, 73.60s/it][A[A[A

--------------------------------------------- Result 1 ---------------------------------------------
[94mBusiness (100%)[0m --> [91mWorld (89%)[0m

Fears for T N [94mpension[0m after [94mtalks[0m [94mUnions[0m representing [94mworkers[0m at Turner   Newall say they are '[94mdisappointed'[0m after talks with stricken parent firm Federal [94mMogul[0m.

Fears for T N [91mbanana[0m after [91mbanana[0m [91mbanana[0m representing [91mbanana[0m at Turner   Newall say they are '[91mbanana[0m after talks with stricken parent firm Federal [91mbanana[0m.







[Succeeded / Failed / Total] 1 / 0 / 1:  20%|██        | 2/10 [10:18<41:13, 309.22s/it][A[A[A


[Succeeded / Failed / Total] 2 / 0 / 2:  20%|██        | 2/10 [10:18<41:13, 309.22s/it][A[A[A

--------------------------------------------- Result 2 ---------------------------------------------
[35mSci/tech (100%)[0m --> [91mWorld (64%)[0m

The Race is On: Second Private [35mTeam[0m Sets Launch [35mDate[0m for [35mHuman[0m [35mSpaceflight[0m ([35mSPACE[0m.[35mcom[0m) [35mSPACE[0m.[35mcom[0m - [35mTORONTO[0m, [35mCanada[0m -- [35mA[0m [35msecond[0m\[35mteam[0m of rocketeers [35mcompeting[0m for the  #36;10 million Ansari X [35mPrize[0m, a [35mcontest[0m for\[35mprivately[0m funded [35msuborbital[0m [35mspace[0m [35mflight[0m, has officially [35mannounced[0m the first\[35mlaunch[0m date for its [35mmanned[0m rocket.

The Race is On: Second Private [91mbanana[0m Sets Launch [91mbanana[0m for [91mbanana[0m [91mbanana[0m ([91mbanana[0m.[91mbanana[0m) [91mbanana[0m.[91mbanana[0m - [91mbanana[0m, [91mbanana[0m -- [91mbanana[0m [91mbanana[0m\[91mbanana[0m of rocketeers [91mbanana[0m for the  #36;10 million A




[Succeeded / Failed / Total] 2 / 0 / 2:  30%|███       | 3/10 [14:03<32:48, 281.26s/it][A[A[A


[Succeeded / Failed / Total] 3 / 0 / 3:  30%|███       | 3/10 [14:03<32:48, 281.26s/it][A[A[A

--------------------------------------------- Result 3 ---------------------------------------------
[35mSci/tech (100%)[0m --> [94mBusiness (77%)[0m

Ky. Company Wins Grant to [35mStudy[0m [35mPeptides[0m ([35mAP[0m) [35mAP[0m - [35mA[0m company [35mfounded[0m by a [35mchemistry[0m [35mresearcher[0m at the [35mUniversity[0m of Louisville won a grant to develop a method of producing better [35mpeptides[0m, which are short chains of [35mamino[0m [35macids[0m, the building blocks of [35mproteins[0m.

Ky. Company Wins Grant to [94mbanana[0m [94mbanana[0m ([94mbanana[0m) [94mbanana[0m - [94mbanana[0m company [94mbanana[0m by a [94mbanana[0m [94mbanana[0m at the [94mbanana[0m of Louisville won a grant to develop a method of producing better [94mbanana[0m, which are short chains of [94mbanana[0m [94mbanana[0m, the building blocks of [94mbanana[0m.







[Succeeded / Failed / Total] 3 / 0 / 3:  40%|████      | 4/10 [15:49<23:44, 237.46s/it][A[A[A


[Succeeded / Failed / Total] 4 / 0 / 4:  40%|████      | 4/10 [15:49<23:44, 237.46s/it][A[A[A

--------------------------------------------- Result 4 ---------------------------------------------
[35mSci/tech (100%)[0m --> [91mWorld (65%)[0m

[35mPrediction[0m Unit Helps [35mForecast[0m Wildfires (AP) [35mAP[0m - It's barely dawn when Mike Fitzpatrick [35mstarts[0m his shift with a blur of colorful maps, figures and endless charts, but already he knows what the day will bring. Lightning will strike in places he expects. Winds will pick up, moist places will dry and flames will roar.

[91mbanana[0m Unit Helps [91mbanana[0m Wildfires (AP) [91mbanana[0m - It's barely dawn when Mike Fitzpatrick [91mbanana[0m his shift with a blur of colorful maps, figures and endless charts, but already he knows what the day will bring. Lightning will strike in places he expects. Winds will pick up, moist places will dry and flames will roar.







[Succeeded / Failed / Total] 4 / 0 / 4:  50%|█████     | 5/10 [17:08<17:08, 205.64s/it][A[A[A


[Succeeded / Failed / Total] 5 / 0 / 5:  50%|█████     | 5/10 [17:08<17:08, 205.64s/it][A[A[A

--------------------------------------------- Result 5 ---------------------------------------------
[35mSci/tech (100%)[0m --> [91mWorld (62%)[0m

Calif. Aims to Limit Farm-Related [35mSmog[0m (AP) AP - Southern California's [35msmog-fighting[0m agency went after [35memissions[0m of the [35mbovine[0m variety Friday, adopting the nation's first rules to reduce air pollution from dairy cow manure.

Calif. Aims to Limit Farm-Related [91mbanana[0m (AP) AP - Southern California's [91mbanana[0m agency went after [91mbanana[0m of the [91mbanana[0m variety Friday, adopting the nation's first rules to reduce air pollution from dairy cow manure.




KeyboardInterrupt: ignored

### Visualizing attack results

We are logging `AttackResult` objects using a `CSVLogger`. This logger stores all attack results in a dataframe, which we can easily access and display. Since we set `color_method` to `'html'`, the attack results will display their differences, in color, in HTML. Using `IPython` utilities and `pandas`

In [None]:
import pandas as pd
pd.options.display.max_colwidth = 480 # increase colum width so we can actually read the examples

from IPython.core.display import display, HTML
display(HTML(logger.df[['original_text', 'perturbed_text']].to_html(escape=False)))

### Conclusion
We can examine these examples for a good idea of how many words had to be changed to "banana" to change the prediction score from the correct class to another class. The examples without perturbed words were originally misclassified, so they were skipped by the attack. Looks like some examples needed only a couple "banana"s, while others needed up to 17 "banana" substitutions to change the class score. Wow! 🍌

### Bonus: Attacking Custom Samples

We can also attack custom data samples, like these ones I just made up!

In [None]:
# For AG News, labels are 0: World, 1: Sports, 2: Business, 3: Sci/Tech

custom_dataset = [
    ('Malaria deaths in Africa fall by 5% from last year', 0),
    ('Washington Nationals defeat the Houston Astros to win the World Series', 1),
    ('Exxon Mobil hires a new CEO', 2),
    ('Microsoft invests $1 billion in OpenAI', 3),
]

attack_args = AttackArgs(num_examples=4)

dataset = Dataset(custom_dataset)

attacker = Attacker(attack, dataset, attack_args)

attacker.attack_dataset()

# results_iterable = attack.attack_dataset(custom_dataset)

# logger = CSVLogger(color_method='html')

# for result in results_iterable:
#     logger.log_attack_result(result)
    
# display(HTML(logger.df[['original_text', 'perturbed_text']].to_html(escape=False)))












  0%|          | 0/4 [00:00<?, ?it/s][A[A[A[A[A[A[A[A[A[A[A

Attack(
  (search_method): GreedySearch
  (goal_function):  UntargetedClassification
  (transformation):  BananaWordSwap
  (constraints): 
    (0): RepeatModification
    (1): StopwordModification
  (is_black_box):  True
) 



AttributeError: ignored