# The importance of constraints

Constraints determine which potential adversarial examples are valid inputs to the model. When determining the efficacy of an attack, constraints are everything. After all, an attack that looks very powerful may just be generating nonsense. Or, perhaps more nefariously, an attack may generate a real-looking example that changes the original label of the input. That's why you should always clearly define the *constraints* your adversarial examples must meet. 

### Classes of constraints

TextAttack evaluates constraints using methods from three groups:

- **Overlap constraints** determine if a perturbation is valid based on character-level analysis. For example, some attacks are constrained by edit distance: a perturbation is only valid if it perturbs some small number of characters (or fewer).

- **Grammaticality constraints** filter inputs based on syntactical information. For example, an attack may require that adversarial perturbations do not introduce grammatical errors.

- **Semantic constraints** try to ensure that the perturbation is semantically similar to the original input. For example, we may design a constraint that uses a sentence encoder to encode the original and perturbed inputs, and enforce that the sentence encodings be within some fixed distance of one another. (This is what happens in subclasses of `textattack.constraints.semantics.sentence_encoders`.)

### A new constraint

To add our own constraint, we need to create a subclass of `textattack.constraints.Constraint`. We can implement one of two functions, either `__call__` or `call_many`:

- `__call__` determines if original input `x` and perturbation `x_adv` fulfill a desired constraint. It returns either `True` or `False`.
- `call_many` determines if a list of perturbations `x_adv` fulfill the constraint from original input `x`. This is here in case your constraint can be vectorized. If not, just implement `__call__`, and `__call__` will be executed for each `(x, x_adv)` pair.

### A custom constraint


For fun, we're going to see what happens when we constrain an attack to only allow perturbations that substitute out a named entity for another. In linguistics, a **named entity** is a proper noun, the name of a person, organization, location, product, etc. Named Entity Recognition is a popular NLP task (and one that state-of-the-art models can perform quite well). 


### NLTK and Named Entity Recognition

**NLTK**, the Natural Language Toolkit, is a Python package that helps developers write programs that process natural language. NLTK comes with predefined algorithms for lots of linguistic tasks– including Named Entity Recognition.

First, we're going to write a constraint class. In the `__call__` method, we're going to use NLTK to find the named entities in both `x` and `x_adv`. We will only return `True` (that is, our constraint is met) if `x_adv` has substituted one named entity in `x` for another.

Let's import NLTK and download the required modules:

In [1]:
import nltk
nltk.download('punkt') # The NLTK tokenizer
nltk.download('maxent_ne_chunker') # NLTK named-entity chunker
nltk.download('words') # NLTK list of words

[nltk_data] Downloading package punkt to /u/edl9cy/nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package maxent_ne_chunker to
[nltk_data]     /u/edl9cy/nltk_data...
[nltk_data]   Package maxent_ne_chunker is already up-to-date!
[nltk_data] Downloading package words to /u/edl9cy/nltk_data...
[nltk_data]   Package words is already up-to-date!


True

### NLTK NER Example

Here's an example of using NLTK to find the named entities in a sentence:

In [2]:
sentence = ('In 2017, star quarterback Tom Brady led the Patriots to the Super Bowl, '
           'but lost to the Philadelphia Eagles.')

# 1. Tokenize using the NLTK tokenizer.
tokens = nltk.word_tokenize(sentence)

# 2. Tag parts of speech using the NLTK part-of-speech tagger.
tagged = nltk.pos_tag(tokens)

# 3. Extract entities from tagged sentence.
entities = nltk.chunk.ne_chunk(tagged)
print(entities)

(S
  In/IN
  2017/CD
  ,/,
  star/NN
  quarterback/NN
  (PERSON Tom/NNP Brady/NNP)
  led/VBD
  the/DT
  (ORGANIZATION Patriots/NNP)
  to/TO
  the/DT
  (ORGANIZATION Super/NNP Bowl/NNP)
  ,/,
  but/CC
  lost/VBD
  to/TO
  the/DT
  (ORGANIZATION Philadelphia/NNP Eagles/NNP)
  ./.)


It looks like `nltk.chunk.ne_chunk` gives us an `nltk.tree.Tree` object where named entities are also `nltk.tree.Tree` objects within that tree. We can take this a step further and grab the named entities from the tree of entities:

In [3]:
# 4. Filter entities to just named entities.
named_entities = [entity for entity in entities if isinstance(entity, nltk.tree.Tree)]
print(named_entities)

[Tree('PERSON', [('Tom', 'NNP'), ('Brady', 'NNP')]), Tree('ORGANIZATION', [('Patriots', 'NNP')]), Tree('ORGANIZATION', [('Super', 'NNP'), ('Bowl', 'NNP')]), Tree('ORGANIZATION', [('Philadelphia', 'NNP'), ('Eagles', 'NNP')])]


### Caching with `@functools.lru_cache`

A little-known feature of Python 3 is `functools.lru_cache`, a decorator that allows users to easily cache the results of a function in an LRU cache. We're going to be using the NLTK library quite a bit to tokenize, parse, and detect named entities in sentences. These sentences might repeat themselves. As such, we'll use this decorator to cache named entities so that we don't have to perform this expensive computation multiple times.

### Putting it all together: getting a list of Named Entity Labels from a sentence

Now that we know how to tokenize, parse, and detect named entities using NLTK, let's put it all together into a single helper function. Later, when we implement our constraint, we can query this function to easily get the entity labels from a sentence. We can even use `@functools.lru_cache` to try and speed this process up.

In [4]:
import functools

@functools.lru_cache(maxsize=2**14)
def get_entities(sentence):
    tokens = nltk.word_tokenize(sentence)
    tagged = nltk.pos_tag(tokens)
    # Setting `binary=True` makes NLTK return all of the named
    # entities tagged as NNP instead of detailed tags like
    #'Organization', 'Geo-Political Entity', etc.
    entities = nltk.chunk.ne_chunk(tagged, binary=True)
    return entities.leaves()

And let's test our function to make sure it works:

In [5]:
sentence = 'Jack Black starred in the 2003 film classic "School of Rock".'
get_entities(sentence)

[('Jack', 'NNP'),
 ('Black', 'NNP'),
 ('starred', 'VBD'),
 ('in', 'IN'),
 ('the', 'DT'),
 ('2003', 'CD'),
 ('film', 'NN'),
 ('classic', 'JJ'),
 ('``', '``'),
 ('School', 'NNP'),
 ('of', 'IN'),
 ('Rock', 'NNP'),
 ("''", "''"),
 ('.', '.')]

We flattened the tree of entities, so the return format is a list of `(word, entity type)` tuples. For non-entities, the `entity_type` is just the part of speech of the word. `'NNP'` is the indicator of a named entity (a proper noun, according to NLTK). Looks like we identified three named entities here: 'Jack' and 'Black', 'School', and 'Rock'. as a 'GPE'. (Seems that the labeler thinks Rock is the name of a place, a city or something.) Whatever technique NLTK uses for named entity recognition may be a bit rough, but it did a pretty decent job here!

### Creating our NamedEntityConstraint

Now that we know how to detect named entities using NLTK, let's create our custom constraint.

In [6]:
from textattack.constraints import Constraint

class NamedEntityConstraint(Constraint):
    """ A constraint that ensures `x_adv` only substitutes named entities from `x` with other named entities.
    """
    def _check_constraint(self, x, x_adv, original_text=None):
        x_entities = get_entities(x.text)
        x_adv_entities = get_entities(x_adv.text)
        # If there aren't named entities, let's return False (the attack
        # will eventually fail).
        if len(x_entities) == 0:
            return False
        if len(x_entities) != len(x_adv_entities):
            # If the two sentences have a different number of entities, then 
            # they definitely don't have the same labels. In this case, the 
            # constraint is violated, and we return True.
            return False
        else:
            # Here we compare all of the words, in order, to make sure that they match.
            # If we find two words that don't match, this means a word was swapped 
            # between `x` and `x_adv`. That word must be a named entity to fulfill our
            # constraint.
            x_word_label = None
            x_adv_word_label = None
            for (word_1, label_1), (word_2, label_2) in zip(x_entities, x_adv_entities):
                if word_1 != word_2:
                    # Finally, make sure that words swapped between `x` and `x_adv` are named entities. If 
                    # they're not, then we also return False.
                    if (label_1 not in ['NNP', 'NE']) or (label_2 not in ['NNP', 'NE']):
                        return False            
            # If we get here, all of the labels match up. Return True!
            return True
    

### Testing our constraint

We need to create an attack and a dataset to test our constraint on. We went over all of this in the first tutorial, so let's gloss over this part for now.

In [7]:
# Import the dataset.
from textattack.datasets.classification import YelpSentiment
# Create the model.
from textattack.models.classification.lstm import LSTMForYelpSentimentClassification
model = LSTMForYelpSentimentClassification()
# Create the goal function using the model.
from textattack.goal_functions import UntargetedClassification
goal_function = UntargetedClassification(model)

[34;1mtextattack[0m: Goal function <class 'textattack.goal_functions.classification.untargeted_classification.UntargetedClassification'> matches model LSTMForYelpSentimentClassification.


In [8]:
from textattack.transformations import WordSwapEmbedding
from textattack.search_methods import GreedySearch
from textattack.constraints.semantics import RepeatModification, StopwordModification
from textattack.shared import Attack

# We're going to the `WordSwapEmbedding` transformation. Using the default settings, this
# will try substituting words with their neighbors in the counter-fitted embedding space. 
transformation = WordSwapEmbedding(max_candidates=15) 

# We'll use the greedy search method again
search_method = GreedySearch()

# Our constraints will be the same as Tutorial 1, plus the named entity constraint
constraints = [RepeatModification(),
               StopwordModification(),
               NamedEntityConstraint()]

# Now, let's make the attack using these parameters. 
attack = Attack(goal_function, constraints, transformation, search_method)

print(attack)

Attack(
  (search_method): GreedySearch
  (goal_function):  UntargetedClassification
  (transformation):  WordSwapEmbedding(
    (max_candidates):  15
    (embedding_type):  paragramcf
  )
  (constraints): 
    (0): NamedEntityConstraint
    (1): RepeatModification
    (2): StopwordModification
  (is_black_box):  True
)


In [9]:
import torch
torch.cuda.is_available()

True

Now, let's use our attack. We're going to iterate through the `YelpSentiment` dataset and attack samples until we achieve 10 successes. (There's a lot to check here, and since we're using a greedy search over all potential word swap positions, each sample will take a few minutes. This will take a few hours to run on a single core.)

In [None]:
from textattack.loggers import CSVLogger # tracks a dataframe for us.
from textattack.attack_results import SuccessfulAttackResult

results_iterable = attack.attack_dataset(YelpSentiment(), attack_n=True)
logger = CSVLogger(color_method='html')

num_successes = 0
while num_successes < 10:
    result = next(results_iterable)
    if isinstance(result, SuccessfulAttackResult):
        logger.log_attack_result(result)
        num_successes += 1
        print(num_successes)

Now let's visualize our 10 successes in color:

In [None]:
import pandas as pd
pd.options.display.max_colwidth = 480 # increase column width so we can actually read the examples

from IPython.core.display import display, HTML
display(HTML(logger.df[['passage_1', 'passage_2']].to_html(escape=False)))

### Conclusion

Our constraint seems to have done its job: it filtered out attacks that did not swap out a named entity for another, according to the NLTK named entity detector. However, we can see some problems inherent in the detector: it often thinks the first word of a given sentence is a named entity, probably due to capitalization. (This is why "Awesome atmosphere" can be replaced by "Sublime atmosphere" and still fulfill our constraint; NLTK is telling us that both of those are proper nouns, some specific named type of atmosphere.) 

We did manage to produce some nice adversarial examples! "Cool Cuts" hair cuttery became "Cool Cutback" and the entire prediction  (of 298 words) flipped from positive to negative. "Red Lobster" became "Flushed Lobster" and the prediction (of 337 words) also shifted from positive to negative.