Let's create a custom BAE attack workflow step-by-step. We'll walk through the customization of each phase in the TextAttack library. Here's how you can proceed:

# 1. Set Up the Environment
Ensure you have TextAttack installed. You can install it using pip:

In [None]:
pip install textattack

# 2. Candidate Word Generation
We'll use the WordSwapMaskedLM transformation to generate substitution and insertion candidates using a BERT-based model. Customize the number of candidates (max_candidates) or method.

In [None]:
from textattack.transformations import WordSwapMaskedLM

# Use BERT for masked language model transformations
transformation = WordSwapMaskedLM(method="bae", max_candidates=30)


Here:

method="bae" ensures substitutions and insertions are generated as per the BAE algorithm.
max_candidates=30 sets the maximum number of replacement/insertion options.

# 3. Apply Constraints
Next, we'll apply semantic and grammatical constraints. For example:

Ensure replacements are semantically similar.
Limit the percentage of words perturbed.

In [None]:
from textattack.constraints.semantics import WordEmbeddingDistance
from textattack.constraints.grammaticality import PartOfSpeech
from textattack.constraints import MaxWordsPerturbed

# Semantic similarity constraint
semantic_constraint = WordEmbeddingDistance(min_cos_sim=0.8)

# Grammatical constraint to ensure valid replacements
grammatical_constraint = PartOfSpeech()

# Limit the maximum number of perturbed words
max_perturbation_constraint = MaxWordsPerturbed(max_percent=0.2)

constraints = [semantic_constraint, grammatical_constraint, max_perturbation_constraint]


# 4. Define the Goal Function
Set the goal function to evaluate whether the model is fooled. For example, an untargeted classification attack (simply misclassify the input):

In [None]:
from textattack.goal_functions import UntargetedClassification
from textattack.models.wrappers import HuggingFaceModelWrapper

# Load a pre-trained classification model
model = HuggingFaceModelWrapper.from_pretrained("bert-base-uncased")

# Define the goal function
goal_function = UntargetedClassification(model)


# 5. Combine Everything into an Attack
Bring together the transformation, constraints, and goal function using TextAttack's attack class.

In [None]:
from textattack.attack_recipes import Attack

# Assemble the attack
attack = Attack(transformation, constraints, goal_function)


# 6. Run the Attack
You can now run the attack on a dataset or individual examples.

Attack on a Custom Sentence:

In [None]:
# Test the attack on a single example
input_sentence = [("This is a great product!", 1)]  # (sentence, label)
results = attack.attack_dataset(input_sentence)

# Print results
for result in results:
    print(result)


Attack on a Dataset:
Use a HuggingFace dataset to run the attack on multiple examples:

In [None]:
from textattack.datasets import HuggingFaceDataset

# Load dataset
dataset = HuggingFaceDataset("imdb", split="test")

# Attack the dataset
attack_results = attack.attack_dataset(dataset)

# Print some results
for i, result in enumerate(attack_results):
    if i > 10: break  # Print only the first 10 results
    print(result)


# Customization Ideas
Switch Models: Replace bert-base-uncased with other language models like roberta-base or distilbert-base-uncased.
Modify Constraints:
Add a similarity metric based on cosine similarity in the universal sentence encoder.
Adjust max_percent to allow more or fewer perturbations.
Targeted Attacks: Modify the goal_function to implement targeted attacks aiming to classify the input into a specific class.
