##Preliminary Notes

The aim of the InCrediblAE shared task is to build your own custom attack method that will generate adversarial examples to fool a victim classifier. This notebook is intended as an easy way for you to get started.

<br>

### Using GPU
It is recommended that you run this notebook with a GPU. To do this, click on "additional connection options" (next to Connect / RAM usage), select "change runtime type", and select a GPU.

<br>

### (optional) Mounting Google Drive - don't bother with this if running this notebook for first time
If you will be re-running this notebook many times, it might be convenient to mount your personal google drive. This will allow you to
1. load data/victim files quickly rather than re-downloading them with each session
2. save output files to a permanent location

Instructions for mounting are in the 'Making your own attack section'.


# Setup (installing dependencies)

In [1]:
#code below assumes we are working from the BODEGA repo
import os
os.chdir("./BODEGA")

# Making your own attack

## Imports

In [2]:
import gc
import os
import pathlib
import sys
import time
import random
import numpy as np

import OpenAttack
import torch
import datasets
from datasets import Dataset

from OpenAttack.tags import Tag
from OpenAttack.text_process.tokenizer import PunctTokenizer

from metrics.BODEGAScore import BODEGAScore
from utils.data_mappings import dataset_mapping, dataset_mapping_pairs, SEPARATOR_CHAR
from utils.no_ssl_verify import no_ssl_verify
from victims.bert import VictimBERT, readfromfile_generator
from victims.bilstm import VictimBiLSTM
from victims.caching import VictimCache
from victims.unk_fix_wrapper import UNK_TEXT

#imports for BodegaAttackEval wrapper
from typing import Any, Dict, Generator, Iterable, List, Optional, Union
from tqdm import tqdm
from OpenAttack.utils import visualizer, result_visualizer, get_language, language_by_name
from OpenAttack.tags import *

  from .autonotebook import tqdm as notebook_tqdm


In [3]:
using_mounted_drive = False
print('Cuda device available', torch.cuda.is_available())

Cuda device available True


## (do not change) Wrapper for producing submission file

In [4]:
class BodegaAttackEval(OpenAttack.AttackEval):
  '''
  wrapper for OpenAttack.AttackEval to produce a submission.tsv file for shared task evaluation

  To perform evaluation, we use a new method: eval_and_save_tsv() rather than the usual AttackEval.eval()
  submission.tsv file consists of 4 columns for each sample in attack set: succeeded, num_queries, original_text and modified text (newlines are escaped)

  '''
  def eval_and_save_tsv(self, dataset: Iterable[Dict[str, Any]], total_len : Optional[int] = None, visualize : bool = False, progress_bar : bool = False, num_workers : int = 0, chunk_size : Optional[int] = None, tsv_file_path: Optional[os.PathLike] = None):
      """
      Evaluation function of `AttackEval`.

      Args:
          dataset: An iterable dataset.
          total_len: Total length of dataset (will be used if dataset doesn't has a `__len__` attribute).
          visualize: Display a pretty result for each data in the dataset.
          progress_bar: Display a progress bar if `True`.
          num_workers: The number of processes running the attack algorithm. Default: 0 (running on the main process).
          chunk_size: Processing pool trunks size.

          tsv_file_path: path to save submission tsv

      Returns:
          A dict of attack evaluation summaries.

      """


      if hasattr(dataset, "__len__"):
          total_len = len(dataset)

      def tqdm_writer(x):
          return tqdm.write(x, end="")

      if progress_bar:
          result_iterator = tqdm(self.ieval(dataset, num_workers, chunk_size), total=total_len)
      else:
          result_iterator = self.ieval(dataset, num_workers, chunk_size)

      total_result = {}
      total_result_cnt = {}
      total_inst = 0
      success_inst = 0

      #list for tsv
      x_orig_list = []
      x_adv_list = []
      num_queries_list = []
      succeed_list = []

      # Begin for
      for i, res in enumerate(result_iterator):
          total_inst += 1
          success_inst += int(res["success"])

          if TAG_Classification in self.victim.TAGS:
              x_orig = res["data"]["x"]
              if res["success"]:
                  x_adv = res["result"]
                  if Tag("get_prob", "victim") in self.victim.TAGS:
                      self.victim.set_context(res["data"], None)
                      try:
                          probs = self.victim.get_prob([x_orig, x_adv])
                      finally:
                          self.victim.clear_context()
                      y_orig = probs[0]
                      y_adv = probs[1]
                  elif Tag("get_pred", "victim") in self.victim.TAGS:
                      self.victim.set_context(res["data"], None)
                      try:
                          preds = self.victim.get_pred([x_orig, x_adv])
                      finally:
                          self.victim.clear_context()
                      y_orig = int(preds[0])
                      y_adv = int(preds[1])
                  else:
                      raise RuntimeError("Invalid victim model")
              else:
                  y_adv = None
                  x_adv = None
                  if Tag("get_prob", "victim") in self.victim.TAGS:
                      self.victim.set_context(res["data"], None)
                      try:
                          probs = self.victim.get_prob([x_orig])
                      finally:
                          self.victim.clear_context()
                      y_orig = probs[0]
                  elif Tag("get_pred", "victim") in self.victim.TAGS:
                      self.victim.set_context(res["data"], None)
                      try:
                          preds = self.victim.get_pred([x_orig])
                      finally:
                          self.victim.clear_context()
                      y_orig = int(preds[0])
                  else:
                      raise RuntimeError("Invalid victim model")
              info = res["metrics"]
              info["Succeed"] = res["success"]
              if visualize:
                  if progress_bar:
                      visualizer(i + 1, x_orig, y_orig, x_adv, y_adv, info, tqdm_writer, self.tokenizer)
                  else:
                      visualizer(i + 1, x_orig, y_orig, x_adv, y_adv, info, sys.stdout.write, self.tokenizer)

              #list for tsv
              succeed_list.append(res["success"])
              num_queries_list.append(res["metrics"]["Victim Model Queries"])
              x_orig_list.append(x_orig)

              if res["success"]:
                x_adv_list.append(x_adv)
              else:
                x_adv_list.append("ATTACK_UNSUCCESSFUL")



          for kw, val in res["metrics"].items():
              if val is None:
                  continue

              if kw not in total_result_cnt:
                  total_result_cnt[kw] = 0
                  total_result[kw] = 0
              total_result_cnt[kw] += 1
              total_result[kw] += float(val)
      # End for

      summary = {}
      summary["Total Attacked Instances"] = total_inst
      summary["Successful Instances"] = success_inst
      summary["Attack Success Rate"] = success_inst / total_inst
      for kw in total_result_cnt.keys():
          if kw in ["Succeed"]:
              continue
          if kw in ["Query Exceeded"]:
              summary["Total " + kw] = total_result[kw]
          else:
              summary["Avg. " + kw] = total_result[kw] / total_result_cnt[kw]

      if visualize:
          result_visualizer(summary, sys.stdout.write)


      #saving tsv
      if tsv_file_path is not None:
        with open(tsv_file_path, 'w') as f:
          f.write('succeeded' + '\t' + 'num_queries' + '\t' + 'original_text' + '\t' + 'modified_text' + '\t'+ '\n') #header
          for success, num_queries, x_orig, x_adv in zip(succeed_list, num_queries_list, x_orig_list, x_adv_list):
            escaped_x_orig = x_orig.replace('\n', '\\n') #escaping newlines
            escaped_x_adv = x_adv.replace('\n', '\\n')
            f.write(str(success) + '\t' + str(num_queries) + '\t' + escaped_x_orig + '\t' + escaped_x_adv + '\t'+ '\n')

      return summary

## (optional) Mounting Google Drive


Steps to use mounted google drive:
1. create a folder in your local google drive (e.g. `incrediblAE_public_release`)  
2. download all directories from the download link (see [Download section above](https://colab.research.google.com/drive/1juHWIL44z8O3C5wDAE45vzlJgX51KI5D?authuser=3#scrollTo=eVVE2-64rKuS&line=3&uniqifier=1://)) and upload them to your google drive folder
3. create an empty subdirectory called `outputs` (`incredibleAE_public_release/outputs/`)

At this point, your google drive folder should have 6 subdirectories (C19, FC, HN, PR2, RD, and outputs)
4. uncomment code below, replacing path_to_mounted_dir with path to your folder (e.g. `/content/drive/My Drive/incrediblAE_public_release`)



In [5]:
# from google.colab import drive
# drive.mount('/content/drive')

# using_mounted_drive = True
# path_to_mounted_folder = '/content/drive/My Drive/incrediblAE_public_release'


You can also comment out the !gdown command in Downloading section, so the notebook doesn't redownload data each time you run it.

## Making custom attacker (token shuffler)

Here's an example of how to create a custom attack method.
Your attacker will need to subclass `OpenAttack.attackers.ClassificationAttacker`  

(See also OpenAttack framework docs: https://openattack.readthedocs.io/en/latest/)

In [6]:
import nltk
from nltk.corpus import stopwords, wordnet

nltk.download('stopwords')

nltk.download('wordnet')
nltk.download('omw-1.4')

[nltk_data] Downloading package stopwords to
[nltk_data]     C:\Users\selin\AppData\Roaming\nltk_data...
[nltk_data]   Package stopwords is already up-to-date!
[nltk_data] Downloading package wordnet to
[nltk_data]     C:\Users\selin\AppData\Roaming\nltk_data...
[nltk_data]   Package wordnet is already up-to-date!
[nltk_data] Downloading package omw-1.4 to
[nltk_data]     C:\Users\selin\AppData\Roaming\nltk_data...
[nltk_data]   Package omw-1.4 is already up-to-date!


True

In [22]:
import spacy
import numpy as np
from scipy.spatial.distance import cosine

# Load a pre-trained NLP model
nlp = spacy.load("en_core_web_md")

In [27]:
class MyAttacker(OpenAttack.attackers.ClassificationAttacker):
    @property
    def TAGS(self):
        # Returns tags to help OpenAttack check your parameters automatically
        return {self.lang_tag, Tag("get_pred", "victim")}

    def __init__(self, tokenizer=None, max_generations=10, population_size=20):
        if tokenizer is None:
            tokenizer = PunctTokenizer()
        self.tokenizer = tokenizer
        self.lang_tag = OpenAttack.utils.get_language([self.tokenizer])
        self.stop_words = set(stopwords.words('english'))
        self.max_generations = max_generations
        self.population_size = population_size
        self.homoglyphs = {
            'a': ['а', 'ɑ', 'а'],
            'e': ['е'],
            'o': ['о', 'ο', 'о'],
            'c': ['с', 'ϲ'],
            'p': ['р'],
            'x': ['х'],
            'y': ['у'],
            'i': ['і'],
            'l': ['ⅼ']
        }
        
    def attack(self, victim, input_, goal):
        try:
            original_tokens = self.tokenizer.tokenize(input_)
            population = [self.mutate(original_tokens) for _ in range(self.population_size)]
            original_embedding = nlp(self.tokenizer.detokenize(original_tokens)).vector

            for generation in range(self.max_generations):
                scored_population = [(individual, self.evaluate_fitness(individual, original_tokens, victim, goal, original_embedding)) for individual in population]
                population = [x[0] for x in sorted(scored_population, key=lambda x: x[1], reverse=True)]

                for individual in population:
                    x_new = self.tokenizer.detokenize(individual)
                    y_new = victim.get_pred([x_new])
                    if goal.check(x_new, y_new):
                        return x_new

                population = self.evolve_population(population, original_tokens, victim, goal, original_embedding)

            best_candidate = max(population, key=lambda ind: self.evaluate_fitness(ind, original_tokens, victim, goal, original_embedding))
            return self.tokenizer.detokenize(best_candidate)
        except Exception as e:
            print(e)
            return input_

    def evaluate_fitness(self, individual, original_tokens, victim, goal, original_embedding):
        adversarial_text = self.tokenizer.detokenize(individual)
        prediction = victim.get_pred([adversarial_text])
        adversarial_embedding = nlp(adversarial_text).vector
        semantic_similarity = -cosine(original_embedding, adversarial_embedding)
        changes = sum(1 for orig, mod in zip(original_tokens, individual) if orig != mod)
        fitness_score = semantic_similarity - (changes * 0.5)  # Heavier penalty for more changes
        return fitness_score if goal.check(adversarial_text, prediction) else -1000 + fitness_score

    def evolve_population(self, population, original_tokens, victim, goal, original_embedding):
        new_population = []
        parent_candidates = sorted(population, key=lambda ind: self.evaluate_fitness(ind, original_tokens, victim, goal, original_embedding))[:len(population)//2]
        while len(new_population) < len(population):
            parent1, parent2 = random.sample(parent_candidates, 2)
            child1, child2 = self.crossover(parent1, parent2)
            new_population.append(self.mutate(child1))
            if len(new_population) < len(population):
                new_population.append(self.mutate(child2))
        return new_population

    def mutate(self, tokens, mutation_rate=0.1):
        return [self.swap(token) if random.random() < mutation_rate else token for token in tokens]

    def crossover(self, parent1, parent2):
        point = random.randint(1, len(parent1) - 1)
        child1 = parent1[:point] + parent2[point:]
        child2 = parent2[:point] + parent1[point:]
        return child1, child2

    def is_better(self, individual1, individual2, original_tokens):
        changes1 = sum(1 for orig, mod in zip(original_tokens, individual1) if orig != mod)
        changes2 = sum(1 for orig, mod in zip(original_tokens, individual2) if orig != mod)

        if changes1 < changes2:
            return True
        return False

    def swap(self, token):
      if isinstance(token, tuple):
          word = token[0]
      else:
          word = token

      modified_word = ''.join([random.choice(self.homoglyphs[char]) if char in self.homoglyphs else char for char in word])

      if isinstance(token, tuple):
          return (modified_word, token[1])
      else:
          return modified_word

## Testing your attack

The code below will test MyAttacker (above) on the victim classifier, compute BODEGA score, and output results to /content/BODEGA/outputs.

WARNING: files in default output directory (/content/BODGEa/outputs) do not persist after you disconnect from the colab runtime session. To keep them, you can either:

1. download them manually or
2. set `out_dir` to a mounted Google Drive directory (will automatically save files to your google drive)



### Choose task + victim classifier

In [9]:
# determinism
random.seed(10)
torch.manual_seed(10)
np.random.seed(0)

# Change these variables to what you want
task = 'HN' # PR2, HN, FC, RD, C19
victim_model = 'BERT' # BERT or BiLSTM
using_custom_attacker = True # change to False if you want to test out OpenAttack's pre-implemented attackers (e.g. BERTattack)
attack = 'custom' # if using custom attack, this name can be whatever you want. If using pre-implemented attack, set to name of attacker ('BERTattack')

# misc variables - no need to change
targeted = False # this shared task evaluates performance in an untargeted scenario
visualize_adv_examples = True # prints adversarial samples as they are generated, showing the difference between original
using_first_n_samples = False # used when you want to evaluate on a subset of the full eval set.
first_n_samples = 20


### Run to evaluate attacker


In [28]:

if using_mounted_drive:
    data_path =  pathlib.Path(f"{path_to_mounted_folder}/{task}")
    model_path = pathlib.Path(f"{path_to_mounted_folder}/{task}/{victim_model}-512.pth")
    out_dir = pathlib.Path(f"{path_to_mounted_folder}/outputs")

else:
  data_path =  pathlib.Path(f"path-to-you-folder/incrediblAE_public_release/{task}")
  model_path = pathlib.Path(f"path-to-you-folder/incrediblAE_public_release/{task}/{victim_model}-512.pth")
  out_dir = pathlib.Path("path-to-you-folder/outputs")



RESULTS_FILE_NAME = 'results_' + task + '_' + str(targeted) + '_' + attack + '_' + victim_model + '.txt' #stores BODEGA metrics
SUBMISSION_FILE_NAME = 'submission_' + task + '_' + str(targeted) + '_' + attack + '_' + victim_model + '.tsv' #stores original and modified text, to be submitted to shared task organizers

results_path = out_dir / RESULTS_FILE_NAME if out_dir else None
submission_path = out_dir / SUBMISSION_FILE_NAME if out_dir else None

if out_dir:
    if (out_dir / RESULTS_FILE_NAME).exists():
      print(f"Existing results file found. This script will overwrite previous file: {str(results_path)}")
    if submission_path.exists():
      print(f"Existing submission file found. This script will overwrite previous file: {str(submission_path)}")




# Prepare task data
with_pairs = (task == 'FC' or task == 'C19')

# Choose device
print("Setting up the device...")

using_TF = (attack in ['TextFooler', 'BAE'])
if using_TF:
    # Disable GPU usage by TF to avoid memory conflicts
    import tensorflow as tf

    tf.config.set_visible_devices(devices=[], device_type='GPU')

if torch.cuda.is_available():
    print('using GPU')
    victim_device = torch.device("cuda")
    attacker_device = torch.device("cuda")
else:
    victim_device = torch.device("cpu")
    attacker_device = torch.device('cpu')

# Prepare victim
print("Loading up victim model...")
if victim_model == 'BERT':
    victim = VictimCache(model_path, VictimBERT(model_path, task, victim_device))
elif victim_model == 'BiLSTM':
    victim = VictimCache(model_path, VictimBiLSTM(model_path, task, victim_device))

# Load data
print("Loading data...")
test_dataset = Dataset.from_generator(readfromfile_generator,
                                      gen_kwargs={'subset': 'attack', 'dir': data_path, 'trim_text': True,
                                                  'with_pairs': with_pairs})
if not with_pairs:
    dataset = test_dataset.map(dataset_mapping)
    dataset = dataset.remove_columns(["text"])
else:
    dataset = test_dataset.map(dataset_mapping_pairs)
    dataset = dataset.remove_columns(["text1", "text2"])

dataset = dataset.remove_columns(["fake"])

# Filter data
if using_first_n_samples:
  dataset = dataset.select(range(first_n_samples))

if targeted:
    dataset = [inst for inst in dataset if inst["y"] == 1 and victim.get_pred([inst["x"]])[0] == inst["y"]]

print("Subset size: " + str(len(dataset)))

# Prepare attack
print("Setting up the attacker...")

# Necessary to bypass the outdated SSL certifiacte on the OpenAttack servers
with no_ssl_verify():
  if using_custom_attacker:
    attacker = MyAttacker()
  else:
    filter_words = OpenAttack.attack_assist.filter_words.get_default_filter_words('english') + [SEPARATOR_CHAR]
    if attack == 'PWWS':
        attacker = OpenAttack.attackers.PWWSAttacker(token_unk=UNK_TEXT, lang='english', filter_words=filter_words)
    elif attack == 'SCPN':
        os.environ["TOKENIZERS_PARALLELISM"] = "false"
        attacker = OpenAttack.attackers.SCPNAttacker(device=attacker_device)
    elif attack == 'TextFooler':
        attacker = OpenAttack.attackers.TextFoolerAttacker(token_unk=UNK_TEXT, lang='english',
                                                           filter_words=filter_words)
    elif attack == 'DeepWordBug':
        attacker = OpenAttack.attackers.DeepWordBugAttacker(token_unk=UNK_TEXT)
    elif attack == 'VIPER':
        attacker = OpenAttack.attackers.VIPERAttacker()
    elif attack == 'GAN':
        attacker = OpenAttack.attackers.GANAttacker()
    elif attack == 'Genetic':
        attacker = OpenAttack.attackers.GeneticAttacker(lang='english', filter_words=filter_words)
    elif attack == 'PSO':
        attacker = OpenAttack.attackers.PSOAttacker(lang='english', filter_words=filter_words)
    elif attack == 'BERTattack':
        attacker = OpenAttack.attackers.BERTAttacker(filter_words=filter_words, use_bpe=False, device=attacker_device)
    elif attack == 'BAE':
        attacker = OpenAttack.attackers.BAEAttacker(device=attacker_device, filter_words=filter_words)
    else:
        attacker = None

# Run the attack
print("Evaluating the attack...")
RAW_FILE_NAME = 'raw_' + task + '_' + str(targeted) + '_' + attack + '_' + victim_model + '.tsv'
raw_path = out_dir / RAW_FILE_NAME if out_dir else None

scorer = BODEGAScore(victim_device, task, align_sentences=True, semantic_scorer="BLEURT", raw_path = raw_path)
with no_ssl_verify():
    attack_eval = BodegaAttackEval(attacker, victim, language='english', metrics=[
        scorer  # , OpenAttack.metric.EditDistance()
    ])
    start = time.time()
    summary = attack_eval.eval_and_save_tsv(dataset, visualize=visualize_adv_examples, progress_bar=False, tsv_file_path = submission_path)
    end = time.time()
attack_time = end - start
attacker = None

# Remove unused stuff
victim.finalise()
del victim
gc.collect()
torch.cuda.empty_cache()
if "TOKENIZERS_PARALLELISM" in os.environ:
    del os.environ["TOKENIZERS_PARALLELISM"]

# Evaluate
start = time.time()
score_success, score_semantic, score_character, score_BODEGA= scorer.compute()
end = time.time()
evaluate_time = end - start

# Print results
print("Subset size: " + str(len(dataset)))
print("Success score: " + str(score_success))
print("Semantic score: " + str(score_semantic))
print("Character score: " + str(score_character))
print("BODEGA score: " + str(score_BODEGA))
print("Queries per example: " + str(summary['Avg. Victim Model Queries']))
print("Total attack time: " + str(attack_time))
print("Time per example: " + str((attack_time) / len(dataset)))
print("Total evaluation time: " + str(evaluate_time))

if out_dir:
  with open(results_path, 'w') as f:
      f.write("Subset size: " + str(len(dataset)) + '\n')
      f.write("Success score: " + str(score_success) + '\n')
      f.write("Semantic score: " + str(score_semantic) + '\n')
      f.write("Character score: " + str(score_character) + '\n')
      f.write("BODEGA score: " + str(score_BODEGA) + '\n')
      f.write("Queries per example: " + str(summary['Avg. Victim Model Queries']) + '\n')
      f.write("Total attack time: " + str(end - start) + '\n')
      f.write("Time per example: " + str((end - start) / len(dataset)) + '\n')
      f.write("Total evaluation time: " + str(evaluate_time) + '\n')

  print('-')
  print('Bodega metrics saved to', results_path)
  print('Submission file saved to', submission_path)

Existing results file found. This script will overwrite previous file: C:\Users\selin\Documents\CheckThat\BODEGA\outputs\results_HN_False_custom_BERT.txt
Existing submission file found. This script will overwrite previous file: C:\Users\selin\Documents\CheckThat\BODEGA\outputs\submission_HN_False_custom_BERT.tsv
Setting up the device...
using GPU
Loading up victim model...
Victim caching: file found, loading...
Loading data...
Subset size: 400
Setting up the attacker...
Evaluating the attack...


The tokenizer class you load from this checkpoint is not the same type as the class this function is called from. It may result in unexpected tokenization. 
The tokenizer class you load from this checkpoint is 'BleurtSPTokenizer'. 
The class this function is called from is 'BertTokenizer'.


[32mLabel: 1 (58.51%) --> 0 (95.60%)[0m            |                                   
                                            |                                   
Buckeye Battle : Trump And Kasich Headed    |                                   
Buckeye Battle : Trump And Kasich Headed    |                                   
                                            |                                   
For HEATED Fight In Ohio Donald Trump and   |                                   
For HEATED Fight In Ohio Donald Trump and   |                                   
                                            |                                   
John Kasich are headed for a heated battle  |                                   
John Kasich are headed for a heated battle  |                                   
                                            |                                   
in Ohio ’ s winner take all primary March   |                                   
in Ohio ’ s winner 

Exception when evaluate data {'x': 'You knew it would happen: Sharpton goes after Dolezal’s parents\nEmail address: \n \nWell THIS didn’t take long. Of course if it’s an issue of black and white – even when the “victim” is a fake black person, good ol’ Al Sharpton takes up the case! \nAs reported by Mediate: \nAl Sharpton has some questions for Lawrence and Ruthanne Dolezal about why they waited until Friday to appear on TV to discuss their controversial daughter, Rachel Dolazel. Dolazel’s parents outed the NAACP local president as a white woman pretending to be black, and Sharpton was approached by TMZ one day later to give his thoughts. \nWhen asked if what the Dolazel parents did was wrong, Sharpton first brought up the NAACP description of the story as a family dispute, but then said that Rachel’s parents were trying to create a distraction from her activism in social justice. \n“On one level, you’ve got to say to her, ‘you’re misleading us’, but another level, mom and dad, come on

[31mLabel: 1 (99.99%) --> Failed![0m               |                                   
                                            |                                   
You knew it would happen : Sharpton goes    |                                   
after Dolezal ’ s parents Email address :   |                                   
Well THIS didn ’ t take long . Of course if |                                   
it ’ s an issue of black and white – even   |                                   
when the “ victim ” is a fake black person  |                                   
, good ol ’ Al Sharpton takes up the case ! |                                   
As reported by Mediate : Al Sharpton has    |                                   
some questions for Lawrence and Ruthanne    |                                   
Dolezal about why they waited until Friday  |                                   
to appear on TV to discuss their            |                                   
controversial daugh

Exception when evaluate data {'x': 'Liberals and the ticking clock\n \nPatriot Retort:There’s a scene in the great WWII film “The Dirty Dozen” where the men are training on rope climbing. One of the guys — I think it’s Trini Lopez’s character — can’t climb up the rest of the way. He’s tired, weak and unable to go any further. Lee Marvin’s character grabs a machine gun and starts firing at the rope beneath his grip. Suddenly, Trini scampers up the rest of the way to the top. \nNothing will get people moving better than scaring the living daylights out of them. \nUsing fear and terror to get people to bend to your will is such a common practice you can pretty much find it anywhere. \nread more', 'y': 1}
Traceback (most recent call last):
  File "c:\Users\selin\miniconda3\Lib\site-packages\OpenAttack\attack_eval\utils.py", line 10, in attack_process
    adversarial_sample = attacker(victim, data)
                         ^^^^^^^^^^^^^^^^^^^^^^
  File "c:\Users\selin\miniconda3\Lib\site-pa

[31mLabel: 1 (100.00%) --> Failed![0m              |                                   
                                            |                                   
Liberals and the ticking clock Patriot      |                                   
Retort : There ’ s a scene in the great     |                                   
WWII film “ The Dirty Dozen ” where the men |                                   
are training on rope climbing . One of the  |                                   
guys — I think it ’ s Trini Lopez ’ s       |                                   
character — can ’ t climb up the rest of    | Running Time:            5.4724   
the way . He ’ s tired , weak and unable to | Query Exceeded:          no       
go any further . Lee Marvin ’ s character   | Victim Model Queries:    622      
grabs a machine gun and starts firing at    | BODEGA Score:            (later)  
the rope beneath his grip . Suddenly ,      | Succeed:                 no       
Trini scampers up t

Exception when evaluate data {'x': "Jake Tapper on Russian Hacking: ‘Isn’t This All a Huge Failure of the Obama Administration?’\n \nFinally someone asked the real question…isn’t the Russian hacking scandal a result of Obama dropping the ball? \n&#160; \nGuest Post by&#160; Philip Hodges \nThough he didn’t cast the entire blame on the Obama administration – considering that it was Russia who supposedly hacked into the DNC’s and John Podesta’s emails. But since the whole ordeal happened “under his watch,” CNN’s Jake Tapper suggested that the Obama administration is&#160;largely to blame “in terms of cyber-security and foreign relations.” \nTapper put the question before Senator&#160;Amy Klobuchar (D-MN) who responded that “it’s easy to cast blame.” \n“In our country,” she said, “I don’t think we were ready to think that a major nation like Russia would get into this kind of mess with our own country.” \nI think it would be naive to think that this hasn’t been going on for a long time – 

[31mLabel: 1 (99.98%) --> Failed![0m               |                                   
                                            |                                   
Jake Tapper on Russian Hacking : ‘ Isn ’ t  |                                   
This All a Huge Failure of the Obama        |                                   
Administration ?’ Finally someone asked the |                                   
real question … isn ’ t the Russian hacking |                                   
scandal a result of Obama dropping the ball |                                   
? &# 160 ; Guest Post by &# 160 ; Philip    |                                   
Hodges Though he didn ’ t cast the entire   |                                   
blame on the Obama administration –         |                                   
considering that it was Russia who          |                                   
supposedly hacked into the DNC ’ s and John |                                   
Podesta ’ s emails 

Exception when evaluate data {'x': 'Jack Black, Morgan Freeman, And Others Explain The Iran Deal (VIDEO)\nAs many of you are no doubt aware, President Obama recently negotiated a deal with Iran that would prevent them from starting a nuclear program for the next ten years, effectively removing most fears about the country starting a nuclear war. \nOf course, that hasn’t stopped conservatives from going crazy over this, with Fox News of course at the head of the paranoia train. \nLuckily, to appeal to mainstream, and more importantly, younger folks, a slew of celebrities including Jack Black and Morgan Freeman, have made a video with the Global Zero Channel explaining the Iran deal in layman’s terms, all the while attacking Congress. Check it out below: \n \n \nObama Successfully Reaches Nuclear Deal With IranJuly 14, 2015In "International" \nIran President Says Nuclear Deal Will Help The CountryJanuary 18, 2016In "News" \nThat Time Chris Wallace Grilled Dick Cheney Over Iran FailuresOc

[31mLabel: 1 (99.99%) --> Failed![0m               |                                   
                                            |                                   
Jack Black , Morgan Freeman , And Others    |                                   
Explain The Iran Deal ( VIDEO ) As many of  |                                   
you are no doubt aware , President Obama    |                                   
recently negotiated a deal with Iran that   |                                   
would prevent them from starting a nuclear  |                                   
program for the next ten years ,            |                                   
effectively removing most fears about the   |                                   
country starting a nuclear war . Of course  |                                   
, that hasn ’ t stopped conservatives from  |                                   
going crazy over this , with Fox News of    | Running Time:            6.5561   
course at the head 

Exception when evaluate data {'x': 'MSNBC Interrupts NSA Report for \'Breaking\' Bieber News\nMSNBC reporter Andrea Mitchell was forced to interrupt a former Congresswoman, in the midst of explaining a report about the National Security Agency\'s mass surveillance program, to bring viewers "Breaking News" about teen pop icon Justin Bieber. \nMitchell was in the middle of interviewing Rep. Jane Harman (D-CA), who was speaking from the 2014 World Economic Forum in Davos, Switzerland, when Bieber was being arraigned in Miami over charges that he operated a vehicle while intoxicated. \nFilmed footage of the interruption surfaced on The Guardian\'s website Friday morning. "The news stops for no one and broadcasters have to be sure they are keeping their audiences up to date with the key issues of the day," the paper\'s Media Monkey blog opined. "With that in mind, hacks around the world should take their hats off to MSNBC." \nWatch: \nVideo published on MSNBC.com does not show the Bieber in

[31mLabel: 1 (99.99%) --> Failed![0m               |                                   
                                            |                                   
MSNBC Interrupts NSA Report for ' Breaking  |                                   
' Bieber News MSNBC reporter Andrea         |                                   
Mitchell was forced to interrupt a former   |                                   
Congresswoman , in the midst of explaining  |                                   
a report about the National Security Agency |                                   
' s mass surveillance program , to bring    |                                   
viewers " Breaking News " about teen pop    |                                   
icon Justin Bieber . Mitchell was in the    |                                   
middle of interviewing Rep . Jane Harman (  |                                   
D - CA ), who was speaking from the 2014    |                                   
World Economic Foru

Exception when evaluate data {'x': 'The Follies (of two widening scandals in Raleigh)\nMore national outrage about NC GOP voter suppression efforts \nThe controversy swirling around the efforts of the North Carolina Republican Party to make it more difficult for African-Americans to vote went national this week, with stories around the country about the memo by NC GOP Executive Director Dallas Woodhouse urging support for local elections boards to restricting access to early voting. \nThe move is a startling attempt by Republicans to defy the intent of the recent ruling by the U.S. Court of Appeals for the 4th Circuit&#160;that struck down the massive voter suppression law passed by the General Assembly in 2013 and signed by Gov. Pat McCrory. \nThe court found that legislative leaders asked for data about how people vote, broken down by race, and then changed the voting methods used disproportionately by African-Americans.&#160; The intent could not have been clearer. \nSupporters of t

[31mLabel: 1 (99.74%) --> Failed![0m               |                                   
                                            |                                   
The Follies ( of two widening scandals in   |                                   
Raleigh ) More national outrage about NC    |                                   
GOP voter suppression efforts The           |                                   
controversy swirling around the efforts of  |                                   
the North Carolina Republican Party to make |                                   
it more difficult for African - Americans   |                                   
to vote went national this week , with      |                                   
stories around the country about the memo   |                                   
by NC GOP Executive Director Dallas         |                                   
Woodhouse urging support for local          |                                   
elections boards to

Exception when evaluate data {'x': 'Bush Administration Withholds UNFPA Funding for Fourth Year\nOn Sept 16, the Bush administration made a formal “determination” that the United Nations Population Fund (UNFPA) is ineligible, under a longstanding federal anticoercion law, for the $34 million that Congress had set aside for it. “UNFPA’s support of, and involvement in, China’s birth-planning activities facilitates the Chinese government’s coercive abortion program,” wrote Undersecretary of State for Political Affairs Nicholas Burns in a letter to congressional leaders. \nThe action was predictable. President Bush first blocked a U.S. contribution to UNFPA in 2002, and has every year since, on the grounds that the agency violates the anticoercion law by virtue of its mere presence in China, despite the fact that the administration’s own investigative team found no evidence to support this claim. UNFPA operates in over 150 poor countries around the world, but does not provide or pay for ab

[31mLabel: 1 (99.98%) --> Failed![0m               |                                   
                                            |                                   
Bush Administration Withholds UNFPA Funding |                                   
for Fourth Year On Sept 16 , the Bush       |                                   
administration made a formal “              |                                   
determination ” that the United Nations     |                                   
Population Fund ( UNFPA ) is ineligible ,   |                                   
under a longstanding federal anticoercion   |                                   
law , for the $ 34 million that Congress    |                                   
had set aside for it . “ UNFPA ’ s support  |                                   
of , and involvement in , China ’ s birth - |                                   
planning activities facilitates the Chinese |                                   
government ’ s coer

Exception when evaluate data {'x': 'Peter King Sides With Passion Over Principle\n \nI don’t need to tell conservatives that Peter King is poison. But the recent story in Politico.com gives us a pretty clear picture about what is wrong with him. It also might provide us with a window into Obama’s recent temper tantrum in using government money to shut down national parks as part of the “shutdown.” \n“His current breach with the national GOP stems from a profoundly resonant local issue: aid for Hurricane Sandy, aid for which the House delayed approving and ultimately cleared against opposition from dozens of his Republican colleagues. At the time, King called on New York donors to snub the National Republican Congressional Committee; he has denounced Republican opponents of Sandy aid, such as Cruz and Kentucky Sen. Rand Paul, who have visited the Empire State since then. That wound is clearly still raw for King, who alluded in the interview to how ‘politicians walk around with their arm

[31mLabel: 1 (99.99%) --> Failed![0m               |                                   
                                            |                                   
Peter King Sides With Passion Over          |                                   
Principle I don ’ t need to tell            |                                   
conservatives that Peter King is poison .   |                                   
But the recent story in Politico . com      |                                   
gives us a pretty clear picture about what  |                                   
is wrong with him . It also might provide   |                                   
us with a window into Obama ’ s recent      |                                   
temper tantrum in using government money to |                                   
shut down national parks as part of the “   |                                   
shutdown .” “ His current breach with the   |                                   
national GOP stems 

Exception when evaluate data {'x': "Senators Who Rejected Human-Caused Climate Change Received 7 Times as Much Money from Oil and Gas Interests\nDavid Merrett/Wikimedia Commons \nDecember 2, 2015 - On January 21, 2015, the Senate voted on a number of amendments to S. 1, the Keystone XL Pipeline Approval Act. While construction of the Keystone XL pipeline was rejected last month, several amendments to S. 1 address opinions about climate change more generally. One such amendment, S. Amdt. 58, expresses the sense of Congress that “ human activity significantly contributes to climate change.” &#160;The amendment, which required a two-thirds majority to pass, failed by a vote of 50-49. According to an analysis by MapLight: Senators voting ‘NO’ on the amendment received, on average,&#160; \n7.1 times as much money ($259,314) from oil and gas interests compared to those voting ‘YES’ ($36,759) between April 1, 2009 and March 31, 2015. \nTwelve senators received more than $300,000 from oil and 

[31mLabel: 1 (99.99%) --> Failed![0m               |                                   
                                            |                                   
Senators Who Rejected Human - Caused        |                                   
Climate Change Received 7 Times as Much     |                                   
Money from Oil and Gas Interests David      |                                   
Merrett / Wikimedia Commons December 2 ,    |                                   
2015 - On January 21 , 2015 , the Senate    |                                   
voted on a number of amendments to S . 1 ,  |                                   
the Keystone XL Pipeline Approval Act .     |                                   
While construction of the Keystone XL       |                                   
pipeline was rejected last month , several  |                                   
amendments to S . 1 address opinions about  |                                   
climate change more

Exception when evaluate data {'x': 'MSNBC Maddow Trump Tax Return Fail\n \n \nThe left-wing media and Democrats are continuing their desperate strategy to undermine President Trump. \nSince last year, claimed Russian connections and Trump’s taxes are the main lines of attack. \nLast night, MSNBC host Rachel Maddow played her anti-Trump role by discussing Trump’s 2005 tax return. \nWith great fanfare, Maddow promoted the revelation to draw viewers but her disclosure was a complete fail and she was promptly mocked on Twitter. \nTrump’s tax return showed he paid $38 million in taxes – a rate that’s higher than Senator Bernie Sanders (I-VT) and former President Obama. \nFrom AOL News: \nTwitter erupted on Tuesday night after MSNBC host Rachel Maddow sent a single tweet announcing she had obtained a copy of President Donald Trump’s tax returns. \nChatter reached a fever pitch during her show as she discussed the specific information contained within the two pages of documents she had receiv

[31mLabel: 1 (99.99%) --> Failed![0m               |                                   
                                            |                                   
MSNBC Maddow Trump Tax Return Fail The left |                                   
- wing media and Democrats are continuing   |                                   
their desperate strategy to undermine       |                                   
President Trump . Since last year , claimed |                                   
Russian connections and Trump ’ s taxes are |                                   
the main lines of attack . Last night ,     |                                   
MSNBC host Rachel Maddow played her anti -  |                                   
Trump role by discussing Trump ’ s 2005 tax |                                   
return . With great fanfare , Maddow        |                                   
promoted the revelation to draw viewers but |                                   
her disclosure was 

Exception when evaluate data {'x': 'Video proof of voter fraud in D.C.\nPresident Obama and U.S. Attorney General Eric Holder’s Justice Department have a new obsession to obstruct any state’s passing of voter-identification laws, even recently attacking South Carolina and my own state of Texas. Holder calls voter ID laws “unnecessary” and says voter fraud “doesn’t exist,” but new video proof in his own voting precinct proves otherwise. \nObama’s administration says they’re against voter ID laws because they are valiantly trying to keep minorities and the poor from being unfairly discriminated against. But the truth is that they are trying to keep President Obama in office. They know that voter IDs are bad business for this White House’s campaign and re-election. \nIn the last couple weeks, video proof from Holder’s own voting precinct in Washington, D.C., revealed voter fraud is alive and well in the nation’s capital. On primary election day, April 3, a representative from James O’Keef

[31mLabel: 1 (99.72%) --> Failed![0m               |                                   
                                            |                                   
Video proof of voter fraud in D . C .       |                                   
President Obama and U . S . Attorney        |                                   
General Eric Holder ’ s Justice Department  |                                   
have a new obsession to obstruct any state  |                                   
’ s passing of voter - identification laws  |                                   
, even recently attacking South Carolina    |                                   
and my own state of Texas . Holder calls    |                                   
voter ID laws “ unnecessary ” and says      |                                   
voter fraud “ doesn ’ t exist ,” but new    |                                   
video proof in his own voting precinct      |                                   
proves otherwise . 

KeyboardInterrupt: 

Your output should look like this.
The custom attack has a very low BODEGA score, suggesting that the attack was not very successful (low success rate and low preservation of meaning).

VictimBERT on PR2:
```
Subset size: 416
Success score: 0.1778846153846154
Semantic score: 0.40792732766351186
Character score: 0.3001644500157
BODEGA score: 0.02308437726605881
Queries per example: 2.1778846153846154
Total attack time: 19.421820878982544
Time per example: 0.04668706942063112
Total evaluation time: 10.617336988449097
```

## Submission Files

Whenever you run an attack on a dataset, a submission_task.tsv file will be saved to your outputs directory. At the end of the test phase, you will need to submit your final attack's submission files to the shared task organisers for evaluation (1 for each dataset * num_victim_classifiers).

The submission file contains 4 pieces of information per attacked text:
1. was the attack successful
2. number of queries to victim model used to generate the adversarial sample
3. the original text
4. the adversarial text (or ATTACK_UNSUCCESSFUL if unsuccessful)

## Final tips:

### Using a subset of eval dataset
Testing your attack on the entire eval dataset can take a while. To speed things up, you can test on the first n samples of the dataset, by setting `using_first_n_samples` to `True`.  

### Running pre-implemented attacks

BODEGA supports a number of pre-existing attacks. Trying these might be useful if you want to:
- compare your performance with existing methods (also reported in the [BODEGA preprint](https://arxiv.org/abs/2303.08032))
- get inspiration from observing their substitutions

To use an existing attack requires only two changes to the code above:
1. set `using_custom_attacker` to `False`
2. set `attack` to the name of a supported attack
(`PWWS`, `SCPN`, `TextFooler`, `DeepWordBug`, `GAN`, `Genetic`, `PSO`, `BERTattack` or`BAE`)

Note that using `BAE` or `TextFooler` will require you to install additional dependencies since they rely on tensorflow:

- tensorflow >= 2.0.0
- tensorflow_hub

https://openattack.readthedocs.io/en/latest/quickstart/installation.html
