# Initialization
---
This cell downloads and extracts the dataset from https://www.dropbox.com/s/hylbuaovqwo2zav/nli_fever.zip.
- Execute it **ONLY ONCE**, at the start of your work.

In [2]:
!wget https://www.dropbox.com/s/hylbuaovqwo2zav/nli_fever.zip
!unzip "nli_fever.zip"
!rm "nli_fever.zip"
!rm -r "__MACOSX"
!ls -l

--2024-05-22 13:17:59--  https://www.dropbox.com/s/hylbuaovqwo2zav/nli_fever.zip
Resolving www.dropbox.com (www.dropbox.com)... 162.125.65.18, 2620:100:6017:18::a27d:212
Connecting to www.dropbox.com (www.dropbox.com)|162.125.65.18|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: /s/raw/hylbuaovqwo2zav/nli_fever.zip [following]
--2024-05-22 13:18:00--  https://www.dropbox.com/s/raw/hylbuaovqwo2zav/nli_fever.zip
Reusing existing connection to www.dropbox.com:443.
HTTP request sent, awaiting response... 302 Found
Location: https://uc4c561a38b1012df90c4649a7ac.dl.dropboxusercontent.com/cd/0/inline/CTZihYlM0cIB6hGRkQiKsL6sifE_befAxSPSLTjbOmWfEk5IyJLjqZnxMHV-YHqs-wmW1DQi2ItVBl6ayq43cqvgsS8V0leBiFSTD3qxnzF0UPgh_dgf7rT2cbdY8aAf488IKamFbi9YIIW1hUtofLtS/file# [following]
--2024-05-22 13:18:01--  https://uc4c561a38b1012df90c4649a7ac.dl.dropboxusercontent.com/cd/0/inline/CTZihYlM0cIB6hGRkQiKsL6sifE_befAxSPSLTjbOmWfEk5IyJLjqZnxMHV-YHqs-wmW1DQi2ItVBl6ayq43cqvgsS8V0leBi

These cells initialize the models and the dataset.
- You need to execute it **ONLY ONCE**, but, if for any reason the process crashes, you may try re-running from this cell (so you'll avoid downloading files again).
- If it still crashes, then re-run from the start.

In [3]:
import json
import random
random.seed(3983751073717997123)

LABEL_MAP = {
    'SUPPORTS': 'entailment',
    'NOT ENOUGH INFO': 'neutral',
    'REFUTES': 'contradiction'
}
TRAIN_PATH = 'nli_fever/train_fitems.jsonl'

with open(TRAIN_PATH, 'r') as fin:
    dataset = []
    for line in fin:
        dataset.append(json.loads(line))

to_sample = random.sample(population=range(0, len(dataset)), k=100)
sampled = [dataset[i] for i in to_sample]
print(len(dataset), 'samples')

208346 samples


In [15]:
import torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification
device = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu")


def initialize_models():
    models = {}
    tokenizer = {}

    model_name_base = "MoritzLaurer/DeBERTa-v3-base-mnli-fever-anli"
    model_name_large = "MoritzLaurer/DeBERTa-v3-large-mnli-fever-anli-ling-wanli"
    model_name_large_2 = "Joelzhang/deberta-v3-large-snli_mnli_fever_anli_R1_R2_R3-nli"

    tokenizer_base = AutoTokenizer.from_pretrained(model_name_base)
    model_base = AutoModelForSequenceClassification.from_pretrained(model_name_base)

    tokenizer_large = AutoTokenizer.from_pretrained(model_name_large)
    model_large = AutoModelForSequenceClassification.from_pretrained(model_name_large)

    tokenizer_large_2 = AutoTokenizer.from_pretrained(model_name_large_2)
    model_large_2 = AutoModelForSequenceClassification.from_pretrained(model_name_large_2)


    models = {"base": model_base.to(device),
             "large": model_large.to(device),
             "large2": model_large_2.to(device)}
    tokenizers = {"base": tokenizer_base,
             "large": tokenizer_large,
             "large2": tokenizer_large_2}
    return tokenizers, models


def get_prediction(premise, hypothesis, model):
    model_input = tokenizers[chosen_model](premise, hypothesis, truncation=False, return_tensors="pt")
    output = models[chosen_model](model_input["input_ids"].to(device))  # device = "cuda:0" or "cpu"
    prediction = torch.softmax(output["logits"][0], -1).tolist()
    label_names = ["entailment", "neutral", "contradiction"]
    prediction = {name: round(float(pred) * 100, 1) for pred, name in zip(prediction, label_names)}
    return prediction

In [12]:
tokenizers, models = initialize_models()



---
# The Main Loop
This cell contains the main part of the program: it will loop through each sample of the dataset, asking you to provide a new, hard to understand, hypothesis for each of them.

You can choose either to:
1. modify the given hypothesis, keeping the same label
2. come up with a new hypothesis and its correspective label (you can also use ChatGPT for ideas)

In both cases, when writing the result on [this google sheet](https://docs.google.com/spreadsheets/d/1k7JTOOS2jUDItxCh7xSjwf3eGR8skGP7P7HQGh7_WCg/edit#gid=0), write also the main "change" you performed.
- You can come up with your categorization or take inspiration from the one of [this paper](https://arxiv.org/pdf/2010.12729) (see Table 2).

NOTE: **The changes on the hypothesis can be anything as long as the label does not change**.

---
### Formal Definition
**Given**:
- *M* :   ensemble of models that you will fool
- *P* :   premise (the 'context')
- *H* :   hypothesis (the 'claim'), simple enough so that *M* correctly classifies the relationship between *P* and *H*
- *L* :   gold label (the relationship between *P* and *H*)

**Task**: generate *H'* such that:
1. *H* and *H'* have more or less the same meaning --> the relationship between *P* and *H'* is the same as the relationship between *P* and *H*
2. *H'* can fool *M* --> *M* will predict a different relationship type

In [None]:
last = int(input("If you are resuming, enter the last ID you worked on (otherwise 0): "))
assert last < len(dataset), f"You entered an ID value that is higher than the size of the dataset -- Rerun this cell."

i = max(0, last)
for elem in sampled[last:]:
    chosen_model = random.choice(list(models.keys()))
    print("-"*30)
    print(f"[ID {i} - CID {elem['cid']} - model to fool: {chosen_model}]")
    print(f"PREMISE:")
    for context in elem['context'].split('.'):
        if context.strip() != '':
            print(f"\t> {context.strip()}.")
    print(f"HYPOTHESIS:\n\t> {elem['query']}")
    print(f"GOLD LABEL: {LABEL_MAP[elem['label']]}")
    print("-"*30)

    hypothesis = input("> type new hypothesis: ")
    while hypothesis.lower() != 'n':
        prediction = get_prediction(elem['context'], hypothesis, chosen_model)
        # rescore for better visibility
        #prediction = {k: int(v*100) for k, v in prediction.items()}
        predicted = max(prediction, key=prediction.get)
        if predicted != LABEL_MAP[elem["label"]]:
            print(f"PREDICTED LABEL **CHANGED**: >>>> {predicted} <<<< -- {prediction}", flush=True)
        else:
            print(f"PREDICTED LABEL: {predicted} -- {prediction}", flush=True)
        hypothesis = input("type n to exit, otherwise type new hyphotesis: ")

    i += 1