# Initialization
This cell downloads and extracts the dataset from https://www.dropbox.com/s/hylbuaovqwo2zav/nli_fever.zip.
- Execute it **ONLY ONCE**, at the start of your work.

In [17]:
!wget https://www.dropbox.com/s/hylbuaovqwo2zav/nli_fever.zip
!unzip "nli_fever.zip"
!rm "nli_fever.zip"
!rm -r "__MACOSX"

--2024-05-21 20:47:08--  https://www.dropbox.com/s/hylbuaovqwo2zav/nli_fever.zip
Risoluzione di www.dropbox.com (www.dropbox.com)... 162.125.69.18, 2620:100:6025:18::a27d:4512
Connessione a www.dropbox.com (www.dropbox.com)|162.125.69.18|:443... connesso.
Richiesta HTTP inviata, in attesa di risposta... 302 Found
Posizione: /s/raw/hylbuaovqwo2zav/nli_fever.zip [segue]
--2024-05-21 20:47:08--  https://www.dropbox.com/s/raw/hylbuaovqwo2zav/nli_fever.zip
Riutilizzo della connessione esistente a www.dropbox.com:443.
Richiesta HTTP inviata, in attesa di risposta... 302 Found
Posizione: https://uc044a043b2fa4b989d886d12271.dl.dropboxusercontent.com/cd/0/inline/CTWbR5X3Yxt9XL98kIpkMW15vU9rxB5da0Z_1lYlyJL-I4Kc0CILXN4cbI6NH9z99T5dsxlGIyFUpkQQCG3RWMcCtyepPTK_2tsEH0pY_CK3vaqhMiDRjFuZAOy_B9yM1ipPvsyLl3Xo04fi20R-k1JK/file# [segue]
--2024-05-21 20:47:09--  https://uc044a043b2fa4b989d886d12271.dl.dropboxusercontent.com/cd/0/inline/CTWbR5X3Yxt9XL98kIpkMW15vU9rxB5da0Z_1lYlyJL-I4Kc0CILXN4cbI6NH9z99T5dsx

This cell initializes the models and the dataset.
- You need to execute it **ONLY ONCE**, but, if for any reason the process crashes, you may try re-running from this cell (so you'll avoid downloading files again). 
- If it still crashes, then re-run from the start.

In [77]:
import json
import random
from pprint import pprint

random.seed(3983751073717997123)

LABEL_MAP = {
    'SUPPORTS': 'entailment', 
    'NOT ENOUGH INFO': 'neutral', 
    'REFUTES': 'contradiction'
}
TRAIN_PATH = 'nli_fever/train_fitems.jsonl' 
with open(TRAIN_PATH, 'r') as fin:
    dataset = []
    for line in fin:
        dataset.append(json.loads(line))

to_sample = random.sample(population=range(0, len(dataset)), k=100)
sampled = [dataset[i] for i in to_sample]
#pprint(sampled)
print(len(dataset), 'samples')

208346 samples


In [48]:
def get_prediction(claim:str):
    return {'entailment' : random.random(), 'neutral' : random.random(), 'contradiction' : random.random()}

# The Main Loop
This cell contains the main part of the program: it will loop through each sample of the dataset, asking you to provide a new, hard to understand, hypothesis for each of them.

You can choose either to:
1. modify the given hypothesis, keeping the same label
2. come up with a new hypothesis and its correspective label (you can also use ChatGPT for ideas)

In both cases, when writing the result on [this google sheet](https://docs.google.com/spreadsheets/d/1k7JTOOS2jUDItxCh7xSjwf3eGR8skGP7P7HQGh7_WCg/edit#gid=0), write also the main "change" you performed.
- You can come up with your categorization or take inspiration from the one of [this paper](https://arxiv.org/pdf/2010.12729) (see Table 2).

NOTE: **The changes on the hypothesis can be anything as long as the label does not change**. 

### Formal Definition
**Given**:
- *M* :   ensemble of models that you will fool
- *P* :   premise (the 'context')
- *H* :   hypothesis (the 'claim'), simple enough so that *M* correctly classifies the relationship between *P* and *H*
- *L* :   gold label (the relationship between *P* and *H*)

**Task**: generate *H'* such that:
1. *H* and *H'* have more or less the same meaning --> the relationship between *P* and *H'* is the same as the relationship between *P* and *H*
2. *H'* can fool *M* --> *M* will predict a different relationship type 

In [43]:
last = int(input("If you are resuming, enter the last ID you worked on (otherwise 0): "))
assert last < len(dataset), f"You entered an ID value that is higher than the size of the dataset -- Rerun this cell."

i = max(0, last) 
for elem in sampled[last:]:
    print("-"*30)
    print(f"[ID {i} - CID {elem['cid']}]")
    print(f"PREMISE:")
    for context in elem['context'].split('.'):
        if context.strip() != '':
            print(f"\t> {context.strip()}.")
    print(f"HYPOTHESIS:\n\t> {elem['query']}")
    print(f"GOLD LABEL: {LABEL_MAP[elem['label']]}")
    print("-"*30)

    hypothesis = input("> type new hypothesis: ")
    while hypothesis.lower() != 'n':
        prediction = get_prediction(hypothesis)
        # rescore for better visibility
        prediction = {k: int(v*100) for k, v in prediction.items()}
        predicted = max(prediction, key=prediction.get)
        if predicted != LABEL_MAP[elem["label"]]:
            print(f"PREDICTED LABEL **CHANGED**: >>>> {predicted} <<<< -- {prediction}", flush=True)
        else:
            print(f"PREDICTED LABEL: {predicted} -- {prediction}", flush=True)
        hypothesis = input("type n to exit, otherwise type new hyphotesis: ")
    
    i += 1
    

------------------------------
[ID 48 - CID 79261]
PREMISE:
	> Furious 7.
	> Principal photography began in Atlanta , Georgia , in September 2013 , resumed in April 2014 and ended in July 2014 , with other filming locations including Los Angeles , Colorado , Abu Dhabi , and Tokyo.
	> Furious 7 premiered in Los Angeles on April 1 , 2015 , and was theatrically released in the United States on April 3 , 2015 , playing in 3D , IMAX 3D , and 4DX internationally.
HYPOTHESIS:
	> Furious 7 never concluded filming.
GOLD LABEL: contradiction
------------------------------
PREDICTED LABEL **CHANGED**: >>>> neutral <<<< -- {'entailment': 53, 'neutral': 94, 'contradiction': 0}
------------------------------
[ID 49 - CID 210596]
PREMISE:
	> Primal Fear is a 1996 American neo-noir crime-thriller film , based on William Diehl 's 1993 novel of the same name and directed by Gregory Hoblit.
	> Richard Gere.
	> He went on to star in several hit films , including An Officer and a Gentleman , Pretty Woman ,

# CSV creation for step 1 

In [78]:
import polars as pl
to_csv = {
    'id' : [],
    'cid' : [], 
    'premise': [],
    'hypothesis' : [],
    'alternative hypothesis' : [], 
    'label' : [],	
    'new hypothesis': [], 	
    'new label': [], 
    'change type':[]
}
for i, sample in enumerate(sampled):
    to_csv['id'].append(i)
    to_csv['cid'].append(sample['cid'])
    to_csv['premise'].append(sample['context'])
    to_csv['hypothesis'].append(sample['query'])
    to_csv['alternative hypothesis'].append('')
    to_csv['label'].append(LABEL_MAP[sample['label']])
    to_csv['new hypothesis'].append('')
    to_csv['new label'].append('')
    to_csv['change type'].append('')
to_csv = pl.from_dict(to_csv)
to_csv.write_csv(TRAIN_PATH.replace('.jsonl', '.csv'), separator=',')
print("csv written.")

q = (
    to_csv.lazy()
    .group_by("label")
    .len()
)
df = q.collect()
print(df)

csv written.
shape: (3, 2)
┌───────────────┬─────┐
│ label         ┆ len │
│ ---           ┆ --- │
│ str           ┆ u32 │
╞═══════════════╪═════╡
│ entailment    ┆ 56  │
│ neutral       ┆ 17  │
│ contradiction ┆ 27  │
└───────────────┴─────┘
