<a href="https://colab.research.google.com/github/DreRnc/ExplainingExplanations/blob/main/Explainations.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Dataset : **E-SNLI**. \
Model : **Small T5**.

In [1]:
# git clone https://github.com/DreRnc/ExplainingExplanations.git
# cd ExplainingExplanations
# pip install -r requirements.txt

# 1.0 Preparation


## 1.1 Loading Dataset

In [2]:
from datasets import load_dataset

dataset = load_dataset("esnli")

  from .autonotebook import tqdm as notebook_tqdm


In [3]:
training_set = dataset['train']
validation_set = dataset['validation']
test_set = dataset['test']

print("Shape of training_set: ", training_set.shape)
print("Shae of validation_set: ", validation_set.shape)
print("Shape of test_set: ", test_set.shape)

Shape of training_set:  (549367, 6)
Shae of validation_set:  (9842, 6)
Shape of test_set:  (9824, 6)


In [4]:
training_set[0]

{'premise': 'A person on a horse jumps over a broken down airplane.',
 'hypothesis': 'A person is training his horse for a competition.',
 'label': 1,
 'explanation_1': 'the person is not necessarily training his horse',
 'explanation_2': '',
 'explanation_3': ''}

In [5]:
n_train = n_valid = n_test = 5000

train_small = training_set.select(range(n_train))
valid_small = validation_set.select(range(n_valid))
test_small = test_set.select(range(n_test))

print("Shape of train_small: ", train_small.shape)
print("Shape of valid_small: ", valid_small.shape)
print("Shape of test_small: ", test_small.shape)


Shape of train_small:  (5000, 6)
Shape of valid_small:  (5000, 6)
Shape of test_small:  (5000, 6)


## 1.2 Loading T5 Model

In [6]:
from transformers import T5Tokenizer, T5ForConditionalGeneration

tokenizer = T5Tokenizer.from_pretrained("t5-small")
model = T5ForConditionalGeneration.from_pretrained("t5-small")

You are using the default legacy behaviour of the <class 'transformers.models.t5.tokenization_t5.T5Tokenizer'>. This is expected, and simply means that the `legacy` (previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, set `legacy=False`. This should only be set if you understand what it means, and thoroughly read the reason why this was added as explained in https://github.com/huggingface/transformers/pull/24565
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.


Test **zero-shot** on a random task.

In [7]:
input_ids = tokenizer("translate English to French: Hello Dre, I think the English version is ok for us.", return_tensors="pt").input_ids
outputs = model.generate(input_ids,  max_new_tokens = 100)

print(tokenizer.decode(outputs[0], skip_special_tokens=True, max_length = 100))

Bonjour Dre, je pense que la version anglaise est bonne pour nous.


## 1.3 Zero-shot to Verify Everything is Working

In [8]:
from src.utils import generate_prompt_mnli

In [9]:
example = training_set[0]
example

{'premise': 'A person on a horse jumps over a broken down airplane.',
 'hypothesis': 'A person is training his horse for a competition.',
 'label': 1,
 'explanation_1': 'the person is not necessarily training his horse',
 'explanation_2': '',
 'explanation_3': ''}

Generating the prompt:

<b><u> mnli hypothesis: </b></u> The St. Louis Cardinals have always won. <b><u> premise: </b></u> yeah well losing is i mean i’m i’m originally from Saint Louis and Saint Louis Cardinals when they were there were uh a mostly a losing team but

Output: 
* 0: Entailment 
* 1: Neutral
* 2: Contradiction

In [10]:
prompt = generate_prompt_mnli(example)
prompt

'mnli hypothesis: A person is training his horse for a competition. premise: A person on a horse jumps over a broken down airplane.'

In [11]:
input_ids = tokenizer(prompt, return_tensors= "pt").input_ids

outputs = model.generate(input_ids)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

neutral




# 2.0 Task 1: Zero-shot evaluation

In [12]:
from src.utils import evaluate_output_mnli
from tqdm import tqdm

In [18]:
correct = 0
seen = 0

for datapoint in tqdm(test_small):
    prompt = generate_prompt_mnli(datapoint)
    input_ids = tokenizer(prompt, return_tensors= "pt").input_ids
    outputs = model.generate(input_ids)
    output = tokenizer.decode(outputs[0], skip_special_tokens=True)
    correct += evaluate_output_mnli(output, datapoint['label'])

print("Accuracy zero shot on test set: ", correct/n_test)

100%|██████████| 5000/5000 [04:51<00:00, 17.15it/s]

Accuracy zero shot on test set:  0.717





# 3.0 Task 2: Fine tuning without explanations

# 4.0 Task 4: Making the model generate explanations