# Group 42 - COMP34812



**Task A :** Natural Language Inference (NLI)

*Given a premise and a hypothesis, determine if the hypothesis is true based on the
premise. You will be given more than 26K premise-hypothesis pairs as training data, and
more than 6K pairs as validation data.*

**Solution C :** Deep learning-based approaches underpinned by transformer architectures

*Our final model used an ensemble approach where predictions from three transformer models T5, RoBERTa, and FlanT5 are combined using hard voting. These pre-trained models underwent fine-tuning and transfer learning with our dataset to improve their performance, as well as adding a BiLSTM layer to the classification head. Leveraging these pre-trained models as a starting point for training on my dataset will result in faster convergence and improved performance.*

**Group 42 :** Aisha Wahid & Libby Walton

## Preparing Dataset

In [1]:
import os
import numpy as np
os.environ["KERAS_BACKEND"] = "tensorflow"
%env TF_USE_LEGACY_KERAS=1
import tensorflow as tf

env: TF_USE_LEGACY_KERAS=1


In [2]:
from google.colab import drive
drive.mount('/content/GoogleDrive')

Mounted at /content/GoogleDrive


In [3]:
import pandas as pd

dev_df = pd.read_csv('/content/GoogleDrive/MyDrive/NLU_Model/dev.csv')
dev_df['hypothesis'] = dev_df['hypothesis'].astype(str)

## Loading Models

### Tokenisers

In [8]:
import tensorflow as tf
import numpy as np
from transformers import T5Tokenizer, T5ForConditionalGeneration

# T5 tokeniser
tokenizer = T5Tokenizer.from_pretrained("google-t5/t5-base")

def t5_encode(hypotheses, premises, tokenizer, max_length=120):

    concatenated_inputs = [h + ' [SEP] ' + p for h, p in zip(np.array(hypotheses), np.array(premises))]

    inputs = tokenizer(
        concatenated_inputs,
        padding='max_length',
        truncation=True,
        max_length=max_length,
        return_tensors='tf'
    )

    return {
        'input_ids': inputs['input_ids'],
        'attention_mask': inputs['attention_mask']
    }

# Tokenize test data
dev_input_T5 = t5_encode(dev_df.premise.values, dev_df.hypothesis.values, tokenizer)

OSError: Can't load tokenizer for 'google-t5/t5-base'. If you were trying to load it from 'https://huggingface.co/models', make sure you don't have a local directory with the same name. Otherwise, make sure 'google-t5/t5-base' is the correct path to a directory containing all relevant files for a T5Tokenizer tokenizer.

In [10]:
import tensorflow as tf
import numpy as np
from transformers import RobertaTokenizer

# RoBERTa Tokeniser
tokenizer = RobertaTokenizer.from_pretrained('roberta-base')

def roberta_encode(hypotheses, premises, tokenizer, max_length=120):

    concatenated_inputs = [h + ' </s> ' + p for h, p in zip(np.array(hypotheses), np.array(premises))]

    inputs = tokenizer(
        concatenated_inputs,
        padding='max_length',
        truncation=True,
        max_length=max_length,
        return_tensors='tf'
    )

    return {
        'input_ids': inputs['input_ids'],
        'attention_mask': inputs['attention_mask']
    }

# Tokenise test data
dev_input_RB = roberta_encode(dev_df.premise.values, dev_df.hypothesis.values, tokenizer)

tokenizer_config.json:   0%|          | 0.00/25.0 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/899k [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

config.json:   0%|          | 0.00/481 [00:00<?, ?B/s]

In [11]:
import tensorflow as tf
import numpy as np
from transformers import T5Tokenizer, T5ForConditionalGeneration

# FLAN T5 Tokeniser
tokenizer = T5Tokenizer.from_pretrained("google/flan-t5-base")

def flan_t5_encode(hypotheses, premises, tokenizer, max_length=120):

    concatenated_inputs = [h + ' [SEP] ' + p for h, p in zip(np.array(hypotheses), np.array(premises))]

    inputs = tokenizer(
        concatenated_inputs,
        padding='max_length',
        truncation=True,
        max_length=max_length,
        return_tensors='tf'
    )

    return {
        'input_ids': inputs['input_ids'],
        'attention_mask': inputs['attention_mask']
    }

# Tokenise test data
dev_input_FLAN = t5_encode(dev_df.premise.values, dev_df.hypothesis.values, tokenizer)

tokenizer_config.json:   0%|          | 0.00/2.54k [00:00<?, ?B/s]

spiece.model:   0%|          | 0.00/792k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/2.20k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/2.42M [00:00<?, ?B/s]

Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.


### Load Models

In [13]:
import transformers

#Load models from GoogleDrive
modelT5 = tf.keras.models.load_model('/content/GoogleDrive/MyDrive/NLU_Model/T5_model_86p31.h5', custom_objects={"TFT5EncoderModel": transformers.TFT5EncoderModel})
modelRoBERTa = tf.keras.models.load_model('/content/GoogleDrive/MyDrive/NLU_Model/roBERT_model_85p39.h5', custom_objects={"TFRobertaModel": transformers.TFRobertaModel})
modelFlanT5 = tf.keras.models.load_model('/content/GoogleDrive/MyDrive/NLU_Model/T5_flan_model_87p38.h5', custom_objects={"TFT5EncoderModel": transformers.TFT5EncoderModel})



### Ensemble

In [14]:
from collections import Counter

# Generate predicitons for each model
predictionsT5 = np.argmax(modelT5.predict(dev_input_T5), axis=1)
predictionsRoBERTa = np.argmax(modelRoBERTa.predict(dev_input_RB), axis=1)
predictionsT5Flan = np.argmax(modelFlanT5.predict(dev_input_FLAN), axis=1)

# Performs Hard voting
ensemble_predictions = []
for pred_t5, pred_roberta, pred_flan in zip(predictionsT5, predictionsRoBERTa, predictionsT5Flan):
    votes = Counter([pred_t5, pred_roberta, pred_flan])
    ensemble_predictions.append(votes.most_common(1)[0][0])

# Calculate accuracy
acc = np.mean(ensemble_predictions == dev_df.label.values)
print("Ensemble Accuracy:", acc)

Ensemble Accuracy: 0.8797684429271189


### Writing predicted labels to csv

In [None]:
pd.set_option('display.max_rows', None)
result_df = pd.DataFrame({'prediction': predicted_labels})
column_name_row = pd.DataFrame({'prediction': ['prediction']}, index=[0])
result_df['prediction'] = result_df['prediction'].astype(int)
result_df = pd.concat([column_name_row, result_df]).reset_index(drop=True)
result_df