## RTE : Recognizing Textual Entailment

The Recognizing Textual Entailment(RTE) task is a sentence pair classification task. It consists of sentence pairs drawn from annual text data challenge sets and annotated for textual entailment.

See [website](https://aclweb.org/aclwiki/Recognizing_Textual_Entailment) for more info.

In [1]:
import numpy as np
import pandas as pd
import os
import sys
import csv
from sklearn import metrics
from sklearn.metrics import classification_report

sys.path.append("../") 
from bert_sklearn import BertClassifier

DATADIR = os.getcwd() + '/glue_data'

In [2]:
%%bash
python3 download_glue_data.py --data_dir glue_data --tasks RTE 

Downloading and extracting RTE...
	Completed!


In [2]:
"""
RTE train data size: 2489 
RTE dev data size: 277 
"""

def read_tsv(filename,quotechar=None):
    with open(filename, "r", encoding='utf-8') as f:
        return list(csv.reader(f,delimiter="\t",quotechar=quotechar))
    

def get_rte_df(filename):
    df = pd.read_csv(filename, sep='\t',  encoding = 'utf8',keep_default_na=False)
    df=df[['sentence1','sentence2','label']]
    df.columns=['text_a','text_b','label']
    df = df[pd.notnull(df['label'])]  
    df = df[df.label != ""]    
    return df    

def get_rte_data(train_file = DATADIR +'/RTE/train.tsv',
                  dev_file = DATADIR + '/RTE/dev.tsv'):
    
    train = get_rte_df(train_file)
    print("RTE train data size: %d "%(len(train)))        
    dev = get_rte_df(dev_file)
    print("RTE dev data size: %d "%(len(dev)))              
    label_list = np.unique(train['label'].values)

    
    return train,dev,label_list

train,dev,label_list = get_rte_data()


RTE train data size: 2489 
RTE dev data size: 277 


In [4]:
print(label_list)

['entailment' 'not_entailment']


In [5]:
train.head()

Unnamed: 0,text_a,text_b,label
0,No Weapons of Mass Destruction Found in Iraq Yet.,Weapons of Mass Destruction Found in Iraq.,not_entailment
1,"A place of sorrow, after Pope John Paul II die...",Pope Benedict XVI is the new leader of the Rom...,entailment
2,Herceptin was already approved to treat the si...,Herceptin can be used to treat breast cancer.,entailment
3,"Judie Vivian, chief executive at ProMedica, a ...",The previous name of Ho Chi Minh City was Saigon.,entailment
4,A man is due in court later charged with the m...,Paul Stewart Hutchinson is accused of having s...,not_entailment


In [5]:
%%time
X_train = train[['text_a','text_b']]
y_train = train['label']

# define model
model = BertClassifier()
model.epochs = 4
model.learning_rate = 3e-5
model.max_seq_length = 96
model.validation_fraction = 0

print('\n',model,'\n')

# fit model
model.fit(X_train, y_train)

# test model on dev
test = dev
X_test = test[['text_a','text_b']]
y_test = test['label']

# make predictions
y_pred = model.predict(X_test)
print("Accuracy: %0.2f%%"%(metrics.accuracy_score(y_pred,y_test) * 100))

target_names = ['entailment', 'not_entailment']
print(classification_report(y_test, y_pred, target_names=target_names))

Building sklearn classifier...

 BertClassifier(bert_model='bert-base-uncased', epochs=4, eval_batch_size=8,
        fp16=False, gradient_accumulation_steps=1, label_list=None,
        learning_rate=3e-05, local_rank=-1, logfile='bert_sklearn.log',
        loss_scale=0, max_seq_length=96, num_mlp_hiddens=500,
        num_mlp_layers=0, random_state=42, restore_file=None,
        train_batch_size=32, use_cuda=True, validation_fraction=0,
        warmup_proportion=0.1) 

Loading bert-base-uncased model...
Defaulting to linear classifier/regressor
train data size: 2489, validation data size: 0


Training: 100%|██████████| 78/78 [00:46<00:00,  1.73it/s, loss=0.717]
Training: 100%|██████████| 78/78 [00:46<00:00,  1.75it/s, loss=0.636]
Training: 100%|██████████| 78/78 [00:48<00:00,  1.64it/s, loss=0.417]
Training: 100%|██████████| 78/78 [00:52<00:00,  1.50it/s, loss=0.287]
                                                           

Accuracy: 64.62%
                precision    recall  f1-score   support

    entailment       0.63      0.79      0.70       146
not_entailment       0.67      0.49      0.57       131

     micro avg       0.65      0.65      0.65       277
     macro avg       0.65      0.64      0.63       277
  weighted avg       0.65      0.65      0.64       277

CPU times: user 2min 10s, sys: 1min 13s, total: 3min 23s
Wall time: 3min 24s


