The test heuristic file is the output of running e-SNLI model against the HANS dataset. There are 3 fallable syntactic heuristics tested by the HANS dataset: lexical overlap, subsequence and constituent.

For each heuristic, I selected 100 samples to run the test on e-SNLI model

This notebook will analyze the result after running the model on the sample HANS dataset.

In [1]:
import pandas as pd
from sklearn import metrics
import matplotlib.pyplot as plt
import seaborn as sns
pd.set_option('display.max_colwidth', None)

In [2]:
df = pd.read_csv('test_heuristics.csv')

In [3]:
lexical_overlap_df = df[df['heuristic'] == 'lexical_overlap']
subsequence_df = df[df['heuristic'] == 'subsequence']
constituent_df = df[df['heuristic'] == 'constituent']
print("Number of lexical_overlap tests: ", len(lexical_overlap_df))
print("Number of subsequence tests: ", len(subsequence_df))
print("Number of constituent tests: ", len(constituent_df))

Number of lexical_overlap tests:  100
Number of subsequence tests:  100
Number of constituent tests:  100


In [4]:
# count the number of tests where predicted label is entailment
lexical_overlap_entailment = lexical_overlap_df[lexical_overlap_df['pred_label'] == 'entailment']
print("Number of examples that are misclassified as entailment (lexical overlap) ", len(lexical_overlap_entailment))

Number of examples that are misclassified as entailment (lexical overlap)  69


In [5]:
lexical_overlap_entailment.head(30)

Unnamed: 0,gold_label,Premise,Hypothesis,pred_label,pred_expl,heuristic
0,neutral,The president advised the doctor .,The doctor advised the president .,entailment,the doctor is the president which means that the doctor is the doctor .,lexical_overlap
1,neutral,The student saw the managers .,The managers saw the student .,entailment,the pilot saw the student because the student saw the .,lexical_overlap
3,neutral,The senators supported the actor .,The actor supported the senators .,entailment,the actor is the leader .,lexical_overlap
4,neutral,The actors avoided the bankers .,The bankers avoided the actors .,entailment,actors are the actors .,lexical_overlap
5,neutral,The senators mentioned the artist .,The artist mentioned the senators .,entailment,the artist is the artist .,lexical_overlap
6,neutral,The managers saw the secretaries .,The secretaries saw the managers .,entailment,the postal saw is the name of the .,lexical_overlap
7,neutral,The professor recognized the secretaries .,The secretaries recognized the professor .,entailment,the professor is the professor .,lexical_overlap
9,neutral,The athletes recommended the senator .,The senator recommended the athletes .,entailment,athletes are the same as the .,lexical_overlap
10,neutral,The president avoided the athlete .,The athlete avoided the president .,entailment,the president that the president was the same thing .,lexical_overlap
11,neutral,The judges supported the banker .,The banker supported the judges .,entailment,the judges are the judges .,lexical_overlap


In [6]:
subsequence_entailment = subsequence_df[subsequence_df['pred_label'] == 'entailment']
print("Number of examples that are misclassified as entailment (subsequence) ", len(subsequence_entailment))

Number of examples that are misclassified as entailment (subsequence)  55


In [7]:
subsequence_entailment.head(30)

Unnamed: 0,gold_label,Premise,Hypothesis,pred_label,pred_expl,heuristic
100,neutral,The manager knew the tourists supported the author .,The manager knew the tourists .,entailment,the author who the tourist has the meaning of the words .,subsequence
101,neutral,The manager knew the athlete mentioned the actor .,The manager knew the athlete .,entailment,the cameraman did the athlete that was the cameraman .,subsequence
104,neutral,The artists heard the judges saw the scientists .,The artists heard the judges .,entailment,the judges saw the judges saw the judges saw the .,subsequence
105,neutral,The scientists heard the presidents believed the students .,The scientists heard the presidents .,entailment,the scientists hear the university because the violinists were by the students .,subsequence
108,neutral,The presidents heard the actor resigned .,The presidents heard the actor .,entailment,the actor is telling the actor .,subsequence
109,neutral,The student knew the tourist arrived .,The student knew the tourist .,entailment,the student did the tourist because he was in the past .,subsequence
110,neutral,The scientist believed the artists ran .,The scientist believed the artists .,entailment,the scientist who the artist was the artists .,subsequence
115,neutral,The senator knew the professors called the actors .,The senator knew the professors .,entailment,the pope that the professor is the same as the .,subsequence
117,neutral,The doctor knew the athletes advised the secretaries .,The doctor knew the athletes .,entailment,the doctor did the athletes because he was the the first sentence .,subsequence
119,neutral,The judges heard the actors performed .,The judges heard the actors .,entailment,the actors hear the actors because they were the actors .,subsequence


In [8]:
constituent_entailment = constituent_df[constituent_df['pred_label'] == 'entailment']
print("Number of examples that are misclassified as entailment (constituent) ", len(constituent_entailment))

Number of examples that are misclassified as entailment (constituent)  78


In [9]:
constituent_entailment.head(30)

Unnamed: 0,gold_label,Premise,Hypothesis,pred_label,pred_expl,heuristic
200,neutral,"In case the doctors stopped the author , the bankers helped the manager .",The doctors stopped the author .,entailment,the doctors stopped the author is the same as the doctors stopped the author .,constituent
201,neutral,"Whether or not the professor danced , the student waited .",The professor danced .,entailment,the professor is a person .,constituent
202,neutral,"Whether or not the managers waited , the doctors stopped the professor .",The managers waited .,entailment,the employees waited because they are waiting .,constituent
203,neutral,"Unless the doctors ran , the lawyers encouraged the scientists .",The doctors ran .,entailment,doctors run is the same as doctors doctors .,constituent
204,neutral,"In case the judges waited , the senators arrived .",The judges waited .,entailment,the judges waited is the same as the judges waited .,constituent
205,neutral,"Unless the authors slept , the secretaries ran .",The authors slept .,entailment,the joggers are asleep .,constituent
206,neutral,"In case the students recognized the bankers , the actors stopped the lawyers .",The students recognized the bankers .,entailment,students are the same as the actors .,constituent
207,neutral,"In case the artist saw the managers , the students danced .",The artist saw the managers .,entailment,the artist saw the .,constituent
208,neutral,"Unless the senators ran , the professors recommended the doctor .",The senators ran .,entailment,the postal ran is the same as the first sentence .,constituent
210,neutral,"If the judge encouraged the managers , the lawyers supported the doctors .",The judge encouraged the managers .,entailment,the judge is the same as the judge .,constituent
