# Moving from words to sentences

### What is the most basic thing we want to be able to do with more than just word-level information?

We propose that natural language inference is a good domain to test for whether relational information between words is being used. Humans are good at it and give predictable answers to these questions, and they require concrete and tangible realtional information between words to get to the right answer.

### Datasets that require increasing amounts of non-local information

Several tasks now are interested in sentence representations that go beyond bag-of-words. Sentiment analysis and paraphrase datasets go slightly above, but a lot of the performance of most models on these comes from word-level information. While sentence representations do outperform BOW on these, it is unclear exactly where they have improved.

Natural language inference is a useful domain in which we can propose challenges that require increasingly complex compositionality and are therefore more diagnostic for what is being learnt and what isn't.

We present this set of datasets for natural language inference that humans perform predictably well on, and are impossible to capture from word-level information. 

### Choosing a vocabulary
We chose the SNLI dataset vocabulary, so that we could benchmark on the InferSent model that was trained end-to-end on natural language inference with this dataset. This assumes GloVe word embeddings.

NOTE : I haven't actually checked if these examples are within the vocab, but that's easy to do.


In [22]:
import numpy as np
import itertools
id2label = {0:'CONTRADICTION', 1:'NEUTRAL', 2:'ENTAILMENT'}
label2id = {'CONTRADICTION': 0, 'NEUTRAL':1, 'ENTAILMENT':2}
N = 1


### Kinds of examples:

#### Requiring word-level information regarding its symmetry

A. **Symmetric vs non-symmetric verbs (over subject-object):**

In [23]:
# Insensitive to tense for now (?)

v_cons = ['overtakes', 'gives the hat to', 'takes the bag from', 'is behind', 'is in front of']
v_neus = ['watches', 'ignores', 'hits', 'hugs', 'shoves']#, 'admires', 'talks to']


all_nps = {'short': ["the boy", "the businessman", "the girl", "the old woman", "the man"],
       'long': ["the woman in the black shirt", "the boy holding an umbrella",
             "the girl wearing a hat", "the man with a beard", "the boy with red shoes"]
      }


all_vs = {"cont": v_cons,
     "neu": v_neus}

names = {"cont": "CONTRADICTION",
          "neu": "NEUTRAL"}

for np_key, nps in all_nps.items():
    for vs_key, vs in all_vs.items():
        sents_A = []
        sents_B = []
        labels = []
        for np1, np2 in list(itertools.product(nps, nps)):
            if (np1 != np2):
                for v in vs:
                    sents_A.append(np1 + " " + v + " " + np2 + ' . ')
                    sents_B.append(np2 + " " + v + " " + np1 + ' . ')
                    labels.append(names[vs_key])
                    
                    # self-rep
                    sents_A.append(np1 + " " + v + " " + np2 + ' . ')
                    sents_B.append(np1 + " " + v + " " + np2 + ' . ')
                    labels.append('ENTAILMENT')
                    
        open("testData/true/s1.verb_same_" + vs_key + '_' + np_key, 'w').write("\n".join([str(x) for x in sents_A]))
        open("testData/true/s2.verb_same_" + vs_key + '_' + np_key, 'w').write("\n".join([str(x) for x in sents_B]))
        np.savetxt("testData/true/labels.verb_same_" + vs_key + '_' + np_key, [label2id[x] for x in labels], fmt='%i')
        print("Total: ", len(labels), "\n")
        temp = np.random.randint(0, len(labels), N)
        for i in temp:
            print(sents_A[i])
            print(sents_B[i])
            print(labels[i])
            print("\n")
    

Total:  200 

the boy with red shoes watches the man with a beard . 
the boy with red shoes watches the man with a beard . 
ENTAILMENT


Total:  200 

the boy with red shoes is in front of the boy holding an umbrella . 
the boy holding an umbrella is in front of the boy with red shoes . 
CONTRADICTION


Total:  200 

the businessman hits the old woman . 
the old woman hits the businessman . 
NEUTRAL


Total:  200 

the girl gives the hat to the businessman . 
the businessman gives the hat to the girl . 
CONTRADICTION




B. **Temporal ordering**

In [24]:
tws = ["after", "before"]
vps = ['sat down', 'walked in', 'stood up', 'shouted loudly', 'frowned angrily']
all_nps = {'short': ["the boy", "the fat man", "the girl", "the old woman", "the tall girl"],
       'long': ["the woman in the black shirt", "the boy holding an umbrella",
             "the girl wearing a hat", "the man with a beard", "the boy with red shoes"]
      }


for np_key, nps in all_nps.items():
    sents_A = []
    sents_B = []
    labels = []
    for np1, np2 in list(itertools.product(nps, nps)):
        if (np1 != np2):
            for vp1, vp2 in list(itertools.product(vps, vps)):
                if (vp1 != vp2):
                    for w in tws:
                        sents_A.append(np1 + " " + vp1 + " " + w + " " + np2 + ' ' + vp2 + ' . ')
                        sents_B.append(np2 + " " + vp2 + " " + w + " " + np1 + ' ' + vp1 + ' . ')
                        labels.append('CONTRADICTION')
                    
                        # self-rep
                        sents_A.append(np1 + " " + vp1 + " " + w + " " + np2 + ' ' + vp2 + ' . ')
                        sents_B.append(np1 + " " + vp1 + " " + w + " " + np2 + ' ' + vp2 + ' . ')
                        labels.append('ENTAILMENT')
                    
    open("testData/true/s1.temp_same_" + np_key, 'w').write("\n".join([str(x) for x in sents_A]))
    open("testData/true/s2.temp_same_" + np_key, 'w').write("\n".join([str(x) for x in sents_B]))
    np.savetxt("testData/true/labels.temp_same_" + np_key, [label2id[x] for x in labels], fmt='%i')
    print("Total: ", len(labels), "\n")
    temp = np.random.randint(0, len(labels), N)
    for i in temp:
        print(sents_A[i])
        print(sents_B[i])
        print(labels[i])
        print("\n")

Total:  1600 

the man with a beard frowned angrily before the boy with red shoes walked in . 
the man with a beard frowned angrily before the boy with red shoes walked in . 
ENTAILMENT


Total:  1600 

the tall girl shouted loudly after the boy sat down . 
the tall girl shouted loudly after the boy sat down . 
ENTAILMENT




#### Requiring bi-gram compositionality

A. Modifiers (adjectives)

In [25]:
vps = ['meets', 'resembles', 'watches', 'ignores', 'hits', 'hugs', 'shoves', 'admires', 'talks to', 'collides with']

adjs_temp = {'pos': ['tall', 'cheerful', 'big', 'fat', 'clean', 'happy'],
          'neg': ['short', 'grumpy', 'small', 'thin', 'dirty', 'sad']}

adjs = {}
adjs['pos'] = adjs_temp['pos'] + adjs_temp['neg']
adjs['neg'] = adjs_temp['neg'] + adjs_temp['pos']

all_nps = {'short': ["boy", "man", "girl", "woman", "waitress"],
       'long': ["woman in the black shirt", "boy holding an umbrella",
             "girl wearing a hat", "man with a beard", "waitress with red shoes"]
      }


for np_key, nps in all_nps.items():
    sents_A = []
    sents_B = []
    labels = []
    for vp in vps:
        for np1, np2 in list(itertools.product(nps, nps)):
            if (np1 != np2):
                for p, n in zip(adjs['pos'], adjs['neg']):
                    
                    sents_A.append('The ' +  np1 + ' who is ' + p + ', ' + vp + ' the ' + np2 + ' who is ' + n + ' . ')
                    sents_B.append('The ' +  np1 + ', ' + vp + ' the ' + np2 + ' who is ' + n + ' . ')
                    labels.append('ENTAILMENT')
                    
                    sents_A.append('The ' +  np1 + ' who is ' + p + ', ' + vp + ' the ' + np2 + ' who is ' + n + ' . ')
                    sents_B.append('The ' + np1 + ' who is ' + n + ', '+ vp + ' the ' + np2 + ' . ')
                    labels.append('CONTRADICTION')
                    
                    sents_A.append('The ' +  np1 + ' who is ' + p + ', ' + vp + ' the ' + np2 + ' who is ' + n + ' . ')
                    sents_B.append('The ' + np1 + ' who is ' + p + ', '+ vp + ' the ' + np2 + ' . ')
                    labels.append('ENTAILMENT')
                    
                    sents_A.append('The ' +  np1 + ' who is ' + p + ', ' + vp + ' the ' + np2 + ' who is ' + n + ' . ')
                    sents_B.append('The ' +  np1 + ', ' + vp + ' the ' + np2 + ' who is ' + p + ' . ')
                    labels.append('CONTRADICTION')
    
                    
        
        
    open("testData/true/s1.adjr_whois_" + np_key, 'w').write("\n".join([str(x) for x in sents_A]))
    open("testData/true/s2.adjr_whois_" + np_key, 'w').write("\n".join([str(x) for x in sents_B]))
    np.savetxt("testData/true/labels.adjr_whois_" + np_key, [label2id[x] for x in labels], fmt='%i')

    print("Total: ", len(labels), "\n")

    temp = np.random.randint(0, len(labels), N)
    for i in temp:
        print(sents_A[i])
        print(sents_B[i])
        print(labels[i])
        print("\n")


for np_key, nps in all_nps.items():
    sents_A = []
    sents_B = []
    labels = []
    for vp in vps:
        for np1, np2 in list(itertools.product(nps, nps)):
            if (np1 != np2):
                for p, n in zip(adjs['pos'], adjs['neg']):
                    
                    sents_A.append('The ' +  p + ' ' + np1 + ', ' + vp + ' the ' + n + ' ' + np2 + ' . ')
                    sents_B.append('The ' +  p + ' ' + np1 + ', ' + vp + ' the ' + np2 + ' . ')
                    labels.append('ENTAILMENT')
                    
                    sents_A.append('The ' + p + ' ' + np1 + ', ' + vp + ' the ' + n + ' ' + np2 + ' . ')
                    sents_B.append('The ' + np1 + ', ' + vp + ' the ' + p + ' ' + np2 + ' . ')
                    labels.append('CONTRADICTION')
                    
                    sents_A.append('The ' + p + ' ' + np1 + ', ' + vp + ' the ' + n + ' ' + np2 + ' . ')
                    sents_B.append('The ' + np1 + ', ' + vp + ' the ' + n + ' ' + np2 + ' . ')
                    labels.append('ENTAILMENT')
                    
                    sents_A.append('The ' + p + ' ' + np1 + ', ' + vp + ' the ' + n + ' ' + np2 + ' . ')
                    sents_B.append('The ' + n + ' ' + np1 + ', ' + vp + ' the ' + np2 + ' . ')
                    labels.append('CONTRADICTION')
    
                    
        
        
    open("testData/true/s1.adjr_" + np_key, 'w').write("\n".join([str(x) for x in sents_A]))
    open("testData/true/s2.adjr_" + np_key, 'w').write("\n".join([str(x) for x in sents_B]))
    np.savetxt("testData/true/labels.adjr_" + np_key, [label2id[x] for x in labels], fmt='%i')

    print("Total: ", len(labels), "\n")

    temp = np.random.randint(0, len(labels), N)
    for i in temp:
        print(sents_A[i])
        print(sents_B[i])
        print(labels[i])
        print("\n")

Total:  9600 

The girl wearing a hat who is happy, ignores the man with a beard who is sad . 
The girl wearing a hat, ignores the man with a beard who is sad . 
ENTAILMENT


Total:  9600 

The woman who is small, admires the man who is big . 
The woman, admires the man who is small . 
CONTRADICTION


Total:  9600 

The small girl wearing a hat, meets the big woman in the black shirt . 
The girl wearing a hat, meets the big woman in the black shirt . 
ENTAILMENT


Total:  9600 

The short girl, shoves the tall woman . 
The short girl, shoves the woman . 
ENTAILMENT




C. With but/however/whereas discourse markers

Could also add although?

In [26]:
# Generate discourse marked examples
discs = ['however', 'but', 'whereas']
all_nps = {'short': ["the boy", "the man", "the girl", "the woman", "the waitress"],
       'long': ["the woman in the black shirt", "the boy holding an umbrella",
             "the girl wearing a hat", "the man with a beard", "the waitress with red shoes"]
      }
vps = ['sit down', 'walk in', 'stand up', 'shout loudly', 'frown angrily']

sents_A = []
sents_B = []
labels = []

yn = [' does ', ' does not ']

for np_key, nps in all_nps.items():
    sents_A = []
    sents_B = []
    labels = []
    for np1, np2 in list(itertools.product(nps, nps)):
        if (np1 != np2):
            for disc in discs:
                for vp in vps:
                    for p, n in zip(yn, reversed(yn)):
                        sents_A.append(np1 + p + vp + " , " + disc + " " + np2 + n + vp + ' . ')
                        sents_B.append(np1 + p + vp + ' . ')
                        labels.append('ENTAILMENT')

                        sents_A.append(np1 + p + vp + " , " + disc + " " + np2 + n + vp + ' . ')
                        sents_B.append(np1 + n + vp + ' . ')
                        labels.append('CONTRADICTION')
                        
                        sents_A.append(np1 + p + vp + " , " + disc + " " + np2 + n + vp + ' . ')
                        sents_B.append(np2 + n + vp + ' . ')
                        labels.append('ENTAILMENT')

                        sents_A.append(np1 + p + vp + " , " + disc + " " + np2 + n + vp + ' . ')
                        sents_B.append(np2 + p + vp + ' . ')
                        labels.append('CONTRADICTION')
                        

    open("testData/true/s1.subjv_" + np_key, 'w').write("\n".join([str(x) for x in sents_A]))
    open("testData/true/s2.subjv_" + np_key, 'w').write("\n".join([str(x) for x in sents_B]))
    np.savetxt("testData/true/labels.subjv_" + np_key, [label2id[x] for x in labels], fmt='%i')

    print("Total: ", len(labels), "\n")

    temp = np.random.randint(0, len(labels), N)
    for i in temp:
        print(sents_A[i])
        print(sents_B[i])
        print(labels[i])
        print("\n")
    

Total:  2400 

the man with a beard does not walk in , whereas the boy holding an umbrella does walk in . 
the boy holding an umbrella does not walk in . 
CONTRADICTION


Total:  2400 

the woman does walk in , whereas the man does not walk in . 
the man does not walk in . 
ENTAILMENT




#### Requiring bigram compositionality as well as symmetry understanding

A. Comparatives

In [27]:
with open("./testData/adjs") as f:
    adjs = f.readlines()
    adjs = [x.strip() for x in adjs]

adjs = adjs[:100]
all_comps = {'ml': {'pos': ['more ' + a for a in adjs],
                'neg': ['less ' + a for a in adjs]},
         'not' : {'pos': ['more ' + a for a in adjs] + ['less ' + a for a in adjs],
                  'neg': ['not more ' + a for a in adjs] + ['not less ' + a for a in adjs]}
        }

all_nps = {'short': ["the boy", "the man", "the girl", "the woman"],
       'long': ["the woman in the black shirt", "the boy holding an umbrella",
             "the girl wearing a hat", "the man with a beard"]
      }

for np_key, nps in all_nps.items():
    for comp_key, comps in all_comps.items():
        sents_A = []
        sents_B = []
        labels = []
        for p, n in zip(comps['pos'], comps['neg']):
            for np1, np2 in list(itertools.product(nps, nps)):
                if (np1 != np2):        
                    sents_A.append(np1 + " is " + p + " than " + np2 + ' . ')
                    sents_B.append(np2 + " is " + n + " than " + np1 + ' . ')
                    labels.append('ENTAILMENT')
                    
                    sents_A.append(np1 + " is " + n + " than " + np2 + ' . ')
                    sents_B.append(np2 + " is " + p + " than " + np1 + ' . ')
                    labels.append('ENTAILMENT')

                    sents_A.append(np1 + " is " + p + " than " + np2 + ' . ')
                    sents_B.append(np1 + " is " + n + " than " + np2 + ' . ')
                    labels.append('CONTRADICTION')
                    
                    sents_A.append(np1 + " is " + n + " than " + np2 + ' . ')
                    sents_B.append(np1 + " is " + p + " than " + np2 + ' . ')
                    labels.append('CONTRADICTION')
                    
        open("testData/true/s1.comp_" + comp_key + "_" + np_key, 'w').write("\n".join([str(x) for x in sents_A]))
        open("testData/true/s2.comp_" + comp_key + "_" + np_key, 'w').write("\n".join([str(x) for x in sents_B]))
        np.savetxt("testData/true/labels.comp_" + comp_key + "_" + np_key, [label2id[x] for x in labels], fmt='%i')
        print("Total: ", len(labels), "\n")
        temp = np.random.randint(0, len(labels), N)
        for i in temp:
            print(sents_A[i])
            print(sents_B[i])
            print(labels[i])
            print("\n")


for np_key, nps in all_nps.items():
    comp_key = 'not'
    comps = all_comps[comp_key]
    sents_A = []
    sents_B = []
    labels = []
    for p, n in zip(comps['pos'], comps['neg']):
        for np1, np2 in list(itertools.product(nps, nps)):
            if (np1 != np2):        
                sents_A.append(np1 + " is " + p + " than " + np2 + ' . ')
                sents_B.append(np2 + " is " + p + " than " + np1 + ' . ')
                labels.append('CONTRADICTION')                
                sents_A.append(np1 + " is " + n + " than " + np2 + ' . ')
                sents_B.append(np2 + " is " + n + " than " + np1 + ' . ')
                labels.append('CONTRADICTION')
                
                
                sents_A.append(np1 + " is " + p + " than " + np2 + ' . ')
                sents_B.append(np1 + " is " + p + " than " + np2 + ' . ')
                labels.append('ENTAILMENT')
                
                sents_A.append(np1 + " is " + n + " than " + np2 + ' . ')
                sents_B.append(np1 + " is " + n + " than " + np2 + ' . ')
                labels.append('ENTAILMENT')
                    
    open("testData/true/s1.comp_same_" + np_key, 'w').write("\n".join([str(x) for x in sents_A]))
    open("testData/true/s2.comp_same_" + np_key, 'w').write("\n".join([str(x) for x in sents_B]))
    np.savetxt("testData/true/labels.comp_same_" + np_key, [label2id[x] for x in labels], fmt='%i')
    print("Total: ", len(labels), "\n")
    temp = np.random.randint(0, len(labels), N)
    for i in temp:
        print(sents_A[i])
        print(sents_B[i])
        print(labels[i])
        print("\n")




Total:  4800 

the girl wearing a hat is more jittery than the man with a beard . 
the girl wearing a hat is less jittery than the man with a beard . 
CONTRADICTION


Total:  9600 

the girl wearing a hat is not less mammoth than the boy holding an umbrella . 
the boy holding an umbrella is less mammoth than the girl wearing a hat . 
ENTAILMENT


Total:  4800 

the boy is more joyous than the girl . 
the girl is less joyous than the boy . 
ENTAILMENT


Total:  9600 

the boy is not less great than the girl . 
the boy is less great than the girl . 
CONTRADICTION


Total:  9600 

the woman in the black shirt is more average than the girl wearing a hat . 
the woman in the black shirt is more average than the girl wearing a hat . 
ENTAILMENT


Total:  9600 

the girl is not less bloody than the boy . 
the girl is not less bloody than the boy . 
ENTAILMENT




B. Modifiers that negate - if and only if

In [28]:
connec = ['when', 'if']
phe = {'pos': ['it rains', 'there is a lot of snow', 'the wind does blow very hard',
              'there are many clouds', 'the sun is not shining', 'the air is damp'],
      'neg' : ['it does not rain', 'there is not a lot of snow', 'the wind does not blow very hard',
              'there are not many clouds', 'the sun is shining', 'the air is not damp']}

con = {'pos': ['the trees do look beautiful', 'it is very cold', 'everyone does feel sad',
              'the roads are dangerous', 'it is better to stay home', 'the dogs do not go outside'],
      'neg' : ['the trees do not look beautiful', 'it is not very cold', 'everyone does not feel sad',
              'the roads are not dangerous', 'it is not better to stay home', 'the dogs do go outside']}

sents_A = []
sents_B = []
labels = []


for conn in connec:
    for p_i in np.arange(len(phe['pos'])):
        for c_i in np.arange(len(con['pos'])):
        
            pcon = con['pos'][c_i]
            ncon = con['neg'][c_i]
        
            pphe = phe['pos'][p_i]
            nphe = phe['neg'][p_i]
        
            sents_A.append(pcon + " " + conn + " " + pphe + ' . ')
            sents_B.append(ncon + " " + conn + " " + pphe + ' . ')
            labels.append('CONTRADICTION')
        
            sents_A.append(pcon + " " + conn + " " + pphe + ' . ')
            sents_B.append(pcon + " " + conn + " " + nphe + ' . ')
            labels.append('NEUTRAL')
    

open("testData/true/s1.ncon", 'w').write("\n".join([str(x) for x in sents_A]))
open("testData/true/s2.ncon", 'w').write("\n".join([str(x) for x in sents_B]))
np.savetxt("testData/true/labels.ncon", [label2id[x] for x in labels], fmt='%i')

print("Total: ", len(labels), "\n")

temp = np.random.randint(0, len(labels), N)
for i in temp:
    print(sents_A[i])
    print(sents_B[i])
    print(labels[i])
    print("\n")



Total:  144 

the dogs do not go outside when the air is damp . 
the dogs do go outside when the air is damp . 
CONTRADICTION


