# Generating Sentences with TreeRNNs

This notebook goes through a minimal example of encoding one sentence into a distributed representation using a TreeRNN, and the using this distributed representation to generate another sentence using a different TreeRNN in reverse. To start, we'll train the encoder and decoder weights on about 25,000 sentence pairs.

In [7]:
import random
import pickle
import numpy as np

from pysem.corpora import SNLI
from pysem.networks import DependencyNetwork
from pysem.generatives import EmbeddingGenerator

snli = SNLI('/Users/peterblouw/corpora/snli_1.0/')
snli.load_xy_pairs()
snli.load_vocab('snli_words.pickle')

train_data = [d for d in snli.train_data if d.label == 'entailment']
dev_data = [d for d in snli.dev_data if d.label == 'entailment']

dim = 300
iters = 30
rate = 0.002
batchsize = 1000
vocab = snli.vocab
vectors = 'pretrained_snli_embeddings.pickle'

with open('depdict', 'rb') as pfile:
    subvocabs = pickle.load(pfile)

encoder = DependencyNetwork(dim=dim, vocab=vocab, pretrained=vectors)
decoder = EmbeddingGenerator(dim=dim, subvocabs=subvocabs, vectors=vectors)

for _ in range(iters):
    print('On iteration ', _)
    batch = random.sample(dev_data, batchsize)
    for sample in batch:
        s1 = sample.sentence1
        s2 = sample.sentence2

        encoder.forward_pass(s1)        
        decoder.forward_pass(s2, encoder.get_root_embedding())
        decoder.backward_pass(rate=rate)
        encoder.backward_pass(decoder.pass_grad, rate=rate)


On iteration  0
On iteration  1
On iteration  2
On iteration  3
On iteration  4
On iteration  5
On iteration  6
On iteration  7
On iteration  8
On iteration  9
On iteration  10
On iteration  11
On iteration  12
On iteration  13
On iteration  14
On iteration  15
On iteration  16
On iteration  17
On iteration  18
On iteration  19
On iteration  20
On iteration  21
On iteration  22
On iteration  23
On iteration  24
On iteration  25
On iteration  26
On iteration  27
On iteration  28
On iteration  29


## Simple Entailment Generation Examples

This small amount of data probably isn't enough to generalize outside of the training set, so we'll first check how well the learned decoder is able to generate the entailments it has been trained on.

In [8]:
sample_trees = [d for d in dev_data if 5 < len(d.sentence2.split()) < 10]
batch = random.sample(dev_data, 5)

for sample in batch:
    s1 = sample.sentence1
    s2 = sample.sentence2
    randsen = random.choice(sample_trees)

    encoder.forward_pass(s1)
    decoder.forward_pass(s2, encoder.get_root_embedding())

    predicted = [node.pword for node in decoder.tree]
    true = [node.lower_ for node in decoder.tree]

    print('Sentence: ', s1)
    print('Predicted Entailment: ', ' '.join(predicted))
    print('Actual Entailment: ', ' '.join(true))
    print('')

Sentence:  Two people are sitting next to a wood-stacked campfire at night.
Predicted Entailment:  people are sitting outside at night .
Actual Entailment:  people are sitting outside at night .

Sentence:  kids drawing something on paper
Predicted Entailment:  there drawing kids drawing .
Actual Entailment:  there are kids drawing .

Sentence:  Two men and a woman finishing a meal and drinks.
Predicted Entailment:  some are eating a meal .
Actual Entailment:  some people eating a meal .

Sentence:  A male with brown clothing standing on the side of the street with his thumb out with a big bag on his back.
Predicted Entailment:  the man is is
Actual Entailment:  a man is standing

Sentence:  Many people are in a cafeteria or restaurant, there are two workers wearing white and black who are taking their orders.
Predicted Entailment:  people in a busy restaurant .
Actual Entailment:  people in a busy restaurant .



We can also generate entailments using randomly chosen trees for the decoding network structure. This doesn't work very well.

In [9]:
batch = random.sample(dev_data, 5)

for sample in batch:
    s1 = sample.sentence1
    s2 = sample.sentence2
    randsen = random.choice(sample_trees)

    encoder.forward_pass(s1)
    decoder.forward_pass(s2, encoder.get_root_embedding())

    predicted = [node.pword for node in decoder.tree]
    true = [node.lower_ for node in decoder.tree]
    
    print('Sentence: ', s1)
    print('Predicted Entailment: ', ' '.join(predicted))
    print('Actual Entailment: ', ' '.join(true))

    decoder.forward_pass(randsen.sentence2, encoder.get_root_embedding())
    alternate = [node.pword for node in decoder.tree]
    print('Random Tree Entailment: ', ' '.join(alternate))
    print('')

Sentence:  Two young girls dressed in pink reach into wishing fountain at mall.
Predicted Entailment:  girls are are in the same color .
Actual Entailment:  girls are dressed in the same color .
Random Tree Entailment:  the girls are are the club .

Sentence:  A man in a white shirt has his mouth open and is adjusting dials.
Predicted Entailment:  a man is wearing white .
Actual Entailment:  a man is wearing white .
Random Tree Entailment:  a man wearing while open his he open

Sentence:  An older gentleman in a blue sports jacket and glasses is talking to a woman in an off-white jacket.
Predicted Entailment:  the older man is wearing a blue jacket .
Actual Entailment:  the older man is wearing a blue jacket .
Random Tree Entailment:  the older older man is wearing a jacket .

Sentence:  These three men dressed in blue, yellow and white, are playing a sport.
Predicted Entailment:  there are three men playing a sport .
Actual Entailment:  there are three men playing a sport .
Random Tre

## Generating Entailment Chains (i.e. Inferential Roles)

We can also generate entailment chains by re-encoding a generated sentence, and then generating new sentence from the subsequent encoding. This is kind of neat because it allows us to distill what the model has learned in a network of inferential relationships between sentences. Philosophers sometimes argue that the meaning of sentences is determined by it's role or location in such a network.

In [10]:
s1 = 'A man curls up in a blanket on the street.'
s2 = 'A dog chases in a field.'
s3 = 'A frog is thirsty.'

def predict(encoder, decoder, s1, s2, s3):
    encoder.forward_pass(s1)
    decoder.forward_pass(s2, encoder.get_root_embedding())

    true = [node.lower_ for node in decoder.tree]
    predicted = [node.pword for node in decoder.tree]

    print('Sentence: ', s1)
    print('Predicted Entailment: ', ' '.join(predicted))

    encoder.forward_pass(' '.join(predicted))
    decoder.forward_pass(s3, encoder.get_root_embedding())

    predicted = [node.pword for node in decoder.tree]
    print('Next Prediction: ', ' '.join(predicted))
    print('')

predict(encoder, decoder, s1, s2, s3)
    
s1 = 'A group of Asian men pose around a large table after enjoying a meal together.'
s2 = 'A man sleeps in a blanket.'
s3 = 'A man is cold.'

predict(encoder, decoder, s1, s2, s3)

s1 = 'Two police officers are sitting on motorcycles in the road.'
s2 = 'Two red elephants are walking.'
s3 = 'The men are wearing blue.'

predict(encoder, decoder, s1, s2, s3)

s1 = 'Five people are playing in a gymnasium.'
s2 = 'A man walks on the ocean.'
s3 = 'fast people swim quickly.'

predict(encoder, decoder, s1, s2, s3)

s1 = 'A woman, whose face can only be seen in a mirror, is applying eyeliner in a dimly lit room.'
s2 = 'The lady chased mirror.'
s3 = 'A big dog is chasing large squirrels.'

predict(encoder, decoder, s1, s2, s3)

Sentence:  A man curls up in a blanket on the street.
Predicted Entailment:  a man lounging in the street .
Next Prediction:  a man is outside .

Sentence:  A group of Asian men pose around a large table after enjoying a meal together.
Predicted Entailment:  a group pose for a picture .
Next Prediction:  the group pose outside .

Sentence:  Two police officers are sitting on motorcycles in the road.
Predicted Entailment:  two several police are sitting .
Next Prediction:  the they are are dead .

Sentence:  Five people are playing in a gymnasium.
Predicted Entailment:  the people are in a gymnasium .
Next Prediction:  several people are inside .

Sentence:  A woman, whose face can only be seen in a mirror, is applying eyeliner in a dimly lit room.
Predicted Entailment:  a woman applying eyeliner .
Next Prediction:  a blonde woman is applying red eyeliner .



In [11]:
def condition(encoder, decoder, s1, s2, cond):
    encoder.forward_pass(s1)
    decoder.forward_pass(s2, encoder.get_root_embedding() + cond)

    true = [node.lower_ for node in decoder.tree]
    predicted = [node.pword for node in decoder.tree]
    print('Predicted Entailment: ', ' '.join(predicted))
    
s1 = 'A woman, whose face can only be seen in a mirror, is applying eyeliner in a dimly lit room.'
s2 = 'a blond woman applying eyeliner'
cond_sen = 'A person in a hooded shirt is photographing a woman.'
encoder.forward_pass(cond_sen)
cond = encoder.get_root_embedding()

print('Sentence: ', s1)
print('Conditioning Context: ', cond_sen)

encoder.forward_pass('')
condition(encoder, decoder, s1, s2, cond)   

s1 = 'A shirtless man sleeps in his blue boat out on the open waters.'
s2 = 'The red man is in the big boat.'
cond_word = 'fishing'
cond = encoder.vectors[cond_word]

print('')
print('Sentence: ', s1)
print('Conditioning Context: ', cond_word)

encoder.forward_pass('')
condition(encoder, decoder, s1, s2, cond)

s1 = 'Seven women stand and sit around a waters edge and one of them women sitting in the middle with her bare feet in the water drinks from a water bottle.'
s2 = 'a big man is on a boat.'
cond_sen = 'What is in the water bottle?'

encoder.forward_pass(cond_sen)
cond = encoder.get_root_embedding()

print('')
print('Sentence: ', s1)
print('Conditioning Context: ', cond_sen)

condition(encoder, decoder, s1, s2, cond)

Sentence:  A woman, whose face can only be seen in a mirror, is applying eyeliner in a dimly lit room.
Conditioning Context:  A person in a hooded shirt is photographing a woman.
Predicted Entailment:  a dressed person wearing eyeliner

Sentence:  A shirtless man sleeps in his blue boat out on the open waters.
Conditioning Context:  fishing
Predicted Entailment:  a shirtless man fisherman on a floating boat .

Sentence:  Seven women stand and sit around a waters edge and one of them women sitting in the middle with her bare feet in the water drinks from a water bottle.
Conditioning Context:  What is in the water bottle?
Predicted Entailment:  a young woman drinking from the bottle .


In [12]:
s1 = 'Several runners compete in a road race.'
s2 = 'the dog ran quickly to the beach.'

def sub_predict(encoder, decoder, s1, s2):
    
    encoder.forward_pass(s1)
    decoder.forward_pass(s2, encoder.get_root_embedding())

    true = [node.lower_ for node in decoder.tree]
    predicted = [node.pword for node in decoder.tree]

    print('Sentence: ', s1)
    print('Predicted Entailment: ', ' '.join(predicted))
    print('')    

sub_predict(encoder, decoder, s1, s2)
    
s1 = 'Many runners compete in a road race.'
sub_predict(encoder, decoder, s1, s2)

s1 = 'Several runners compete in a talent show.'
sub_predict(encoder, decoder, s1, s2)

s1 = 'Several performers compete in a talent show.'
sub_predict(encoder, decoder, s1, s2)

s1 = 'Several performers perform in a music show.'
sub_predict(encoder, decoder, s1, s2)

s1 = 'Several swimmers compete in a fast race.'
sub_predict(encoder, decoder, s1, s2)

s1 = 'The swimmers compete in a road race.'
s2 = 'a terrible pride comes before the fall.'
sub_predict(encoder, decoder, s1, s2)

s1 = 'The swimmers compete in a swim race.'
s2 = 'a big man is in the pool.'
sub_predict(encoder, decoder, s1, s2)

s1 = 'Several runners paint paintings.'
sub_predict(encoder, decoder, s1, s2)

s1 = 'The swimmers race before eating.'
sub_predict(encoder, decoder, s1, s2)

s1 = 'One very slow cyclist is in the indoor arena.'
sub_predict(encoder, decoder, s1, s2)

s1 = 'a big man swims in the lake.'
sub_predict(encoder, decoder, s1, s2)

Sentence:  Several runners compete in a road race.
Predicted Entailment:  the runners compete together in a race .

Sentence:  Many runners compete in a road race.
Predicted Entailment:  the runners compete together in a race .

Sentence:  Several runners compete in a talent show.
Predicted Entailment:  the runners compete together in a stage .

Sentence:  Several performers compete in a talent show.
Predicted Entailment:  some performers compete outside in a stage .

Sentence:  Several performers perform in a music show.
Predicted Entailment:  the performers performing outside in a concert .

Sentence:  Several swimmers compete in a fast race.
Predicted Entailment:  the people compete fast in a race .

Sentence:  The swimmers compete in a road race.
Predicted Entailment:  the several people compete in a race .

Sentence:  The swimmers compete in a swim race.
Predicted Entailment:  the several people swam in a race .

Sentence:  Several runners paint paintings.
Predicted Entailment:  t