### DLP Lab5
Goal of this lab is to to implement a conditional seq2seq VAE for English tense conversion
1. Tense conversion: ‘access’ to ‘accessing’, or ‘accessed’ to ‘accesses’
2. Generative model: Gaussian noise + tense -> access, accesses, accessing, accessed

#### Requirment
1. Implement a conditional seq2seq VAE.
    * Modify encoder, decoder, and training functions
    * Implement evaluation function, dataloader, and reparameterization trick.
2. Plot and **compare** the CrossEntropy loss, KL loss and BLEU-4 score of testing data curves during training with different settings of your model
    * Teacher forcing ratio
    * KL annealing schedules (two methods)
3. Output the conversion results between tenses (from tense A to tense B)
4. Output the results generated by a Gaussian noise with 4 tenses.

#### Implement detail
1. Use LSTM
2. Log variance
3. Condition (tense)
    * Simply concatenate to the hidden_0 and z
    * Embed your condition to high dimensional space (or simply use one-hot)
4. KL lost annealing
    * Monotonic
    * Cyclical
5. Adopt BLEU-4 score function in NLTK (average 10 testing scores)
6. Adopt Gaussian_score() to compute the generation score
    * Random sample 100 noise to generate 100 words with 4 different tenses(totally 400 words)
    * 4 words should exactly match the training data

#### Hyper parameters
* LSTM hidden size: 256 or 512
* Latent size: 32
* Condition embedding size: 8
* Teacher forcing ratio: 0~1 (>0.5) (??? 
* KL weight: 0~1 (???
* Learning rate: 0.05
* Optimizer: SGD
* Loss function: torch.nn.CrossEntropyLoss()

Date: 2020/05/

In [1]:
from __future__ import unicode_literals, print_function, division
from io import open
import unicodedata
import string
import re
import random
import time
import math
import torch
import torch.nn as nn
from torch import optim
import torch.nn.functional as F
from torch.utils import data
import matplotlib.pyplot as plt
plt.switch_backend('agg')
import matplotlib.ticker as ticker
import numpy as np
import pandas as pd
from os import system
from nltk.translate.bleu_score import SmoothingFunction, sentence_bleu

In [2]:
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
SOS_token = 0
EOS_token = 1
#----------Hyper Parameters----------#
hidden_size = 256
condition_size = 8
#The number of vocabulary
vocab_size = 30
teacher_forcing_ratio = 1.0
empty_input_ratio = 0.1
KLD_weight = 0.0
learning_rate = 0.05

In [3]:
def asMinutes(s):
    m = math.floor(s / 60)
    s -= m * 60
    return '%dm %2ds' % (m, s)


def timeSince(since, percent):
    now = time.time()
    s = now - since
    es = s / (percent)
    rs = es - s
    return '%s (- %s)' % (asMinutes(s), asMinutes(rs))

### Fetch data
* train.txt: Each training pair includes 4 words: simple present(sp), third person(tp), present progressive(pg), simple past(p)
* test.txt: Each training pair includes 2 words with different combination of tenses

In [4]:
def getData(mode):
    assert mode == 'train' or mode == 'test'
    if mode == 'train':
        data = pd.read_csv('./data/'+mode+'.txt', delimiter=' ', header=None)
    else:
        data = []
        with open('./data/test.txt','r') as fp:
            for line in fp:
                word = line.split(' ')
                word[1] = word[1].strip('\n')
                data.extend([word])
    return data

getData("train")

Unnamed: 0,0,1,2,3
0,abandon,abandons,abandoning,abandoned
1,abet,abets,abetting,abetted
2,abdicate,abdicates,abdicating,abdicated
3,abduct,abducts,abducting,abducted
4,abound,abounds,abounding,abounded
...,...,...,...,...
1222,exhort,exhorts,exhorting,exhorted
1223,exhilarate,exhilarates,exhilarating,exhilarated
1224,exculpate,exculpates,exculpating,exculpated
1225,exasperate,exasperates,exasperating,exasperated


#### Vocabulary

In [5]:
class Vocabuary():
    def __init__(self):
        self.word2index = {'SOS': 0, 'EOS': 1, 'PAD': 2, 'UNK': 3}
        self.index2word = {0: 'SOS', 1: 'EOS', 2: 'PAD', 3: 'UNK'}
        self.n_words = 4
        self.max_length = 0
        self.build_vocab(getData('train'))
        

    # input the training data and build vocabulary
    def build_vocab(self, corpus):        
        for idx in range(corpus.shape[0]):
            for word in corpus.iloc[idx,:]:
                if len(word) > self.max_length:
                    self.max_length = len(word)
                    
                for char in word:
                    if char not in self.word2index:
                        self.word2index[char] = self.n_words
                        self.index2word[self.n_words] = char
                        self.n_words += 1                      
                    
    # convert word in indices
    def word2indices(self, word, add_eos=False, add_sos=False):
        indices = [self.word2index[char] if char in self.word2index else 3 for char in word]

        if add_sos:
            indices.insert(0, 0)
        if add_eos:
            indices.append(1)
            
        # padding input of same target into same length
        indices.extend([2]*(self.max_length-len(word)))
            
        return np.array(indices)
    
    # convert indices to word
    def indices2word(self, indices):
        word = [self.index2word[idx] for idx in indices if idx > 2 ]
        return ''.join(word)

In [6]:
v = Vocabuary()
t = "exculpated"
idx = v.word2indices(t)
print(idx)
t = v.indices2word(idx)
print(t)

[12 29 14 15 20 17  4 13 12  7  2  2  2  2  2]
exculpated


#### DataLoader

In [7]:
class TenseLoader(data.Dataset):
    def __init__(self, mode, vocab):
        self.mode = mode   
        self.vocab = vocab
        self.order = {'sp':0, 'tp':1, 'pg':2, 'p':3}
        self.build_pair()
        
    def __len__(self):
        return len(self.data)

    def __getitem__(self, index):
        if self.mode == 'train':
            input_tense = self.data[index][0][0]
            input_tensor = torch.tensor(self.vocab.word2indices(self.data[index][0][1]))
            target_tense = self.data[index][1][0]
            target_tensor = torch.tensor(self.vocab.word2indices(self.data[index][1][1]))    

        else:
            condition = [["sp", "p"], ["sp", "pg"], ["sp", "tp"], ["sp", "tp"], ["p", "tp"], 
                        ["sp", "pg"], ["p", "sp"], ["pg", "sp"], ["pg", "p"], ["pg", "tp"]]
            input_tense = self.order[condition[index][0]]
            input_tensor = torch.tensor(self.vocab.word2indices(self.data[index][0]))
            target_tense = self.order[condition[index][1]]
            target_tensor = torch.tensor(self.vocab.word2indices(self.data[index][1]))

        return (input_tense, input_tensor), (target_tense, target_tensor)
        
    def build_pair(self):
        if self.mode == 'train':
            pd = getData(self.mode)
            self.data = []
            for index in range(len(pd)):
                for i in range(4):
                    for j in range(4):
                        self.data.append([(i, pd.iloc[index,:][i]), (j, pd.iloc[index,:][j])])
        else:
            self.data = getData(self.mode)

In [8]:
trainset = TenseLoader('train', v)
trainset[1]

((0, tensor([4, 5, 4, 6, 7, 8, 6, 2, 2, 2, 2, 2, 2, 2, 2])),
 (1, tensor([4, 5, 4, 6, 7, 8, 6, 9, 2, 2, 2, 2, 2, 2, 2])))

In [9]:
testset = TenseLoader('test', v)
testset[0]

((0, tensor([4, 5, 4, 6, 7, 8, 6, 2, 2, 2, 2, 2, 2, 2, 2])),
 (3, tensor([ 4,  5,  4,  6,  7,  8,  6, 12,  7,  2,  2,  2,  2,  2,  2])))

#### Show reslut
* Crossentropy loss curve
* KL loss curve
* BLEU-4 score curve

In [10]:
def show_result(score, c_loss, kl_loss):  
    label = ["BLEU Score", "Crossentropy Loss", "KL Loss"]
    
    plt.figure(figsize=(10, 6))
    
    plt.ylabel("Loss & Score")
    plt.xlabel("Epochs")
    plt.title("Training Curve", fontsize=18)
    plt.plot(score, label="BLEU Score")
    plt.plot(c_loss, label="Crossentropy Loss")
    plt.plot(kl_loss, label="KL Loss")

    plt.show()

#### Condition
concatenate the condition part with the initial hidden part 
* nn.Embedding
* One-hot

In [11]:
def condition_embedding(condition_size, condition):
    embedding = nn.Embedding(4, condition_size)
    
    if type(condition) == int:
        condition_tensor = torch.tensor(condition)
        embedded_tense = embedding(condition_tensor).view(1, -1)
    else: 
        condition_tensor = torch.tensor(condition)
        embedded_tense = embedding(condition_tensor)
        
    return embedded_tense
                     
condition_embedding(condition_size, 2)

tensor([[ 1.0307, -0.8062, -1.6080,  1.0490,  0.2623,  0.4760,  1.3073, -0.1252]],
       grad_fn=<ViewBackward>)

### Evaluation

In [12]:
#compute BLEU-4 score
def compute_bleu(output, reference):
    """
    reference = 'accessed'
    output = 'access'
    return BLEU score
    """
    cc = SmoothingFunction()
    if len(reference) == 3:
        weights = (0.33,0.33,0.33)
    else:
        weights = (0.25,0.25,0.25,0.25)
    return sentence_bleu([reference], output,weights=weights,smoothing_function=cc.method1)

In [13]:
# compute generation score
def Gaussian_score(predictions, plot_pred=False):
    """
    the order should be : simple present, third person, present progressive, past
    predictions = [['consult', 'consults', 'consulting', 'consulted'],...]
    return Gaussian_score score
    """
    score = 0
    words_list = []
    with open('./data/train.txt','r') as fp:
        for line in fp:
            word = line.split(' ')
            word[3] = word[3].strip('\n')
            words_list.extend([word])
        for idx, t in enumerate(predictions):
            if plot_pred:
                print (t)
            for idxj, i in enumerate(words_list):
                if t == i:
                    score += 1
    return score/len(predictions)

In [39]:
# print the prediction and return the bleu score
def BLEU_score(prediction, plot_pred=False):
    data = getData('test')
    inputs = data[0]
    targets = data[1]

    bleu_total = 0
    for idx in range(len(inputs)):
        bleu_total += compute_bleu(prediction[idx], targets[idx])

        if plot_pred:
            output = "\ninput:  {}\ntarget: {}\npred:   {}".format(inputs[idx], targets[idx], prediction[idx])
            print ("="*30+output)

    return bleu_total/len(inputs)

In [43]:
def predict(encoder, decoder, vocab, dataloader, batch_size=64, plot_pred=False):    
    prediction = []
    inputs = []
    outputs = []

    with torch.no_grad():    
        for input_data, target_data in dataloader:
            input_tense = input_data[0]
            input_tensors = input_data[1].to(device)
            target_tense = target_data[0]
            target_tensors = target_data[1].to(device)
            
            batch_size = input_tensors.size(0)

            # transpose tensor from (batch_size, seq_len) to (seq_len, batch_size)
            input_tensors = input_tensors.transpose(0, 1)
            target_tensors = target_tensors.transpose(0, 1)
        
            # record outputs
            output_tensors = torch.zeros(input_tensors.size())

            # get tense embedding tensor
            input_embedded = condition_embedding(condition_size, input_tense)
            target_embedded = condition_embedding(condition_size, target_tense)
        
            # init hidden state and cat condition
            encoder_hidden = encoder.initHidden(input_embedded, batch_size)

            # calculate number of time step
            input_length = input_tensors.size(0)
            target_length = target_tensors.size(0)

            #----------sequence to sequence part for encoder----------#
            for ei in range(input_length):
                encoder_output, encoder_hidden = encoder(
                    input_tensors[ei], encoder_hidden)

            decoder_input = torch.tensor([[SOS_token] for i in range(batch_size)], device=device)
            output = torch.zeros(input_length, batch_size)

            # reparameterization trick
            mu, logvar = encoder.variational(encoder_hidden)
            reparameterized_state = reparameterize(mu, logvar)

            # init decoder hidden state and cat condition
            decoder_hidden = decoder.initHidden(reparameterized_state, target_embedded, batch_size)

            #----------sequence to sequence part for decoder----------#
            for di in range(target_length):
                decoder_output, decoder_hidden = decoder(
                    decoder_input, decoder_hidden) 
                topv, topi = decoder_output.topk(1)
                decoder_input = topi.squeeze().detach()  # detach from history as input
                output[di] = decoder_input

            # get predict indices
            output = output.transpose(0, 1) 

            # convert input into string
            for idx in range(batch_size):
                outputs.append(vocab.indices2word(output[idx].data.numpy()))

    return outputs

In [44]:
def evaluate(encoder, decoder, vocab, batch_size=64, plot_pred=False):
    # predict train.txt for gaussian score
    # create dataloader
    trainset = TenseLoader('train', vocab)
    trainloader = data.DataLoader(trainset, batch_size=batch_size)
    predictions = predict(encoder, decoder, vocab, dataloader=trainloader, batch_size=batch_size, plot_pred=plot_pred)
    
    # compute Gaussian score
    gaussian_score = Gaussian_score(predictions, plot_pred=plot_pred)
    if plot_pred:
        print ("Gaussian score: %.2f"%gaussian_score)

    # predict test.txt for bleu score
    testset = TenseLoader('test', vocab)
    testloader = data.DataLoader(testset, batch_size=batch_size)
    predictions = predict(encoder, decoder, vocab, dataloader=testloader, batch_size=batch_size, plot_pred=plot_pred)
      
    # compute BLEU score
    bleu_score = BLEU_score(predictions, plot_pred=plot_pred)
    if plot_pred:
        print ("BLEU score: %.2f"%bleu_score)
    
    return bleu_score, gaussian_score

#### Reparameterization Trick

In [17]:
def reparameterize(mu, logvar):
    std = torch.exp(0.5*logvar)
    eps = torch.randn_like(std)
    return mu + eps*std

### Encoder

In [18]:
class EncoderRNN(nn.Module):
    def __init__(self, input_size, hidden_size, condition_size):
        super(EncoderRNN, self).__init__()
        self.hidden_size = hidden_size
        self.condition_size = condition_size
        
        self.fc1 = nn.Linear(hidden_size+condition_size, hidden_size)
        self.fc2 = nn.Linear(hidden_size+condition_size, hidden_size)

        self.embedding = nn.Embedding(input_size, hidden_size+condition_size)
        self.lstm = nn.LSTM(hidden_size+condition_size, hidden_size+condition_size)

    def forward(self, input, hidden):
        embedded = self.embedding(input).view(1, -1, self.hidden_size+condition_size)
        output = embedded
        output, hidden = self.lstm(output, hidden)
        return output, hidden
    
    def variational(self, hidden):
        return self.fc1(hidden[0]), self.fc2(hidden[0])

    def initHidden(self, embedded_tense, batch_size=64):
        embedded_tense = embedded_tense.to(device).view(1, batch_size, -1)
        zeros = torch.zeros(1, batch_size, self.hidden_size, device=device)
        return (torch.cat((zeros, embedded_tense), 2),
                torch.cat((zeros, embedded_tense), 2))

### Decoder

In [19]:
class DecoderRNN(nn.Module):
    def __init__(self, hidden_size, output_size, condition_size):
        super(DecoderRNN, self).__init__()
        self.hidden_size = hidden_size
        self.condition_size = condition_size

        self.embedding = nn.Embedding(output_size, hidden_size+condition_size)
        self.lstm = nn.LSTM(hidden_size+condition_size, hidden_size+condition_size)
        self.out = nn.Linear(hidden_size+condition_size, output_size)
        self.softmax = nn.LogSoftmax(dim=1)

    def forward(self, input, hidden):
        output = self.embedding(input).view(1, -1, self.hidden_size+condition_size)
        output = F.relu(output)
        output, hidden = self.lstm(output, hidden)
        output = self.out(output[0])
        return output, hidden

    def initHidden(self, hidden_state, embedded_tense, batch_size):
        embedded_tense = embedded_tense.to(device).view(1, batch_size, -1)
        return (torch.cat((hidden_state, embedded_tense), 2),
                torch.cat((hidden_state, embedded_tense), 2))

### Train
* Use teacher forcing
* Use KL loss annealing

In [20]:
def kl_annealing(epochs, mode):
    assert mode == "monotonic" or mode == "cyclical"
    if mode == "monotonic":
        if epochs > 1e4:
            KLD_weight = 1
        else:
            KLD_weight = 1e-4 * epochs
    else:
        if epochs%1e4 > 5e3:
            KLD_weight = 1
        else:
            KLD_weight = 2e-1 * epochs
    return KLD_weight

In [21]:
def compute_kl_loss(mu, logvar):
    # 0.5 * sum(1 + log(sigma^2) - mu^2 - sigma^2)
    KLD = -0.5 * torch.sum(1 + logvar - mu.pow(2) - logvar.exp())
    return KLD

In [31]:
# save model every epoch
def train(input_tensors, input_tense, target_tensors, target_tense, encoder, decoder, encoder_optimizer, decoder_optimizer, 
          criterion, epochs, teacher_forcing_ratio):
    batch_size = input_tensors.size(0)
    # get tense embedding tensor
    input_embedded = condition_embedding(condition_size, input_tense)
    target_embedded = condition_embedding(condition_size, target_tense)

    # transpose tensor from (batch_size, seq_len) to (seq_len, batch_size)
    input_tensors = input_tensors.transpose(0, 1)
    target_tensors = target_tensors.transpose(0, 1)

    # init encoder hidden state and cat conditionde
    encoder_hidden = encoder.initHidden(input_embedded, batch_size)

    encoder_optimizer.zero_grad()
    decoder_optimizer.zero_grad()

    # calculate number of time step
    input_length = input_tensors.size(0)
    target_length = target_tensors.size(0)

    loss = 0
    ce_loss = 0

    #----------sequence to sequence part for encoder----------#
    for ei in range(input_length):
        encoder_output, encoder_hidden = encoder(
            input_tensors[ei], encoder_hidden)

    # reparameterization trick
    mu, logvar = encoder.variational(encoder_hidden)
    reparameterized_state = reparameterize(mu, logvar)

    # calculate kl loss
    kl_loss = kl_annealing(epochs, "monotonic") * compute_kl_loss(mu, logvar) / batch_size
    loss += kl_loss

    # init decoder hidden state and cat condition
    decoder_hidden = decoder.initHidden(reparameterized_state, target_embedded, batch_size)

    decoder_input = torch.tensor([[SOS_token] for i in range(batch_size)], device=device)

    use_teacher_forcing = True if random.random() < teacher_forcing_ratio else False

    #----------sequence to sequence part for decoder----------#
    if use_teacher_forcing:
        # Teacher forcing: Feed the target as the next input
        for di in range(target_length):
            decoder_output, decoder_hidden = decoder(
                decoder_input, decoder_hidden) 
            ce_loss += criterion(decoder_output, target_tensors[di])
            decoder_input = target_tensors[di]  # Teacher forcing

    else:
        # Without teacher forcing: use its own predictions as the next input
        for di in range(target_length):
            decoder_output, decoder_hidden = decoder(
                decoder_input, decoder_hidden) 
            topv, topi = decoder_output.topk(1)
            decoder_input = topi.squeeze().detach()  # detach from history as input

            ce_loss += criterion(decoder_output, target_tensors[di])

    loss += ce_loss
    loss.backward()

    encoder_optimizer.step()
    decoder_optimizer.step()

    return ce_loss.item()/target_length, kl_loss

In [23]:
def trainIters(encoder, decoder, vocab, n_iters, print_every=1000, plot_every=100, 
               batch_size=64, learning_rate=0.01, teacher_forcing_ratio=1.0):
    start = time.time()

    # Reset every print_every, for print log
    print_ce_loss_total = 0  
    print_kl_loss_total = 0
    # Reset every plot_every, for plot curve
    crossentropy_losses = []
    kl_losses = []
    plot_ce_loss_total = 0
    plot_kl_loss_total = 0
    # scores
    gaussian_scores = []
    bleu_scores = []
    

    encoder_optimizer = optim.SGD(encoder.parameters(), lr=learning_rate)
    decoder_optimizer = optim.SGD(decoder.parameters(), lr=learning_rate)

    # create dataloader
    trainset = TenseLoader('train', vocab)
    trainloader = data.DataLoader(trainset, batch_size, shuffle = True)

    criterion = nn.CrossEntropyLoss()

    for iter in range(1, n_iters + 1):
        for input_data, target_data in trainloader:
            input_tense = input_data[0]
            input_tensors = input_data[1].to(device)
            target_tense = target_data[0]
            target_tensors = target_data[1].to(device)

            ce_loss, kl_loss = train(input_tensors, input_tense, target_tensors, target_tense, encoder, decoder, 
                                     encoder_optimizer, decoder_optimizer, criterion, (iter-1), teacher_forcing_ratio)
            print_ce_loss_total += ce_loss
            print_kl_loss_total += kl_loss
            plot_ce_loss_total += ce_loss
            plot_kl_loss_total += kl_loss
            
        # evaluate and save model
        _, gaussian_score = evaluate(encoder, decoder, vocab, plot_pred=False)
        gaussian_scores.append(gaussian_score)
#         if avg_bleu > max_score:
#             max_score = avg_bleu
#             print ("Model save...")
#             torch.save(encoder, "./models/encoder_{:.4f}.ckpt".format(avg_bleu))
#             torch.save(decoder, "./models/decoder_{:.4f}.ckpt".format(avg_bleu))

        if iter % print_every == 0:
            print_ce_loss_avg = print_ce_loss_total / print_every
            print_kl_loss_avg = print_kl_loss_total / print_every
            print('%s (%d %2d%%) CE Loss: %.4f, KL Loss: %.4f, Gaussian score: %.2f' % 
                  (timeSince(start, iter / n_iters), iter, iter / n_iters * 100, print_ce_loss_avg, print_kl_loss_avg, gaussian_score))
            print_ce_loss_total = 0
            print_kl_loss_total = 0

        crossentropy_losses.append(plot_ce_loss_total)
        plot_ce_loss_total = 0
        kl_losses.append(plot_kl_loss_total)
        plot_kl_loss_total = 0
            
#     print ("The highest score is %s"%max_score)
            
    return gaussian_scores, crossentropy_losses

In [32]:
encoder1 = EncoderRNN(vocab_size, hidden_size, condition_size).to(device)
decoder1 = DecoderRNN(hidden_size, vocab_size, condition_size).to(device)
vocab = Vocabuary()
gaussian_scores, crossentropy_losses = trainIters(encoder1, decoder1, vocab, 100, print_every=1, plot_every=1, teacher_forcing_ratio=0.8)
# show_result(scores, losses)

  


0m 18s (- 30m 21s) (1  1%) CE Loss: 496.8858, KL Loss: 0.0000, Gaussian score: 0.00
0m 32s (- 26m 44s) (2  2%) CE Loss: 429.5147, KL Loss: 78.4589, Gaussian score: 0.00
0m 46s (- 25m 13s) (3  3%) CE Loss: 402.5837, KL Loss: 231.3072, Gaussian score: 0.00
1m  1s (- 24m 25s) (4  4%) CE Loss: 400.0578, KL Loss: 445.1276, Gaussian score: 0.00
1m 15s (- 23m 51s) (5  5%) CE Loss: 394.6223, KL Loss: 729.0020, Gaussian score: 0.00
1m 29s (- 23m 21s) (6  6%) CE Loss: 391.6205, KL Loss: 1075.7430, Gaussian score: 0.00
1m 43s (- 22m 56s) (7  7%) CE Loss: 382.6861, KL Loss: 1441.0710, Gaussian score: 0.00
1m 57s (- 22m 34s) (8  8%) CE Loss: 374.8272, KL Loss: 1840.8333, Gaussian score: 0.00
2m 11s (- 22m 13s) (9  9%) CE Loss: 369.3593, KL Loss: 2396.2534, Gaussian score: 0.00
2m 25s (- 21m 53s) (10 10%) CE Loss: 359.0015, KL Loss: 3046.0583, Gaussian score: 0.00
2m 40s (- 21m 35s) (11 11%) CE Loss: 348.5397, KL Loss: 3729.3750, Gaussian score: 0.00
2m 54s (- 21m 18s) (12 12%) CE Loss: 341.7385, KL

22m 39s (- 1m 11s) (95 95%) CE Loss: 48.0914, KL Loss: 12440.6172, Gaussian score: 0.00
22m 53s (- 0m 57s) (96 96%) CE Loss: 48.1637, KL Loss: 12424.8047, Gaussian score: 0.00
23m  8s (- 0m 42s) (97 97%) CE Loss: 49.4416, KL Loss: 12404.8643, Gaussian score: 0.00
23m 22s (- 0m 28s) (98 98%) CE Loss: 51.6383, KL Loss: 12339.0947, Gaussian score: 0.00
23m 36s (- 0m 14s) (99 99%) CE Loss: 50.3529, KL Loss: 12247.1309, Gaussian score: 0.00
23m 50s (- 0m  0s) (100 100%) CE Loss: 48.6790, KL Loss: 12234.2744, Gaussian score: 0.00


In [45]:
bleu, gaussian = evaluate(encoder1, decoder1, vocab, batch_size=64, plot_pred=True)

  


abandon
abandon
abandoning
abandons
abandoning
abandons
abandons
abandoning
abandoned
abandons
abandons
abandons
abandoning
abandoned
abandoning
abandoned
abet
abet
abetting
abetting
abets
abetting
abetting
abetting
abetting
abetting
abetting
abetting
abetting
abetting
abetting
abetting
abdicates
abdicates
abdicates
abdicates
abdicates
abdicates
abdicates
abdicates
abdicates
abdicates
abdicates
abdicates
abdicates
abdicates
abdicates
abdicates
abducts
abducting
abducting
abducting
abducts
abducts
abducting
absturing
abducts
abducting
abducting
abducts
abducts
abducts
abducts
abducts
abounding
abounding
abounds
abounds
abounding
abounds
abounding
abounds
abounding
abounds
abounds
abounding
abounds
abounding
abounds
abounds
absorbs
absorbs
absorbs
absorbs
absorbs
absorbs
absorbs
absorbs
absorbed
absorbs
absorbs
absorbs
absorbs
absorbs
absorbs
absorbs
accept
accepted
accept
accepted
accept
cancering
accepting
accepting
accepted
accept
accept
accept
accept
accepting
accept
accepting
accomp

calling
calls
calling
calling
calls
call
calling
call
calling
call
campaigned
campaigns
campaigns
campaigns
campaigns
acchinates
campaigning
acplaining
campaigns
campaigns
campaigns
campaigning
campaigning
campaigning
campaigning
campaigns
capitulates
capitulates
capitulates
capitulates
capitulates
capitulates
capitulates
capitulates
capitulates
capitulates
capitulates
capitulates
capitulates
capitulates
capitulates
capitulates
captures
captures
captures
captures
captures
captures
captures
captures
captures
captures
captures
captures
captures
captures
captures
captures
cares
cares
cares
cares
cares
cares
cares
cares
cares
cares
cares
cares
cares
cares
cares
cares
caresses
caresses
caresses
caresses
caresses
caresses
caresses
caressed
caresses
caresses
caresses
caressing
caresses
caresses
caresses
caresses
carrying
carries
carry
carries
carries
carries
carries
carries
carrying
carries
carries
carries
carries
carries
carries
carries
broadcasts
broadcasts
broadcasts
broadcasts
broadcastin

dangles
daring
dag
dares
daring
dare
daring
dares
dare
daring
daring
daring
daring
daring
dares
daring
dares
darkens
darkens
darkening
darkens
darkens
darkening
darkens
darkens
darkening
darkening
darkening
darkening
darkening
darkens
darkens
darkening
darts
darts
darts
darts
darts
darts
darts
darts
darts
darts
darts
darts
dart
darts
darts
darts
dashes
dashes
dashes
dashed
dashes
dashes
dashes
dashes
dashes
dashes
dashed
dashing
dashes
dashes
dashes
dashed
decides
decides
decides
decides
decides
decides
decides
decides
decides
decides
decides
decides
decides
decides
decides
decides
declaiming
declaiming
declaim
declaiming
declaiming
declaim
declaiming
declaiming
declaim
declaim
declaiming
declaiming
declaiming
declaiming
declaiming
declaim
declares
declares
declares
declares
declares
declares
declares
declares
declares
declares
declares
declares
declares
declares
declares
declares
decline
decline
decline
declines
declines
declines
declines
declines
declines
decline
declines
declines
de

esteems
esteems
esteems
esteems
esteems
esteems
esteems
esteem
esteem
esteem
esteeming
esteems
esteems
esteems
estimates
estimates
estimates
estimates
estimates
estimates
estimates
estimates
estimates
estimates
estimates
estimates
estimates
estimates
estimates
estimates
evoke
evoke
evoke
evoke
evoke
evoke
evoke
evoke
evoke
evoke
evoke
evoke
evoke
evoke
evoke
evoke
devolves
devolves
devolve
devolve
devolves
devolves
devolve
devolve
devolves
devolve
devolves
devolve
devolves
devolves
devolves
devolve
exceeds
exceeds
exceeds
exceeds
exceeds
exceeds
exceeds
exceeds
exceeds
exceeds
exceeds
exceeds
exceeds
exceeds
exceeds
exceeding
exchanges
exchanges
exchanges
exchanges
exchanges
exchanges
exchange
exchanges
exchanges
exchanges
exchanges
exchanges
exchanges
exchanges
exchanges
exchanges
excite
excite
excites
excite
excites
excite
excites
excites
excites
excites
excites
excite
excites
excite
excites
excite
exclaims
exclaiming
exclaiming
exclaiming
exclaims
exclaiming
exclaims
exclaiming
excl

hurry
hurrying
hurrying
hurries
hurries
hurries
hurry
hurrying
hurrying
inteigining
identify
disientiing
identifying
identify
identifying
identify
identify
identifying
identify
identified
identifies
identify
identified
identified
identified
ingrine
ignore
ignore
ignore
ignore
ingure
ignore
ignore
ingore
ignoring
ignories
ignories
ignore
ignore
ignures
ignore
illumine
illumine
illumined
illumine
illumined
illumine
illumine
illumine
illumine
illumines
illumined
illumines
illumine
illumine
linumizes
illumine
imagine
imagine
imagine
imagine
imagines
imagine
imagine
imagine
imagine
imagine
imagine
imagine
imagine
imagine
imagine
imagine
imitates
imitates
imitates
imitates
imitates
imitates
imitates
imitates
imitates
imitates
imitates
imitates
imitates
imitates
imitates
imitates
implying
implies
implying
implies
implies
implying
implies
implies
implying
implies
implying
implied
implies
implies
implying
implies
import
imports
imports
imports
imports
imports
imports
imports
imports
imports
imp

parodies
parodies
parodies
parodies
parodies
parodies
parodies
parodies
parodies
parodies
parodies
approy
approde
parodies
participates
participates
participates
participates
participates
participates
participates
participates
participates
participates
participates
participates
participates
participates
participates
participates
bypasses
bypasses
bypassed
bypassed
bypasses
bypasses
bypasses
bypasses
bypassing
bypassing
bypassing
bypasses
bypasses
bypasses
bypasses
bypasses
pauses
pauses
pauses
pauses
pauses
pauses
pauses
pauses
pauses
pausing
pauses
pauses
pauses
pausing
pauses
paused
pecked
pecks
pecks
pecks
pecks
pecks
pecked
peaking
pecks
pecks
pecks
pecks
pecks
pecked
pecks
pecks
peels
peels
peels
peels
peels
peel
peels
peel
peels
peels
peels
peels
peels
peel
peels
peel
peers
peers
peers
peers
peers
peers
peers
peers
peers
peers
peers
peers
peers
peers
peers
peering
penetrates
penetrates
penetrates
penetrates
penetrates
penetrates
penetrates
penterates
penetrates
penetrates
penetra

preserves
resigns
resign
resigns
resigns
resign
resign
resign
resigns
resigns
resigns
resigns
resigns
resigns
resigns
resigns
resign
resisting
resisting
resisting
resisting
resisting
resisting
resisting
resisting
resisting
resisted
resists
resisting
resisting
resisting
resisting
resists
resolves
resolves
resolves
resolves
resolves
resolves
resolves
resolves
resolves
resolve
resolves
resolves
resolves
resolves
resolves
resolve
respects
respects
respects
respects
respects
respects
respects
respects
respects
respects
respects
respecting
respects
respects
respects
respects
corresponds
corresponds
corresponds
corresponds
corresponds
corresponds
corresponds
corresponds
corresponds
corresponds
corresponds
corresponds
corresponds
corresponds
corresponds
corresponds
restore
restore
restore
restore
restore
restore
restore
restore
restore
restore
restore
restore
restore
restore
restore
restore
results
results
resulted
resulted
results
results
results
results
results
results
results
resulted
resul

studies
study
study
study
studied
stiude
studied
studies
stumbles
stumbles
stumbles
stumbles
stumbles
stumbles
stumbles
stumbles
stumbles
stumbles
stumbles
stumbles
stumbles
stumbles
stumbles
stumbles
submitting
submits
submitting
submit
submitting
submitting
submitting
submitting
submits
submitting
submitted
submits
submitting
submitting
submitting
submits
subsides
subsides
subsides
subsides
subsides
subsides
subsides
subsides
subside
subsides
subside
subside
subside
subside
subsides
subsides
substitutes
substitutes
substitutes
substitutes
substitutes
substitutes
substitutes
substitutes
substitutes
substitutes
substitutes
substitutes
substitutes
substitutes
substitutes
substitutes
succeed
succeeds
succeeds
succeeds
succeeds
succeeds
succeeds
succeeds
succeeds
succeeds
succeeds
succeeds
succeeds
succeeds
succeeds
succeeds
sucks
sucks
sucks
suck
sucks
sucks
sucking
sucking
sucks
sucks
sucked
sucked
sucking
sucking
sucked
sucks
suffers
suffers
suffering
suffers
suffers
suffers
suffers
su

buy
buys
buy
buy
buy
buy
buy
buy
bought
bought
bistled
bought
catches
catches
catches
catches
catches
catches
catches
catches
catches
catching
catches
catches
caught
caught
catches
catches
choose
choose
choose
choose
choose
choose
choose
choose
choose
choose
chost
choose
chose
chose
chose
choose
become
become
become
become
became
become
become
become
become
becomes
become
become
became
became
became
became
crept
creeps
creeping
creeps
creeping
creeps
creeping
creep
creeps
creeping
creeping
creeps
creeps
creeps
crept
crept
cut
cutting
cut
cut
cut
cut
cutting
cut
cuts
cutting
cuts
cutting
cut
cut
cut
cut
dealing
deal
deal
deal
deal
deal
deal
deal
deal
dealing
deal
deal
dealt
dealt
dealt
dealt
dig
digs
digs
dig
digs
digs
digging
digging
digging
digging
digging
digging
dinging
dugging
dugging
digging
drinks
drinking
drinks
drinking
drinking
drink
drink
drinking
drinking
drinking
drink
drinking
drank
drinks
drank
drinks
drive
drive
drive
drive
drive
drive
drive
drive
drive
drive
drive
drive

show
showing
underlies
underlies
underlies
underlies
underlies
underlies
underlie
underlies
underlies
underlies
underlies
underlies
underlie
underlie
underlie
underlies
tread
treads
treading
treads
tread
tread
treading
treading
treads
treads
treading
treading
treads
trod
tread
tread
stride
stride
stride
stride
stride
stride
stride
stride
stride
stride
stride
stride
stride
stride
stride
stride
measures
measures
measure
measures
measures
measure
measure
measure
measure
measures
measure
measure
measures
measures
measures
measures
flame
flame
flame
flame
flame
flame
flame
flame
flame
flame
flame
flame
flame
flame
flame
flame
fixes
fixing
fix
fix
fixes
fixes
fixes
fixing
fixes
fixing
fixing
fix
fixing
fixes
fixing
fixes
firm
firmed
firming
firms
firm
firms
firmed
firms
firms
firming
firming
firming
firmed
firming
firming
firming
fidgets
fidget
fidgets
fidget
fidgets
fidget
fidgets
fidget
fignted
fidgets
fidget
fidgeting
fidgets
fidgets
fidget
fidget
fictionalize
fictionalize
fictionalize
fi