## Assignment 3: Sequence to Sequence Models
LING 380/780 -- Neural Network Models of Linguistic Structure


Start by loading the necessary pytorch packages, along with the seq2seq model defintion and training code.

In [None]:
import torch.nn as nn
import torch.optim as optim
import torch
import model
import train

Next, load the functions that will create and load the synthetic datasets using PCFGs defined in *grammars.py*.

In [None]:
from generator import create_file
from data_prep import load_and_prepare_dataset

from grammars import pcfg_agreement_pp, pcfg_agreement_pp_ambig, pcfg_agreement_pp_unambig
from grammars import gen_reinflection_example, gen_pres_reinflection_example

Use these functions to generate datasets (which will be stored in the *data* and *cache* subdirectories), and then use *load_and_prepare_dataset* to create training, validation and testing sets, along with text objects for the source (input) and target (output), which will be used for their vocabulary objects.

In [None]:
create_file('reinflection_pp',pcfg_agreement_pp_ambig,gen_reinflection_example,5000)
create_file('reinflection_pp_test',pcfg_agreement_pp_unambig,gen_pres_reinflection_example,100)

train_iter, val_iter, test_iter, src_text, trg_text = load_and_prepare_dataset('reinflection_pp', 5)

Set some hyperparameters and create the loss, the network and optimizer objects.

In [None]:
EMBEDDING_SIZE = 128
HIDDEN_SIZE = 128
ATTENTION = 'Null'

PAD_IDX = trg_text.vocab.stoi['<pad>']

criterion = nn.CrossEntropyLoss(ignore_index=PAD_IDX)
net = model.Seq2Seq(src_text, EMBEDDING_SIZE, HIDDEN_SIZE, trg_text, attention=ATTENTION)
optimizer = optim.Adam(net.parameters())

Define the set of words that will be used to compute accuracy (if the value of the *eval_words* argument of *train* is not specified, accuracy will be computed for all words in the target).  Here, since we are interested in assessing accuracy in inflecting verbs, we consider only present tense verbs, in both singular and plural forms.

In [None]:
eval_verbs = ['laughs','dances','hopes','burps', 'coughs', 'dies', 'laugh', 'dance', 'hope', 'burp', 'cough', 'die']

Train the network.

In [None]:
N_EPOCHS = 10
train.train(net, train_iter, val_iter, test_iter, optimizer, criterion, short_train=False, n_epochs=N_EPOCHS, eval_words = eval_verbs, patience=3)

Load evaluation functions that provide an interface for the translating sentences and batches, and plotting a heatmap for attention weights for each word in an output.  

In [None]:
from eval import translate_batch, translate, plot_from_batch, plot 

Try translating some sentences...

In [None]:
translate(net, 'the gentle badger coughed', 'past' )

In [None]:
translate(net, 'the gentle badger coughed', 'past' )

In [None]:
plot(net, 'with the gentle kindly dogs the humble badger danced', 'pres')

Load a test batch with a given target length to be used for testing

In [None]:
desired_target_length = 10

for i in test_iter:
    if i.trg.shape[0]==desired_target_length:
        sample_test_batch = i
        continue

print(sample_test_batch)

In [None]:
translate_batch(net, sample_test_batch)

In [None]:
plot_from_batch(net, sample_test_batch, 0) 