# HW 3: Neural Machine Translation

In this homework you will build a full neural machine translation system using an attention-based encoder-decoder network to translate from German to English. The encoder-decoder network with attention forms the backbone of many current text generation systems. See [Neural Machine Translation and Sequence-to-sequence Models: A Tutorial](https://arxiv.org/pdf/1703.01619.pdf) for an excellent tutorial that also contains many modern advances.

## Goals


1. Build a non-attentional baseline model (pure seq2seq as in [ref](https://papers.nips.cc/paper/5346-sequence-to-sequence-learning-with-neural-networks.pdf)). 
2. Incorporate attention into the baseline model ([ref](https://arxiv.org/abs/1409.0473) but with dot-product attention as in class notes).
3. Implement beam search: review/tutorial [here](http://www.phontron.com/slides/nlp-programming-en-13-search.pdf)
4. Visualize the attention distribution for a few examples. 

Consult the papers provided for hyperparameters, and the course notes for formal definitions.

This will be the most time-consuming assignment in terms of difficulty/training time, so we recommend that you get started early!

In [8]:
from load_data import DataLoader
from models import LSTMEncoder, LSTMDecoder, Seq2Seq

loader = DataLoader('cpu')
train_iter, val_iter, DE, EN = loader.get_iters()

Loading data...
building vocab...
initializing iterators...
[epoch: 1, batch: 1] loss: 9.348546981811523
[epoch: 1, batch: 2] loss: 8.98713207244873
[epoch: 1, batch: 3] loss: 6.1001057624816895
[epoch: 1, batch: 4] loss: 5.152592182159424
[epoch: 1, batch: 5] loss: 4.749356746673584
[epoch: 1, batch: 6] loss: 4.6155009269714355


KeyboardInterrupt: 

In [11]:
encoder = LSTMEncoder(DE, 300, 200, 4, 0.5)
decoder = LSTMDecoder(EN, 300, 200, 4, 0.5)
model = Seq2Seq(encoder, decoder, 'cpu')
model.fit(train_iter, val_iter = val_iter)

[epoch: 1, batch: 1] loss: 9.302654266357422
[epoch: 1, batch: 2] loss: 9.319097518920898
[epoch: 1, batch: 3] loss: 9.267510414123535
[epoch: 1, batch: 4] loss: 9.26815128326416
[epoch: 1, batch: 5] loss: 9.23572063446045
[epoch: 1, batch: 6] loss: 9.248668670654297
[epoch: 1, batch: 7] loss: 9.23048210144043
[epoch: 1, batch: 8] loss: 9.207789421081543
[epoch: 1, batch: 9] loss: 9.19106388092041
[epoch: 1, batch: 10] loss: 9.158113479614258
[epoch: 1, batch: 11] loss: 9.140076637268066
[epoch: 1, batch: 12] loss: 9.12300968170166
[epoch: 1, batch: 13] loss: 9.122042655944824
[epoch: 1, batch: 14] loss: 9.090656280517578
[epoch: 1, batch: 15] loss: 9.098532676696777
[epoch: 1, batch: 16] loss: 9.080828666687012
[epoch: 1, batch: 17] loss: 9.058524131774902
[epoch: 1, batch: 18] loss: 9.028560638427734
[epoch: 1, batch: 19] loss: 9.005752563476562
[epoch: 1, batch: 20] loss: 9.014092445373535
[epoch: 1, batch: 21] loss: 8.998960494995117
[epoch: 1, batch: 22] loss: 8.967049598693848
[e

KeyboardInterrupt: 

Notes from tax talk:
- You should use WindStar to file taxes
- software isn't available yet (sometime next week)

Possible forms (either 1, 2, or 3):
- Federal income tax return
- It is possible that you will also need to file state tax return
- File 8843
