# neuralmt: default program

In [None]:
from neuralmt import *
import os, sys

## Run the default solution on dev

In [None]:
model = Seq2Seq(build=False)
model.load(os.path.join('data', 'seq2seq_E049.pt'))
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)
model.eval()
# loading test dataset
test_iter = loadTestData(os.path.join('data', 'input', 'dev.txt'), model.fields['src'],
                            device=device, linesToLoad=sys.maxsize)
results = translate(model, test_iter) # Warning: will take >5mins depending on your machine
print("\n".join(results))

## Evaluate the default output

In [None]:
from bleu_check import bleu
ref_t = []
with open(os.path.join('data','reference','dev.out')) as r:
    ref_t = r.read().strip().splitlines()
print(bleu(ref_t, results))

## Documentation

### Copy Mechanism
Inside the translate(model, test_itr) function, we created a simple copy mechanism that will take the unknown word position of the source sentence and insert the source word rather than the \<unk\> token. It works with multi-unknown sentences as each unknown word is assigned and index value for the respective word in the sentence source.

To get the source unknown reference word a permute of the attention matrix was needed.

## Analysis

Do some analysis of the results. What ideas did you try? What worked and what did not?


### Copy Mechanism

Since English and German are both germanic languages, the copy mechanism that we implemented, while not truly perfect, still improves the score a fair amount due to the language similarity.

The baseline model had a score of 17.11, while the baseline w/ copy produced a score of 17.40, nearly a .30 point increase of BLEU score.

The reason why the model slightly increases the score is due to words OOV having an identical translation, so putting in the corresponding word as a replacement for unknown tokens worked. It is better to use a method of translating the OOV words through fine-tuning and this approach would be more necessary if the languages were not similar.

We plotted the attention graph below to get an idea of how the attention mechanism works and aid in adjusting the shapes of tensors to get the original word the is considered OOV. For this example, the word psychotherapie-patientin is the \<unk\> word, and the reader can mostly understand the word in English due to both languages being so similar. In terms of the BLEU score, only words with a direct translation will have an improvement on BLEU score. Words such as kebab have an indentical translation in german and english for example.

<img src="attentionplot.png">



In [None]:
# "Beamish search"
# Takes the 2k best outputs and creates k best possible outputs.
# Starting with the first words of the k best orignal outputs, we use a trigram probibity
# to select the next word. In the example below 'K' and 'k' repersent the two final outputs 
# and the gird repersents 2k the orignal outputs.

# --|--|--|--|--|--|--|--|--|--|--|--|--|--|--|--||--|
# K |  |K |k |K |K |K |K |K |K |  |  |  |  |  |  ||  |
# --|--|--|--|--|--|--|--|--|--|--|--|--|--|--|--||--|
# k |Kk|  |  |k |k |k |k |k |k |Kk|Kk|Kk|Kk|Kk|Kk|
# --|--|--|--|--|--|--|--|--|--|--|--|--|--|--|--||--||--|
#   |  |k |k |  |  |  |  |  |  |  |  |  |  |  |  ||  ||  |
# --|--|--|--|--|--|--|--|--|--|--|--|--|--|--|--||--||--|
#   |  |  |K |  |  |  |  |  |  |  |  |  |  |  |  |
# --|--|--|--|--|--|--|--|--|--|--|--|--|--|--|--|
 
# The problem with this method was; extremly common words such as "I" "we" "was" "the" keep getting selected,
# creating very poor sentences. 