<h3><b>Train a Neural Machine Translation model using attention mechanism to enhance it's performance.</b></h3>

<h4><i>0. Setup </i></h4>

In [1]:
import os
import numpy as np
import tensorflow as tf

from model import Translator
from model import masked_loss, masked_acc 
from nmt_dataset import DatasetConfig
from nmt_dataset import load_dataset
from nmt_dataset import decode

<h4><i>1. Load dataset. </i></h4>

In [2]:
# define dataset loading config
ds_config = DatasetConfig(
    vocab_size=25_000,
    max_sequence_len=45,
    batch_size=32,
    inputs_preprocessor=None,
    targets_preprocessor=None,
)

In [3]:
# load dataset

print("loading dataset and tokenizers ...", end=" ")
dataset, inputs_vectorizer, targets_vectorizer = load_dataset(
    config=ds_config
)
print("done.")

loading dataset and tokenizers ... done.


In [4]:
# a quick glance at dataset's input and output structure.

for inputs, outputs in dataset.take(1):
    enc_input, dec_input = inputs["enc_inputs"], inputs["dec_inputs"]
    print(f"encoder input shape:{'':2} {enc_input.shape}")
    print(f"decoder input shape:{'':2} {dec_input.shape}")
    print(f"output target shape:{'':2} {outputs.shape}")

encoder input shape:   (32, 45)
decoder input shape:   (32, 44)
output target shape:   (32, 44)


<h4><i>2. Make a new model. </i></h4>

In [5]:
nmt_model = Translator(
    inputs_vectorizer,
    targets_vectorizer,
    units=512,
    attn_num_heads=6,
)

<h4><i>3. Compile and train the model. </i></h4>

In [7]:
# Compile the Model
nmt_model.compile(optimizer="adam", loss=masked_loss, metrics=[masked_loss, masked_acc])

In [8]:
# Fit the model.
history = nmt_model.fit(dataset, epochs=20)

Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


<h4><i>4. Save the model. </i></h4>

In [None]:
nmt_model.save("persian_to_english_nmt_model")

<h4><i> 5. Evaluate the model. </i></h4>

In [15]:
texts = ["بگو کجا ", "با تو موافقم", "من داشتم قدم می زدم", " من یک دکتر هستم"]

for t in texts:
    result = nmt_model.translate([t])
    translated = result[0].numpy().decode()
    print(f"{t:10}: {translated}")
    print()

بگو کجا   : say one tell 

با تو موافقم: you can be as you for a bit 

من داشتم قدم می زدم: i love and withered 

 من یک دکتر هستم: i enjoy a doctor 



It seems that the translation of quality is not satisfying. Some potential reason :
    1. Low quality dataset .
    2. Small number of samples.
    3. Training for just a few epochs.