# Neural Machine Translation (NMT) System: Recurrent Neural Networks (RNN)

## Executive Summary
This project explores the implementation of a Neural Machine Translation (NMT) system designed to translate English sentences into French. It specifically focuses on the application of Recurrent Neural Networks (RNNs) within a Sequence-to-Sequence (Seq2Seq) framework, providing a foundational analysis of deep learning techniques for natural language processing tasks before the advent of Transformer architectures.

## Research Objectives
1.  **RNN Efficacy Analysis**: To evaluate the performance of RNN-based Encoder-Decoder architectures in handling sequence translation.
2.  **Seq2Seq Implementation**: To construct a robust pipeline involving data preprocessing, vocabulary handling, and tensor batching tailored for sequential models.
3.  **Performance Quantification**: To utilize standard BLEU (Bilingual Evaluation Understudy) scores to rigorously assess translation quality.

## Methodology

### 1. Data Engineering
*   **Corpus**: English-French sentence pairs (Anki/ManyThings).
*   **Preprocessing**: Implementation of a Unicode-to-ASCII normalization pipeline, regex cleaning, and sequence padding.
*   **Vectorization**: Custom `DataLoader` and `Vocab_Lang` classes for efficient tensor mapping.

### 2. Model Architecture
*   **Encoder-Decoder Framework**: A classic Seq2Seq architecture where an Encoder RNN compresses the source sentence into a context vector, which is then unfolded by a Decoder RNN to generate the translation.
*   **Recurrent Layers**: Utilization of RNN units (potentially LSTM or GRU variants) to capture temporal dependencies in text.
*   **Teacher Forcing**: Implementation of teacher forcing during training to stabilize convergence by using ground truth targets as next-step inputs.

### 3. Training & Evaluation
*   **Optimization**: Adam optimizer with Cross-Entropy Loss to handle multi-class classification at each time step.
*   **Metrics**: Evaluation performed on unseen test data using BLEU-1 through BLEU-4 scores to measure n-gram overlap between predicted and reference translations.

---
*This notebook serves as a focused study on RNN-based NMT methodologies, establishing a baseline for understanding sequence modeling.*

In [1]:
# run this code when running the code on Google Colab
# from google.colab import drive
# drive.mount('/content/drive')
# import sys
# sys.path.insert(0,'/content/drive/MyDrive/Colab Notebooks/NMT')

## 1. Importing Libraries

In [2]:
# Importing the required libraries
import pandas as pd
import numpy as np
import unicodedata
import re
from torch.utils.data import Dataset
from torch.utils.data import DataLoader
import torch
import random
import os
import math

# Importing the required libraries for model training and evalutation
import torch.nn as nn
import torch.nn.functional as F
import time
from tqdm.notebook import tqdm
import nltk
from nltk.translate.bleu_score import sentence_bleu, SmoothingFunction, corpus_bleu

## 2. Downloaing Data

Downloading the dataset and arranging the source and target sentences in a dataframe

In [3]:
# downloading the data
if __name__ == '__main__':
    os.system("wget http://www.manythings.org/anki/fra-eng.zip")
    os.system("unzip -o fra-eng.zip")

# arranding the data in a dataframe
if __name__ == '__main__':
    lines = open('fra.txt', encoding='UTF-8').read().strip().split('\n')
    total_num_examples = 50000 
    original_word_pairs = [[w for w in l.split('\t')][:2] for l in lines[:total_num_examples]]
    random.shuffle(original_word_pairs)

    dat = pd.DataFrame(original_word_pairs, columns=['eng', 'fra'])
    print(dat) # Visualize the data

--2026-01-13 13:10:55--  http://www.manythings.org/anki/fra-eng.zip
Resolving www.manythings.org (www.manythings.org)... 173.254.30.110
Connecting to www.manythings.org (www.manythings.org)|173.254.30.110|:80... 173.254.30.110
Connecting to www.manythings.org (www.manythings.org)|173.254.30.110|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 8186368 (7.8M) [application/zip]
Saving to: ‘fra-eng.zip.2’

     0K .......... .......... .......... .......... ..........  0%  752K 11s
    50K .......... .......... .......... .......... ..........  1% 2.05M 7s
   100K .......... .......... .......... .......... ..........  1% 4.38M 5s
   150K .......... .......... .......... .......... ..........  2% 2.59M 5s
   200K .......... .......... .......... .......... ..........  3% 17.5M 4s
   250K .......... .......... .......... .......... ..........  3% 6.99M 3s
   300K .......... .......... .......... .......... ..........  4% 10.1M 3s
   350K ..connected.
HTTP request sen

Archive:  fra-eng.zip
  inflating: _about.txt              
  inflating: fra.txt                 
                        eng                              fra
0       Was the movie good?  Est-ce que le film était bien ?
1             I'm grounded.         Je suis privé de sortie.
2          I hate the heat.             Je déteste le chaud.
3       I panicked and ran.           J'ai paniqué et couru.
4           Tom works late.              Tom travaille tard.
...                     ...                              ...
49995   I'm not hungry now.     Je n'ai pas faim maintenant.
49996     I love this band.                J'aime ce groupe.
49997         I am so sick.    C'est moi qui suis si malade.
49998  Tom made a sandwich.         Tom prépara un sandwich.
49999       Summer is over.               L'été est terminé.

[50000 rows x 2 columns]

                        eng                              fra
0       Was the movie good?  Est-ce que le film était bien ?
1             I'm gro

## 3. Preprocessing the data


In [4]:
# Converts the unicode file to ascii
def unicode_to_ascii(s):
    """Normalizes latin chars with accent to their canonical decomposition"""
    return ''.join(c for c in unicodedata.normalize('NFD', s) if unicodedata.category(c) != 'Mn')


def preprocess_sentence(w):
    '''
    Preprocess the sentence to add the start, end tokens and make them lower-case
    '''
    w = unicode_to_ascii(w.lower().strip())
    w = re.sub(r'([?.!,¿])', r' \1 ', w)
    w = re.sub(r'[" "]+', ' ', w)

    w = re.sub(r'[^a-zA-Z?.!,¿]+', ' ', w)
    
    w = w.rstrip().strip()
    w = '<start> ' + w + ' <end>'
    return w

if __name__ == '__main__':
    data = dat.copy()
    data['eng'] = dat.eng.apply(lambda w: preprocess_sentence(w))
    data['fra'] = dat.fra.apply(lambda w: preprocess_sentence(w))
    print(data) # Visualizing the data

                                       eng  \
0       <start> was the movie good ? <end>   
1             <start> i m grounded . <end>   
2          <start> i hate the heat . <end>   
3       <start> i panicked and ran . <end>   
4           <start> tom works late . <end>   
...                                    ...   
49995   <start> i m not hungry now . <end>   
49996     <start> i love this band . <end>   
49997         <start> i am so sick . <end>   
49998  <start> tom made a sandwich . <end>   
49999       <start> summer is over . <end>   

                                                 fra  
0      <start> est ce que le film etait bien ? <end>  
1            <start> je suis prive de sortie . <end>  
2                <start> je deteste le chaud . <end>  
3              <start> j ai panique et couru . <end>  
4                 <start> tom travaille tard . <end>  
...                                              ...  
49995    <start> je n ai pas faim maintenant . <end>  
49996  

## 4. Building Vocabulary

Arranging the vocabulary of words from the source and target languages in a list

In [5]:
def build_vocabulary(pd_dataframe):
    '''
    Creating a list to store words forming the vocabulary of a chosen language
    '''
    sentences = [sen.split() for sen in pd_dataframe]
    vocabulary = {}
    for sent in sentences:
        for word in sent:
            if word not in vocabulary:
                vocabulary[word] = 1
    return list(vocabulary.keys())

if __name__ == '__main__':
    src_vocab_list = build_vocabulary(data['eng'])
    trg_vocab_list = build_vocabulary(data['fra'])

print("The source vocabulary is: ", src_vocab_list)
print("The target vocabulary is: ", trg_vocab_list)

The target vocabulary is:  ['<start>', 'est', 'ce', 'que', 'le', 'film', 'etait', 'bien', '?', '<end>', 'je', 'suis', 'prive', 'de', 'sortie', '.', 'deteste', 'chaud', 'j', 'ai', 'panique', 'et', 'couru', 'tom', 'travaille', 'tard', 'nous', 'devons', 'echapper', 'ta', 'chance', 'epuisee', 'il', 'faut', 'essaie', 'ou', 'as', 'tu', 'mange', 'ne', 'pas', 'content', 'n', 'aime', 'les', 'epinards', 'interromps', '!', 'tends', 'la', 'corde', 'adore', 'francais', 'qui', 'a', 'dormi', 'ici', 'tout', 'disparu', 'geologue', 'tant', 'faire', 'vis', ',', 'desormais', 'etes', 'vous', 'dingues', 'ressens', 'crois', 'etre', 'dans', 'vrai', 'en', 'connaissions', 'aucun', 'avez', 'tous', 'reussi', 'ferme', 'porte', 'va', 'chercher', 'chanter', 'laisse', 'moi', 'paix', 'sonne', 'ete', 'prudent', 'aimes', 'ca', 'combien', 'recois', 'mes', 'amis', 'me', 'manquent', 'savez', 'c', 'repondez', 'ma', 'question', 'rien', 'pardonne', 'fais', 'encore', 'une', 'fois', 'bonne', 'journee', 'un', 'pret', 'attendez',

## 5. Instantiating the training and target data set

1. **Vocabulary Class** -  A separate class has been created for vocabulary. With this, the vocabulary list for each language can be instantiated into a data structure which stores the words from the vocabulary accompanied with a mapping of these words with numbers serving as their indices. This numbers can be used in the training process by the model.

2. **DataLoader Class** - Each sentence is stored as a list of words. The DataLoader will instantiate this into a long tensor.



In [6]:
# vocabulary class
class Vocab_Lang():
    def __init__(self, vocab):
        self.word2idx = {'<pad>': 0, '<unk>': 1}
        self.idx2word = {0: '<pad>', 1: '<unk>'}
        self.vocab = vocab
        
        for index, word in enumerate(vocab):
            self.word2idx[word] = index + 2 # +2 because of <pad> and <unk> token
            self.idx2word[index + 2] = word
    
    def __len__(self):
        return len(self.word2idx)

# data loader class
class MyData(Dataset):
    def __init__(self, X, y):
        self.length = torch.LongTensor([np.sum(1 - np.equal(x, 0)) for x in X])
        self.data = torch.LongTensor(X)
        self.target = torch.LongTensor(y)
    
    def __getitem__(self, index):
        x = self.data[index]
        y = self.target[index]
        return x, y

    def __len__(self):
        return len(self.data)

In [8]:
def pad_sequences(x, max_len):
    """ 
    Adding padding to sentences of length smaller than the maximum sentence length
    """
    padded = np.zeros((max_len), dtype=np.int64)
    if len(x) > max_len:
        padded[:] = x[:max_len]
    else:
        padded[:len(x)] = x
    return padded


def preprocess_data_to_tensor(dataframe, src_vocab, trg_vocab):
    # Vectorize the input and target languages
    src_tensor = [[src_vocab.word2idx[s if s in src_vocab.vocab else '<unk>'] for s in eng.split(' ')] for eng in dataframe['eng'].values.tolist()]
    trg_tensor = [[trg_vocab.word2idx[s if s in trg_vocab.vocab else '<unk>'] for s in fra.split(' ')] for fra in dataframe['fra'].values.tolist()]

    # Calculate the max_length of input and output tensor for padding
    max_length_src, max_length_trg = max(len(t) for t in src_tensor), max(len(t) for t in trg_tensor)
    print('max_length_src: {}, max_length_trg: {}'.format(max_length_src, max_length_trg))

    # Pad all the sentences in the dataset with the max_length
    src_tensor = [pad_sequences(x, max_length_src) for x in src_tensor]
    trg_tensor = [pad_sequences(x, max_length_trg) for x in trg_tensor]

    return src_tensor, trg_tensor, max_length_src, max_length_trg


def train_test_split(src_tensor, trg_tensor):
    '''
    Create training and test sets.
    '''
    total_num_examples = len(src_tensor) - int(0.2*len(src_tensor))
    src_tensor_train, src_tensor_test = src_tensor[:int(0.75*total_num_examples)], src_tensor[int(0.75*total_num_examples):total_num_examples]
    trg_tensor_train, trg_tensor_test = trg_tensor[:int(0.75*total_num_examples)], trg_tensor[int(0.75*total_num_examples):total_num_examples]

    return src_tensor_train, src_tensor_test, trg_tensor_train, trg_tensor_test

The sentences from the source and target language are stored in form of tensors. The data is split into training and testing data set separately. By setting the appropriate hyperparamters of embedding and batch size the tensor data in smapled into batched using the DataLoader module from Pytorch

In [9]:
if __name__ == '__main__':

    # HYPERPARAMETERS 
    BATCH_SIZE = 64
    EMBEDDING_DIM = 256

    src_vocab = Vocab_Lang(src_vocab_list)
    trg_vocab = Vocab_Lang(trg_vocab_list)

    src_tensor, trg_tensor, max_length_src, max_length_trg = preprocess_data_to_tensor(data, src_vocab, trg_vocab)
    src_tensor_train, src_tensor_val, trg_tensor_train, trg_tensor_val = train_test_split(src_tensor, trg_tensor)
    # Create train and val datasets
    train_dataset = MyData(src_tensor_train, trg_tensor_train)
    train_dataset = DataLoader(train_dataset, batch_size=BATCH_SIZE, drop_last=True, shuffle=True)
    
    test_dataset = MyData(src_tensor_val, trg_tensor_val)
    test_dataset = DataLoader(test_dataset, batch_size=BATCH_SIZE, drop_last=True, shuffle=False)

max_length_src: 10, max_length_trg: 17


  self.data = torch.LongTensor(X)


In [9]:
if __name__ == '__main__':
    idxes = random.choices(range(len(train_dataset.dataset)), k=5)
    src, trg =  train_dataset.dataset[idxes]
    print('Source:', src)
    print('Source Dimensions: ', src.size())
    print('Target:', trg)
    print('Target Dimensions: ', trg.size())

Source: tensor([[   2,   47,   38,  390, 3940,   11,    7,    0,    0,    0,    0],
        [   2,   38,   54,   16,  327,   28,    7,    0,    0,    0,    0],
        [   2,   34,    9, 2408,   11,    7,    0,    0,    0,    0,    0],
        [   2,   96,  152,   54,   11,    7,    0,    0,    0,    0,    0],
        [   2,   14,  319,   80,   32,  964,   11,    7,    0,    0,    0]])
Source Dimensions:  torch.Size([5, 11])
Target: tensor([[   2,   50,   40, 5958,   11,    7,    0,    0,    0,    0,    0,    0,
            0,    0,    0,    0,    0,    0,    0],
        [   2,   95,   40,  112, 1183,   29,    7,    0,    0,    0,    0,    0,
            0,    0,    0,    0,    0,    0,    0],
        [   2,   36,  272,   42,   43,   34,   36, 2494,   11,    7,    0,    0,
            0,    0,    0,    0,    0,    0,    0],
        [   2,   41,  106, 6365,   11,    7,    0,    0,    0,    0,    0,    0,
            0,    0,    0,    0,    0,    0,    0],
        [   2,   16,  295,   48

## 6. Model training
Now we will train a Transformer-based Encoder and Decoder model for learning the translation from the source to target language. We will import the encoder and decoder models and use them in the training process

### RNN Model Training

In [16]:
import rnnencoder
import rnndecoder

def loss_function(real, pred):
    mask = real.ge(1).float() # Only consider non-zero inputs in the loss
    
    loss_ = F.cross_entropy(pred, real) * mask 
    return torch.mean(loss_)

def train_rnn_model(encoder, decoder, dataset, optimizer, trg_vocab, device, n_epochs):
    batch_size = dataset.batch_size
    for epoch in range(n_epochs):
        start = time.time()
        n_batch = 0
        total_loss = 0
        
        encoder.train()
        decoder.train()
        
        for src, trg in tqdm(dataset):
            n_batch += 1
            loss = 0
            
            enc_output, enc_hidden = encoder(src.transpose(0,1).to(device))
            dec_hidden = enc_hidden
            
            # use teacher forcing - feeding the target as the next input (via dec_input)
            dec_input = torch.tensor([[trg_vocab.word2idx['<start>']]] * batch_size)
        
            # run code below for every timestep in the ys batch
            for t in range(1, trg.size(1)):
                predictions, dec_hidden, _ = decoder(dec_input.to(device), dec_hidden.to(device), enc_output.to(device))
                assert len(predictions.shape) == 2 and predictions.shape[0] == dec_input.shape[0] and predictions.shape[1] == len(trg_vocab.word2idx), "First output of decoder must have shape [batch_size, vocab_size], you returned shape " + str(predictions.shape)
                loss += loss_function(trg[:, t].to(device), predictions.to(device))
                dec_input = trg[:, t].unsqueeze(1)
        
            batch_loss = (loss / int(trg.size(1)))
            total_loss += batch_loss.item()
            
            optimizer.zero_grad()
            
            batch_loss.backward()

            ### update model parameters
            optimizer.step()
        
        print('Epoch:{:2d}/{}\t Loss: {:.4f} \t({:.2f}s)'.format(epoch + 1, n_epochs, total_loss / n_batch, time.time() - start))

    print('Model trained!')

In [18]:
if __name__ == '__main__':
    # HYPERPARAMETERS
    LEARNING_RATE = 0.001
    HIDDEN_UNITS=256
    N_EPOCHS=10

    device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
  
    rnn_encoder = rnnencoder.RnnEncoder(src_vocab, EMBEDDING_DIM, HIDDEN_UNITS).to(device)
    rnn_decoder = rnndecoder.RnnDecoder(trg_vocab, EMBEDDING_DIM, HIDDEN_UNITS).to(device)

    rnn_model_params = list(rnn_encoder.parameters()) + list(rnn_decoder.parameters())
    optimizer = torch.optim.Adam(rnn_model_params, lr=LEARNING_RATE)

    print('Encoder and Decoder models initialized!')

Encoder and Decoder models initialized!


In [19]:
if __name__ == '__main__':
    train_rnn_model(rnn_encoder, rnn_decoder, train_dataset, optimizer, trg_vocab, device, N_EPOCHS)

  0%|          | 0/468 [00:00<?, ?it/s]

Epoch: 1/10	 Loss: 1.2961 	(86.14s)


  0%|          | 0/468 [00:00<?, ?it/s]

Epoch: 2/10	 Loss: 0.8533 	(79.34s)


  0%|          | 0/468 [00:00<?, ?it/s]

Epoch: 3/10	 Loss: 0.6509 	(78.07s)


  0%|          | 0/468 [00:00<?, ?it/s]

Epoch: 4/10	 Loss: 0.5025 	(77.48s)


  0%|          | 0/468 [00:00<?, ?it/s]

Epoch: 5/10	 Loss: 0.3877 	(77.34s)


  0%|          | 0/468 [00:00<?, ?it/s]

Epoch: 6/10	 Loss: 0.3012 	(87.66s)


  0%|          | 0/468 [00:00<?, ?it/s]

Epoch: 7/10	 Loss: 0.2382 	(80.38s)


  0%|          | 0/468 [00:00<?, ?it/s]

Epoch: 8/10	 Loss: 0.1937 	(75.95s)


  0%|          | 0/468 [00:00<?, ?it/s]

Epoch: 9/10	 Loss: 0.1625 	(77.35s)


  0%|          | 0/468 [00:00<?, ?it/s]

Epoch:10/10	 Loss: 0.1411 	(84.62s)
Model trained!


# 7. Decoding output

Decoding the output to predict the sentences for unseen source sentences

### RNN Model decoding

In [20]:
import decodingalgorithm
if __name__ == '__main__':
    rnn_encoder.eval()
    rnn_decoder.eval()
    idxes = random.choices(range(len(test_dataset.dataset)), k=5)
    src, trg =  train_dataset.dataset[idxes]
    curr_output, _ = decodingalgorithm.decode_rnn_model(rnn_encoder, rnn_decoder, src.transpose(0,1).to(device), trg.size(1), device)
    for i in range(len(src)):
        print("Source sentence:", ' '.join([x for x in [src_vocab.idx2word[j.item()] for j in src[i]] if x != '<pad>']))
        print("Target sentence:", ' '.join([x for x in [trg_vocab.idx2word[j.item()] for j in trg[i]] if x != '<pad>']))
        print("Predicted sentence:", ' '.join([x for x in [trg_vocab.idx2word[j.item()] for j in curr_output[i]] if x != '<pad>']))
        print("----------------")

Source sentence: <start> don t you trust me ? <end>
Target sentence: <start> ne me faites vous pas confiance ? <end>
Predicted sentence: <start> ne me faites vous pas confiance en moi ? <end>
----------------
Source sentence: <start> you re productive . <end>
Target sentence: <start> tu es productif . <end>
Predicted sentence: <start> vous etes productive . <end>
----------------
Source sentence: <start> give me a drink . <end>
Target sentence: <start> donne moi quelque chose a boire ! <end>
Predicted sentence: <start> donne moi quelque chose a boire ! <end>
----------------
Source sentence: <start> i lost my keys . <end>
Target sentence: <start> j ai perdu mes cles . <end>
Predicted sentence: <start> j ai perdu mes cles . <end>
----------------
Source sentence: <start> did you hug anybody ? <end>
Target sentence: <start> avez vous enlace qui que ce soit ? <end>
Predicted sentence: <start> as tu enlace qui que ce soit ? <end>
----------------


## Model Evaluation

Evaluation of the model is based on the blue score

### RNN Model Evaluation

In [21]:
def get_reference_candidate(target, pred, trg_vocab):
    def _to_token(sentence):
        lis = []
        for s in sentence[1:]:
            x = trg_vocab.idx2word[s]
            if x == "<end>": break
            lis.append(x)
        return lis
    reference = _to_token(list(target.numpy()))
    candidate = _to_token(list(pred.numpy()))
    return reference, candidate

def compute_bleu_scores(target_tensor_val, target_output, final_output, trg_vocab):
    bleu_1 = 0.0
    bleu_2 = 0.0
    bleu_3 = 0.0
    bleu_4 = 0.0

    smoother = SmoothingFunction()
    save_reference = []
    save_candidate = []
    for i in range(len(target_tensor_val)):
        reference, candidate = get_reference_candidate(target_output[i], final_output[i], trg_vocab)
    
        bleu_1 += sentence_bleu(reference, candidate, weights=(1,), smoothing_function=smoother.method1)
        bleu_2 += sentence_bleu(reference, candidate, weights=(1/2, 1/2), smoothing_function=smoother.method1)
        bleu_3 += sentence_bleu(reference, candidate, weights=(1/3, 1/3, 1/3), smoothing_function=smoother.method1)
        bleu_4 += sentence_bleu(reference, candidate, weights=(1/4, 1/4, 1/4, 1/4), smoothing_function=smoother.method1)

        save_reference.append(reference)
        save_candidate.append(candidate)
    
    bleu_1 = bleu_1/len(target_tensor_val)
    bleu_2 = bleu_2/len(target_tensor_val)
    bleu_3 = bleu_3/len(target_tensor_val)
    bleu_4 = bleu_4/len(target_tensor_val)

    scores = {"bleu_1": bleu_1, "bleu_2": bleu_2, "bleu_3": bleu_3, "bleu_4": bleu_4}
    print('BLEU 1-gram: %f' % (bleu_1))
    print('BLEU 2-gram: %f' % (bleu_2))
    print('BLEU 3-gram: %f' % (bleu_3))
    print('BLEU 4-gram: %f' % (bleu_4))

    return save_candidate, scores

def evaluate_rnn_model(encoder, decoder, test_dataset, target_tensor_val, device):
    trg_vocab = decoder.trg_vocab
    batch_size = test_dataset.batch_size
    n_batch = 0
    total_loss = 0

    encoder.eval()
    decoder.eval()
    
    final_output, target_output = None, None

    with torch.no_grad():
        for batch, (src, trg) in enumerate(test_dataset):
            n_batch += 1
            loss = 0
            curr_output, curr_predictions = decodingalgorithm.decode_rnn_model(encoder, decoder, src.transpose(0,1).to(device), trg.size(1), device)
            for t in range(1, trg.size(1)):
                loss += loss_function(trg[:, t].to(device), curr_predictions[:,t,:].to(device))

            if final_output is None:
                final_output = torch.zeros((len(target_tensor_val), trg.size(1)))
                target_output = torch.zeros((len(target_tensor_val), trg.size(1)))
            final_output[batch*batch_size:(batch+1)*batch_size] = curr_output
            target_output[batch*batch_size:(batch+1)*batch_size] = trg
            batch_loss = (loss / int(trg.size(1)))
            total_loss += batch_loss

        print('Loss {:.4f}'.format(total_loss / n_batch))
    
    # Compute BLEU scores
    return compute_bleu_scores(target_tensor_val, target_output, final_output, trg_vocab)

In [22]:
if __name__ == '__main__':
    rnn_save_candidate, rnn_scores = evaluate_rnn_model(rnn_encoder, rnn_decoder, test_dataset, trg_tensor_val, device)

Loss 1.8288
BLEU 1-gram: 0.236959
BLEU 2-gram: 0.074289
BLEU 3-gram: 0.056699
BLEU 4-gram: 0.054889
BLEU 1-gram: 0.236959
BLEU 2-gram: 0.074289
BLEU 3-gram: 0.056699
BLEU 4-gram: 0.054889


**Saving Transformer Encoder and Decoder Model**

In [24]:
if __name__=='__main__':
    # from google.colab import drive
    # drive.mount('/content/drive')
    if rnn_encoder is not None and rnn_encoder is not None:
        print("Saving RNN model....") 
        torch.save(rnn_encoder, 'rnn_encoder.pt')
        torch.save(rnn_decoder, 'rnn_decoder.pt')
        print("RNN model saved!")

Saving RNN model....
RNN model saved!
