Fine-tuning best T5 Transformer 🤖
-----------------------------------

In this notebook, we will continue the fine-tuning of T5 transformer on the new extracted sentences from the book **Grammaire de Wolof Moderne** without considering the definitions. We provide, bellow, the main evaluation figures, obtained from the hyperparameter search step. We will evaluate the training on the validation dataset.

- Parallel coordinates from panel:

- Parameter importance char: 
[t5_v3_importance](https://wandb.ai/oumar-kane-team/small-t5-cross-fw-translation-bayes-hpsearch-v3/reports/undefined-23-05-16-10-36-17---Vmlldzo0Mzc4NDY0?accessToken=eyaiyrid0qz1zg2jkq3fc65biw53084dpfitbi0dgonq6mweupw6kgjml9d2nv1w)

We can see in the above chart that the batch is the most important parameter with a negative correlation with the BLEU score (meaning that a lower batch size is better). Next, we the probability of modifying a character in the french corpus is also important and a high probability provide a better BLEU score.  

In [1]:
# let us import all necessary libraries
from transformers import AutoModelForSeq2SeqLM, Seq2SeqTrainingArguments, Seq2SeqTrainer, T5TokenizerFast, set_seed, AdamW, get_linear_schedule_with_warmup, T5ForConditionalGeneration,\
    get_cosine_schedule_with_warmup, Adafactor
from wolof_translate.utils.sent_transformers import TransformerSequences
from wolof_translate.utils.improvements.end_marks import add_end_mark # added
from torch.nn import TransformerEncoderLayer, TransformerDecoderLayer
from torch.utils.data import Dataset, DataLoader, random_split
from wolof_translate.data.dataset_v3 import SentenceDataset # v2 -> v3
from wolof_translate.utils.sent_corrections import *
from sklearn.model_selection import train_test_split
from torch.optim.lr_scheduler import _LRScheduler
# from custom_rnn.utils.kwargs import Kwargs
from torch.nn.utils.rnn import pad_sequence
from plotly.subplots import make_subplots
from nlpaug.augmenter import char as nac
from torch.utils.data import DataLoader
# from datasets  import load_metric # make pip install evaluate instead
# and pip install sacrebleu for instance
from torch.nn import functional as F
import plotly.graph_objects as go
from tokenizers import Tokenizer
import matplotlib.pyplot as plt
from tqdm import tqdm, trange
from functools import partial
from torch.nn import utils
from copy import deepcopy
from torch import optim
from typing import *
from torch import nn
import pandas as pd
import numpy as np
import itertools
import evaluate
import random
import string
import shutil
import wandb
import torch
import json
import copy
import os

os.environ["WANDB_DISABLED"] = "true"

  from .autonotebook import tqdm as notebook_tqdm


## French to wolof

### Configure dataset 🔠

In [2]:
# recuperate the tokenizer from a json file
tokenizer = T5TokenizerFast(tokenizer_file=f"wolof-translate/wolof_translate/tokenizers/t5_tokenizers/tokenizer_v3_2.json")


In [3]:
def recuperate_datasets(fr_char_p: float, fr_word_p: float, max_len: int, end_mark_opt: int):

  # Let us recuperate the end_mark adding option
  if end_mark_opt == 1:
    # Create augmentation to add on French sentences
    fr_augmentation = TransformerSequences(nac.KeyboardAug(aug_char_p=fr_char_p, aug_word_p=fr_word_p, 
                                                          aug_word_max=max_len),
                                          remove_mark_space, delete_guillemet_space)

  else:
    
    if end_mark_opt == 2:

      end_mark_fn = partial(add_end_mark, end_mark_to_remove = '!', replace = True)
    
    elif end_mark_opt == 3:

      end_mark_fn = partial(add_end_mark)
    
    elif end_mark_opt == 4:

      end_mark_fn = partial(add_end_mark, end_mark_to_remove = '!')

    # Create augmentation to add on French sentences
    fr_augmentation = TransformerSequences(nac.KeyboardAug(aug_char_p=fr_char_p, aug_word_p=fr_word_p, 
                                                          aug_word_max= max_len),
                                          remove_mark_space, delete_guillemet_space, end_mark_fn)
    
  # Recuperate the train dataset
  train_dataset_aug = SentenceDataset(f"data/extractions/new_data/train_set.csv",
                                        tokenizer,
                                        truncation = True, max_len=max_len,
                                        cp1_transformer = fr_augmentation)

  # Recuperate the valid dataset
  valid_dataset = SentenceDataset(f"data/extractions/new_data/valid_set.csv",
                                        tokenizer, max_len=max_len,
                                        truncation = True)
  
  # Return the datasets
  return train_dataset_aug, valid_dataset

### Configure the model and the evaluation function ⚙️

Let us evaluate the predictions with the `bleu` metric.

In [4]:
%%writefile wolof-translate/wolof_translate/utils/evaluation.py
from tokenizers import Tokenizer
from typing import *
import numpy as np
import evaluate

class TranslationEvaluation:
    
    def __init__(self, 
                 tokenizer: Tokenizer,
                 decoder: Union[Callable, None] = None,
                 metric = evaluate.load('sacrebleu'),
                 ):
        
        self.tokenizer = tokenizer
        
        self.decoder = decoder
        
        self.metric = metric
    
    def postprocess_text(self, preds, labels):
        
        preds = [pred.strip() for pred in preds]
        
        labels = [[label.strip()] for label in labels]
        
        return preds, labels

    def compute_metrics(self, eval_preds):

        preds, labels = eval_preds

        if isinstance(preds, tuple):
        
            preds = preds[0]
        
        decoded_preds = self.tokenizer.batch_decode(preds, skip_special_tokens=True)

        labels = np.where(labels != -100, labels, self.tokenizer.pad_token_id)
        
        decoded_labels = self.tokenizer.batch_decode(labels, skip_special_tokens=True)

        decoded_preds, decoded_labels = self.postprocess_text(decoded_preds, decoded_labels)

        result = self.metric.compute(predictions=decoded_preds, references=decoded_labels)
        
        result = {"bleu": result["score"]}

        prediction_lens = [np.count_nonzero(pred != self.tokenizer.pad_token_id) for pred in preds]
        
        result["gen_len"] = np.mean(prediction_lens)
        
        result = {k: round(v, 4) for k, v in result.items()}
        
        return result

Overwriting wolof-translate/wolof_translate/utils/evaluation.py


Let us initialize the evaluation object.

In [5]:
%run wolof-translate/wolof_translate/utils/evaluation.py
evaluation = TranslationEvaluation(tokenizer)


### Searching for the best parameters 🕖

In [6]:
from wolof_translate.models.transformers.optimization import TransformerScheduler
from wolof_translate.trainers.transformer_trainer import ModelRunner
from wolof_translate.utils.evaluation import TranslationEvaluation
from wolof_translate.models.transformers.main import Transformer
from wolof_translate.utils.split_with_valid import split_data


-------------

### --- Wandb v5

In [10]:
# let us initialize the hyperparameter configuration 
config = {
    'random_state': 0,
    'fr_char_p': 0.9833729160799256,
    'fr_word_p': 0.8864155155342781,
    'learning_rate': 0.005124817704092204,
    'weight_decay': 0.011812072968025469,
    'batch_size': 8,
    'warmup_ratio': 0.0,
    'max_epoch': 200,
    'max_len': 95,
    'end_mark': 4,
    'bleu': 1.3977,
    'model_dir': 'data/checkpoints/fw_t5_small_custom_train_v5_checkpoints/',
    'new_model_dir': 'data/checkpoints/t5_small_custom_train_results_fw_v5/'
}

# Initialize the model name
model_name = 't5-small'

# import the model with its pre-trained weights
model = T5ForConditionalGeneration.from_pretrained(model_name)

# resize the token embeddings
model.resize_token_embeddings(len(tokenizer))

# let us initialize the evaluation class
evaluation = TranslationEvaluation(tokenizer)

# let us initialize the trainer
trainer = ModelRunner(model, seed = 0, version = 5, evaluation = evaluation, optimizer=Adafactor)

# split the data
split_data(config['random_state'], csv_file = "ad_sentences.csv")

# recuperate train and test set
train_dataset, test_dataset = recuperate_datasets(config['fr_char_p'], 
                                                    config['fr_word_p'], 51,
                                                    config['end_mark'])

# let us calculate the appropriate warmup steps (let us take a max epoch of 100)
# length = len(train_dataset)

# n_steps = length // config['batch_size']

# num_steps = config['max_epoch'] * n_steps

# warmup_steps = (config['max_epoch'] * n_steps) * config['warmup_ratio']

# # Initialize the scheduler parameters
# scheduler_args = {'num_warmup_steps': warmup_steps, 'num_training_steps': num_steps}

# Initialize the optimizer parameters
optimizer_args = {
    'lr': config['learning_rate'],
    'weight_decay': config['weight_decay'],
    # 'betas': (0.9, 0.98),
    'relative_step': False
}

# Initialize the loaders parameters
train_loader_args = {'batch_size': config['batch_size']}

# Add the datasets and hyperparameters to trainer
trainer.compile(train_dataset, test_dataset, tokenizer, train_loader_args,
                optimizer_kwargs = optimizer_args,
                # lr_scheduler=get_linear_schedule_with_warmup,
                # lr_scheduler_kwargs=scheduler_args, 
                predict_with_generate = True,
                hugging_face = True,
                logging_dir="data/logs/t5_small_custom_train_fw"
                )

# We will from checkpoints so let us the model
# trainer.load(config['model_dir'], load_best=True) # Only for the first loading
trainer.load(config['new_model_dir'], load_best=True)

        

### ---

In [8]:
trainer.train(epochs = config['max_epoch'] - trainer.current_epoch, auto_save=True, metric_for_best_model='bleu', metric_objective='maximize', log_step=1,
              saving_directory = config['new_model_dir'])

  0%|          | 0/197 [00:00<?, ?it/s]

For epoch 4: 


Train batch number 189: 100%|██████████| 189/189 [00:49<00:00,  3.82batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.70batches/s]



Metrics: {'train_loss': 0.4865215807837784, 'test_loss': 0.7882578237490221, 'bleu': 0.9016, 'gen_len': 7.8512}




  1%|          | 1/197 [00:58<3:11:01, 58.48s/it]

For epoch 5: 


Train batch number 189: 100%|██████████| 189/189 [00:44<00:00,  4.26batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.68batches/s]



Metrics: {'train_loss': 0.39433138884564556, 'test_loss': 0.8588387397202578, 'bleu': 0.6311, 'gen_len': 6.7381}




  1%|          | 2/197 [01:50<2:57:10, 54.52s/it]

For epoch 6: 


Train batch number 189: 100%|██████████| 189/189 [00:43<00:00,  4.32batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.52batches/s]



Metrics: {'train_loss': 0.3199326772853811, 'test_loss': 0.8707032555883581, 'bleu': 0.6809, 'gen_len': 7.6607}




  2%|▏         | 3/197 [02:41<2:51:39, 53.09s/it]

For epoch 7: 


Train batch number 189: 100%|██████████| 189/189 [00:44<00:00,  4.25batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.48batches/s]



Metrics: {'train_loss': 0.26351767289575445, 'test_loss': 0.8776449317281897, 'bleu': 0.5299, 'gen_len': 7.3274}




  2%|▏         | 4/197 [03:33<2:49:44, 52.77s/it]

For epoch 8: 


Train batch number 189: 100%|██████████| 189/189 [00:46<00:00,  4.10batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.49batches/s]



Metrics: {'train_loss': 0.21915919554454308, 'test_loss': 0.9185130216858604, 'bleu': 2.1151, 'gen_len': 11.9048}




  3%|▎         | 5/197 [04:29<2:51:59, 53.75s/it]

For epoch 9: 


Train batch number 189: 100%|██████████| 189/189 [00:45<00:00,  4.15batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.41batches/s]



Metrics: {'train_loss': 0.18694558306030495, 'test_loss': 0.9460377449339087, 'bleu': 2.4888, 'gen_len': 8.8512}




  3%|▎         | 6/197 [05:24<2:52:38, 54.23s/it]

For epoch 10: 


Train batch number 189: 100%|██████████| 189/189 [00:45<00:00,  4.12batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.43batches/s]



Metrics: {'train_loss': 0.16209907411898256, 'test_loss': 0.9953543056141246, 'bleu': 0.9885, 'gen_len': 7.4762}




  4%|▎         | 7/197 [06:18<2:51:27, 54.15s/it]

For epoch 11: 


Train batch number 189: 100%|██████████| 189/189 [00:46<00:00,  4.04batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:05<00:00,  2.11batches/s]



Metrics: {'train_loss': 0.13945165083364205, 'test_loss': 1.0150363391095942, 'bleu': 2.2402, 'gen_len': 7.6726}




  4%|▍         | 8/197 [07:13<2:51:48, 54.54s/it]

For epoch 12: 


Train batch number 189: 100%|██████████| 189/189 [00:46<00:00,  4.07batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.44batches/s]



Metrics: {'train_loss': 0.12792153906806436, 'test_loss': 0.9709472439505837, 'bleu': 3.5373, 'gen_len': 9.25}




  5%|▍         | 9/197 [08:10<2:52:32, 55.06s/it]

For epoch 13: 


Train batch number 189: 100%|██████████| 189/189 [00:46<00:00,  4.09batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.47batches/s]



Metrics: {'train_loss': 0.11206897961163016, 'test_loss': 0.9947704347697172, 'bleu': 2.7082, 'gen_len': 7.5774}




  5%|▌         | 10/197 [09:04<2:50:37, 54.75s/it]

For epoch 14: 


Train batch number 189: 100%|██████████| 189/189 [00:46<00:00,  4.04batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.45batches/s]



Metrics: {'train_loss': 0.10019452838355271, 'test_loss': 1.0234557010910728, 'bleu': 2.6445, 'gen_len': 8.4464}




  6%|▌         | 11/197 [09:58<2:49:33, 54.70s/it]

For epoch 15: 


Train batch number 189: 100%|██████████| 189/189 [00:46<00:00,  4.10batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:05<00:00,  2.07batches/s]



Metrics: {'train_loss': 0.0918272433299867, 'test_loss': 0.9993590251965956, 'bleu': 3.8211, 'gen_len': 7.8988}




  6%|▌         | 12/197 [10:55<2:50:31, 55.30s/it]

For epoch 16: 


Train batch number 189: 100%|██████████| 189/189 [00:49<00:00,  3.81batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.40batches/s]



Metrics: {'train_loss': 0.08363995419214011, 'test_loss': 1.0266891446980564, 'bleu': 2.4332, 'gen_len': 8.5655}




  7%|▋         | 13/197 [11:53<2:51:43, 56.00s/it]

For epoch 17: 


Train batch number 189: 100%|██████████| 189/189 [00:48<00:00,  3.87batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.39batches/s]



Metrics: {'train_loss': 0.0778308276914888, 'test_loss': 1.0307117917320945, 'bleu': 3.5087, 'gen_len': 8.3155}




  7%|▋         | 14/197 [12:49<2:51:25, 56.20s/it]

For epoch 18: 


Train batch number 189: 100%|██████████| 189/189 [00:46<00:00,  4.08batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.42batches/s]



Metrics: {'train_loss': 0.07338590092129177, 'test_loss': 1.0326201536438682, 'bleu': 2.2492, 'gen_len': 7.6845}




  8%|▊         | 15/197 [13:44<2:48:51, 55.67s/it]

For epoch 19: 


Train batch number 189: 100%|██████████| 189/189 [00:46<00:00,  4.08batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.28batches/s]



Metrics: {'train_loss': 0.06746183748716715, 'test_loss': 1.0440157706087285, 'bleu': 3.6125, 'gen_len': 8.5952}




  8%|▊         | 16/197 [14:38<2:46:53, 55.32s/it]

For epoch 20: 


Train batch number 189: 100%|██████████| 189/189 [00:47<00:00,  3.99batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.37batches/s]



Metrics: {'train_loss': 0.064381278499409, 'test_loss': 1.022717367519032, 'bleu': 3.9882, 'gen_len': 8.2202}




  9%|▊         | 17/197 [15:35<2:47:34, 55.86s/it]

For epoch 21: 


Train batch number 189: 100%|██████████| 189/189 [00:48<00:00,  3.89batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.43batches/s]



Metrics: {'train_loss': 0.06066898434427838, 'test_loss': 1.0671358867125078, 'bleu': 2.6027, 'gen_len': 8.0}




  9%|▉         | 18/197 [16:32<2:47:17, 56.07s/it]

For epoch 22: 


Train batch number 189: 100%|██████████| 189/189 [00:46<00:00,  4.06batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.39batches/s]



Metrics: {'train_loss': 0.05614416074579355, 'test_loss': 1.0840876319191672, 'bleu': 2.0204, 'gen_len': 7.9345}




 10%|▉         | 19/197 [17:26<2:44:51, 55.57s/it]

For epoch 23: 


Train batch number 189: 100%|██████████| 189/189 [00:51<00:00,  3.66batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:05<00:00,  2.03batches/s]



Metrics: {'train_loss': 0.052967036404109824, 'test_loss': 1.06580266085538, 'bleu': 3.3651, 'gen_len': 8.3512}




 10%|█         | 20/197 [18:27<2:48:09, 57.00s/it]

For epoch 24: 


Train batch number 189: 100%|██████████| 189/189 [00:56<00:00,  3.37batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.26batches/s]



Metrics: {'train_loss': 0.05137106681134177, 'test_loss': 1.0852028402415188, 'bleu': 4.0046, 'gen_len': 7.9464}




 11%|█         | 21/197 [19:33<2:55:21, 59.78s/it]

For epoch 25: 


Train batch number 189: 100%|██████████| 189/189 [00:47<00:00,  3.97batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:05<00:00,  2.06batches/s]



Metrics: {'train_loss': 0.048710938129160136, 'test_loss': 1.054525139656934, 'bleu': 4.0749, 'gen_len': 8.25}




 11%|█         | 22/197 [20:31<2:53:05, 59.34s/it]

For epoch 26: 


Train batch number 189: 100%|██████████| 189/189 [00:47<00:00,  4.01batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.34batches/s]



Metrics: {'train_loss': 0.04627667081162885, 'test_loss': 1.1047622127966448, 'bleu': 1.8308, 'gen_len': 7.6845}




 12%|█▏        | 23/197 [21:26<2:48:29, 58.10s/it]

For epoch 27: 


Train batch number 189: 100%|██████████| 189/189 [00:49<00:00,  3.85batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.35batches/s]



Metrics: {'train_loss': 0.04437995573417062, 'test_loss': 1.050233778628436, 'bleu': 4.0838, 'gen_len': 7.8095}




 12%|█▏        | 24/197 [22:25<2:48:13, 58.35s/it]

For epoch 28: 


Train batch number 189: 100%|██████████| 189/189 [00:46<00:00,  4.05batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.36batches/s]



Metrics: {'train_loss': 0.04140237169882292, 'test_loss': 1.070192884315144, 'bleu': 4.295, 'gen_len': 7.9524}




 13%|█▎        | 25/197 [23:22<2:45:43, 57.81s/it]

For epoch 29: 


Train batch number 189: 100%|██████████| 189/189 [00:47<00:00,  3.98batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.25batches/s]



Metrics: {'train_loss': 0.04121357126644364, 'test_loss': 1.0708809982646594, 'bleu': 4.1214, 'gen_len': 7.8631}




 13%|█▎        | 26/197 [24:18<2:43:00, 57.19s/it]

For epoch 30: 


Train batch number 189: 100%|██████████| 189/189 [00:57<00:00,  3.30batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:06<00:00,  1.83batches/s]



Metrics: {'train_loss': 0.03893457076182126, 'test_loss': 1.1135996688495984, 'bleu': 3.7075, 'gen_len': 7.8393}




 14%|█▎        | 27/197 [25:24<2:49:46, 59.92s/it]

For epoch 31: 


Train batch number 189: 100%|██████████| 189/189 [00:50<00:00,  3.78batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:05<00:00,  2.08batches/s]



Metrics: {'train_loss': 0.03728323184919578, 'test_loss': 1.084847393361005, 'bleu': 4.2531, 'gen_len': 7.9821}




 14%|█▍        | 28/197 [26:23<2:47:51, 59.59s/it]

For epoch 32: 


Train batch number 189: 100%|██████████| 189/189 [00:51<00:00,  3.70batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.23batches/s]



Metrics: {'train_loss': 0.03719530980935488, 'test_loss': 1.0737222270532087, 'bleu': 2.9761, 'gen_len': 8.0536}




 15%|█▍        | 29/197 [27:22<2:46:39, 59.52s/it]

For epoch 33: 


Train batch number 189: 100%|██████████| 189/189 [00:51<00:00,  3.65batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:05<00:00,  1.91batches/s]



Metrics: {'train_loss': 0.03568398482880737, 'test_loss': 1.0778734575618396, 'bleu': 4.1338, 'gen_len': 8.1607}




 15%|█▌        | 30/197 [28:23<2:46:55, 59.97s/it]

For epoch 34: 


Train batch number 189: 100%|██████████| 189/189 [00:48<00:00,  3.87batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.25batches/s]



Metrics: {'train_loss': 0.032914650710250334, 'test_loss': 1.08007150888443, 'bleu': 3.5579, 'gen_len': 8.2619}




 16%|█▌        | 31/197 [29:20<2:43:37, 59.14s/it]

For epoch 35: 


Train batch number 189: 100%|██████████| 189/189 [00:49<00:00,  3.79batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:05<00:00,  2.11batches/s]



Metrics: {'train_loss': 0.03213488609559598, 'test_loss': 1.0833699215542187, 'bleu': 3.3765, 'gen_len': 8.006}




 16%|█▌        | 32/197 [30:19<2:42:12, 58.98s/it]

For epoch 36: 


Train batch number 189: 100%|██████████| 189/189 [00:53<00:00,  3.56batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:05<00:00,  2.20batches/s]



Metrics: {'train_loss': 0.03352817001619509, 'test_loss': 1.0895872061902827, 'bleu': 4.4927, 'gen_len': 7.9881}




 17%|█▋        | 33/197 [31:22<2:44:46, 60.28s/it]

For epoch 37: 


Train batch number 189: 100%|██████████| 189/189 [00:52<00:00,  3.62batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:05<00:00,  1.98batches/s]



Metrics: {'train_loss': 0.029258279203777274, 'test_loss': 1.0928312242031097, 'bleu': 3.9524, 'gen_len': 8.1429}




 17%|█▋        | 34/197 [32:24<2:44:46, 60.65s/it]

For epoch 38: 


Train batch number 189: 100%|██████████| 189/189 [00:48<00:00,  3.93batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.28batches/s]



Metrics: {'train_loss': 0.029178353659710082, 'test_loss': 1.0975456075234846, 'bleu': 4.6547, 'gen_len': 8.2202}




 18%|█▊        | 35/197 [33:22<2:41:43, 59.90s/it]

For epoch 39: 


Train batch number 189: 100%|██████████| 189/189 [00:53<00:00,  3.51batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:05<00:00,  1.97batches/s]



Metrics: {'train_loss': 0.03039971282754941, 'test_loss': 1.0875938209620388, 'bleu': 2.9031, 'gen_len': 8.131}




 18%|█▊        | 36/197 [34:25<2:42:56, 60.73s/it]

For epoch 40: 


Train batch number 189: 100%|██████████| 189/189 [00:56<00:00,  3.37batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:05<00:00,  1.90batches/s]



Metrics: {'train_loss': 0.03054027190855729, 'test_loss': 1.1209385611794211, 'bleu': 4.3653, 'gen_len': 8.4583}




 19%|█▉        | 37/197 [35:30<2:45:43, 62.15s/it]

For epoch 41: 


Train batch number 189: 100%|██████████| 189/189 [00:54<00:00,  3.44batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.20batches/s]



Metrics: {'train_loss': 0.029628932589379254, 'test_loss': 1.100514915856448, 'bleu': 4.5645, 'gen_len': 8.4226}




 19%|█▉        | 38/197 [36:33<2:45:33, 62.48s/it]

For epoch 42: 


Train batch number 189: 100%|██████████| 189/189 [00:52<00:00,  3.60batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.37batches/s]



Metrics: {'train_loss': 0.026622804714534334, 'test_loss': 1.1147295507517727, 'bleu': 4.874, 'gen_len': 8.2679}




 20%|█▉        | 39/197 [37:35<2:44:17, 62.39s/it]

For epoch 43: 


Train batch number 189: 100%|██████████| 189/189 [00:50<00:00,  3.71batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:06<00:00,  1.61batches/s]



Metrics: {'train_loss': 0.025176344136337913, 'test_loss': 1.1015970002521167, 'bleu': 3.3069, 'gen_len': 8.1012}




 20%|██        | 40/197 [38:37<2:42:22, 62.05s/it]

For epoch 44: 


Train batch number 189: 100%|██████████| 189/189 [00:58<00:00,  3.23batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:05<00:00,  2.11batches/s]



Metrics: {'train_loss': 0.026173302318615022, 'test_loss': 1.1020840352231807, 'bleu': 2.9903, 'gen_len': 7.8512}




 21%|██        | 41/197 [39:44<2:45:29, 63.65s/it]

For epoch 45: 


Train batch number 189: 100%|██████████| 189/189 [00:57<00:00,  3.26batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:07<00:00,  1.54batches/s]



Metrics: {'train_loss': 0.02622906128486628, 'test_loss': 1.0957561352036216, 'bleu': 4.5633, 'gen_len': 8.2083}




 21%|██▏       | 42/197 [40:53<2:48:09, 65.09s/it]

For epoch 46: 


Train batch number 189: 100%|██████████| 189/189 [00:58<00:00,  3.25batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:05<00:00,  2.09batches/s]



Metrics: {'train_loss': 0.024751166368356694, 'test_loss': 1.0906914418393916, 'bleu': 4.4558, 'gen_len': 8.0298}




 22%|██▏       | 43/197 [42:01<2:49:19, 65.97s/it]

For epoch 47: 


Train batch number 189: 100%|██████████| 189/189 [00:49<00:00,  3.78batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.25batches/s]



Metrics: {'train_loss': 0.023978951087753688, 'test_loss': 1.1311543529683894, 'bleu': 3.7091, 'gen_len': 8.1488}




 22%|██▏       | 44/197 [42:59<2:42:31, 63.74s/it]

For epoch 48: 


Train batch number 189: 100%|██████████| 189/189 [00:52<00:00,  3.59batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:05<00:00,  2.07batches/s]



Metrics: {'train_loss': 0.023312074746985836, 'test_loss': 1.0894582163203845, 'bleu': 4.1818, 'gen_len': 8.125}




 23%|██▎       | 45/197 [44:00<2:39:35, 63.00s/it]

For epoch 49: 


Train batch number 189: 100%|██████████| 189/189 [00:48<00:00,  3.89batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.43batches/s]



Metrics: {'train_loss': 0.02500311371490911, 'test_loss': 1.0904016494750977, 'bleu': 3.7671, 'gen_len': 8.1488}




 23%|██▎       | 46/197 [44:57<2:33:40, 61.06s/it]

For epoch 50: 


Train batch number 189: 100%|██████████| 189/189 [00:46<00:00,  4.04batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.41batches/s]



Metrics: {'train_loss': 0.023132516270769492, 'test_loss': 1.1020000143484636, 'bleu': 5.6443, 'gen_len': 8.4048}




 24%|██▍       | 47/197 [45:53<2:29:15, 59.70s/it]

For epoch 51: 


Train batch number 189: 100%|██████████| 189/189 [00:45<00:00,  4.12batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.43batches/s]



Metrics: {'train_loss': 0.022710416656327507, 'test_loss': 1.1089196638627485, 'bleu': 5.2884, 'gen_len': 8.7381}




 24%|██▍       | 48/197 [46:47<2:23:59, 57.98s/it]

For epoch 52: 


Train batch number 189: 100%|██████████| 189/189 [00:48<00:00,  3.90batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.30batches/s]



Metrics: {'train_loss': 0.021734779116712392, 'test_loss': 1.1245777606964111, 'bleu': 3.7441, 'gen_len': 8.875}




 25%|██▍       | 49/197 [47:44<2:22:00, 57.57s/it]

For epoch 53: 


Train batch number 189: 100%|██████████| 189/189 [00:46<00:00,  4.06batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.43batches/s]



Metrics: {'train_loss': 0.02245104450500871, 'test_loss': 1.096148203719746, 'bleu': 4.3601, 'gen_len': 8.4286}




 25%|██▌       | 50/197 [48:38<2:18:43, 56.62s/it]

For epoch 54: 


Train batch number 189: 100%|██████████| 189/189 [00:46<00:00,  4.09batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.42batches/s]



Metrics: {'train_loss': 0.020777449763266655, 'test_loss': 1.100434346632524, 'bleu': 5.3137, 'gen_len': 8.1726}




 26%|██▌       | 51/197 [49:33<2:15:56, 55.87s/it]

For epoch 55: 


Train batch number 189: 100%|██████████| 189/189 [00:46<00:00,  4.05batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.38batches/s]



Metrics: {'train_loss': 0.020064042175247793, 'test_loss': 1.1260934201153843, 'bleu': 4.6608, 'gen_len': 8.0833}




 26%|██▋       | 52/197 [50:27<2:14:00, 55.45s/it]

For epoch 56: 


Train batch number 189: 100%|██████████| 189/189 [00:47<00:00,  3.97batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.40batches/s]



Metrics: {'train_loss': 0.020321137946907174, 'test_loss': 1.0778048499064012, 'bleu': 4.6247, 'gen_len': 8.4107}




 27%|██▋       | 53/197 [51:23<2:13:12, 55.50s/it]

For epoch 57: 


Train batch number 189: 100%|██████████| 189/189 [00:48<00:00,  3.90batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:05<00:00,  2.14batches/s]



Metrics: {'train_loss': 0.019423651383332317, 'test_loss': 1.082686272534457, 'bleu': 3.9432, 'gen_len': 7.756}




 27%|██▋       | 54/197 [52:20<2:13:22, 55.96s/it]

For epoch 58: 


Train batch number 189: 100%|██████████| 189/189 [00:46<00:00,  4.04batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:05<00:00,  2.16batches/s]



Metrics: {'train_loss': 0.01885003463990947, 'test_loss': 1.1108581152829258, 'bleu': 5.2247, 'gen_len': 8.3155}




 28%|██▊       | 55/197 [53:15<2:11:53, 55.73s/it]

For epoch 59: 


Train batch number 189: 100%|██████████| 189/189 [00:48<00:00,  3.93batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:05<00:00,  2.12batches/s]



Metrics: {'train_loss': 0.01809993702561038, 'test_loss': 1.0857226144183765, 'bleu': 4.4345, 'gen_len': 7.994}




 28%|██▊       | 56/197 [54:12<2:11:46, 56.07s/it]

For epoch 60: 


Train batch number 189: 100%|██████████| 189/189 [00:45<00:00,  4.14batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.27batches/s]



Metrics: {'train_loss': 0.017184592389548444, 'test_loss': 1.0914136929945513, 'bleu': 4.4862, 'gen_len': 7.9345}




 29%|██▉       | 57/197 [55:06<2:09:16, 55.41s/it]

For epoch 61: 


Train batch number 189: 100%|██████████| 189/189 [00:47<00:00,  4.00batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.26batches/s]



Metrics: {'train_loss': 0.01741034778391834, 'test_loss': 1.0916032276370309, 'bleu': 3.1792, 'gen_len': 7.9226}




 29%|██▉       | 58/197 [56:01<2:08:16, 55.37s/it]

For epoch 62: 


Train batch number 189: 100%|██████████| 189/189 [00:44<00:00,  4.22batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.53batches/s]



Metrics: {'train_loss': 0.016615732208327957, 'test_loss': 1.095419471914118, 'bleu': 6.0793, 'gen_len': 8.5952}




 30%|██▉       | 59/197 [56:55<2:06:33, 55.02s/it]

For epoch 63: 


Train batch number 189: 100%|██████████| 189/189 [00:44<00:00,  4.22batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.42batches/s]



Metrics: {'train_loss': 0.016856069635001143, 'test_loss': 1.081476628780365, 'bleu': 7.2895, 'gen_len': 8.6726}




 30%|███       | 60/197 [57:49<2:05:06, 54.79s/it]

For epoch 64: 


Train batch number 189: 100%|██████████| 189/189 [00:43<00:00,  4.33batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.63batches/s]



Metrics: {'train_loss': 0.01658658145602901, 'test_loss': 1.09098359671506, 'bleu': 4.6036, 'gen_len': 8.0833}




 31%|███       | 61/197 [58:40<2:01:42, 53.70s/it]

For epoch 65: 


Train batch number 189: 100%|██████████| 189/189 [00:44<00:00,  4.26batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.51batches/s]



Metrics: {'train_loss': 0.01734132795574922, 'test_loss': 1.1055920991030606, 'bleu': 3.4974, 'gen_len': 8.0714}




 31%|███▏      | 62/197 [59:32<1:59:41, 53.20s/it]

For epoch 66: 


Train batch number 189: 100%|██████████| 189/189 [00:43<00:00,  4.30batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.48batches/s]



Metrics: {'train_loss': 0.017244715252201313, 'test_loss': 1.1122769875959917, 'bleu': 5.4051, 'gen_len': 8.1607}




 32%|███▏      | 63/197 [1:00:24<1:57:32, 52.63s/it]

For epoch 67: 


Train batch number 189: 100%|██████████| 189/189 [00:45<00:00,  4.17batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.52batches/s]



Metrics: {'train_loss': 0.01745748186604706, 'test_loss': 1.0961992794817144, 'bleu': 4.7244, 'gen_len': 8.3452}




 32%|███▏      | 64/197 [1:01:17<1:56:49, 52.70s/it]

For epoch 68: 


Train batch number 189: 100%|██████████| 189/189 [00:45<00:00,  4.12batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.51batches/s]



Metrics: {'train_loss': 0.017213458696715336, 'test_loss': 1.1196081204847856, 'bleu': 3.1322, 'gen_len': 7.8333}




 33%|███▎      | 65/197 [1:02:10<1:56:31, 52.97s/it]

For epoch 69: 


Train batch number 189: 100%|██████████| 189/189 [00:43<00:00,  4.30batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.48batches/s]



Metrics: {'train_loss': 0.017903524879661816, 'test_loss': 1.1062012748284773, 'bleu': 5.2005, 'gen_len': 7.9167}




 34%|███▎      | 66/197 [1:03:02<1:54:47, 52.57s/it]

For epoch 70: 


Train batch number 189: 100%|██████████| 189/189 [00:44<00:00,  4.23batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.53batches/s]



Metrics: {'train_loss': 0.017470410489723597, 'test_loss': 1.1109233173457058, 'bleu': 4.5726, 'gen_len': 8.4345}




 34%|███▍      | 67/197 [1:03:54<1:53:43, 52.49s/it]

For epoch 71: 


Train batch number 189: 100%|██████████| 189/189 [00:44<00:00,  4.26batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.58batches/s]



Metrics: {'train_loss': 0.017619338737204473, 'test_loss': 1.1106605692343279, 'bleu': 6.1146, 'gen_len': 8.1845}




 35%|███▍      | 68/197 [1:04:46<1:52:33, 52.36s/it]

For epoch 72: 


Train batch number 189: 100%|██████████| 189/189 [00:44<00:00,  4.26batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.51batches/s]



Metrics: {'train_loss': 0.018787335119057426, 'test_loss': 1.0978737527673894, 'bleu': 4.9577, 'gen_len': 7.8274}




 35%|███▌      | 69/197 [1:05:38<1:51:26, 52.24s/it]

For epoch 73: 


Train batch number 189: 100%|██████████| 189/189 [00:45<00:00,  4.18batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.59batches/s]



Metrics: {'train_loss': 0.017951155520566597, 'test_loss': 1.1160184361717918, 'bleu': 5.0452, 'gen_len': 7.6071}




 36%|███▌      | 70/197 [1:06:31<1:50:52, 52.38s/it]

For epoch 74: 


Train batch number 189: 100%|██████████| 189/189 [00:44<00:00,  4.24batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.53batches/s]



Metrics: {'train_loss': 0.01710547921346333, 'test_loss': 1.1086897199804133, 'bleu': 5.6718, 'gen_len': 8.3571}




 36%|███▌      | 71/197 [1:07:23<1:49:57, 52.36s/it]

For epoch 75: 


Train batch number 189: 100%|██████████| 189/189 [00:44<00:00,  4.25batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.50batches/s]



Metrics: {'train_loss': 0.01679115932774804, 'test_loss': 1.1366045800122349, 'bleu': 6.3618, 'gen_len': 7.7857}




 37%|███▋      | 72/197 [1:08:16<1:49:08, 52.39s/it]

For epoch 76: 


Train batch number 189: 100%|██████████| 189/189 [00:46<00:00,  4.03batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.44batches/s]



Metrics: {'train_loss': 0.015993746925174954, 'test_loss': 1.1114322272214023, 'bleu': 4.598, 'gen_len': 8.1845}




 37%|███▋      | 73/197 [1:09:10<1:49:41, 53.08s/it]

For epoch 77: 


Train batch number 189: 100%|██████████| 189/189 [00:44<00:00,  4.25batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.51batches/s]



Metrics: {'train_loss': 0.015637696238367686, 'test_loss': 1.12073092027144, 'bleu': 5.4114, 'gen_len': 8.4167}




 38%|███▊      | 74/197 [1:10:02<1:48:13, 52.79s/it]

For epoch 78: 


Train batch number 189: 100%|██████████| 189/189 [00:44<00:00,  4.28batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.45batches/s]



Metrics: {'train_loss': 0.016182411514424656, 'test_loss': 1.1460975462740117, 'bleu': 5.4561, 'gen_len': 8.25}




 38%|███▊      | 75/197 [1:10:54<1:46:46, 52.51s/it]

For epoch 79: 


Train batch number 189: 100%|██████████| 189/189 [00:46<00:00,  4.02batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.49batches/s]



Metrics: {'train_loss': 0.014889138445834674, 'test_loss': 1.1057139743458142, 'bleu': 3.7569, 'gen_len': 8.0952}




 39%|███▊      | 76/197 [1:11:49<1:47:24, 53.26s/it]

For epoch 80: 


Train batch number 189: 100%|██████████| 189/189 [00:44<00:00,  4.25batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.59batches/s]



Metrics: {'train_loss': 0.015086171443009916, 'test_loss': 1.0782040845264087, 'bleu': 4.3569, 'gen_len': 7.9286}




 39%|███▉      | 77/197 [1:12:42<1:45:56, 52.97s/it]

For epoch 81: 


Train batch number 189: 100%|██████████| 189/189 [00:44<00:00,  4.21batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.48batches/s]



Metrics: {'train_loss': 0.014168701031449788, 'test_loss': 1.1118932637301358, 'bleu': 4.6854, 'gen_len': 8.1905}




 40%|███▉      | 78/197 [1:13:34<1:44:54, 52.89s/it]

For epoch 82: 


Train batch number 189: 100%|██████████| 189/189 [00:45<00:00,  4.16batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.44batches/s]



Metrics: {'train_loss': 0.015049237631342655, 'test_loss': 1.120302834294059, 'bleu': 4.1979, 'gen_len': 8.4107}




 40%|████      | 79/197 [1:14:28<1:44:15, 53.01s/it]

For epoch 83: 


Train batch number 189: 100%|██████████| 189/189 [00:44<00:00,  4.25batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.46batches/s]



Metrics: {'train_loss': 0.015341018874163705, 'test_loss': 1.1220990419387817, 'bleu': 4.606, 'gen_len': 8.0774}




 41%|████      | 80/197 [1:15:20<1:42:51, 52.75s/it]

For epoch 84: 


Train batch number 189: 100%|██████████| 189/189 [00:44<00:00,  4.26batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.57batches/s]



Metrics: {'train_loss': 0.01509010486465524, 'test_loss': 1.100352942943573, 'bleu': 6.2273, 'gen_len': 8.5119}




 41%|████      | 81/197 [1:16:12<1:41:36, 52.56s/it]

For epoch 85: 


Train batch number 189: 100%|██████████| 189/189 [00:45<00:00,  4.18batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.53batches/s]



Metrics: {'train_loss': 0.01488489586436165, 'test_loss': 1.1063287203962153, 'bleu': 5.0181, 'gen_len': 8.5}




 42%|████▏     | 82/197 [1:17:05<1:40:56, 52.66s/it]

For epoch 86: 


Train batch number 189: 100%|██████████| 189/189 [00:45<00:00,  4.20batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.51batches/s]



Metrics: {'train_loss': 0.015173925204503119, 'test_loss': 1.109544190493497, 'bleu': 5.4111, 'gen_len': 8.4702}




 42%|████▏     | 83/197 [1:17:58<1:40:09, 52.72s/it]

For epoch 87: 


Train batch number 189: 100%|██████████| 189/189 [00:44<00:00,  4.25batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.51batches/s]



Metrics: {'train_loss': 0.013889281777665019, 'test_loss': 1.097086169502952, 'bleu': 4.4662, 'gen_len': 8.1667}




 43%|████▎     | 84/197 [1:18:50<1:38:55, 52.52s/it]

For epoch 88: 


Train batch number 189: 100%|██████████| 189/189 [00:44<00:00,  4.30batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.53batches/s]



Metrics: {'train_loss': 0.012413388074723088, 'test_loss': 1.0910359241745688, 'bleu': 6.3921, 'gen_len': 8.4643}




 43%|████▎     | 85/197 [1:19:41<1:37:37, 52.30s/it]

For epoch 89: 


Train batch number 189: 100%|██████████| 189/189 [00:44<00:00,  4.24batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.45batches/s]



Metrics: {'train_loss': 0.011958136644107215, 'test_loss': 1.0941044200550427, 'bleu': 5.1872, 'gen_len': 8.494}




 44%|████▎     | 86/197 [1:20:33<1:36:34, 52.21s/it]

For epoch 90: 


Train batch number 189: 100%|██████████| 189/189 [00:45<00:00,  4.18batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.49batches/s]



Metrics: {'train_loss': 0.01190856864174493, 'test_loss': 1.1092033277858386, 'bleu': 4.9437, 'gen_len': 8.4762}




 44%|████▍     | 87/197 [1:21:26<1:36:04, 52.40s/it]

For epoch 91: 


Train batch number 189: 100%|██████████| 189/189 [00:46<00:00,  4.09batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.53batches/s]



Metrics: {'train_loss': 0.012541833406969629, 'test_loss': 1.1182849353010005, 'bleu': 3.3944, 'gen_len': 7.9048}




 45%|████▍     | 88/197 [1:22:20<1:35:56, 52.81s/it]

For epoch 92: 


Train batch number 189: 100%|██████████| 189/189 [00:44<00:00,  4.23batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.49batches/s]



Metrics: {'train_loss': 0.012816770014564985, 'test_loss': 1.1166085519573905, 'bleu': 4.1936, 'gen_len': 8.2083}




 45%|████▌     | 89/197 [1:23:13<1:34:53, 52.72s/it]

For epoch 93: 


Train batch number 189: 100%|██████████| 189/189 [00:46<00:00,  4.07batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.32batches/s]



Metrics: {'train_loss': 0.011543021424643932, 'test_loss': 1.114879391410134, 'bleu': 3.1274, 'gen_len': 8.0893}




 46%|████▌     | 90/197 [1:24:07<1:34:54, 53.22s/it]

For epoch 94: 


Train batch number 189: 100%|██████████| 189/189 [00:47<00:00,  3.96batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.38batches/s]



Metrics: {'train_loss': 0.012574935264185702, 'test_loss': 1.0825500217351047, 'bleu': 4.0552, 'gen_len': 8.3214}




 46%|████▌     | 91/197 [1:25:03<1:35:23, 53.99s/it]

For epoch 95: 


Train batch number 189: 100%|██████████| 189/189 [00:46<00:00,  4.04batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.39batches/s]



Metrics: {'train_loss': 0.012746560746414597, 'test_loss': 1.1065589026971296, 'bleu': 4.2357, 'gen_len': 7.8452}




 47%|████▋     | 92/197 [1:25:57<1:34:51, 54.20s/it]

For epoch 96: 


Train batch number 189: 100%|██████████| 189/189 [00:46<00:00,  4.05batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.49batches/s]



Metrics: {'train_loss': 0.016475186026014544, 'test_loss': 1.0940866036848589, 'bleu': 3.8456, 'gen_len': 8.0476}




 47%|████▋     | 93/197 [1:26:52<1:34:06, 54.29s/it]

For epoch 97: 


Train batch number 189: 100%|██████████| 189/189 [00:46<00:00,  4.09batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.43batches/s]



Metrics: {'train_loss': 0.0115835652908399, 'test_loss': 1.0838135562159799, 'bleu': 4.6009, 'gen_len': 8.1548}




 48%|████▊     | 94/197 [1:27:46<1:33:07, 54.25s/it]

For epoch 98: 


Train batch number 189: 100%|██████████| 189/189 [00:44<00:00,  4.27batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.41batches/s]



Metrics: {'train_loss': 0.011350376294647397, 'test_loss': 1.1121427552266554, 'bleu': 4.8723, 'gen_len': 8.2083}




 48%|████▊     | 95/197 [1:28:38<1:31:14, 53.67s/it]

For epoch 99: 


Train batch number 189: 100%|██████████| 189/189 [00:44<00:00,  4.27batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.51batches/s]



Metrics: {'train_loss': 0.010132218291189968, 'test_loss': 1.104966857216575, 'bleu': 4.0791, 'gen_len': 8.0774}




 49%|████▊     | 96/197 [1:29:30<1:29:29, 53.16s/it]

For epoch 100: 


Train batch number 189: 100%|██████████| 189/189 [00:45<00:00,  4.18batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.49batches/s]



Metrics: {'train_loss': 0.009769761051136321, 'test_loss': 1.1077959754250266, 'bleu': 5.1988, 'gen_len': 8.0298}




 49%|████▉     | 97/197 [1:30:23<1:28:25, 53.05s/it]

For epoch 101: 


Train batch number 189: 100%|██████████| 189/189 [00:45<00:00,  4.16batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.50batches/s]



Metrics: {'train_loss': 0.01013394311811435, 'test_loss': 1.1200078346512534, 'bleu': 4.9631, 'gen_len': 7.9226}




 50%|████▉     | 98/197 [1:31:16<1:27:35, 53.09s/it]

For epoch 102: 


Train batch number 189: 100%|██████████| 189/189 [00:46<00:00,  4.07batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.51batches/s]



Metrics: {'train_loss': 0.009777488409622854, 'test_loss': 1.1171191063794224, 'bleu': 5.6303, 'gen_len': 8.2798}




 50%|█████     | 99/197 [1:32:11<1:27:14, 53.42s/it]

For epoch 103: 


Train batch number 189: 100%|██████████| 189/189 [00:43<00:00,  4.30batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.32batches/s]



Metrics: {'train_loss': 0.009428129560700937, 'test_loss': 1.1147884726524353, 'bleu': 5.5233, 'gen_len': 8.2202}




 51%|█████     | 100/197 [1:33:03<1:25:44, 53.03s/it]

For epoch 104: 


Train batch number 189: 100%|██████████| 189/189 [00:45<00:00,  4.18batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.37batches/s]



Metrics: {'train_loss': 0.009483898013415032, 'test_loss': 1.1011991175738247, 'bleu': 5.3497, 'gen_len': 8.0476}




 51%|█████▏    | 101/197 [1:33:56<1:24:55, 53.08s/it]

For epoch 105: 


Train batch number 189: 100%|██████████| 189/189 [00:44<00:00,  4.27batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.50batches/s]



Metrics: {'train_loss': 0.010557355320083579, 'test_loss': 1.1080025434494019, 'bleu': 4.8678, 'gen_len': 8.131}




 52%|█████▏    | 102/197 [1:34:48<1:23:29, 52.74s/it]

For epoch 106: 


Train batch number 189: 100%|██████████| 189/189 [00:44<00:00,  4.22batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.53batches/s]



Metrics: {'train_loss': 0.009950684627275372, 'test_loss': 1.083040565252304, 'bleu': 3.1768, 'gen_len': 8.1964}




 52%|█████▏    | 103/197 [1:35:40<1:22:27, 52.64s/it]

For epoch 107: 


Train batch number 189: 100%|██████████| 189/189 [00:45<00:00,  4.15batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.51batches/s]



Metrics: {'train_loss': 0.01014897333015811, 'test_loss': 1.0995363051241094, 'bleu': 4.6456, 'gen_len': 8.2321}




 53%|█████▎    | 104/197 [1:36:33<1:21:52, 52.82s/it]

For epoch 108: 


Train batch number 189: 100%|██████████| 189/189 [00:44<00:00,  4.27batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.35batches/s]



Metrics: {'train_loss': 0.009883167066459825, 'test_loss': 1.0814814973961224, 'bleu': 4.46, 'gen_len': 7.875}




 53%|█████▎    | 105/197 [1:37:26<1:20:49, 52.71s/it]

For epoch 109: 


Train batch number 189: 100%|██████████| 189/189 [00:44<00:00,  4.26batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.53batches/s]



Metrics: {'train_loss': 0.009738326489539798, 'test_loss': 1.0818634764714674, 'bleu': 4.8738, 'gen_len': 8.2143}




 54%|█████▍    | 106/197 [1:38:18<1:19:41, 52.54s/it]

For epoch 110: 


Train batch number 189: 100%|██████████| 189/189 [00:44<00:00,  4.20batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.53batches/s]



Metrics: {'train_loss': 0.00964794449362517, 'test_loss': 1.0859762348911979, 'bleu': 5.113, 'gen_len': 8.3095}




 54%|█████▍    | 107/197 [1:39:11<1:18:52, 52.58s/it]

For epoch 111: 


Train batch number 189: 100%|██████████| 189/189 [00:45<00:00,  4.17batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.49batches/s]



Metrics: {'train_loss': 0.010849866368478317, 'test_loss': 1.0842477652159603, 'bleu': 4.6461, 'gen_len': 8.506}




 55%|█████▍    | 108/197 [1:40:04<1:18:11, 52.71s/it]

For epoch 112: 


Train batch number 189: 100%|██████████| 189/189 [00:44<00:00,  4.23batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.56batches/s]



Metrics: {'train_loss': 0.00899543264550009, 'test_loss': 1.087445698001168, 'bleu': 4.637, 'gen_len': 7.9702}




 55%|█████▌    | 109/197 [1:40:56<1:17:08, 52.59s/it]

For epoch 113: 


Train batch number 189: 100%|██████████| 189/189 [00:46<00:00,  4.04batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.40batches/s]



Metrics: {'train_loss': 0.008902738764484797, 'test_loss': 1.0989827378229662, 'bleu': 3.2321, 'gen_len': 8.2857}




 56%|█████▌    | 110/197 [1:41:51<1:17:09, 53.21s/it]

For epoch 114: 


Train batch number 189: 100%|██████████| 189/189 [00:44<00:00,  4.29batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.53batches/s]



Metrics: {'train_loss': 0.009288180534476562, 'test_loss': 1.081053544174541, 'bleu': 3.4161, 'gen_len': 8.2321}




 56%|█████▋    | 111/197 [1:42:42<1:15:35, 52.73s/it]

For epoch 115: 


Train batch number 189: 100%|██████████| 189/189 [00:44<00:00,  4.27batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.54batches/s]



Metrics: {'train_loss': 0.008435305172550398, 'test_loss': 1.1147497567263516, 'bleu': 3.4686, 'gen_len': 7.9405}




 57%|█████▋    | 112/197 [1:43:34<1:14:19, 52.46s/it]

For epoch 116: 


Train batch number 189: 100%|██████████| 189/189 [00:43<00:00,  4.31batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.66batches/s]



Metrics: {'train_loss': 0.009226941357603267, 'test_loss': 1.0867530486800454, 'bleu': 4.7533, 'gen_len': 8.2738}




 57%|█████▋    | 113/197 [1:44:25<1:12:54, 52.08s/it]

For epoch 117: 


Train batch number 189: 100%|██████████| 189/189 [00:43<00:00,  4.30batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.42batches/s]



Metrics: {'train_loss': 0.009442877590781934, 'test_loss': 1.0838626325130463, 'bleu': 3.209, 'gen_len': 7.6905}




 58%|█████▊    | 114/197 [1:45:17<1:11:45, 51.87s/it]

For epoch 118: 


Train batch number 189: 100%|██████████| 189/189 [00:43<00:00,  4.33batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.45batches/s]



Metrics: {'train_loss': 0.00944026885092229, 'test_loss': 1.0941214615648442, 'bleu': 5.0761, 'gen_len': 7.9405}




 58%|█████▊    | 115/197 [1:46:08<1:10:43, 51.74s/it]

For epoch 119: 


Train batch number 189: 100%|██████████| 189/189 [00:43<00:00,  4.31batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.51batches/s]



Metrics: {'train_loss': 0.00975100193391357, 'test_loss': 1.091021944176067, 'bleu': 3.1505, 'gen_len': 8.0298}




 59%|█████▉    | 116/197 [1:47:00<1:09:51, 51.75s/it]

For epoch 120: 


Train batch number 189: 100%|██████████| 189/189 [00:44<00:00,  4.28batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.54batches/s]



Metrics: {'train_loss': 0.009409198737807977, 'test_loss': 1.0904634269801052, 'bleu': 3.135, 'gen_len': 8.0952}




 59%|█████▉    | 117/197 [1:47:52<1:09:02, 51.78s/it]

For epoch 121: 


Train batch number 189: 100%|██████████| 189/189 [00:44<00:00,  4.25batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.54batches/s]



Metrics: {'train_loss': 0.009158109311736927, 'test_loss': 1.1177941181442954, 'bleu': 3.9092, 'gen_len': 7.7143}




 60%|█████▉    | 118/197 [1:48:44<1:08:19, 51.89s/it]

For epoch 122: 


Train batch number 189: 100%|██████████| 189/189 [00:43<00:00,  4.37batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.41batches/s]



Metrics: {'train_loss': 0.00964433032244434, 'test_loss': 1.0704427794976668, 'bleu': 6.886, 'gen_len': 8.2321}




 60%|██████    | 119/197 [1:49:35<1:07:11, 51.69s/it]

For epoch 123: 


Train batch number 189: 100%|██████████| 189/189 [00:43<00:00,  4.37batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.63batches/s]



Metrics: {'train_loss': 0.008604984852287022, 'test_loss': 1.0859644142064182, 'bleu': 5.3035, 'gen_len': 8.1369}




 61%|██████    | 120/197 [1:50:26<1:05:56, 51.38s/it]

For epoch 124: 


Train batch number 189: 100%|██████████| 189/189 [00:44<00:00,  4.25batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.46batches/s]



Metrics: {'train_loss': 0.008000439224319986, 'test_loss': 1.1275532354008069, 'bleu': 3.5882, 'gen_len': 8.0595}




 61%|██████▏   | 121/197 [1:51:18<1:05:16, 51.54s/it]

For epoch 125: 


Train batch number 189: 100%|██████████| 189/189 [00:45<00:00,  4.15batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.56batches/s]



Metrics: {'train_loss': 0.00907555095439964, 'test_loss': 1.100295987996188, 'bleu': 4.3344, 'gen_len': 8.5417}




 62%|██████▏   | 122/197 [1:52:11<1:05:03, 52.05s/it]

For epoch 126: 


Train batch number 189: 100%|██████████| 189/189 [00:44<00:00,  4.29batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.54batches/s]



Metrics: {'train_loss': 0.008869150149169077, 'test_loss': 1.0949677337299695, 'bleu': 3.1648, 'gen_len': 7.9762}




 62%|██████▏   | 123/197 [1:53:03<1:04:01, 51.91s/it]

For epoch 127: 


Train batch number 189: 100%|██████████| 189/189 [00:43<00:00,  4.35batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.47batches/s]



Metrics: {'train_loss': 0.009493687287590185, 'test_loss': 1.1044626235961914, 'bleu': 4.8179, 'gen_len': 8.4821}




 63%|██████▎   | 124/197 [1:53:54<1:02:49, 51.64s/it]

For epoch 128: 


Train batch number 189: 100%|██████████| 189/189 [00:43<00:00,  4.31batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.52batches/s]



Metrics: {'train_loss': 0.009979590750946146, 'test_loss': 1.0965562842108987, 'bleu': 5.8993, 'gen_len': 8.0655}




 63%|██████▎   | 125/197 [1:54:45<1:01:56, 51.61s/it]

For epoch 129: 


Train batch number 189: 100%|██████████| 189/189 [00:44<00:00,  4.24batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.53batches/s]



Metrics: {'train_loss': 0.008858120917538476, 'test_loss': 1.102477485483343, 'bleu': 4.5413, 'gen_len': 8.2024}




 64%|██████▍   | 126/197 [1:55:37<1:01:15, 51.77s/it]

For epoch 130: 


Train batch number 189: 100%|██████████| 189/189 [00:46<00:00,  4.04batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.62batches/s]



Metrics: {'train_loss': 0.009401722408482489, 'test_loss': 1.1187654679471797, 'bleu': 3.7305, 'gen_len': 8.0774}




 64%|██████▍   | 127/197 [1:56:31<1:01:12, 52.47s/it]

For epoch 131: 


Train batch number 189: 100%|██████████| 189/189 [00:46<00:00,  4.10batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.32batches/s]



Metrics: {'train_loss': 0.008991013398001626, 'test_loss': 1.0939083857969805, 'bleu': 5.1083, 'gen_len': 8.125}




 65%|██████▍   | 128/197 [1:57:25<1:00:47, 52.86s/it]

For epoch 132: 


Train batch number 189: 100%|██████████| 189/189 [00:45<00:00,  4.15batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.45batches/s]



Metrics: {'train_loss': 0.008011354473801378, 'test_loss': 1.108348391272805, 'bleu': 5.226, 'gen_len': 7.9702}




 65%|██████▌   | 129/197 [1:58:19<1:00:06, 53.03s/it]

For epoch 133: 


Train batch number 189: 100%|██████████| 189/189 [00:43<00:00,  4.37batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.54batches/s]



Metrics: {'train_loss': 0.009607610513706243, 'test_loss': 1.1019155030900782, 'bleu': 5.4631, 'gen_len': 7.9702}




 66%|██████▌   | 130/197 [1:59:09<58:28, 52.36s/it]  

For epoch 134: 


Train batch number 189: 100%|██████████| 189/189 [00:43<00:00,  4.35batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.57batches/s]



Metrics: {'train_loss': 0.009079897123026766, 'test_loss': 1.1112150847911835, 'bleu': 5.357, 'gen_len': 7.9524}




 66%|██████▋   | 131/197 [2:00:01<57:11, 51.99s/it]

For epoch 135: 


Train batch number 189: 100%|██████████| 189/189 [00:44<00:00,  4.26batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.59batches/s]



Metrics: {'train_loss': 0.008983160463339161, 'test_loss': 1.114028350873427, 'bleu': 5.2526, 'gen_len': 8.0298}




 67%|██████▋   | 132/197 [2:00:52<56:15, 51.93s/it]

For epoch 136: 


Train batch number 189: 100%|██████████| 189/189 [00:45<00:00,  4.16batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.22batches/s]



Metrics: {'train_loss': 0.008702422747904035, 'test_loss': 1.1064810725775631, 'bleu': 5.1511, 'gen_len': 7.8869}




 68%|██████▊   | 133/197 [2:01:46<55:58, 52.48s/it]

For epoch 137: 


Train batch number 189: 100%|██████████| 189/189 [00:44<00:00,  4.29batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.54batches/s]



Metrics: {'train_loss': 0.009391933943363025, 'test_loss': 1.1090635819868608, 'bleu': 6.0912, 'gen_len': 8.125}




 68%|██████▊   | 134/197 [2:02:38<54:49, 52.21s/it]

For epoch 138: 


Train batch number 189: 100%|██████████| 189/189 [00:44<00:00,  4.28batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.51batches/s]



Metrics: {'train_loss': 0.01028403740309403, 'test_loss': 1.1018691875717856, 'bleu': 5.7798, 'gen_len': 8.2619}




 69%|██████▊   | 135/197 [2:03:29<53:43, 51.99s/it]

For epoch 139: 


Train batch number 189: 100%|██████████| 189/189 [00:43<00:00,  4.32batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.54batches/s]



Metrics: {'train_loss': 0.009289555785784538, 'test_loss': 1.1116306700489738, 'bleu': 5.9719, 'gen_len': 8.4167}




 69%|██████▉   | 136/197 [2:04:20<52:39, 51.80s/it]

For epoch 140: 


Train batch number 189: 100%|██████████| 189/189 [00:43<00:00,  4.34batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.55batches/s]



Metrics: {'train_loss': 0.009007064868614727, 'test_loss': 1.1112638061696833, 'bleu': 5.1303, 'gen_len': 8.3393}




 70%|██████▉   | 137/197 [2:05:12<51:35, 51.60s/it]

For epoch 141: 


Train batch number 189: 100%|██████████| 189/189 [00:43<00:00,  4.39batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.31batches/s]



Metrics: {'train_loss': 0.008924173639835322, 'test_loss': 1.1091120080514387, 'bleu': 5.3103, 'gen_len': 8.0298}




 70%|███████   | 138/197 [2:06:03<50:36, 51.46s/it]

For epoch 142: 


Train batch number 189: 100%|██████████| 189/189 [00:47<00:00,  3.99batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.63batches/s]



Metrics: {'train_loss': 0.009408779687229077, 'test_loss': 1.1072132424874739, 'bleu': 4.6013, 'gen_len': 8.0238}




 71%|███████   | 139/197 [2:06:58<50:43, 52.48s/it]

For epoch 143: 


Train batch number 189: 100%|██████████| 189/189 [00:44<00:00,  4.29batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.54batches/s]



Metrics: {'train_loss': 0.008634134705293008, 'test_loss': 1.1179079575972124, 'bleu': 5.4615, 'gen_len': 7.9702}




 71%|███████   | 140/197 [2:07:49<49:38, 52.25s/it]

For epoch 144: 


Train batch number 189: 100%|██████████| 189/189 [00:44<00:00,  4.27batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:05<00:00,  2.11batches/s]



Metrics: {'train_loss': 0.008468191701650592, 'test_loss': 1.097485688599673, 'bleu': 4.1603, 'gen_len': 8.0357}




 72%|███████▏  | 141/197 [2:08:42<48:53, 52.39s/it]

For epoch 145: 


Train batch number 189: 100%|██████████| 189/189 [00:45<00:00,  4.19batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.51batches/s]



Metrics: {'train_loss': 0.008348031142352656, 'test_loss': 1.1134050759402188, 'bleu': 4.9708, 'gen_len': 8.1488}




 72%|███████▏  | 142/197 [2:09:35<48:10, 52.55s/it]

For epoch 146: 


Train batch number 189: 100%|██████████| 189/189 [00:43<00:00,  4.34batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.41batches/s]



Metrics: {'train_loss': 0.00795322357143045, 'test_loss': 1.0936239361763, 'bleu': 6.2745, 'gen_len': 8.2024}




 73%|███████▎  | 143/197 [2:10:26<47:00, 52.24s/it]

For epoch 147: 


Train batch number 189: 100%|██████████| 189/189 [00:43<00:00,  4.31batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.59batches/s]



Metrics: {'train_loss': 0.009829451209895076, 'test_loss': 1.1035405749624425, 'bleu': 5.5349, 'gen_len': 8.381}




 73%|███████▎  | 144/197 [2:11:18<45:54, 51.97s/it]

For epoch 148: 


Train batch number 189: 100%|██████████| 189/189 [00:46<00:00,  4.03batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.50batches/s]



Metrics: {'train_loss': 0.008939962510743903, 'test_loss': 1.0983082001859492, 'bleu': 4.2729, 'gen_len': 8.4226}




 74%|███████▎  | 145/197 [2:12:12<45:41, 52.72s/it]

For epoch 149: 


Train batch number 189: 100%|██████████| 189/189 [00:44<00:00,  4.27batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.54batches/s]



Metrics: {'train_loss': 0.009340744947265616, 'test_loss': 1.0786126770756461, 'bleu': 4.9084, 'gen_len': 8.0952}




 74%|███████▍  | 146/197 [2:13:04<44:34, 52.44s/it]

For epoch 150: 


Train batch number 189: 100%|██████████| 189/189 [00:44<00:00,  4.26batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.52batches/s]



Metrics: {'train_loss': 0.00888325632298821, 'test_loss': 1.094839702952992, 'bleu': 5.3281, 'gen_len': 8.4107}




 75%|███████▍  | 147/197 [2:13:56<43:33, 52.27s/it]

For epoch 151: 


Train batch number 189: 100%|██████████| 189/189 [00:44<00:00,  4.29batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.54batches/s]



Metrics: {'train_loss': 0.008175928708205613, 'test_loss': 1.1156262105161494, 'bleu': 5.9914, 'gen_len': 7.9048}




 75%|███████▌  | 148/197 [2:14:48<42:33, 52.12s/it]

For epoch 152: 


Train batch number 189: 100%|██████████| 189/189 [00:44<00:00,  4.28batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.40batches/s]



Metrics: {'train_loss': 0.008101459637237466, 'test_loss': 1.1163916479457507, 'bleu': 5.2238, 'gen_len': 7.9464}




 76%|███████▌  | 149/197 [2:15:40<41:39, 52.06s/it]

For epoch 153: 


Train batch number 189: 100%|██████████| 189/189 [00:44<00:00,  4.26batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.52batches/s]



Metrics: {'train_loss': 0.008493163741756385, 'test_loss': 1.12041128765453, 'bleu': 5.5938, 'gen_len': 8.0179}




 76%|███████▌  | 150/197 [2:16:32<40:45, 52.03s/it]

For epoch 154: 


Train batch number 189: 100%|██████████| 189/189 [00:44<00:00,  4.26batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.58batches/s]



Metrics: {'train_loss': 0.008270321598737846, 'test_loss': 1.1049348711967468, 'bleu': 5.9196, 'gen_len': 8.1369}




 77%|███████▋  | 151/197 [2:17:23<39:51, 51.99s/it]

For epoch 155: 


Train batch number 189: 100%|██████████| 189/189 [00:43<00:00,  4.30batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.55batches/s]



Metrics: {'train_loss': 0.008668307666931666, 'test_loss': 1.116010378707539, 'bleu': 5.106, 'gen_len': 8.0893}




 77%|███████▋  | 152/197 [2:18:15<38:53, 51.85s/it]

For epoch 156: 


Train batch number 189: 100%|██████████| 189/189 [00:43<00:00,  4.31batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.54batches/s]



Metrics: {'train_loss': 0.007918194520888386, 'test_loss': 1.093119670044292, 'bleu': 5.2324, 'gen_len': 8.0595}




 78%|███████▊  | 153/197 [2:19:06<37:55, 51.71s/it]

For epoch 157: 


Train batch number 189: 100%|██████████| 189/189 [00:43<00:00,  4.32batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.54batches/s]



Metrics: {'train_loss': 0.007936495401688344, 'test_loss': 1.0896415656263179, 'bleu': 5.0047, 'gen_len': 7.9048}




 78%|███████▊  | 154/197 [2:19:58<37:00, 51.63s/it]

For epoch 158: 


Train batch number 189: 100%|██████████| 189/189 [00:43<00:00,  4.31batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.55batches/s]



Metrics: {'train_loss': 0.00782380920678387, 'test_loss': 1.0909886278889396, 'bleu': 5.0922, 'gen_len': 8.0357}




 79%|███████▊  | 155/197 [2:20:49<36:06, 51.58s/it]

For epoch 159: 


Train batch number 189: 100%|██████████| 189/189 [00:46<00:00,  4.11batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:05<00:00,  2.05batches/s]



Metrics: {'train_loss': 0.007380521372185463, 'test_loss': 1.0919155478477478, 'bleu': 5.0972, 'gen_len': 8.125}




 79%|███████▉  | 156/197 [2:21:44<35:52, 52.51s/it]

For epoch 160: 


Train batch number 189: 100%|██████████| 189/189 [00:44<00:00,  4.28batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.41batches/s]



Metrics: {'train_loss': 0.0073415253341577395, 'test_loss': 1.0958021120591597, 'bleu': 5.0783, 'gen_len': 8.25}




 80%|███████▉  | 157/197 [2:22:36<34:53, 52.34s/it]

For epoch 161: 


Train batch number 189: 100%|██████████| 189/189 [00:43<00:00,  4.33batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.47batches/s]



Metrics: {'train_loss': 0.007859866918623941, 'test_loss': 1.076573301445354, 'bleu': 5.1803, 'gen_len': 8.1726}




 80%|████████  | 158/197 [2:23:27<33:49, 52.04s/it]

For epoch 162: 


Train batch number 189: 100%|██████████| 189/189 [00:44<00:00,  4.24batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.56batches/s]



Metrics: {'train_loss': 0.008072080446817424, 'test_loss': 1.0719018253413113, 'bleu': 4.8812, 'gen_len': 8.131}




 81%|████████  | 159/197 [2:24:19<32:58, 52.06s/it]

For epoch 163: 


Train batch number 189: 100%|██████████| 189/189 [00:44<00:00,  4.25batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.49batches/s]



Metrics: {'train_loss': 0.007984405946295719, 'test_loss': 1.0876490907235579, 'bleu': 3.7507, 'gen_len': 8.2143}




 81%|████████  | 160/197 [2:25:12<32:11, 52.21s/it]

For epoch 164: 


Train batch number 189: 100%|██████████| 189/189 [00:43<00:00,  4.31batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.40batches/s]



Metrics: {'train_loss': 0.008286612874750668, 'test_loss': 1.0838718685236843, 'bleu': 5.2583, 'gen_len': 8.0238}




 82%|████████▏ | 161/197 [2:26:04<31:15, 52.11s/it]

For epoch 165: 


Train batch number 189: 100%|██████████| 189/189 [00:44<00:00,  4.29batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.43batches/s]



Metrics: {'train_loss': 0.008375679603121834, 'test_loss': 1.0714029344645413, 'bleu': 5.3261, 'gen_len': 8.0357}




 82%|████████▏ | 162/197 [2:26:56<30:20, 52.02s/it]

For epoch 166: 


Train batch number 189: 100%|██████████| 189/189 [00:45<00:00,  4.19batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.46batches/s]



Metrics: {'train_loss': 0.008998500518765665, 'test_loss': 1.0855977860364048, 'bleu': 5.4705, 'gen_len': 8.1369}




 83%|████████▎ | 163/197 [2:27:48<29:36, 52.26s/it]

For epoch 167: 


Train batch number 189: 100%|██████████| 189/189 [00:44<00:00,  4.29batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.49batches/s]



Metrics: {'train_loss': 0.008796480323619791, 'test_loss': 1.0740845149213618, 'bleu': 3.9606, 'gen_len': 8.1369}




 83%|████████▎ | 164/197 [2:28:40<28:40, 52.12s/it]

For epoch 168: 


Train batch number 189: 100%|██████████| 189/189 [00:44<00:00,  4.26batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.49batches/s]



Metrics: {'train_loss': 0.008637408501017104, 'test_loss': 1.0708449347452684, 'bleu': 5.5621, 'gen_len': 8.375}




 84%|████████▍ | 165/197 [2:29:32<27:47, 52.12s/it]

For epoch 169: 


Train batch number 189: 100%|██████████| 189/189 [00:43<00:00,  4.30batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.37batches/s]



Metrics: {'train_loss': 0.008743385471115332, 'test_loss': 1.0891983183947476, 'bleu': 4.9615, 'gen_len': 8.1369}




 84%|████████▍ | 166/197 [2:30:24<26:52, 52.01s/it]

For epoch 170: 


Train batch number 189: 100%|██████████| 189/189 [00:44<00:00,  4.21batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.43batches/s]



Metrics: {'train_loss': 0.007898815804442622, 'test_loss': 1.0955390279943293, 'bleu': 4.7545, 'gen_len': 8.1012}




 85%|████████▍ | 167/197 [2:31:17<26:06, 52.21s/it]

For epoch 171: 


Train batch number 189: 100%|██████████| 189/189 [00:45<00:00,  4.13batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.49batches/s]



Metrics: {'train_loss': 0.008512226215248962, 'test_loss': 1.0757131576538086, 'bleu': 4.7203, 'gen_len': 8.0238}




 85%|████████▌ | 168/197 [2:32:10<25:26, 52.64s/it]

For epoch 172: 


Train batch number 189: 100%|██████████| 189/189 [00:43<00:00,  4.30batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.51batches/s]



Metrics: {'train_loss': 0.00868986238461855, 'test_loss': 1.0939860506491228, 'bleu': 4.5525, 'gen_len': 8.0}




 86%|████████▌ | 169/197 [2:33:02<24:25, 52.33s/it]

For epoch 173: 


Train batch number 189: 100%|██████████| 189/189 [00:44<00:00,  4.21batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.58batches/s]



Metrics: {'train_loss': 0.007862460187718854, 'test_loss': 1.1029833582314579, 'bleu': 4.5211, 'gen_len': 7.869}




 86%|████████▋ | 170/197 [2:33:55<23:33, 52.37s/it]

For epoch 174: 


Train batch number 189: 100%|██████████| 189/189 [00:44<00:00,  4.29batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.46batches/s]



Metrics: {'train_loss': 0.007622448641562923, 'test_loss': 1.0917569398880005, 'bleu': 5.0259, 'gen_len': 7.9167}




 87%|████████▋ | 171/197 [2:34:46<22:36, 52.18s/it]

For epoch 175: 


Train batch number 189: 100%|██████████| 189/189 [00:43<00:00,  4.31batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.50batches/s]



Metrics: {'train_loss': 0.008189962985669756, 'test_loss': 1.081823842091994, 'bleu': 4.7534, 'gen_len': 8.1845}




 87%|████████▋ | 172/197 [2:35:38<21:38, 51.94s/it]

For epoch 176: 


Train batch number 189: 100%|██████████| 189/189 [00:44<00:00,  4.25batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.42batches/s]



Metrics: {'train_loss': 0.007840345202047112, 'test_loss': 1.0935182110829786, 'bleu': 5.2204, 'gen_len': 7.881}




 88%|████████▊ | 173/197 [2:36:30<20:48, 52.00s/it]

For epoch 177: 


Train batch number 189: 100%|██████████| 189/189 [00:43<00:00,  4.31batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.48batches/s]



Metrics: {'train_loss': 0.007801484166063669, 'test_loss': 1.0842923359437422, 'bleu': 4.2559, 'gen_len': 8.244}




 88%|████████▊ | 174/197 [2:37:21<19:53, 51.87s/it]

For epoch 178: 


Train batch number 189: 100%|██████████| 189/189 [00:44<00:00,  4.27batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.48batches/s]



Metrics: {'train_loss': 0.00753026764310783, 'test_loss': 1.0880036028948696, 'bleu': 4.879, 'gen_len': 7.9226}




 89%|████████▉ | 175/197 [2:38:13<19:01, 51.89s/it]

For epoch 179: 


Train batch number 189: 100%|██████████| 189/189 [00:43<00:00,  4.36batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.38batches/s]



Metrics: {'train_loss': 0.007393899291788263, 'test_loss': 1.0955045169050044, 'bleu': 4.0779, 'gen_len': 7.7381}




 89%|████████▉ | 176/197 [2:39:05<18:06, 51.74s/it]

For epoch 180: 


Train batch number 189: 100%|██████████| 189/189 [00:44<00:00,  4.27batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.52batches/s]



Metrics: {'train_loss': 0.008435590337732777, 'test_loss': 1.0926837866956538, 'bleu': 4.1132, 'gen_len': 7.8571}




 90%|████████▉ | 177/197 [2:39:57<17:16, 51.81s/it]

For epoch 181: 


Train batch number 189: 100%|██████████| 189/189 [00:44<00:00,  4.26batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.51batches/s]



Metrics: {'train_loss': 0.008541579381916943, 'test_loss': 1.0721597969532013, 'bleu': 4.4691, 'gen_len': 8.0536}




 90%|█████████ | 178/197 [2:40:49<16:24, 51.84s/it]

For epoch 182: 


Train batch number 189: 100%|██████████| 189/189 [00:45<00:00,  4.18batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:06<00:00,  1.80batches/s]



Metrics: {'train_loss': 0.008119684634688642, 'test_loss': 1.1051941188898953, 'bleu': 4.5684, 'gen_len': 7.6369}




 91%|█████████ | 179/197 [2:41:43<15:49, 52.72s/it]

For epoch 183: 


Train batch number 189: 100%|██████████| 189/189 [00:44<00:00,  4.20batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.43batches/s]



Metrics: {'train_loss': 0.009288790994788836, 'test_loss': 1.0835455466400494, 'bleu': 5.1792, 'gen_len': 8.0714}




 91%|█████████▏| 180/197 [2:42:36<14:56, 52.72s/it]

For epoch 184: 


Train batch number 189: 100%|██████████| 189/189 [00:43<00:00,  4.31batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.44batches/s]



Metrics: {'train_loss': 0.007949697694226251, 'test_loss': 1.0892677117477765, 'bleu': 5.3154, 'gen_len': 8.1071}




 92%|█████████▏| 181/197 [2:43:28<13:58, 52.40s/it]

For epoch 185: 


Train batch number 189: 100%|██████████| 189/189 [00:43<00:00,  4.34batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.50batches/s]



Metrics: {'train_loss': 0.007923778982817013, 'test_loss': 1.0893177769400857, 'bleu': 4.9827, 'gen_len': 7.9048}




 92%|█████████▏| 182/197 [2:44:19<13:02, 52.15s/it]

For epoch 186: 


Train batch number 189: 100%|██████████| 189/189 [00:44<00:00,  4.21batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.49batches/s]



Metrics: {'train_loss': 0.007970868930911704, 'test_loss': 1.0889297127723694, 'bleu': 5.4132, 'gen_len': 8.2202}




 93%|█████████▎| 183/197 [2:45:12<12:12, 52.30s/it]

For epoch 187: 


Train batch number 189: 100%|██████████| 189/189 [00:44<00:00,  4.20batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.40batches/s]



Metrics: {'train_loss': 0.008376711726351271, 'test_loss': 1.0975332964550366, 'bleu': 5.106, 'gen_len': 8.119}




 93%|█████████▎| 184/197 [2:46:05<11:22, 52.47s/it]

For epoch 188: 


Train batch number 189: 100%|██████████| 189/189 [00:44<00:00,  4.24batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.51batches/s]



Metrics: {'train_loss': 0.007654973412552854, 'test_loss': 1.0867230133576826, 'bleu': 5.8752, 'gen_len': 8.2381}




 94%|█████████▍| 185/197 [2:46:57<10:28, 52.40s/it]

For epoch 189: 


Train batch number 189: 100%|██████████| 189/189 [00:43<00:00,  4.32batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.41batches/s]



Metrics: {'train_loss': 0.007768305394892142, 'test_loss': 1.0884783132509752, 'bleu': 4.9669, 'gen_len': 8.2262}




 94%|█████████▍| 186/197 [2:47:49<09:33, 52.15s/it]

For epoch 190: 


Train batch number 189: 100%|██████████| 189/189 [00:44<00:00,  4.22batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.56batches/s]



Metrics: {'train_loss': 0.008556057916475717, 'test_loss': 1.0932203206149014, 'bleu': 5.3171, 'gen_len': 8.4405}




 95%|█████████▍| 187/197 [2:48:41<08:42, 52.28s/it]

For epoch 191: 


Train batch number 189: 100%|██████████| 189/189 [00:44<00:00,  4.26batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.49batches/s]



Metrics: {'train_loss': 0.007583076671436035, 'test_loss': 1.1027203581549905, 'bleu': 5.1714, 'gen_len': 7.994}




 95%|█████████▌| 188/197 [2:49:33<07:49, 52.18s/it]

For epoch 192: 


Train batch number 189: 100%|██████████| 189/189 [00:44<00:00,  4.29batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.50batches/s]



Metrics: {'train_loss': 0.00836189186047127, 'test_loss': 1.0843684890053489, 'bleu': 4.1573, 'gen_len': 8.1369}




 96%|█████████▌| 189/197 [2:50:25<06:56, 52.02s/it]

For epoch 193: 


Train batch number 189: 100%|██████████| 189/189 [00:44<00:00,  4.23batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.39batches/s]



Metrics: {'train_loss': 0.008659992233675131, 'test_loss': 1.1083502769470215, 'bleu': 4.9173, 'gen_len': 8.1131}




 96%|█████████▋| 190/197 [2:51:17<06:05, 52.18s/it]

For epoch 194: 


Train batch number 189: 100%|██████████| 189/189 [00:45<00:00,  4.13batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.55batches/s]



Metrics: {'train_loss': 0.008636262038077115, 'test_loss': 1.0745770281011409, 'bleu': 5.3805, 'gen_len': 8.3095}




 97%|█████████▋| 191/197 [2:52:11<05:15, 52.52s/it]

For epoch 195: 


Train batch number 189: 100%|██████████| 189/189 [00:44<00:00,  4.25batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.46batches/s]



Metrics: {'train_loss': 0.009713850700733481, 'test_loss': 1.0823128656907515, 'bleu': 4.4509, 'gen_len': 8.369}




 97%|█████████▋| 192/197 [2:53:03<04:22, 52.46s/it]

For epoch 196: 


Train batch number 189: 100%|██████████| 189/189 [00:44<00:00,  4.27batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.54batches/s]



Metrics: {'train_loss': 0.00969830921496309, 'test_loss': 1.0834323980591514, 'bleu': 6.1788, 'gen_len': 8.2857}




 98%|█████████▊| 193/197 [2:53:55<03:29, 52.29s/it]

For epoch 197: 


Train batch number 189: 100%|██████████| 189/189 [00:44<00:00,  4.20batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.49batches/s]



Metrics: {'train_loss': 0.008006458084940387, 'test_loss': 1.1015208471905102, 'bleu': 4.1251, 'gen_len': 7.9881}




 98%|█████████▊| 194/197 [2:54:48<02:37, 52.46s/it]

For epoch 198: 


Train batch number 189: 100%|██████████| 189/189 [38:38<00:00, 12.27s/batches]   
Test batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.31batches/s]



Metrics: {'train_loss': 0.007393511888622208, 'test_loss': 1.1017396504228765, 'bleu': 4.8751, 'gen_len': 8.1429}




 99%|█████████▉| 195/197 [3:33:34<24:29, 734.76s/it]

For epoch 199: 


Train batch number 189: 100%|██████████| 189/189 [00:41<00:00,  4.53batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:05<00:00,  1.89batches/s]



Metrics: {'train_loss': 0.007830790079044173, 'test_loss': 1.083995065905831, 'bleu': 4.3594, 'gen_len': 8.494}




 99%|█████████▉| 196/197 [3:34:25<08:49, 529.47s/it]

For epoch 200: 


Train batch number 189: 100%|██████████| 189/189 [00:43<00:00,  4.37batches/s]
Test batch number 11: 100%|██████████| 11/11 [00:03<00:00,  2.75batches/s]



Metrics: {'train_loss': 0.008034040725164155, 'test_loss': 1.1120836463841526, 'bleu': 4.7986, 'gen_len': 8.0655}




100%|██████████| 197/197 [3:35:15<00:00, 65.56s/it] 


### Predictions and Evaluation

In [11]:
# let us get the test set
test_dataset = SentenceDataset(f"data/extractions/new_data/test_set.csv",
                                        tokenizer,
                                        truncation = True)

Let us make the evaluation and print the predicted sentences.

In [12]:
# evaluation with test set
df_ft_to_wf = trainer.evaluate(test_dataset)

Evaluation batch number 12: 100%|██████████| 12/12 [00:04<00:00,  2.60batches/s]


In [13]:
df_ft_to_wf[1].tail(10)

Unnamed: 0,original_sentences,translations,predictions
177,Tout autre mouton que tu vois.,Meneen xar mépp moo gis.,Bi ŋga dee dem
178,Ceci est une tasse et une assiette. On y verse...,Lii nag ëe kaas la ak palaat. Nu ciy xelli kaf...,"Ci biir nataal bi maa ngi ciy gis ay batã, bat..."
179,"Ça c'est un terrain de jeu, c'est-à-dire un tr...",Waaw lii ab fowu la maanaam stade fowu fowu la...,Waaw nataal bii nataal la boob ay nit ñu baree...
180,Quel être!,Moo di loola!,Demleen!
181,C'est moi qui ai été,Maa dem,Maa demoon
182,Il est par-là!,Mi ŋgi foofule!,Dem na ca subë.
183,Un homme n'a été,Góor demul,Demoon ba Ndar.
184,Surtout rentrez chez vous!,Te nag ŋgeen dem ñibbi!,Nataal bii ma gis
185,Surveille-moi ceux-là!,Seetal ma ñenn ñuu!,Kile la.
186,Alors celui-ci s'en alla!,Noona kooku dem!,Demleen foofu!


In [14]:
# let us display 100 samples
pd.options.display.max_rows = 100
df_ft_to_wf[1].sample(100)

Unnamed: 0,original_sentences,translations,predictions
168,Il a vu l'autre.,Gis na keneen ki.,Gis na ma!
124,"Tout ce verbiage, c'est pour que tu ne viennes...","Wax ji yépp, bañ-ŋga-ñëw la.",Ŋga dem la ñooñu bëgg.
123,C'est l'autre endroit qui n'a pas de trou.,Feneen fi bëttul la.,Yaa demulwoon
89,Dis-lui qu'il ne vienne pas,Ni ka bu mu ñëw,Demal ndax mu ñëw ndax it mu génn!
177,Tout autre mouton que tu vois.,Meneen xar mépp moo gis.,Bi ŋga dee dem
18,C'est son ami!,Xaritam la!,Sa xarit la!
118,Un seul est arrivé,Menn doon na ñëw,Nit dem na!
59,Il y est.,Mi ŋgi fi.,Dem na.
30,"Te voilà debout, ici.",Yaa ŋgi tawax.,Yaa ka gis moom.
178,Ceci est une tasse et une assiette. On y verse...,Lii nag ëe kaas la ak palaat. Nu ciy xelli kaf...,"Ci biir nataal bi maa ngi ciy gis ay batã, bat..."
