Fine-tuning best T5 Transformer 🤖
-----------------------------------

In this notebook, we will continue the fine-tuning of T5 transformer on the new extracted sentences from the book **Grammaire de Wolof Moderne** without considering the definitions. We provide, bellow, the main evaluation figures, obtained from the hyperparameter search step. We will evaluate the training on the validation dataset.

- Parallel coordinates from panel:

- Parameter importance char: 
[t5_v3_importance](https://wandb.ai/oumar-kane-team/small-t5-cross-fw-translation-bayes-hpsearch-v3/reports/undefined-23-05-16-10-36-17---Vmlldzo0Mzc4NDY0?accessToken=eyaiyrid0qz1zg2jkq3fc65biw53084dpfitbi0dgonq6mweupw6kgjml9d2nv1w)

We can see in the above chart that the batch is the most important parameter with a negative correlation with the BLEU score (meaning that a lower batch size is better). Next, we the probability of modifying a character in the french corpus is also important and a high probability provide a better BLEU score.  

In [1]:
# let us import all necessary libraries
from transformers import AutoModelForSeq2SeqLM, Seq2SeqTrainingArguments, Seq2SeqTrainer, T5TokenizerFast, set_seed, AdamW, get_linear_schedule_with_warmup, T5ForConditionalGeneration,\
    get_cosine_schedule_with_warmup, Adafactor
from wolof_translate.utils.sent_transformers import TransformerSequences
from wolof_translate.utils.improvements.end_marks import add_end_mark # added
from torch.nn import TransformerEncoderLayer, TransformerDecoderLayer
from torch.utils.data import Dataset, DataLoader, random_split
from wolof_translate.data.dataset_v3 import SentenceDataset # v2 -> v3
from wolof_translate.utils.sent_corrections import *
from sklearn.model_selection import train_test_split
from torch.optim.lr_scheduler import _LRScheduler
# from custom_rnn.utils.kwargs import Kwargs
from torch.nn.utils.rnn import pad_sequence
from plotly.subplots import make_subplots
from nlpaug.augmenter import char as nac
from torch.utils.data import DataLoader
# from datasets  import load_metric # make pip install evaluate instead
# and pip install sacrebleu for instance
from torch.nn import functional as F
import plotly.graph_objects as go
from tokenizers import Tokenizer
import matplotlib.pyplot as plt
import pytorch_lightning as tl
from tqdm import tqdm, trange
from functools import partial
from torch.nn import utils
from copy import deepcopy
from torch import optim
from typing import *
from torch import nn
import pandas as pd
import numpy as np
import itertools
import evaluate
import random
import string
import shutil
import wandb
import torch
import json
import copy
import os

# set a global seed
tl.seed_everything(0)

os.environ["WANDB_DISABLED"] = "true"

  from .autonotebook import tqdm as notebook_tqdm
Global seed set to 0


## French to wolof

### Configure dataset 🔠

In [2]:
# recuperate the tokenizer from a json file
tokenizer = T5TokenizerFast(tokenizer_file=f"wolof-translate/wolof_translate/tokenizers/t5_tokenizers/tokenizer_v3.json")


In [3]:
def recuperate_datasets(wf_char_p: float, wf_word_p: float, max_len: int, end_mark_opt: int):

  # Let us recuperate the end_mark adding option
  if end_mark_opt == 1:
    # Create augmentation to add on French sentences
    fr_augmentation = TransformerSequences(nac.KeyboardAug(aug_char_p=wf_char_p, aug_word_p=wf_word_p, 
                                                          aug_word_max=max_len),
                                          remove_mark_space, delete_guillemet_space)

  else:
    
    if end_mark_opt == 2:

      end_mark_fn = partial(add_end_mark, end_mark_to_remove = '!', replace = True)
    
    elif end_mark_opt == 3:

      end_mark_fn = partial(add_end_mark)
    
    elif end_mark_opt == 4:

      end_mark_fn = partial(add_end_mark, end_mark_to_remove = '!')

    # Create augmentation to add on French sentences
    fr_augmentation = TransformerSequences(nac.KeyboardAug(aug_char_p=wf_char_p, aug_word_p=wf_word_p, 
                                                          aug_word_max= max_len),
                                          remove_mark_space, delete_guillemet_space, end_mark_fn)

  # Recuperate the train dataset
  train_dataset_aug = SentenceDataset(f"data/extractions/new_data/train_set.csv",
                                        tokenizer,
                                        truncation = True, max_len=max_len,
                                        corpus_1 = 'wolof',
                                        corpus_2='french',
                                        cp1_transformer = fr_augmentation)

  # Recuperate the valid dataset
  valid_dataset = SentenceDataset(f"data/extractions/new_data/valid_set.csv",
                                        tokenizer, max_len=max_len,
                                        corpus_1='wolof',
                                        corpus_2='french',
                                        truncation = True)
  
  # Return the datasets
  return train_dataset_aug, valid_dataset

### Configure the model and the evaluation function ⚙️

Let us evaluate the predictions with the `bleu` metric.

In [4]:
%%writefile wolof-translate/wolof_translate/utils/evaluation.py
from tokenizers import Tokenizer
from typing import *
import numpy as np
import evaluate

class TranslationEvaluation:
    
    def __init__(self, 
                 tokenizer: Tokenizer,
                 decoder: Union[Callable, None] = None,
                 metric = evaluate.load('sacrebleu'),
                 ):
        
        self.tokenizer = tokenizer
        
        self.decoder = decoder
        
        self.metric = metric
    
    def postprocess_text(self, preds, labels):
        
        preds = [pred.strip() for pred in preds]
        
        labels = [[label.strip()] for label in labels]
        
        return preds, labels

    def compute_metrics(self, eval_preds):

        preds, labels = eval_preds

        if isinstance(preds, tuple):
        
            preds = preds[0]
        
        decoded_preds = self.tokenizer.batch_decode(preds, skip_special_tokens=True)

        labels = np.where(labels != -100, labels, self.tokenizer.pad_token_id)
        
        decoded_labels = self.tokenizer.batch_decode(labels, skip_special_tokens=True)

        decoded_preds, decoded_labels = self.postprocess_text(decoded_preds, decoded_labels)

        result = self.metric.compute(predictions=decoded_preds, references=decoded_labels)
        
        result = {"bleu": result["score"]}

        prediction_lens = [np.count_nonzero(pred != self.tokenizer.pad_token_id) for pred in preds]
        
        result["gen_len"] = np.mean(prediction_lens)
        
        result = {k: round(v, 4) for k, v in result.items()}
        
        return result

Overwriting wolof-translate/wolof_translate/utils/evaluation.py


Let us initialize the evaluation object.

In [5]:
%run wolof-translate/wolof_translate/utils/evaluation.py
evaluation = TranslationEvaluation(tokenizer)


Using the latest cached version of the module from C:\Users\Oumar Kane\.cache\huggingface\modules\evaluate_modules\metrics\evaluate-metric--sacrebleu\28676bf65b4f88b276df566e48e603732d0b4afd237603ebdf92acaacf5be99b (last modified on Wed Apr 26 19:02:40 2023) since it couldn't be found locally at evaluate-metric--sacrebleu, or remotely on the Hugging Face Hub.


### Searching for the best parameters 🕖

In [6]:
from wolof_translate.models.transformers.optimization import TransformerScheduler
from wolof_translate.trainers.transformer_trainer import ModelRunner
from wolof_translate.utils.evaluation import TranslationEvaluation
from wolof_translate.models.transformers.main import Transformer
from wolof_translate.utils.split_with_valid import split_data


Using the latest cached version of the module from C:\Users\Oumar Kane\.cache\huggingface\modules\evaluate_modules\metrics\evaluate-metric--sacrebleu\28676bf65b4f88b276df566e48e603732d0b4afd237603ebdf92acaacf5be99b (last modified on Wed Apr 26 19:02:40 2023) since it couldn't be found locally at evaluate-metric--sacrebleu, or remotely on the Hugging Face Hub.


-------------

### --- Wandb V3_2

In [7]:
# let us initialize the hyperparameter configuration 
config = {
    'random_state': 0,
    'fr_char_p': 0.01787608197203816,
    'fr_word_p': 0.12076814421402848,
    'learning_rate': 0.002353312454980651,
    'weight_decay': 0.010386784405097485,
    'batch_size': 32,
    'warmup_ratio': 0.0,
    'max_epoch': 1000,
    'max_len': 51,
    'end_mark': 3,
    'bleu': 1.703,
    'model_dir': 'data/checkpoints/wf_t5_small_custom_train_v3_checkpoints/',
    'new_model_dir': 'data/checkpoints/t5_small_custom_train_results_wf_v3/'
}

# Initialize the model name
model_name = 't5-small'

# import the model with its pre-trained weights
model = T5ForConditionalGeneration.from_pretrained(model_name)

# resize the token embeddings
model.resize_token_embeddings(len(tokenizer))

# let us initialize the evaluation class
evaluation = TranslationEvaluation(tokenizer)

# let us initialize the trainer
trainer = ModelRunner(model, seed = 0, version = 1, evaluation = evaluation, optimizer = Adafactor)

# split the data
split_data(config['random_state'])

# recuperate train and test set
train_dataset, test_dataset = recuperate_datasets(config['fr_char_p'], 
                                                    config['fr_word_p'], config['max_len'],
                                                    config['end_mark'])

# let us calculate the appropriate warmup steps (let us take a max epoch of 100)
# length = len(train_dataset)

# n_steps = length // config['batch_size']

# num_steps = config['max_epoch'] * n_steps

# warmup_steps = (config['max_epoch'] * n_steps) * config['warmup_ratio']

# # Initialize the scheduler parameters
# scheduler_args = {'num_warmup_steps': warmup_steps, 'num_training_steps': num_steps}

# Initialize the optimizer parameters
optimizer_args = {
    'lr': config['learning_rate'],
    'weight_decay': config['weight_decay'],
    # 'betas': (0.9, 0.98),
    'relative_step': False
}

# Initialize the loaders parameters
train_loader_args = {'batch_size': config['batch_size']}

# Add the datasets and hyperparameters to trainer
trainer.compile(train_dataset, test_dataset, tokenizer, train_loader_args,
                optimizer_kwargs = optimizer_args,
                # lr_scheduler=get_linear_schedule_with_warmup,
                # lr_scheduler_kwargs=scheduler_args, 
                predict_with_generate = True,
                hugging_face = True,
                logging_dir="data/logs/t5_small_custom_train_wf_v3"
                )

# We will from checkpoints so let us the model
# trainer.load(config['model_dir'], load_best=True) # Only for the first loading
trainer.load(config['new_model_dir'])

        

### ---

In [9]:
trainer.train(epochs = config['max_epoch'] - trainer.current_epoch, auto_save=True, metric_for_best_model='bleu', metric_objective='maximize', log_step=1,
              saving_directory = config['new_model_dir'])

  0%|          | 0/449 [00:00<?, ?it/s]

For epoch 4: 


Train batch number 41: 100%|██████████| 41/41 [00:16<00:00,  2.55batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.87batches/s]



Metrics: {'train_loss': 0.7691327071771389, 'test_loss': 0.6835767328739166, 'bleu': 1.0793, 'gen_len': 7.8904}




  0%|          | 1/449 [00:24<3:05:03, 24.78s/it]

For epoch 5: 


Train batch number 41: 100%|██████████| 41/41 [00:13<00:00,  2.95batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.73batches/s]



Metrics: {'train_loss': 0.6166456043720245, 'test_loss': 0.6218239724636078, 'bleu': 1.5312, 'gen_len': 8.1096}




  0%|          | 2/449 [00:47<2:53:47, 23.33s/it]

For epoch 6: 


Train batch number 41: 100%|██████████| 41/41 [00:12<00:00,  3.27batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.70batches/s]



Metrics: {'train_loss': 0.5230720515658216, 'test_loss': 0.5652649670839309, 'bleu': 3.7055, 'gen_len': 9.3973}




  1%|          | 3/449 [01:08<2:46:34, 22.41s/it]

For epoch 7: 


Train batch number 41: 100%|██████████| 41/41 [00:13<00:00,  3.07batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.54batches/s]



Metrics: {'train_loss': 0.4478797461928391, 'test_loss': 0.5352094501256943, 'bleu': 2.3935, 'gen_len': 9.7123}




  1%|          | 4/449 [01:28<2:40:16, 21.61s/it]

For epoch 8: 


Train batch number 41: 100%|██████████| 41/41 [00:13<00:00,  2.99batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.80batches/s]



Metrics: {'train_loss': 0.3825557704378919, 'test_loss': 0.5224502235651016, 'bleu': 5.3491, 'gen_len': 8.5548}




  1%|          | 5/449 [01:51<2:41:47, 21.86s/it]

For epoch 9: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.81batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.44batches/s]



Metrics: {'train_loss': 0.3207814729795223, 'test_loss': 0.5234281927347183, 'bleu': 6.4599, 'gen_len': 8.5685}




  1%|▏         | 6/449 [02:14<2:44:57, 22.34s/it]

For epoch 10: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.87batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.26batches/s]



Metrics: {'train_loss': 0.26829560064687963, 'test_loss': 0.5266535609960556, 'bleu': 7.5056, 'gen_len': 8.9247}




  2%|▏         | 7/449 [02:37<2:47:18, 22.71s/it]

For epoch 11: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.85batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.52batches/s]



Metrics: {'train_loss': 0.22462953990552484, 'test_loss': 0.5114246025681496, 'bleu': 8.5748, 'gen_len': 9.4521}




  2%|▏         | 8/449 [03:01<2:48:34, 22.93s/it]

For epoch 12: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.79batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.42batches/s]



Metrics: {'train_loss': 0.18864348421736463, 'test_loss': 0.536265167593956, 'bleu': 9.6049, 'gen_len': 9.2808}




  2%|▏         | 9/449 [03:25<2:50:14, 23.22s/it]

For epoch 13: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.81batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.45batches/s]



Metrics: {'train_loss': 0.1571519821882248, 'test_loss': 0.5295195862650871, 'bleu': 11.6962, 'gen_len': 9.7397}




  2%|▏         | 10/449 [03:48<2:51:17, 23.41s/it]

For epoch 14: 


Train batch number 41: 100%|██████████| 41/41 [00:16<00:00,  2.54batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.49batches/s]



Metrics: {'train_loss': 0.13023781104058754, 'test_loss': 0.5361726403236389, 'bleu': 12.796, 'gen_len': 9.274}




  2%|▏         | 11/449 [04:14<2:54:35, 23.92s/it]

For epoch 15: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.76batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.52batches/s]



Metrics: {'train_loss': 0.11202308408370833, 'test_loss': 0.5560192510485649, 'bleu': 12.8857, 'gen_len': 9.226}




  3%|▎         | 12/449 [04:37<2:53:48, 23.86s/it]

For epoch 16: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.81batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.44batches/s]



Metrics: {'train_loss': 0.09791787586561064, 'test_loss': 0.5650171920657158, 'bleu': 12.8742, 'gen_len': 9.5411}




  3%|▎         | 13/449 [04:59<2:48:09, 23.14s/it]

For epoch 17: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.75batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.43batches/s]



Metrics: {'train_loss': 0.08768500769283713, 'test_loss': 0.5581209570169449, 'bleu': 13.8594, 'gen_len': 9.4726}




  3%|▎         | 14/449 [05:23<2:49:46, 23.42s/it]

For epoch 18: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.71batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.29batches/s]



Metrics: {'train_loss': 0.07580143195099948, 'test_loss': 0.5655455380678177, 'bleu': 15.27, 'gen_len': 9.7945}




  3%|▎         | 15/449 [05:47<2:51:50, 23.76s/it]

For epoch 19: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.78batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.41batches/s]



Metrics: {'train_loss': 0.0692955617134164, 'test_loss': 0.5691384941339492, 'bleu': 13.8702, 'gen_len': 9.6164}




  4%|▎         | 16/449 [06:09<2:47:35, 23.22s/it]

For epoch 20: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.73batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.63batches/s]



Metrics: {'train_loss': 0.06215109330851857, 'test_loss': 0.562890587747097, 'bleu': 12.6652, 'gen_len': 9.6644}




  4%|▍         | 17/449 [06:31<2:44:08, 22.80s/it]

For epoch 21: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.74batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.52batches/s]



Metrics: {'train_loss': 0.05572789189655606, 'test_loss': 0.571993401646614, 'bleu': 14.5486, 'gen_len': 10.3151}




  4%|▍         | 18/449 [06:53<2:41:33, 22.49s/it]

For epoch 22: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.75batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.59batches/s]



Metrics: {'train_loss': 0.05244325364871723, 'test_loss': 0.5785838633775711, 'bleu': 13.2398, 'gen_len': 9.8014}




  4%|▍         | 19/449 [07:15<2:39:43, 22.29s/it]

For epoch 23: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.67batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.28batches/s]



Metrics: {'train_loss': 0.0487370581888571, 'test_loss': 0.5716377273201942, 'bleu': 15.6181, 'gen_len': 9.7397}




  4%|▍         | 20/449 [07:39<2:44:13, 22.97s/it]

For epoch 24: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.63batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.43batches/s]



Metrics: {'train_loss': 0.04512030199715277, 'test_loss': 0.5817256376147271, 'bleu': 14.2, 'gen_len': 9.3219}




  5%|▍         | 21/449 [08:02<2:42:50, 22.83s/it]

For epoch 25: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.66batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.03batches/s]



Metrics: {'train_loss': 0.04235119840539083, 'test_loss': 0.5851858630776405, 'bleu': 16.598, 'gen_len': 9.8562}




  5%|▍         | 22/449 [08:27<2:47:32, 23.54s/it]

For epoch 26: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.68batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.67batches/s]



Metrics: {'train_loss': 0.03880712667071238, 'test_loss': 0.5801391080021858, 'bleu': 14.9061, 'gen_len': 9.7671}




  5%|▌         | 23/449 [08:49<2:44:25, 23.16s/it]

For epoch 27: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.60batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.59batches/s]



Metrics: {'train_loss': 0.03690844806047475, 'test_loss': 0.592105419933796, 'bleu': 14.6297, 'gen_len': 9.5616}




  5%|▌         | 24/449 [09:11<2:41:50, 22.85s/it]

For epoch 28: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.62batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.43batches/s]



Metrics: {'train_loss': 0.03695929023187335, 'test_loss': 0.5959036231040955, 'bleu': 15.3828, 'gen_len': 9.9726}




  6%|▌         | 25/449 [09:34<2:41:26, 22.85s/it]

For epoch 29: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.68batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.31batches/s]



Metrics: {'train_loss': 0.03458274755536056, 'test_loss': 0.592909523844719, 'bleu': 14.0316, 'gen_len': 9.8151}




  6%|▌         | 26/449 [09:57<2:40:35, 22.78s/it]

For epoch 30: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.67batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.42batches/s]



Metrics: {'train_loss': 0.033935136457042, 'test_loss': 0.5773079514503479, 'bleu': 14.4083, 'gen_len': 9.8836}




  6%|▌         | 27/449 [10:19<2:39:39, 22.70s/it]

For epoch 31: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.67batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.51batches/s]



Metrics: {'train_loss': 0.03082456103548771, 'test_loss': 0.5834457114338875, 'bleu': 16.0268, 'gen_len': 9.589}




  6%|▌         | 28/449 [10:42<2:38:30, 22.59s/it]

For epoch 32: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.61batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.29batches/s]



Metrics: {'train_loss': 0.031200513697978927, 'test_loss': 0.5843532115221024, 'bleu': 15.824, 'gen_len': 9.8973}




  6%|▋         | 29/449 [11:05<2:38:56, 22.71s/it]

For epoch 33: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.69batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.43batches/s]



Metrics: {'train_loss': 0.029231061704638527, 'test_loss': 0.5864860966801644, 'bleu': 18.1646, 'gen_len': 9.7055}




  7%|▋         | 30/449 [11:29<2:41:46, 23.17s/it]

For epoch 34: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.65batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.42batches/s]



Metrics: {'train_loss': 0.02822916159724317, 'test_loss': 0.5778071984648705, 'bleu': 15.8965, 'gen_len': 9.2808}




  7%|▋         | 31/449 [11:51<2:40:00, 22.97s/it]

For epoch 35: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.62batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.13batches/s]



Metrics: {'train_loss': 0.026490117309660447, 'test_loss': 0.5736807212233543, 'bleu': 16.5575, 'gen_len': 9.5342}




  7%|▋         | 32/449 [12:15<2:40:08, 23.04s/it]

For epoch 36: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.69batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.41batches/s]



Metrics: {'train_loss': 0.02705507966258177, 'test_loss': 0.5650861322879791, 'bleu': 16.0575, 'gen_len': 10.0274}




  7%|▋         | 33/449 [12:37<2:38:20, 22.84s/it]

For epoch 37: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.63batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.66batches/s]



Metrics: {'train_loss': 0.02599118931627855, 'test_loss': 0.5809333071112632, 'bleu': 16.675, 'gen_len': 9.7877}




  8%|▊         | 34/449 [12:59<2:36:54, 22.69s/it]

For epoch 38: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.72batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.54batches/s]



Metrics: {'train_loss': 0.02417455634056795, 'test_loss': 0.5839390203356742, 'bleu': 15.1546, 'gen_len': 9.0822}




  8%|▊         | 35/449 [13:21<2:34:36, 22.41s/it]

For epoch 39: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.69batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.52batches/s]



Metrics: {'train_loss': 0.022059341915315243, 'test_loss': 0.5732135325670242, 'bleu': 15.8266, 'gen_len': 9.9247}




  8%|▊         | 36/449 [13:43<2:33:25, 22.29s/it]

For epoch 40: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.71batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.62batches/s]



Metrics: {'train_loss': 0.02359998582794172, 'test_loss': 0.5852567449212074, 'bleu': 15.2201, 'gen_len': 9.7877}




  8%|▊         | 37/449 [14:05<2:32:45, 22.25s/it]

For epoch 41: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.63batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.50batches/s]



Metrics: {'train_loss': 0.021735822509338216, 'test_loss': 0.5929096266627312, 'bleu': 14.6547, 'gen_len': 9.637}




  8%|▊         | 38/449 [14:28<2:32:47, 22.31s/it]

For epoch 42: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.69batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.55batches/s]



Metrics: {'train_loss': 0.0214848274183346, 'test_loss': 0.5928893014788628, 'bleu': 16.3678, 'gen_len': 9.863}




  9%|▊         | 39/449 [14:50<2:31:55, 22.23s/it]

For epoch 43: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.62batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.42batches/s]



Metrics: {'train_loss': 0.02235165429187984, 'test_loss': 0.5918957829475403, 'bleu': 15.2402, 'gen_len': 9.8973}




  9%|▉         | 40/449 [15:12<2:32:13, 22.33s/it]

For epoch 44: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.63batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.41batches/s]



Metrics: {'train_loss': 0.020881129724041717, 'test_loss': 0.5904570281505584, 'bleu': 19.264, 'gen_len': 9.6781}




  9%|▉         | 41/449 [15:37<2:36:09, 22.96s/it]

For epoch 45: 


Train batch number 41: 100%|██████████| 41/41 [00:16<00:00,  2.48batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.32batches/s]



Metrics: {'train_loss': 0.0201093828669045, 'test_loss': 0.5931455224752427, 'bleu': 17.4171, 'gen_len': 9.774}




  9%|▉         | 42/449 [16:01<2:37:46, 23.26s/it]

For epoch 46: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.66batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.50batches/s]



Metrics: {'train_loss': 0.0197199554546032, 'test_loss': 0.579767070710659, 'bleu': 15.3304, 'gen_len': 9.863}




 10%|▉         | 43/449 [16:23<2:35:32, 22.99s/it]

For epoch 47: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.60batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.47batches/s]



Metrics: {'train_loss': 0.019487071464337955, 'test_loss': 0.6131036907434464, 'bleu': 14.5428, 'gen_len': 9.3356}




 10%|▉         | 44/449 [16:46<2:35:14, 23.00s/it]

For epoch 48: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.66batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.54batches/s]



Metrics: {'train_loss': 0.019199075495324482, 'test_loss': 0.5811351284384727, 'bleu': 17.6613, 'gen_len': 10.0}




 10%|█         | 45/449 [17:08<2:33:06, 22.74s/it]

For epoch 49: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.70batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.60batches/s]



Metrics: {'train_loss': 0.018901540602489216, 'test_loss': 0.5825160190463066, 'bleu': 16.4745, 'gen_len': 9.7055}




 10%|█         | 46/449 [17:30<2:31:21, 22.54s/it]

For epoch 50: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.65batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.51batches/s]



Metrics: {'train_loss': 0.018009568678169715, 'test_loss': 0.5876559719443322, 'bleu': 16.2034, 'gen_len': 9.3425}




 10%|█         | 47/449 [17:53<2:30:35, 22.48s/it]

For epoch 51: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.69batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.63batches/s]



Metrics: {'train_loss': 0.018014615310764894, 'test_loss': 0.5827091619372368, 'bleu': 15.9411, 'gen_len': 9.5137}




 11%|█         | 48/449 [18:15<2:29:38, 22.39s/it]

For epoch 52: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.64batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.48batches/s]



Metrics: {'train_loss': 0.017996168808966147, 'test_loss': 0.5911422155797481, 'bleu': 15.5317, 'gen_len': 9.3699}




 11%|█         | 49/449 [18:37<2:29:31, 22.43s/it]

For epoch 53: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.62batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.53batches/s]



Metrics: {'train_loss': 0.01817429808490887, 'test_loss': 0.578473449498415, 'bleu': 16.7748, 'gen_len': 9.5}




 11%|█         | 50/449 [19:00<2:29:00, 22.41s/it]

For epoch 54: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.65batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.30batches/s]



Metrics: {'train_loss': 0.017933525042835533, 'test_loss': 0.5846884988248349, 'bleu': 16.5203, 'gen_len': 10.4589}




 11%|█▏        | 51/449 [19:22<2:29:14, 22.50s/it]

For epoch 55: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.65batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.37batches/s]



Metrics: {'train_loss': 0.015593632578668071, 'test_loss': 0.5833122804760933, 'bleu': 18.2763, 'gen_len': 9.774}




 12%|█▏        | 52/449 [19:45<2:29:20, 22.57s/it]

For epoch 56: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.66batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.55batches/s]



Metrics: {'train_loss': 0.016194683204336865, 'test_loss': 0.5885629236698151, 'bleu': 17.0551, 'gen_len': 9.8767}




 12%|█▏        | 53/449 [20:07<2:28:07, 22.44s/it]

For epoch 57: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.63batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.30batches/s]



Metrics: {'train_loss': 0.015364306251995447, 'test_loss': 0.583597993850708, 'bleu': 17.0424, 'gen_len': 9.4384}




 12%|█▏        | 54/449 [20:30<2:28:38, 22.58s/it]

For epoch 58: 


Train batch number 41: 100%|██████████| 41/41 [00:16<00:00,  2.56batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.52batches/s]



Metrics: {'train_loss': 0.015360199289805278, 'test_loss': 0.5888383343815804, 'bleu': 17.3228, 'gen_len': 9.4384}




 12%|█▏        | 55/449 [20:53<2:29:04, 22.70s/it]

For epoch 59: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.58batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.46batches/s]



Metrics: {'train_loss': 0.014259723219566228, 'test_loss': 0.5884074524044991, 'bleu': 15.1381, 'gen_len': 9.6096}




 12%|█▏        | 56/449 [21:16<2:29:56, 22.89s/it]

For epoch 60: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.62batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.05batches/s]



Metrics: {'train_loss': 0.014600037506259069, 'test_loss': 0.6002086281776429, 'bleu': 14.7419, 'gen_len': 9.6986}




 13%|█▎        | 57/449 [21:40<2:30:38, 23.06s/it]

For epoch 61: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.74batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.49batches/s]



Metrics: {'train_loss': 0.014807586683096683, 'test_loss': 0.5949426412582397, 'bleu': 15.5779, 'gen_len': 9.7329}




 13%|█▎        | 58/449 [22:02<2:27:50, 22.69s/it]

For epoch 62: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.62batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.34batches/s]



Metrics: {'train_loss': 0.014642405289611438, 'test_loss': 0.5900489896535873, 'bleu': 16.1581, 'gen_len': 9.8699}




 13%|█▎        | 59/449 [22:24<2:27:36, 22.71s/it]

For epoch 63: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.65batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.06batches/s]



Metrics: {'train_loss': 0.014607954338738105, 'test_loss': 0.5896898746490479, 'bleu': 15.0456, 'gen_len': 9.7055}




 13%|█▎        | 60/449 [22:48<2:28:25, 22.89s/it]

For epoch 64: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.62batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.46batches/s]



Metrics: {'train_loss': 0.014298720841818467, 'test_loss': 0.573787571489811, 'bleu': 14.7153, 'gen_len': 9.2877}




 14%|█▎        | 61/449 [23:10<2:27:37, 22.83s/it]

For epoch 65: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.64batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.44batches/s]



Metrics: {'train_loss': 0.01409003304363024, 'test_loss': 0.5911540701985359, 'bleu': 17.9965, 'gen_len': 9.863}




 14%|█▍        | 62/449 [23:33<2:26:39, 22.74s/it]

For epoch 66: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.65batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.61batches/s]



Metrics: {'train_loss': 0.01392884986338819, 'test_loss': 0.6018886968493462, 'bleu': 15.4192, 'gen_len': 9.6986}




 14%|█▍        | 63/449 [23:55<2:25:27, 22.61s/it]

For epoch 67: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.62batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.20batches/s]



Metrics: {'train_loss': 0.013341993309284856, 'test_loss': 0.6092778325080872, 'bleu': 16.2804, 'gen_len': 9.7192}




 14%|█▍        | 64/449 [24:18<2:25:51, 22.73s/it]

For epoch 68: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.67batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.56batches/s]



Metrics: {'train_loss': 0.013564165022860213, 'test_loss': 0.601603914797306, 'bleu': 17.4612, 'gen_len': 9.8973}




 14%|█▍        | 65/449 [24:41<2:24:56, 22.65s/it]

For epoch 69: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.66batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.45batches/s]



Metrics: {'train_loss': 0.013230201742816262, 'test_loss': 0.583492074906826, 'bleu': 15.9734, 'gen_len': 9.3014}




 15%|█▍        | 66/449 [25:03<2:24:01, 22.56s/it]

For epoch 70: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.64batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.37batches/s]



Metrics: {'train_loss': 0.013210703085017641, 'test_loss': 0.6118864104151726, 'bleu': 15.7112, 'gen_len': 9.4589}




 15%|█▍        | 67/449 [25:26<2:23:52, 22.60s/it]

For epoch 71: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.65batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.03batches/s]



Metrics: {'train_loss': 0.013626370192846148, 'test_loss': 0.6097711682319641, 'bleu': 17.626, 'gen_len': 9.3699}




 15%|█▌        | 68/449 [25:49<2:24:46, 22.80s/it]

For epoch 72: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.69batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.62batches/s]



Metrics: {'train_loss': 0.013205948274400903, 'test_loss': 0.599357396364212, 'bleu': 17.3403, 'gen_len': 9.3151}




 15%|█▌        | 69/449 [26:11<2:22:59, 22.58s/it]

For epoch 73: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.66batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.48batches/s]



Metrics: {'train_loss': 0.012732337183523469, 'test_loss': 0.5985084518790245, 'bleu': 16.9142, 'gen_len': 9.9247}




 16%|█▌        | 70/449 [26:33<2:21:53, 22.46s/it]

For epoch 74: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.69batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.59batches/s]



Metrics: {'train_loss': 0.013431445320659294, 'test_loss': 0.5997478641569615, 'bleu': 18.0321, 'gen_len': 9.6301}




 16%|█▌        | 71/449 [26:56<2:21:01, 22.38s/it]

For epoch 75: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.68batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.50batches/s]



Metrics: {'train_loss': 0.012422935878176515, 'test_loss': 0.5945037245750427, 'bleu': 15.8307, 'gen_len': 9.5616}




 16%|█▌        | 72/449 [27:18<2:20:32, 22.37s/it]

For epoch 76: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.60batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.37batches/s]



Metrics: {'train_loss': 0.013302127728465854, 'test_loss': 0.6017592385411262, 'bleu': 16.0999, 'gen_len': 9.8904}




 16%|█▋        | 73/449 [27:41<2:20:56, 22.49s/it]

For epoch 77: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.65batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.56batches/s]



Metrics: {'train_loss': 0.012857949463423432, 'test_loss': 0.6032369673252106, 'bleu': 15.4488, 'gen_len': 9.4452}




 16%|█▋        | 74/449 [28:03<2:20:20, 22.45s/it]

For epoch 78: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.66batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.58batches/s]



Metrics: {'train_loss': 0.012203501401151099, 'test_loss': 0.5882429108023643, 'bleu': 16.6846, 'gen_len': 9.7877}




 17%|█▋        | 75/449 [28:25<2:19:29, 22.38s/it]

For epoch 79: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.66batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.52batches/s]



Metrics: {'train_loss': 0.012481016016042814, 'test_loss': 0.5886514574289322, 'bleu': 18.5296, 'gen_len': 9.7055}




 17%|█▋        | 76/449 [28:48<2:19:00, 22.36s/it]

For epoch 80: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.65batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.02batches/s]



Metrics: {'train_loss': 0.011459255698932016, 'test_loss': 0.5854745507240295, 'bleu': 17.9556, 'gen_len': 9.4932}




 17%|█▋        | 77/449 [29:11<2:20:50, 22.72s/it]

For epoch 81: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.58batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.36batches/s]



Metrics: {'train_loss': 0.011035248194253298, 'test_loss': 0.6027146115899086, 'bleu': 17.3761, 'gen_len': 9.5822}




 17%|█▋        | 78/449 [29:34<2:21:11, 22.83s/it]

For epoch 82: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.68batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.60batches/s]



Metrics: {'train_loss': 0.011459521120186986, 'test_loss': 0.6054589994251728, 'bleu': 17.407, 'gen_len': 9.4041}




 18%|█▊        | 79/449 [29:56<2:19:26, 22.61s/it]

For epoch 83: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.71batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.53batches/s]



Metrics: {'train_loss': 0.011407368032761463, 'test_loss': 0.5988498196005821, 'bleu': 18.4878, 'gen_len': 9.6507}




 18%|█▊        | 80/449 [30:18<2:17:37, 22.38s/it]

For epoch 84: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.72batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.51batches/s]



Metrics: {'train_loss': 0.011359405590266717, 'test_loss': 0.6112505130469799, 'bleu': 16.5606, 'gen_len': 9.6781}




 18%|█▊        | 81/449 [30:40<2:16:10, 22.20s/it]

For epoch 85: 


Train batch number 41: 100%|██████████| 41/41 [00:18<00:00,  2.24batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:05<00:00,  1.72batches/s]



Metrics: {'train_loss': 0.011182367330326176, 'test_loss': 0.6037575356662274, 'bleu': 15.976, 'gen_len': 9.7123}




 18%|█▊        | 82/449 [31:07<2:25:23, 23.77s/it]

For epoch 86: 


Train batch number 41: 100%|██████████| 41/41 [00:17<00:00,  2.35batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:05<00:00,  1.78batches/s]



Metrics: {'train_loss': 0.011352464340899775, 'test_loss': 0.612258280813694, 'bleu': 16.2942, 'gen_len': 9.4452}




 18%|█▊        | 83/449 [31:33<2:29:13, 24.46s/it]

For epoch 87: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.69batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.44batches/s]



Metrics: {'train_loss': 0.011346463897697083, 'test_loss': 0.6048486515879631, 'bleu': 16.5055, 'gen_len': 9.9932}




 19%|█▊        | 84/449 [31:56<2:25:17, 23.88s/it]

For epoch 88: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.68batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.58batches/s]



Metrics: {'train_loss': 0.011213754621765963, 'test_loss': 0.6066460609436035, 'bleu': 17.0385, 'gen_len': 9.6849}




 19%|█▉        | 85/449 [32:18<2:22:03, 23.42s/it]

For epoch 89: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.73batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.43batches/s]



Metrics: {'train_loss': 0.012008490336195724, 'test_loss': 0.6124311193823815, 'bleu': 15.9365, 'gen_len': 9.7808}




 19%|█▉        | 86/449 [32:40<2:19:19, 23.03s/it]

For epoch 90: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.61batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:05<00:00,  1.83batches/s]



Metrics: {'train_loss': 0.010541825759701612, 'test_loss': 0.6099623635411262, 'bleu': 17.1639, 'gen_len': 10.1849}




 19%|█▉        | 87/449 [33:04<2:20:16, 23.25s/it]

For epoch 91: 


Train batch number 41: 100%|██████████| 41/41 [00:16<00:00,  2.52batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:05<00:00,  1.79batches/s]



Metrics: {'train_loss': 0.011015046143722607, 'test_loss': 0.6100817263126374, 'bleu': 17.0149, 'gen_len': 9.4932}




 20%|█▉        | 88/449 [33:29<2:22:28, 23.68s/it]

For epoch 92: 


Train batch number 41: 100%|██████████| 41/41 [00:17<00:00,  2.32batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:06<00:00,  1.53batches/s]



Metrics: {'train_loss': 0.010793680211574567, 'test_loss': 0.606595104932785, 'bleu': 16.996, 'gen_len': 10.1849}




 20%|█▉        | 89/449 [33:56<2:28:39, 24.78s/it]

For epoch 93: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.58batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.43batches/s]



Metrics: {'train_loss': 0.011143345857129954, 'test_loss': 0.6192466661334037, 'bleu': 16.6117, 'gen_len': 9.6918}




 20%|██        | 90/449 [34:20<2:25:42, 24.35s/it]

For epoch 94: 


Train batch number 41: 100%|██████████| 41/41 [00:16<00:00,  2.48batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.12batches/s]



Metrics: {'train_loss': 0.01048348991700062, 'test_loss': 0.6136708721518517, 'bleu': 16.7423, 'gen_len': 9.5822}




 20%|██        | 91/449 [34:44<2:24:51, 24.28s/it]

For epoch 95: 


Train batch number 41: 100%|██████████| 41/41 [00:16<00:00,  2.52batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.51batches/s]



Metrics: {'train_loss': 0.010999592418623408, 'test_loss': 0.6246623933315277, 'bleu': 16.3304, 'gen_len': 9.7123}




 20%|██        | 92/449 [35:07<2:22:32, 23.96s/it]

For epoch 96: 


Train batch number 41: 100%|██████████| 41/41 [00:16<00:00,  2.56batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.33batches/s]



Metrics: {'train_loss': 0.010094984678733276, 'test_loss': 0.6151020906865596, 'bleu': 19.2186, 'gen_len': 10.1027}




 21%|██        | 93/449 [35:30<2:21:01, 23.77s/it]

For epoch 97: 


Train batch number 41: 100%|██████████| 41/41 [00:16<00:00,  2.56batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.09batches/s]



Metrics: {'train_loss': 0.011080340788949554, 'test_loss': 0.6170106694102288, 'bleu': 18.342, 'gen_len': 9.4247}




 21%|██        | 94/449 [35:54<2:21:01, 23.84s/it]

For epoch 98: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.64batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.46batches/s]



Metrics: {'train_loss': 0.01090570655083511, 'test_loss': 0.6076108574867248, 'bleu': 15.3635, 'gen_len': 9.6438}




 21%|██        | 95/449 [36:17<2:18:13, 23.43s/it]

For epoch 99: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.64batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.34batches/s]



Metrics: {'train_loss': 0.011068274340842192, 'test_loss': 0.6068428605794907, 'bleu': 17.694, 'gen_len': 9.4726}




 21%|██▏       | 96/449 [36:40<2:17:28, 23.37s/it]

For epoch 100: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.64batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.38batches/s]



Metrics: {'train_loss': 0.010657112165240616, 'test_loss': 0.6109324842691422, 'bleu': 17.0627, 'gen_len': 9.589}




 22%|██▏       | 97/449 [37:03<2:16:24, 23.25s/it]

For epoch 101: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.63batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.20batches/s]



Metrics: {'train_loss': 0.010783551317616933, 'test_loss': 0.6053549349308014, 'bleu': 18.9154, 'gen_len': 9.8562}




 22%|██▏       | 98/449 [37:26<2:15:57, 23.24s/it]

For epoch 102: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.66batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.38batches/s]



Metrics: {'train_loss': 0.01065521109167759, 'test_loss': 0.614494601637125, 'bleu': 17.1321, 'gen_len': 9.6575}




 22%|██▏       | 99/449 [37:49<2:14:44, 23.10s/it]

For epoch 103: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.62batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.23batches/s]



Metrics: {'train_loss': 0.010105817669593706, 'test_loss': 0.6173117026686669, 'bleu': 19.5386, 'gen_len': 9.6507}




 22%|██▏       | 100/449 [38:14<2:18:09, 23.75s/it]

For epoch 104: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.67batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.21batches/s]



Metrics: {'train_loss': 0.01038483974364836, 'test_loss': 0.611224065721035, 'bleu': 18.8274, 'gen_len': 9.8836}




 22%|██▏       | 101/449 [38:37<2:16:30, 23.54s/it]

For epoch 105: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.61batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.52batches/s]



Metrics: {'train_loss': 0.009632284730309394, 'test_loss': 0.6179651618003845, 'bleu': 17.7791, 'gen_len': 9.9726}




 23%|██▎       | 102/449 [39:00<2:14:26, 23.25s/it]

For epoch 106: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.59batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:05<00:00,  1.95batches/s]



Metrics: {'train_loss': 0.009643330942930245, 'test_loss': 0.6234206676483154, 'bleu': 17.0471, 'gen_len': 9.4932}




 23%|██▎       | 103/449 [39:24<2:15:06, 23.43s/it]

For epoch 107: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.59batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.39batches/s]



Metrics: {'train_loss': 0.009783459838661478, 'test_loss': 0.6207496359944343, 'bleu': 18.7027, 'gen_len': 9.8356}




 23%|██▎       | 104/449 [39:47<2:14:30, 23.39s/it]

For epoch 108: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.70batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.52batches/s]



Metrics: {'train_loss': 0.010129109774602622, 'test_loss': 0.624671071767807, 'bleu': 16.7338, 'gen_len': 9.6233}




 23%|██▎       | 105/449 [40:09<2:12:06, 23.04s/it]

For epoch 109: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.72batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.53batches/s]



Metrics: {'train_loss': 0.01044955309026125, 'test_loss': 0.6132630094885826, 'bleu': 16.7479, 'gen_len': 9.9726}




 24%|██▎       | 106/449 [40:31<2:10:08, 22.77s/it]

For epoch 110: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.69batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.44batches/s]



Metrics: {'train_loss': 0.008787157615964733, 'test_loss': 0.6142483621835708, 'bleu': 17.2431, 'gen_len': 9.5068}




 24%|██▍       | 107/449 [40:54<2:09:05, 22.65s/it]

For epoch 111: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.67batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.54batches/s]



Metrics: {'train_loss': 0.009159533500035361, 'test_loss': 0.626823504269123, 'bleu': 16.2123, 'gen_len': 9.6712}




 24%|██▍       | 108/449 [41:16<2:08:20, 22.58s/it]

For epoch 112: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.69batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.51batches/s]



Metrics: {'train_loss': 0.008597037346079582, 'test_loss': 0.6356504127383232, 'bleu': 19.9803, 'gen_len': 9.6781}




 24%|██▍       | 109/449 [41:40<2:10:42, 23.07s/it]

For epoch 113: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.62batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.61batches/s]



Metrics: {'train_loss': 0.008452160755263232, 'test_loss': 0.6253478839993477, 'bleu': 18.0979, 'gen_len': 9.3904}




 24%|██▍       | 110/449 [42:03<2:09:33, 22.93s/it]

For epoch 114: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.66batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.42batches/s]



Metrics: {'train_loss': 0.008853929662486403, 'test_loss': 0.6175889879465103, 'bleu': 17.3175, 'gen_len': 9.4932}




 25%|██▍       | 111/449 [42:25<2:08:38, 22.84s/it]

For epoch 115: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.67batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.11batches/s]



Metrics: {'train_loss': 0.008730697769262805, 'test_loss': 0.6162848860025406, 'bleu': 17.6458, 'gen_len': 9.9932}




 25%|██▍       | 112/449 [42:49<2:08:53, 22.95s/it]

For epoch 116: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.63batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.37batches/s]



Metrics: {'train_loss': 0.008595615738957393, 'test_loss': 0.6222616836428643, 'bleu': 19.1868, 'gen_len': 9.9795}




 25%|██▌       | 113/449 [43:12<2:08:20, 22.92s/it]

For epoch 117: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.68batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.40batches/s]



Metrics: {'train_loss': 0.009152949208439124, 'test_loss': 0.6131578892469406, 'bleu': 16.6684, 'gen_len': 9.589}




 25%|██▌       | 114/449 [43:34<2:07:18, 22.80s/it]

For epoch 118: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.64batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.30batches/s]



Metrics: {'train_loss': 0.009037779223900743, 'test_loss': 0.6166242152452469, 'bleu': 17.8252, 'gen_len': 9.8288}




 26%|██▌       | 115/449 [43:57<2:07:12, 22.85s/it]

For epoch 119: 


Train batch number 41: 100%|██████████| 41/41 [00:16<00:00,  2.56batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.35batches/s]



Metrics: {'train_loss': 0.009325493149822804, 'test_loss': 0.6299377724528312, 'bleu': 17.6645, 'gen_len': 9.6849}




 26%|██▌       | 116/449 [44:21<2:08:32, 23.16s/it]

For epoch 120: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.63batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.41batches/s]



Metrics: {'train_loss': 0.008647436650878772, 'test_loss': 0.6275511637330056, 'bleu': 16.1523, 'gen_len': 9.5753}




 26%|██▌       | 117/449 [44:44<2:07:21, 23.02s/it]

For epoch 121: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.69batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.28batches/s]



Metrics: {'train_loss': 0.0090482577443032, 'test_loss': 0.6258257091045379, 'bleu': 19.1535, 'gen_len': 9.4247}




 26%|██▋       | 118/449 [45:06<2:06:41, 22.97s/it]

For epoch 122: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.66batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.32batches/s]



Metrics: {'train_loss': 0.009094132195658437, 'test_loss': 0.6346587046980858, 'bleu': 16.5403, 'gen_len': 9.4795}




 27%|██▋       | 119/449 [45:29<2:05:52, 22.89s/it]

For epoch 123: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.61batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:05<00:00,  1.99batches/s]



Metrics: {'train_loss': 0.008833167381675505, 'test_loss': 0.6292709439992905, 'bleu': 18.4846, 'gen_len': 9.6986}




 27%|██▋       | 120/449 [45:53<2:07:02, 23.17s/it]

For epoch 124: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.66batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.24batches/s]



Metrics: {'train_loss': 0.008805161046727402, 'test_loss': 0.6221602782607079, 'bleu': 18.1112, 'gen_len': 9.6027}




 27%|██▋       | 121/449 [46:16<2:06:10, 23.08s/it]

For epoch 125: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.66batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.35batches/s]



Metrics: {'train_loss': 0.008586845962648712, 'test_loss': 0.6109922021627426, 'bleu': 19.0663, 'gen_len': 9.7466}




 27%|██▋       | 122/449 [46:39<2:05:04, 22.95s/it]

For epoch 126: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.66batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.16batches/s]



Metrics: {'train_loss': 0.007780425125596727, 'test_loss': 0.6236327692866326, 'bleu': 17.245, 'gen_len': 9.8425}




 27%|██▋       | 123/449 [47:01<2:04:40, 22.95s/it]

For epoch 127: 


Train batch number 41: 100%|██████████| 41/41 [00:16<00:00,  2.55batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.30batches/s]



Metrics: {'train_loss': 0.008298425776202505, 'test_loss': 0.6139879032969475, 'bleu': 19.8559, 'gen_len': 9.7603}




 28%|██▊       | 124/449 [47:25<2:05:23, 23.15s/it]

For epoch 128: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.61batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.36batches/s]



Metrics: {'train_loss': 0.008273437153548002, 'test_loss': 0.6190574541687965, 'bleu': 18.0612, 'gen_len': 9.7671}




 28%|██▊       | 125/449 [47:48<2:04:57, 23.14s/it]

For epoch 129: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.65batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.53batches/s]



Metrics: {'train_loss': 0.008804471033239147, 'test_loss': 0.6249847643077373, 'bleu': 19.5113, 'gen_len': 9.2055}




 28%|██▊       | 126/449 [48:11<2:03:42, 22.98s/it]

For epoch 130: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.59batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.10batches/s]



Metrics: {'train_loss': 0.0092298771402367, 'test_loss': 0.6019272461533547, 'bleu': 17.5889, 'gen_len': 9.7123}




 28%|██▊       | 127/449 [48:35<2:04:33, 23.21s/it]

For epoch 131: 


Train batch number 41: 100%|██████████| 41/41 [00:16<00:00,  2.53batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.54batches/s]



Metrics: {'train_loss': 0.008274423140214711, 'test_loss': 0.6246467307209969, 'bleu': 17.7938, 'gen_len': 9.7397}




 29%|██▊       | 128/449 [48:58<2:04:14, 23.22s/it]

For epoch 132: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.65batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.37batches/s]



Metrics: {'train_loss': 0.008461239594375579, 'test_loss': 0.6224609673023224, 'bleu': 18.8728, 'gen_len': 9.6849}




 29%|██▊       | 129/449 [49:20<2:02:37, 22.99s/it]

For epoch 133: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.62batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.54batches/s]



Metrics: {'train_loss': 0.008323536132371462, 'test_loss': 0.6282427206635475, 'bleu': 17.7135, 'gen_len': 9.4247}




 29%|██▉       | 130/449 [49:43<2:01:12, 22.80s/it]

For epoch 134: 


Train batch number 41: 100%|██████████| 41/41 [00:16<00:00,  2.54batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:06<00:00,  1.66batches/s]



Metrics: {'train_loss': 0.009029186750966601, 'test_loss': 0.646443422138691, 'bleu': 18.5989, 'gen_len': 9.726}




 29%|██▉       | 131/449 [50:07<2:03:52, 23.37s/it]

For epoch 135: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.61batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.30batches/s]



Metrics: {'train_loss': 0.008764893622932637, 'test_loss': 0.6191009402275085, 'bleu': 17.689, 'gen_len': 9.411}




 29%|██▉       | 132/449 [50:31<2:03:23, 23.35s/it]

For epoch 136: 


Train batch number 41: 100%|██████████| 41/41 [00:16<00:00,  2.49batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.47batches/s]



Metrics: {'train_loss': 0.008344209440643104, 'test_loss': 0.6147127896547318, 'bleu': 17.9945, 'gen_len': 9.5822}




 30%|██▉       | 133/449 [50:54<2:03:24, 23.43s/it]

For epoch 137: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.61batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:05<00:00,  1.96batches/s]



Metrics: {'train_loss': 0.008372819022752526, 'test_loss': 0.6202224180102348, 'bleu': 18.1076, 'gen_len': 9.7808}




 30%|██▉       | 134/449 [51:18<2:03:30, 23.53s/it]

For epoch 138: 


Train batch number 41: 100%|██████████| 41/41 [00:16<00:00,  2.53batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.10batches/s]



Metrics: {'train_loss': 0.008401254930238172, 'test_loss': 0.6231881678104401, 'bleu': 19.4076, 'gen_len': 9.8562}




 30%|███       | 135/449 [51:42<2:03:57, 23.69s/it]

For epoch 139: 


Train batch number 41: 100%|██████████| 41/41 [00:16<00:00,  2.54batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.20batches/s]



Metrics: {'train_loss': 0.00765248597599566, 'test_loss': 0.6177630722522736, 'bleu': 16.5501, 'gen_len': 9.7466}




 30%|███       | 136/449 [52:06<2:03:31, 23.68s/it]

For epoch 140: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.72batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.64batches/s]



Metrics: {'train_loss': 0.008125932046734705, 'test_loss': 0.6243663191795349, 'bleu': 19.2897, 'gen_len': 9.6918}




 31%|███       | 137/449 [52:28<2:00:36, 23.19s/it]

For epoch 141: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.77batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.81batches/s]



Metrics: {'train_loss': 0.00821235556746038, 'test_loss': 0.6217575073242188, 'bleu': 18.8595, 'gen_len': 9.9041}




 31%|███       | 138/449 [52:49<1:57:06, 22.59s/it]

For epoch 142: 


Train batch number 41: 100%|██████████| 41/41 [00:17<00:00,  2.41batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.66batches/s]



Metrics: {'train_loss': 0.008150781493452264, 'test_loss': 0.619055449962616, 'bleu': 18.7824, 'gen_len': 9.774}




 31%|███       | 139/449 [53:12<1:58:10, 22.87s/it]

For epoch 143: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.70batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.77batches/s]



Metrics: {'train_loss': 0.008067812202725469, 'test_loss': 0.6108049593865872, 'bleu': 18.0513, 'gen_len': 9.4041}




 31%|███       | 140/449 [53:34<1:55:37, 22.45s/it]

For epoch 144: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.69batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.16batches/s]



Metrics: {'train_loss': 0.008065657891764692, 'test_loss': 0.6165064215660095, 'bleu': 16.6062, 'gen_len': 9.7055}




 31%|███▏      | 141/449 [53:57<1:55:46, 22.55s/it]

For epoch 145: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.71batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.55batches/s]



Metrics: {'train_loss': 0.0083224148569038, 'test_loss': 0.6071142762899399, 'bleu': 17.9334, 'gen_len': 9.8082}




 32%|███▏      | 142/449 [54:18<1:53:53, 22.26s/it]

For epoch 146: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.73batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.64batches/s]



Metrics: {'train_loss': 0.007330224894714065, 'test_loss': 0.6000535540282727, 'bleu': 18.1227, 'gen_len': 9.7945}




 32%|███▏      | 143/449 [54:40<1:52:45, 22.11s/it]

For epoch 147: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.69batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.25batches/s]



Metrics: {'train_loss': 0.007284936830174268, 'test_loss': 0.611670994758606, 'bleu': 17.7201, 'gen_len': 9.8151}




 32%|███▏      | 144/449 [55:03<1:53:09, 22.26s/it]

For epoch 148: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.66batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.34batches/s]



Metrics: {'train_loss': 0.008022085138846462, 'test_loss': 0.6203892223536969, 'bleu': 17.6779, 'gen_len': 9.5411}




 32%|███▏      | 145/449 [55:25<1:53:15, 22.35s/it]

For epoch 149: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.65batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:05<00:00,  1.98batches/s]



Metrics: {'train_loss': 0.008607281701321282, 'test_loss': 0.609234019368887, 'bleu': 16.7906, 'gen_len': 9.3836}




 33%|███▎      | 146/449 [55:49<1:54:32, 22.68s/it]

For epoch 150: 


Train batch number 41: 100%|██████████| 41/41 [00:16<00:00,  2.52batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.20batches/s]



Metrics: {'train_loss': 0.008084833582227186, 'test_loss': 0.6138432085514068, 'bleu': 18.2292, 'gen_len': 9.6712}




 33%|███▎      | 147/449 [56:13<1:56:16, 23.10s/it]

For epoch 151: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.58batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.14batches/s]



Metrics: {'train_loss': 0.007660996929810542, 'test_loss': 0.6218320831656456, 'bleu': 16.196, 'gen_len': 9.4795}




 33%|███▎      | 148/449 [56:36<1:56:44, 23.27s/it]

For epoch 152: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.75batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.42batches/s]



Metrics: {'train_loss': 0.007845891367008046, 'test_loss': 0.6191999033093453, 'bleu': 18.3175, 'gen_len': 9.4041}




 33%|███▎      | 149/449 [56:58<1:54:23, 22.88s/it]

For epoch 153: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.60batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.51batches/s]



Metrics: {'train_loss': 0.007638001507829602, 'test_loss': 0.6232738047838211, 'bleu': 16.8289, 'gen_len': 9.9726}




 33%|███▎      | 150/449 [57:21<1:53:54, 22.86s/it]

For epoch 154: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.57batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.44batches/s]



Metrics: {'train_loss': 0.008312186021812079, 'test_loss': 0.6176113456487655, 'bleu': 19.7453, 'gen_len': 9.8973}




 34%|███▎      | 151/449 [57:44<1:53:46, 22.91s/it]

For epoch 155: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.66batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.57batches/s]



Metrics: {'train_loss': 0.007525402791343811, 'test_loss': 0.6137640446424484, 'bleu': 18.1839, 'gen_len': 9.911}




 34%|███▍      | 152/449 [58:06<1:52:25, 22.71s/it]

For epoch 156: 


Train batch number 41: 100%|██████████| 41/41 [00:16<00:00,  2.49batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.56batches/s]



Metrics: {'train_loss': 0.007692328534974921, 'test_loss': 0.6174110129475594, 'bleu': 15.948, 'gen_len': 10.1096}




 34%|███▍      | 153/449 [58:30<1:53:15, 22.96s/it]

For epoch 157: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.69batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.41batches/s]



Metrics: {'train_loss': 0.007461365354928847, 'test_loss': 0.623101070523262, 'bleu': 19.6549, 'gen_len': 9.7192}




 34%|███▍      | 154/449 [58:52<1:51:41, 22.72s/it]

For epoch 158: 


Train batch number 41: 100%|██████████| 41/41 [00:17<00:00,  2.37batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.02batches/s]



Metrics: {'train_loss': 0.007127432601253797, 'test_loss': 0.6268843621015548, 'bleu': 20.3594, 'gen_len': 9.8356}




 35%|███▍      | 155/449 [59:19<1:57:27, 23.97s/it]

For epoch 159: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.62batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.56batches/s]



Metrics: {'train_loss': 0.0074445397036558975, 'test_loss': 0.6315518110990525, 'bleu': 19.6396, 'gen_len': 9.6301}




 35%|███▍      | 156/449 [59:42<1:54:59, 23.55s/it]

For epoch 160: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.69batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.34batches/s]



Metrics: {'train_loss': 0.009084007481294797, 'test_loss': 0.6316821336746216, 'bleu': 17.6425, 'gen_len': 10.0411}




 35%|███▍      | 157/449 [1:00:04<1:52:53, 23.20s/it]

For epoch 161: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.66batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.40batches/s]



Metrics: {'train_loss': 0.007834931784423023, 'test_loss': 0.628394266963005, 'bleu': 18.3342, 'gen_len': 9.9247}




 35%|███▌      | 158/449 [1:00:27<1:51:38, 23.02s/it]

For epoch 162: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.59batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.38batches/s]



Metrics: {'train_loss': 0.008093590780002316, 'test_loss': 0.6271261855959892, 'bleu': 19.8249, 'gen_len': 9.9178}




 35%|███▌      | 159/449 [1:00:50<1:51:10, 23.00s/it]

For epoch 163: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.72batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.24batches/s]



Metrics: {'train_loss': 0.007646175544345524, 'test_loss': 0.6250883668661118, 'bleu': 19.3612, 'gen_len': 9.8356}




 36%|███▌      | 160/449 [1:01:12<1:50:01, 22.84s/it]

For epoch 164: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.74batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.65batches/s]



Metrics: {'train_loss': 0.007793156645919491, 'test_loss': 0.6141579777002335, 'bleu': 19.352, 'gen_len': 9.8219}




 36%|███▌      | 161/449 [1:01:33<1:47:36, 22.42s/it]

For epoch 165: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.73batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.76batches/s]



Metrics: {'train_loss': 0.007357399802791272, 'test_loss': 0.6231250017881393, 'bleu': 16.9228, 'gen_len': 9.9726}




 36%|███▌      | 162/449 [1:01:55<1:45:55, 22.15s/it]

For epoch 166: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.76batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.71batches/s]



Metrics: {'train_loss': 0.00737219624414404, 'test_loss': 0.638630285859108, 'bleu': 17.7155, 'gen_len': 9.8836}




 36%|███▋      | 163/449 [1:02:16<1:44:36, 21.95s/it]

For epoch 167: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.79batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.70batches/s]



Metrics: {'train_loss': 0.007527301604745955, 'test_loss': 0.6308416068553925, 'bleu': 16.6474, 'gen_len': 9.5685}




 37%|███▋      | 164/449 [1:02:38<1:43:03, 21.70s/it]

For epoch 168: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.75batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.66batches/s]



Metrics: {'train_loss': 0.006426151171780941, 'test_loss': 0.6530499175190926, 'bleu': 16.7385, 'gen_len': 9.637}




 37%|███▋      | 165/449 [1:02:59<1:42:27, 21.65s/it]

For epoch 169: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.75batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.74batches/s]



Metrics: {'train_loss': 0.006598506250600444, 'test_loss': 0.6396682903170585, 'bleu': 17.1362, 'gen_len': 9.7603}




 37%|███▋      | 166/449 [1:03:20<1:41:27, 21.51s/it]

For epoch 170: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.77batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.66batches/s]



Metrics: {'train_loss': 0.0067011067195136735, 'test_loss': 0.6209463641047478, 'bleu': 20.1277, 'gen_len': 9.9521}




 37%|███▋      | 167/449 [1:03:42<1:41:03, 21.50s/it]

For epoch 171: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.75batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.75batches/s]



Metrics: {'train_loss': 0.006923262370614017, 'test_loss': 0.6409795790910721, 'bleu': 19.6463, 'gen_len': 9.8356}




 37%|███▋      | 168/449 [1:04:03<1:40:34, 21.48s/it]

For epoch 172: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.84batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.90batches/s]



Metrics: {'train_loss': 0.0064189991425359395, 'test_loss': 0.6355336651206016, 'bleu': 19.8774, 'gen_len': 9.9315}




 38%|███▊      | 169/449 [1:04:24<1:39:25, 21.31s/it]

For epoch 173: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.92batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.97batches/s]



Metrics: {'train_loss': 0.007198897820738394, 'test_loss': 0.6187283128499985, 'bleu': 20.4561, 'gen_len': 10.0137}




 38%|███▊      | 170/449 [1:04:46<1:40:09, 21.54s/it]

For epoch 174: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.89batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.89batches/s]



Metrics: {'train_loss': 0.007161063051260099, 'test_loss': 0.6320637404918671, 'bleu': 19.1634, 'gen_len': 9.8356}




 38%|███▊      | 171/449 [1:05:07<1:38:29, 21.26s/it]

For epoch 175: 


Train batch number 41: 100%|██████████| 41/41 [00:13<00:00,  2.96batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.92batches/s]



Metrics: {'train_loss': 0.0070391469237553635, 'test_loss': 0.6336560532450676, 'bleu': 19.8055, 'gen_len': 9.8219}




 38%|███▊      | 172/449 [1:05:27<1:36:31, 20.91s/it]

For epoch 176: 


Train batch number 41: 100%|██████████| 41/41 [00:13<00:00,  2.95batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.54batches/s]



Metrics: {'train_loss': 0.007086768774770018, 'test_loss': 0.6105449303984642, 'bleu': 20.2425, 'gen_len': 9.8219}




 39%|███▊      | 173/449 [1:05:47<1:35:32, 20.77s/it]

For epoch 177: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.89batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.86batches/s]



Metrics: {'train_loss': 0.007449632073275563, 'test_loss': 0.6219689011573791, 'bleu': 17.9517, 'gen_len': 9.9178}




 39%|███▉      | 174/449 [1:06:08<1:34:45, 20.67s/it]

For epoch 178: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.92batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.90batches/s]



Metrics: {'train_loss': 0.007351608350645841, 'test_loss': 0.6349267035722732, 'bleu': 17.6162, 'gen_len': 9.4589}




 39%|███▉      | 175/449 [1:06:28<1:33:48, 20.54s/it]

For epoch 179: 


Train batch number 41: 100%|██████████| 41/41 [00:13<00:00,  2.95batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.93batches/s]



Metrics: {'train_loss': 0.007190590088323849, 'test_loss': 0.6220032632350921, 'bleu': 20.0821, 'gen_len': 9.6644}




 39%|███▉      | 176/449 [1:06:48<1:32:52, 20.41s/it]

For epoch 180: 


Train batch number 41: 100%|██████████| 41/41 [00:13<00:00,  2.94batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.81batches/s]



Metrics: {'train_loss': 0.006735605159320119, 'test_loss': 0.6359044373035431, 'bleu': 19.0372, 'gen_len': 10.3356}




 39%|███▉      | 177/449 [1:07:08<1:32:23, 20.38s/it]

For epoch 181: 


Train batch number 41: 100%|██████████| 41/41 [00:13<00:00,  2.96batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.92batches/s]



Metrics: {'train_loss': 0.00649965064963553, 'test_loss': 0.6296579971909523, 'bleu': 17.5911, 'gen_len': 9.8014}




 40%|███▉      | 178/449 [1:07:29<1:31:43, 20.31s/it]

For epoch 182: 


Train batch number 41: 100%|██████████| 41/41 [00:13<00:00,  2.94batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.94batches/s]



Metrics: {'train_loss': 0.006587234223488628, 'test_loss': 0.618849764764309, 'bleu': 20.1513, 'gen_len': 10.1644}




 40%|███▉      | 179/449 [1:07:49<1:31:04, 20.24s/it]

For epoch 183: 


Train batch number 41: 100%|██████████| 41/41 [00:13<00:00,  2.95batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.84batches/s]



Metrics: {'train_loss': 0.006798667384584139, 'test_loss': 0.624708503484726, 'bleu': 18.1643, 'gen_len': 9.9726}




 40%|████      | 180/449 [1:08:09<1:30:39, 20.22s/it]

For epoch 184: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.91batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.90batches/s]



Metrics: {'train_loss': 0.007563388565691506, 'test_loss': 0.6196424111723899, 'bleu': 17.5603, 'gen_len': 9.5205}




 40%|████      | 181/449 [1:08:29<1:30:31, 20.27s/it]

For epoch 185: 


Train batch number 41: 100%|██████████| 41/41 [00:13<00:00,  2.95batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  3.04batches/s]



Metrics: {'train_loss': 0.007036581487826458, 'test_loss': 0.6321885377168656, 'bleu': 16.8823, 'gen_len': 10.1918}




 41%|████      | 182/449 [1:08:49<1:29:47, 20.18s/it]

For epoch 186: 


Train batch number 41: 100%|██████████| 41/41 [00:13<00:00,  2.94batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.87batches/s]



Metrics: {'train_loss': 0.006656442411107625, 'test_loss': 0.6473625928163529, 'bleu': 18.1046, 'gen_len': 9.8082}




 41%|████      | 183/449 [1:09:09<1:29:28, 20.18s/it]

For epoch 187: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.87batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.98batches/s]



Metrics: {'train_loss': 0.007041477814044167, 'test_loss': 0.6198542043566704, 'bleu': 18.9747, 'gen_len': 9.726}




 41%|████      | 184/449 [1:09:30<1:29:24, 20.24s/it]

For epoch 188: 


Train batch number 41: 100%|██████████| 41/41 [00:13<00:00,  2.94batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.99batches/s]



Metrics: {'train_loss': 0.007079193357196523, 'test_loss': 0.6192903541028499, 'bleu': 19.4625, 'gen_len': 9.4589}




 41%|████      | 185/449 [1:09:50<1:28:46, 20.17s/it]

For epoch 189: 


Train batch number 41: 100%|██████████| 41/41 [00:13<00:00,  2.96batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.92batches/s]



Metrics: {'train_loss': 0.006817023167613803, 'test_loss': 0.6306956082582473, 'bleu': 18.9149, 'gen_len': 9.3836}




 41%|████▏     | 186/449 [1:10:10<1:28:16, 20.14s/it]

For epoch 190: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.92batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.95batches/s]



Metrics: {'train_loss': 0.006236740456121724, 'test_loss': 0.6279820933938026, 'bleu': 18.3298, 'gen_len': 9.589}




 42%|████▏     | 187/449 [1:10:30<1:27:55, 20.13s/it]

For epoch 191: 


Train batch number 41: 100%|██████████| 41/41 [00:13<00:00,  2.94batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  3.04batches/s]



Metrics: {'train_loss': 0.0064306379488964635, 'test_loss': 0.6258048340678215, 'bleu': 19.1497, 'gen_len': 9.5548}




 42%|████▏     | 188/449 [1:10:50<1:27:17, 20.07s/it]

For epoch 192: 


Train batch number 41: 100%|██████████| 41/41 [00:13<00:00,  2.96batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  3.00batches/s]



Metrics: {'train_loss': 0.006957451443801202, 'test_loss': 0.6289947509765625, 'bleu': 19.1434, 'gen_len': 9.5411}




 42%|████▏     | 189/449 [1:11:10<1:26:51, 20.04s/it]

For epoch 193: 


Train batch number 41: 100%|██████████| 41/41 [00:13<00:00,  2.95batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.97batches/s]



Metrics: {'train_loss': 0.006673012036693896, 'test_loss': 0.6219037979841232, 'bleu': 20.1, 'gen_len': 9.774}




 42%|████▏     | 190/449 [1:11:30<1:26:29, 20.04s/it]

For epoch 194: 


Train batch number 41: 100%|██████████| 41/41 [00:13<00:00,  2.98batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  3.00batches/s]



Metrics: {'train_loss': 0.006405195863008863, 'test_loss': 0.6310153424739837, 'bleu': 19.1491, 'gen_len': 9.5959}




 43%|████▎     | 191/449 [1:11:50<1:26:01, 20.01s/it]

For epoch 195: 


Train batch number 41: 100%|██████████| 41/41 [00:13<00:00,  2.96batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  3.05batches/s]



Metrics: {'train_loss': 0.006288696046373467, 'test_loss': 0.6422816693782807, 'bleu': 19.1729, 'gen_len': 9.5616}




 43%|████▎     | 192/449 [1:12:10<1:25:33, 19.98s/it]

For epoch 196: 


Train batch number 41: 100%|██████████| 41/41 [00:13<00:00,  2.95batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  3.00batches/s]



Metrics: {'train_loss': 0.006390763770380035, 'test_loss': 0.6280773714184761, 'bleu': 19.5695, 'gen_len': 9.589}




 43%|████▎     | 193/449 [1:12:30<1:25:24, 20.02s/it]

For epoch 197: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.92batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.87batches/s]



Metrics: {'train_loss': 0.006056273049425061, 'test_loss': 0.6526578038930893, 'bleu': 19.8483, 'gen_len': 9.5411}




 43%|████▎     | 194/449 [1:12:50<1:25:28, 20.11s/it]

For epoch 198: 


Train batch number 41: 100%|██████████| 41/41 [00:13<00:00,  2.96batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  3.06batches/s]



Metrics: {'train_loss': 0.006255105734098612, 'test_loss': 0.6403055965900422, 'bleu': 16.4228, 'gen_len': 9.4932}




 43%|████▎     | 195/449 [1:13:10<1:24:54, 20.06s/it]

For epoch 199: 


Train batch number 41: 100%|██████████| 41/41 [00:13<00:00,  2.94batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  3.03batches/s]



Metrics: {'train_loss': 0.00583520013180266, 'test_loss': 0.6352569982409477, 'bleu': 16.9841, 'gen_len': 9.2534}




 44%|████▎     | 196/449 [1:13:30<1:24:28, 20.03s/it]

For epoch 200: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.92batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.90batches/s]



Metrics: {'train_loss': 0.006287931373751745, 'test_loss': 0.6203261718153954, 'bleu': 17.3967, 'gen_len': 9.9452}




 44%|████▍     | 197/449 [1:13:50<1:24:30, 20.12s/it]

For epoch 201: 


Train batch number 41: 100%|██████████| 41/41 [00:13<00:00,  2.97batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.97batches/s]



Metrics: {'train_loss': 0.005807065856415869, 'test_loss': 0.6445825502276421, 'bleu': 15.9972, 'gen_len': 9.4521}




 44%|████▍     | 198/449 [1:14:10<1:24:00, 20.08s/it]

For epoch 202: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.79batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.46batches/s]



Metrics: {'train_loss': 0.006413693091173361, 'test_loss': 0.6295246839523315, 'bleu': 15.8637, 'gen_len': 9.6301}




 44%|████▍     | 199/449 [1:14:32<1:25:27, 20.51s/it]

For epoch 203: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.70batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.29batches/s]



Metrics: {'train_loss': 0.006222178733612343, 'test_loss': 0.6275901600718499, 'bleu': 15.5135, 'gen_len': 9.6849}




 45%|████▍     | 200/449 [1:14:54<1:27:28, 21.08s/it]

For epoch 204: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.67batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.48batches/s]



Metrics: {'train_loss': 0.0063631210118395886, 'test_loss': 0.620643812417984, 'bleu': 18.822, 'gen_len': 10.0205}




 45%|████▍     | 201/449 [1:15:17<1:28:39, 21.45s/it]

For epoch 205: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.72batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.43batches/s]



Metrics: {'train_loss': 0.006801167468926529, 'test_loss': 0.6171383157372474, 'bleu': 15.8256, 'gen_len': 9.6164}




 45%|████▍     | 202/449 [1:15:39<1:29:00, 21.62s/it]

For epoch 206: 


Train batch number 41: 100%|██████████| 41/41 [00:16<00:00,  2.42batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:05<00:00,  1.77batches/s]



Metrics: {'train_loss': 0.0062879685244363985, 'test_loss': 0.6249186746776104, 'bleu': 17.8167, 'gen_len': 9.6027}




 45%|████▌     | 203/449 [1:16:04<1:33:31, 22.81s/it]

For epoch 207: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.71batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.45batches/s]



Metrics: {'train_loss': 0.00602366314723906, 'test_loss': 0.6249963089823722, 'bleu': 18.6648, 'gen_len': 9.6849}




 45%|████▌     | 204/449 [1:16:27<1:32:37, 22.68s/it]

For epoch 208: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.64batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.21batches/s]



Metrics: {'train_loss': 0.006617263828336102, 'test_loss': 0.6173440784215927, 'bleu': 18.8181, 'gen_len': 9.5205}




 46%|████▌     | 205/449 [1:16:49<1:32:24, 22.72s/it]

For epoch 209: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.71batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.28batches/s]



Metrics: {'train_loss': 0.006290939931825894, 'test_loss': 0.6154206536710263, 'bleu': 17.2715, 'gen_len': 9.6918}




 46%|████▌     | 206/449 [1:17:12<1:31:46, 22.66s/it]

For epoch 210: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.68batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:05<00:00,  1.98batches/s]



Metrics: {'train_loss': 0.006646397551975962, 'test_loss': 0.6131063863635063, 'bleu': 18.5513, 'gen_len': 9.2808}




 46%|████▌     | 207/449 [1:17:35<1:31:56, 22.80s/it]

For epoch 211: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.60batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.53batches/s]



Metrics: {'train_loss': 0.006084435809653525, 'test_loss': 0.6156911239027977, 'bleu': 17.0143, 'gen_len': 9.5822}




 46%|████▋     | 208/449 [1:17:58<1:31:12, 22.71s/it]

For epoch 212: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.66batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.18batches/s]



Metrics: {'train_loss': 0.006277245628397639, 'test_loss': 0.6072412982583046, 'bleu': 20.214, 'gen_len': 9.5685}




 47%|████▋     | 209/449 [1:18:20<1:30:57, 22.74s/it]

For epoch 213: 


Train batch number 41: 100%|██████████| 41/41 [00:16<00:00,  2.55batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.15batches/s]



Metrics: {'train_loss': 0.006723177539775285, 'test_loss': 0.611415559053421, 'bleu': 16.9017, 'gen_len': 9.8151}




 47%|████▋     | 210/449 [1:18:44<1:31:47, 23.04s/it]

For epoch 214: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.62batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.13batches/s]



Metrics: {'train_loss': 0.005909713061821715, 'test_loss': 0.6300215274095535, 'bleu': 18.9501, 'gen_len': 9.774}




 47%|████▋     | 211/449 [1:19:08<1:32:31, 23.32s/it]

For epoch 215: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.72batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.43batches/s]



Metrics: {'train_loss': 0.0064808597856360235, 'test_loss': 0.6148046031594276, 'bleu': 16.7016, 'gen_len': 9.5753}




 47%|████▋     | 212/449 [1:19:30<1:30:32, 22.92s/it]

For epoch 216: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.60batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.67batches/s]



Metrics: {'train_loss': 0.00620210359281883, 'test_loss': 0.6257188424468041, 'bleu': 18.4451, 'gen_len': 9.7808}




 47%|████▋     | 213/449 [1:19:52<1:29:31, 22.76s/it]

For epoch 217: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.69batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.38batches/s]



Metrics: {'train_loss': 0.006720936056453644, 'test_loss': 0.619702224433422, 'bleu': 18.3959, 'gen_len': 9.7329}




 48%|████▊     | 214/449 [1:20:15<1:28:53, 22.69s/it]

For epoch 218: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.72batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.63batches/s]



Metrics: {'train_loss': 0.006577950504767459, 'test_loss': 0.6076844021677971, 'bleu': 18.5411, 'gen_len': 9.6096}




 48%|████▊     | 215/449 [1:20:37<1:27:18, 22.38s/it]

For epoch 219: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.65batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.62batches/s]



Metrics: {'train_loss': 0.006264011344372681, 'test_loss': 0.6118568763136863, 'bleu': 20.2037, 'gen_len': 9.7808}




 48%|████▊     | 216/449 [1:20:59<1:26:34, 22.30s/it]

For epoch 220: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.80batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.69batches/s]



Metrics: {'train_loss': 0.006619071582241393, 'test_loss': 0.6050727799534797, 'bleu': 18.9829, 'gen_len': 9.5548}




 48%|████▊     | 217/449 [1:21:20<1:24:51, 21.95s/it]

For epoch 221: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.65batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.37batches/s]



Metrics: {'train_loss': 0.006148489430637621, 'test_loss': 0.61122952029109, 'bleu': 17.6935, 'gen_len': 9.3836}




 49%|████▊     | 218/449 [1:21:42<1:25:15, 22.15s/it]

For epoch 222: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.65batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.12batches/s]



Metrics: {'train_loss': 0.005816502144514788, 'test_loss': 0.6152161836624146, 'bleu': 19.5263, 'gen_len': 9.4521}




 49%|████▉     | 219/449 [1:22:06<1:25:56, 22.42s/it]

For epoch 223: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.66batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.43batches/s]



Metrics: {'train_loss': 0.005748947622345351, 'test_loss': 0.618658323585987, 'bleu': 16.5844, 'gen_len': 9.5}




 49%|████▉     | 220/449 [1:22:28<1:25:32, 22.41s/it]

For epoch 224: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.62batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.75batches/s]



Metrics: {'train_loss': 0.005869165738681104, 'test_loss': 0.6125935703516007, 'bleu': 19.752, 'gen_len': 9.7671}




 49%|████▉     | 221/449 [1:22:50<1:25:17, 22.44s/it]

For epoch 225: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.76batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.73batches/s]



Metrics: {'train_loss': 0.0062658970880254015, 'test_loss': 0.615341791510582, 'bleu': 17.584, 'gen_len': 9.8014}




 49%|████▉     | 222/449 [1:23:12<1:23:45, 22.14s/it]

For epoch 226: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.59batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.59batches/s]



Metrics: {'train_loss': 0.006270143470340749, 'test_loss': 0.6159151867032051, 'bleu': 19.2532, 'gen_len': 9.9247}




 50%|████▉     | 223/449 [1:23:34<1:23:50, 22.26s/it]

For epoch 227: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.75batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.48batches/s]



Metrics: {'train_loss': 0.007090855900925107, 'test_loss': 0.6251556664705277, 'bleu': 16.8923, 'gen_len': 9.8288}




 50%|████▉     | 224/449 [1:23:56<1:22:53, 22.10s/it]

For epoch 228: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.73batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.34batches/s]



Metrics: {'train_loss': 0.0061919169369857845, 'test_loss': 0.630264376103878, 'bleu': 17.7243, 'gen_len': 9.6301}




 50%|█████     | 225/449 [1:24:18<1:22:15, 22.04s/it]

For epoch 229: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.70batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.39batches/s]



Metrics: {'train_loss': 0.005910671111650583, 'test_loss': 0.6321415677666664, 'bleu': 19.5032, 'gen_len': 10.0411}




 50%|█████     | 226/449 [1:24:40<1:22:00, 22.07s/it]

For epoch 230: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.70batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.58batches/s]



Metrics: {'train_loss': 0.005604820263503891, 'test_loss': 0.6305096954107284, 'bleu': 19.2403, 'gen_len': 9.8699}




 51%|█████     | 227/449 [1:25:02<1:21:35, 22.05s/it]

For epoch 231: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.65batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.46batches/s]



Metrics: {'train_loss': 0.006768758391688873, 'test_loss': 0.6094621136784554, 'bleu': 18.5604, 'gen_len': 9.8151}




 51%|█████     | 228/449 [1:25:25<1:21:32, 22.14s/it]

For epoch 232: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.71batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.15batches/s]



Metrics: {'train_loss': 0.005648529926519387, 'test_loss': 0.616222807765007, 'bleu': 19.3392, 'gen_len': 9.5959}




 51%|█████     | 229/449 [1:25:47<1:21:46, 22.30s/it]

For epoch 233: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.71batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.51batches/s]



Metrics: {'train_loss': 0.006098478370936724, 'test_loss': 0.6230017840862274, 'bleu': 18.7878, 'gen_len': 9.5411}




 51%|█████     | 230/449 [1:26:09<1:21:19, 22.28s/it]

For epoch 234: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.63batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.38batches/s]



Metrics: {'train_loss': 0.00601056911812232, 'test_loss': 0.6042679086327553, 'bleu': 18.5478, 'gen_len': 9.5753}




 51%|█████▏    | 231/449 [1:26:32<1:21:42, 22.49s/it]

For epoch 235: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.74batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.65batches/s]



Metrics: {'train_loss': 0.006211358468934167, 'test_loss': 0.6200787290930748, 'bleu': 17.9973, 'gen_len': 9.5205}




 52%|█████▏    | 232/449 [1:26:54<1:20:23, 22.23s/it]

For epoch 236: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.63batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.40batches/s]



Metrics: {'train_loss': 0.0058362238881428065, 'test_loss': 0.6258792921900749, 'bleu': 17.7419, 'gen_len': 9.3562}




 52%|█████▏    | 233/449 [1:27:17<1:20:27, 22.35s/it]

For epoch 237: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.68batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.51batches/s]



Metrics: {'train_loss': 0.005931672038192429, 'test_loss': 0.6228178173303605, 'bleu': 17.937, 'gen_len': 9.5959}




 52%|█████▏    | 234/449 [1:27:39<1:19:46, 22.26s/it]

For epoch 238: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.70batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:05<00:00,  1.99batches/s]



Metrics: {'train_loss': 0.005886389896637056, 'test_loss': 0.6017428085207939, 'bleu': 17.7223, 'gen_len': 9.6438}




 52%|█████▏    | 235/449 [1:28:02<1:20:38, 22.61s/it]

For epoch 239: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.77batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.62batches/s]



Metrics: {'train_loss': 0.006211573718024827, 'test_loss': 0.6118479132652282, 'bleu': 17.8712, 'gen_len': 9.6096}




 53%|█████▎    | 236/449 [1:28:24<1:19:08, 22.29s/it]

For epoch 240: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.67batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.65batches/s]



Metrics: {'train_loss': 0.005462946097661809, 'test_loss': 0.6123753726482392, 'bleu': 20.4441, 'gen_len': 9.5753}




 53%|█████▎    | 237/449 [1:28:46<1:18:22, 22.18s/it]

For epoch 241: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.63batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.28batches/s]



Metrics: {'train_loss': 0.005573474055315118, 'test_loss': 0.6113073706626893, 'bleu': 20.1461, 'gen_len': 9.8082}




 53%|█████▎    | 238/449 [1:29:08<1:18:38, 22.36s/it]

For epoch 242: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.59batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.45batches/s]



Metrics: {'train_loss': 0.005739042606409185, 'test_loss': 0.6191768422722816, 'bleu': 18.8157, 'gen_len': 9.5753}




 53%|█████▎    | 239/449 [1:29:31<1:18:37, 22.46s/it]

For epoch 243: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.76batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.34batches/s]



Metrics: {'train_loss': 0.005223003289912168, 'test_loss': 0.6238937109708786, 'bleu': 19.0911, 'gen_len': 9.7329}




 53%|█████▎    | 240/449 [1:29:53<1:17:38, 22.29s/it]

For epoch 244: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.72batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.53batches/s]



Metrics: {'train_loss': 0.005741426173211416, 'test_loss': 0.6153867527842521, 'bleu': 20.2841, 'gen_len': 9.589}




 54%|█████▎    | 241/449 [1:30:15<1:16:38, 22.11s/it]

For epoch 245: 


Train batch number 41: 100%|██████████| 41/41 [00:16<00:00,  2.55batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.43batches/s]



Metrics: {'train_loss': 0.005201519574833716, 'test_loss': 0.6168542847037315, 'bleu': 21.2418, 'gen_len': 9.5479}




 54%|█████▍    | 242/449 [1:30:40<1:19:04, 22.92s/it]

For epoch 246: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.61batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.33batches/s]



Metrics: {'train_loss': 0.005563015521435839, 'test_loss': 0.6146748855710029, 'bleu': 21.0018, 'gen_len': 9.8014}




 54%|█████▍    | 243/449 [1:31:03<1:18:55, 22.99s/it]

For epoch 247: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.64batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.30batches/s]



Metrics: {'train_loss': 0.006269291917816168, 'test_loss': 0.6185217067599297, 'bleu': 20.1469, 'gen_len': 9.6027}




 54%|█████▍    | 244/449 [1:31:25<1:18:21, 22.94s/it]

For epoch 248: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.71batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.38batches/s]



Metrics: {'train_loss': 0.005900808206827539, 'test_loss': 0.6107195734977722, 'bleu': 18.4802, 'gen_len': 9.4658}




 55%|█████▍    | 245/449 [1:31:48<1:17:34, 22.81s/it]

For epoch 249: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.73batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.27batches/s]



Metrics: {'train_loss': 0.005577081365783404, 'test_loss': 0.5986434295773506, 'bleu': 19.7786, 'gen_len': 9.6781}




 55%|█████▍    | 246/449 [1:32:10<1:16:42, 22.67s/it]

For epoch 250: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.59batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.21batches/s]



Metrics: {'train_loss': 0.006441273766274496, 'test_loss': 0.610493828356266, 'bleu': 18.788, 'gen_len': 9.6712}




 55%|█████▌    | 247/449 [1:32:34<1:17:02, 22.88s/it]

For epoch 251: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.77batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.65batches/s]



Metrics: {'train_loss': 0.006408923305571079, 'test_loss': 0.6060579404234886, 'bleu': 19.7324, 'gen_len': 9.5548}




 55%|█████▌    | 248/449 [1:32:55<1:15:18, 22.48s/it]

For epoch 252: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.64batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.50batches/s]



Metrics: {'train_loss': 0.00638327934131844, 'test_loss': 0.6102750174701214, 'bleu': 19.2115, 'gen_len': 9.4247}




 55%|█████▌    | 249/449 [1:33:17<1:14:40, 22.40s/it]

For epoch 253: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.75batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.47batches/s]



Metrics: {'train_loss': 0.006076691609739167, 'test_loss': 0.6096532329916954, 'bleu': 19.5156, 'gen_len': 9.5959}




 56%|█████▌    | 250/449 [1:33:39<1:13:48, 22.25s/it]

For epoch 254: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.78batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.40batches/s]



Metrics: {'train_loss': 0.005985717326061937, 'test_loss': 0.6036929786205292, 'bleu': 20.1895, 'gen_len': 9.5822}




 56%|█████▌    | 251/449 [1:34:01<1:13:00, 22.12s/it]

For epoch 255: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.64batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.43batches/s]



Metrics: {'train_loss': 0.006120793617943801, 'test_loss': 0.6061703771352768, 'bleu': 20.5674, 'gen_len': 9.6096}




 56%|█████▌    | 252/449 [1:34:23<1:12:45, 22.16s/it]

For epoch 256: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.68batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.35batches/s]



Metrics: {'train_loss': 0.00546574419954928, 'test_loss': 0.6078127831220627, 'bleu': 20.8006, 'gen_len': 9.6781}




 56%|█████▋    | 253/449 [1:34:46<1:12:44, 22.27s/it]

For epoch 257: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.65batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.68batches/s]



Metrics: {'train_loss': 0.005598054304378244, 'test_loss': 0.6096767969429493, 'bleu': 21.6858, 'gen_len': 9.5068}




 57%|█████▋    | 254/449 [1:35:10<1:14:06, 22.80s/it]

For epoch 258: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.66batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.25batches/s]



Metrics: {'train_loss': 0.0053477780950232975, 'test_loss': 0.6170525580644608, 'bleu': 20.021, 'gen_len': 9.6438}




 57%|█████▋    | 255/449 [1:35:33<1:13:42, 22.80s/it]

For epoch 259: 


Train batch number 41: 100%|██████████| 41/41 [00:16<00:00,  2.49batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.54batches/s]



Metrics: {'train_loss': 0.005265989948472962, 'test_loss': 0.6304920293390751, 'bleu': 19.3902, 'gen_len': 9.7603}




 57%|█████▋    | 256/449 [1:35:56<1:13:48, 22.95s/it]

For epoch 260: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.69batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.48batches/s]



Metrics: {'train_loss': 0.005593861578168665, 'test_loss': 0.631471911072731, 'bleu': 18.4532, 'gen_len': 9.4452}




 57%|█████▋    | 257/449 [1:36:19<1:13:00, 22.82s/it]

For epoch 261: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.64batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.15batches/s]



Metrics: {'train_loss': 0.005802378149294272, 'test_loss': 0.6187061265110969, 'bleu': 20.7677, 'gen_len': 9.3904}




 57%|█████▋    | 258/449 [1:36:42<1:12:55, 22.91s/it]

For epoch 262: 


Train batch number 41: 100%|██████████| 41/41 [00:16<00:00,  2.48batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.47batches/s]



Metrics: {'train_loss': 0.00561089685652405, 'test_loss': 0.6199517823755741, 'bleu': 21.2933, 'gen_len': 9.1301}




 58%|█████▊    | 259/449 [1:37:05<1:13:15, 23.13s/it]

For epoch 263: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.68batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.50batches/s]



Metrics: {'train_loss': 0.005614021355740544, 'test_loss': 0.6200786739587784, 'bleu': 19.3986, 'gen_len': 9.4178}




 58%|█████▊    | 260/449 [1:37:28<1:12:00, 22.86s/it]

For epoch 264: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.64batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.55batches/s]



Metrics: {'train_loss': 0.005910205052847542, 'test_loss': 0.6062708847224713, 'bleu': 21.4104, 'gen_len': 9.4041}




 58%|█████▊    | 261/449 [1:37:50<1:11:09, 22.71s/it]

For epoch 265: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.58batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.27batches/s]



Metrics: {'train_loss': 0.005940632238734241, 'test_loss': 0.6167660258710385, 'bleu': 21.6235, 'gen_len': 9.3904}




 58%|█████▊    | 262/449 [1:38:13<1:11:19, 22.88s/it]

For epoch 266: 


Train batch number 41: 100%|██████████| 41/41 [00:16<00:00,  2.51batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.24batches/s]



Metrics: {'train_loss': 0.00593504200621349, 'test_loss': 0.6251007162034512, 'bleu': 19.7939, 'gen_len': 9.2603}




 59%|█████▊    | 263/449 [1:38:37<1:12:02, 23.24s/it]

For epoch 267: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.57batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.20batches/s]



Metrics: {'train_loss': 0.0061766636989465575, 'test_loss': 0.6233701847493649, 'bleu': 21.4783, 'gen_len': 9.411}




 59%|█████▉    | 264/449 [1:39:01<1:11:51, 23.31s/it]

For epoch 268: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.69batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.33batches/s]



Metrics: {'train_loss': 0.005880159479225191, 'test_loss': 0.6218682944774627, 'bleu': 20.2454, 'gen_len': 9.4795}




 59%|█████▉    | 265/449 [1:39:23<1:10:34, 23.01s/it]

For epoch 269: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.75batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.57batches/s]



Metrics: {'train_loss': 0.00542269950085206, 'test_loss': 0.6256695717573166, 'bleu': 20.0682, 'gen_len': 9.3699}




 59%|█████▉    | 266/449 [1:39:45<1:08:51, 22.58s/it]

For epoch 270: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.75batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.64batches/s]



Metrics: {'train_loss': 0.005892054041529574, 'test_loss': 0.625434336066246, 'bleu': 20.3834, 'gen_len': 9.4589}




 59%|█████▉    | 267/449 [1:40:06<1:07:34, 22.28s/it]

For epoch 271: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.76batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.71batches/s]



Metrics: {'train_loss': 0.005672131199389696, 'test_loss': 0.608236075937748, 'bleu': 20.9667, 'gen_len': 9.6986}




 60%|█████▉    | 268/449 [1:40:28<1:06:44, 22.12s/it]

For epoch 272: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.80batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.62batches/s]



Metrics: {'train_loss': 0.005585104232744836, 'test_loss': 0.6371930778026581, 'bleu': 20.3529, 'gen_len': 9.8219}




 60%|█████▉    | 269/449 [1:40:49<1:05:32, 21.84s/it]

For epoch 273: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.78batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.68batches/s]



Metrics: {'train_loss': 0.005901679930436175, 'test_loss': 0.6206578597426414, 'bleu': 20.1138, 'gen_len': 9.5479}




 60%|██████    | 270/449 [1:41:10<1:04:35, 21.65s/it]

For epoch 274: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.74batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.64batches/s]



Metrics: {'train_loss': 0.0058140170976247, 'test_loss': 0.6308314383029938, 'bleu': 19.7197, 'gen_len': 9.6507}




 60%|██████    | 271/449 [1:41:32<1:04:07, 21.61s/it]

For epoch 275: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.81batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.69batches/s]



Metrics: {'train_loss': 0.0056110285307712305, 'test_loss': 0.6204710960388183, 'bleu': 20.3651, 'gen_len': 9.6233}




 61%|██████    | 272/449 [1:41:53<1:03:17, 21.45s/it]

For epoch 276: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.78batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.64batches/s]



Metrics: {'train_loss': 0.005196372061263679, 'test_loss': 0.6154263019561768, 'bleu': 20.5297, 'gen_len': 9.9384}




 61%|██████    | 273/449 [1:42:14<1:02:49, 21.42s/it]

For epoch 277: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.72batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.36batches/s]



Metrics: {'train_loss': 0.006081709175416064, 'test_loss': 0.6241504073143005, 'bleu': 19.0063, 'gen_len': 9.3699}




 61%|██████    | 274/449 [1:42:37<1:03:10, 21.66s/it]

For epoch 278: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.72batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.31batches/s]



Metrics: {'train_loss': 0.00587000434355038, 'test_loss': 0.6067283391952515, 'bleu': 20.7298, 'gen_len': 9.4521}




 61%|██████    | 275/449 [1:42:59<1:03:35, 21.93s/it]

For epoch 279: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.62batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.66batches/s]



Metrics: {'train_loss': 0.005436277320812933, 'test_loss': 0.6212701216340065, 'bleu': 20.7936, 'gen_len': 9.8219}




 61%|██████▏   | 276/449 [1:43:22<1:03:39, 22.08s/it]

For epoch 280: 


Train batch number 41: 100%|██████████| 41/41 [00:17<00:00,  2.39batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:05<00:00,  1.69batches/s]



Metrics: {'train_loss': 0.006078947585348676, 'test_loss': 0.6176783531904221, 'bleu': 19.6968, 'gen_len': 10.1164}




 62%|██████▏   | 277/449 [1:43:48<1:06:42, 23.27s/it]

For epoch 281: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.60batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.12batches/s]



Metrics: {'train_loss': 0.006067163180332721, 'test_loss': 0.6110443457961082, 'bleu': 18.9621, 'gen_len': 9.637}




 62%|██████▏   | 278/449 [1:44:11<1:06:45, 23.42s/it]

For epoch 282: 


Train batch number 41: 100%|██████████| 41/41 [00:17<00:00,  2.32batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:07<00:00,  1.42batches/s]



Metrics: {'train_loss': 0.005722938156573147, 'test_loss': 0.6152283072471618, 'bleu': 20.6348, 'gen_len': 9.6301}




 62%|██████▏   | 279/449 [1:44:39<1:10:16, 24.80s/it]

For epoch 283: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.58batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:05<00:00,  1.99batches/s]



Metrics: {'train_loss': 0.0054624008139731684, 'test_loss': 0.6246601089835166, 'bleu': 19.3483, 'gen_len': 9.6507}




 62%|██████▏   | 280/449 [1:45:04<1:09:22, 24.63s/it]

For epoch 284: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.58batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.27batches/s]



Metrics: {'train_loss': 0.005357607489270045, 'test_loss': 0.6147802591323852, 'bleu': 20.8434, 'gen_len': 9.4247}




 63%|██████▎   | 281/449 [1:45:27<1:07:56, 24.27s/it]

For epoch 285: 


Train batch number 41: 100%|██████████| 41/41 [00:17<00:00,  2.38batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:05<00:00,  1.71batches/s]



Metrics: {'train_loss': 0.005612790803197862, 'test_loss': 0.6157907620072365, 'bleu': 20.5611, 'gen_len': 9.7192}




 63%|██████▎   | 282/449 [1:45:54<1:09:23, 24.93s/it]

For epoch 286: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.64batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:05<00:00,  1.93batches/s]



Metrics: {'train_loss': 0.005577866260598346, 'test_loss': 0.6119830563664437, 'bleu': 20.8057, 'gen_len': 9.7329}




 63%|██████▎   | 283/449 [1:46:18<1:08:19, 24.69s/it]

For epoch 287: 


Train batch number 41: 100%|██████████| 41/41 [00:16<00:00,  2.46batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:05<00:00,  1.86batches/s]



Metrics: {'train_loss': 0.0055847588256455775, 'test_loss': 0.6117132022976876, 'bleu': 19.3252, 'gen_len': 9.5753}




 63%|██████▎   | 284/449 [1:46:43<1:08:44, 24.99s/it]

For epoch 288: 


Train batch number 41: 100%|██████████| 41/41 [00:17<00:00,  2.35batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.27batches/s]



Metrics: {'train_loss': 0.005115722459400209, 'test_loss': 0.6111602693796158, 'bleu': 19.4401, 'gen_len': 9.7192}




 63%|██████▎   | 285/449 [1:47:08<1:08:21, 25.01s/it]

For epoch 289: 


Train batch number 41: 100%|██████████| 41/41 [00:17<00:00,  2.33batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.06batches/s]



Metrics: {'train_loss': 0.005424932695970666, 'test_loss': 0.6224771335721015, 'bleu': 19.204, 'gen_len': 9.5342}




 64%|██████▎   | 286/449 [1:47:34<1:08:34, 25.24s/it]

For epoch 290: 


Train batch number 41: 100%|██████████| 41/41 [00:17<00:00,  2.38batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:05<00:00,  1.82batches/s]



Metrics: {'train_loss': 0.005288798543738156, 'test_loss': 0.6079769507050514, 'bleu': 19.7829, 'gen_len': 9.7329}




 64%|██████▍   | 287/449 [1:48:01<1:09:09, 25.62s/it]

For epoch 291: 


Train batch number 41: 100%|██████████| 41/41 [00:18<00:00,  2.25batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:06<00:00,  1.63batches/s]



Metrics: {'train_loss': 0.00541955870349051, 'test_loss': 0.609216658771038, 'bleu': 20.0061, 'gen_len': 9.9178}




 64%|██████▍   | 288/449 [1:48:29<1:10:59, 26.45s/it]

For epoch 292: 


Train batch number 41: 100%|██████████| 41/41 [00:17<00:00,  2.35batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:05<00:00,  1.99batches/s]



Metrics: {'train_loss': 0.005574780573114389, 'test_loss': 0.603496091067791, 'bleu': 19.1421, 'gen_len': 9.5959}




 64%|██████▍   | 289/449 [1:48:55<1:10:23, 26.39s/it]

For epoch 293: 


Train batch number 41: 100%|██████████| 41/41 [00:17<00:00,  2.37batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:07<00:00,  1.38batches/s]



Metrics: {'train_loss': 0.005371977370686647, 'test_loss': 0.6084339492022991, 'bleu': 19.9553, 'gen_len': 9.5753}




 65%|██████▍   | 290/449 [1:49:23<1:10:59, 26.79s/it]

For epoch 294: 


Train batch number 41: 100%|██████████| 41/41 [00:17<00:00,  2.39batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:05<00:00,  1.98batches/s]



Metrics: {'train_loss': 0.005336340489831367, 'test_loss': 0.6049511007964611, 'bleu': 21.2538, 'gen_len': 9.4589}




 65%|██████▍   | 291/449 [1:49:49<1:09:31, 26.40s/it]

For epoch 295: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.57batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:05<00:00,  1.88batches/s]



Metrics: {'train_loss': 0.005418325552898572, 'test_loss': 0.6058187253773213, 'bleu': 21.3402, 'gen_len': 9.6164}




 65%|██████▌   | 292/449 [1:50:13<1:07:11, 25.68s/it]

For epoch 296: 


Train batch number 41: 100%|██████████| 41/41 [00:16<00:00,  2.44batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.03batches/s]



Metrics: {'train_loss': 0.0053107766434550285, 'test_loss': 0.617085599899292, 'bleu': 19.7866, 'gen_len': 9.774}




 65%|██████▌   | 293/449 [1:50:37<1:06:06, 25.43s/it]

For epoch 297: 


Train batch number 41: 100%|██████████| 41/41 [00:16<00:00,  2.48batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.09batches/s]



Metrics: {'train_loss': 0.005436259546180869, 'test_loss': 0.6154283702373504, 'bleu': 20.5247, 'gen_len': 9.6712}




 65%|██████▌   | 294/449 [1:51:02<1:05:01, 25.17s/it]

For epoch 298: 


Train batch number 41: 100%|██████████| 41/41 [00:16<00:00,  2.45batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.17batches/s]



Metrics: {'train_loss': 0.005271567724554277, 'test_loss': 0.611473998427391, 'bleu': 18.4766, 'gen_len': 9.7192}




 66%|██████▌   | 295/449 [1:51:26<1:04:04, 24.97s/it]

For epoch 299: 


Train batch number 41: 100%|██████████| 41/41 [00:16<00:00,  2.51batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.21batches/s]



Metrics: {'train_loss': 0.0057518299436196685, 'test_loss': 0.6053631715476513, 'bleu': 21.7111, 'gen_len': 9.9315}




 66%|██████▌   | 296/449 [1:51:53<1:04:41, 25.37s/it]

For epoch 300: 


Train batch number 41: 100%|██████████| 41/41 [00:16<00:00,  2.52batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:05<00:00,  2.00batches/s]



Metrics: {'train_loss': 0.005083606169536346, 'test_loss': 0.6162916824221611, 'bleu': 21.2416, 'gen_len': 9.6644}




 66%|██████▌   | 297/449 [1:52:18<1:03:47, 25.18s/it]

For epoch 301: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.73batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.39batches/s]



Metrics: {'train_loss': 0.005064447927175135, 'test_loss': 0.6256087586283684, 'bleu': 20.5749, 'gen_len': 9.4384}




 66%|██████▋   | 298/449 [1:52:40<1:01:13, 24.33s/it]

For epoch 302: 


Train batch number 41: 100%|██████████| 41/41 [00:16<00:00,  2.45batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.18batches/s]



Metrics: {'train_loss': 0.0052582813120197235, 'test_loss': 0.6182479217648507, 'bleu': 20.1607, 'gen_len': 9.4041}




 67%|██████▋   | 299/449 [1:53:05<1:01:04, 24.43s/it]

For epoch 303: 


Train batch number 41: 100%|██████████| 41/41 [00:16<00:00,  2.48batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.02batches/s]



Metrics: {'train_loss': 0.005529455539611418, 'test_loss': 0.6117510922253132, 'bleu': 18.9016, 'gen_len': 9.6781}




 67%|██████▋   | 300/449 [1:53:30<1:01:06, 24.60s/it]

For epoch 304: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.61batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:05<00:00,  1.84batches/s]



Metrics: {'train_loss': 0.005796560211243426, 'test_loss': 0.6092315807938575, 'bleu': 19.4563, 'gen_len': 9.5068}




 67%|██████▋   | 301/449 [1:53:54<1:00:32, 24.55s/it]

For epoch 305: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.61batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.07batches/s]



Metrics: {'train_loss': 0.005197645625008679, 'test_loss': 0.6215727835893631, 'bleu': 19.5206, 'gen_len': 9.8904}




 67%|██████▋   | 302/449 [1:54:17<59:21, 24.23s/it]  

For epoch 306: 


Train batch number 41: 100%|██████████| 41/41 [00:16<00:00,  2.54batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.37batches/s]



Metrics: {'train_loss': 0.005294773591364302, 'test_loss': 0.6222190171480179, 'bleu': 21.1073, 'gen_len': 9.4521}




 67%|██████▋   | 303/449 [1:54:41<58:45, 24.14s/it]

For epoch 307: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.59batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.11batches/s]



Metrics: {'train_loss': 0.005503720030324851, 'test_loss': 0.6220301821827888, 'bleu': 20.8115, 'gen_len': 9.6438}




 68%|██████▊   | 304/449 [1:55:05<58:07, 24.05s/it]

For epoch 308: 


Train batch number 41: 100%|██████████| 41/41 [00:16<00:00,  2.44batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:05<00:00,  1.78batches/s]



Metrics: {'train_loss': 0.005515114199824449, 'test_loss': 0.6182497859001159, 'bleu': 20.8793, 'gen_len': 9.8219}




 68%|██████▊   | 305/449 [1:55:31<58:58, 24.57s/it]

For epoch 309: 


Train batch number 41: 100%|██████████| 41/41 [00:18<00:00,  2.24batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:06<00:00,  1.59batches/s]



Metrics: {'train_loss': 0.005402204566408039, 'test_loss': 0.61879104077816, 'bleu': 19.2673, 'gen_len': 9.8562}




 68%|██████▊   | 306/449 [1:55:59<1:00:50, 25.53s/it]

For epoch 310: 


Train batch number 41: 100%|██████████| 41/41 [00:16<00:00,  2.48batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.16batches/s]



Metrics: {'train_loss': 0.005497448413246652, 'test_loss': 0.6107971385121346, 'bleu': 18.6029, 'gen_len': 9.8356}




 68%|██████▊   | 307/449 [1:56:23<59:45, 25.25s/it]  

For epoch 311: 


Train batch number 41: 100%|██████████| 41/41 [00:17<00:00,  2.39batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.00batches/s]



Metrics: {'train_loss': 0.005757493966418069, 'test_loss': 0.6079165279865265, 'bleu': 20.2, 'gen_len': 9.6712}




 69%|██████▊   | 308/449 [1:56:49<59:23, 25.27s/it]

For epoch 312: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.57batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.17batches/s]



Metrics: {'train_loss': 0.005452504018094481, 'test_loss': 0.6121256649494171, 'bleu': 17.9427, 'gen_len': 9.4178}




 69%|██████▉   | 309/449 [1:57:13<58:15, 24.97s/it]

For epoch 313: 


Train batch number 41: 100%|██████████| 41/41 [00:17<00:00,  2.34batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:05<00:00,  1.94batches/s]



Metrics: {'train_loss': 0.005573575464772947, 'test_loss': 0.6064021170139313, 'bleu': 18.6929, 'gen_len': 9.4795}




 69%|██████▉   | 310/449 [1:57:39<58:20, 25.18s/it]

For epoch 314: 


Train batch number 41: 100%|██████████| 41/41 [00:16<00:00,  2.54batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:05<00:00,  1.75batches/s]



Metrics: {'train_loss': 0.005792282933437425, 'test_loss': 0.6259278506040573, 'bleu': 19.4498, 'gen_len': 9.726}




 69%|██████▉   | 311/449 [1:58:04<58:18, 25.35s/it]

For epoch 315: 


Train batch number 41: 100%|██████████| 41/41 [00:16<00:00,  2.48batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.44batches/s]



Metrics: {'train_loss': 0.00563583569885118, 'test_loss': 0.5952006876468658, 'bleu': 17.4809, 'gen_len': 9.7877}




 69%|██████▉   | 312/449 [1:58:28<57:02, 24.98s/it]

For epoch 316: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.57batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.41batches/s]



Metrics: {'train_loss': 0.005010281467991994, 'test_loss': 0.6110283672809601, 'bleu': 16.8798, 'gen_len': 9.4726}




 70%|██████▉   | 313/449 [1:58:52<55:25, 24.45s/it]

For epoch 317: 


Train batch number 41: 100%|██████████| 41/41 [00:16<00:00,  2.42batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:06<00:00,  1.64batches/s]



Metrics: {'train_loss': 0.004907714035475581, 'test_loss': 0.6117600709199905, 'bleu': 19.1569, 'gen_len': 9.4863}




 70%|██████▉   | 314/449 [1:59:18<56:15, 25.00s/it]

For epoch 318: 


Train batch number 41: 100%|██████████| 41/41 [00:16<00:00,  2.56batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.06batches/s]



Metrics: {'train_loss': 0.005170127575671891, 'test_loss': 0.6302130959928036, 'bleu': 17.4588, 'gen_len': 9.4247}




 70%|███████   | 315/449 [1:59:42<55:21, 24.79s/it]

For epoch 319: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.60batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.07batches/s]



Metrics: {'train_loss': 0.005378114590534895, 'test_loss': 0.6093520805239677, 'bleu': 20.5156, 'gen_len': 9.6438}




 70%|███████   | 316/449 [2:00:06<54:16, 24.48s/it]

For epoch 320: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.66batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.23batches/s]



Metrics: {'train_loss': 0.005565198845366334, 'test_loss': 0.6056035682559013, 'bleu': 19.1875, 'gen_len': 9.9795}




 71%|███████   | 317/449 [2:00:29<53:02, 24.11s/it]

For epoch 321: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.66batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.21batches/s]



Metrics: {'train_loss': 0.005222496392046351, 'test_loss': 0.6157307118177414, 'bleu': 18.9822, 'gen_len': 9.7945}




 71%|███████   | 318/449 [2:00:52<51:55, 23.78s/it]

For epoch 322: 


Train batch number 41: 100%|██████████| 41/41 [00:16<00:00,  2.52batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.32batches/s]



Metrics: {'train_loss': 0.005360883007552929, 'test_loss': 0.6167121574282646, 'bleu': 17.882, 'gen_len': 9.5959}




 71%|███████   | 319/449 [2:01:16<51:38, 23.84s/it]

For epoch 323: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.64batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.28batches/s]



Metrics: {'train_loss': 0.005084891383331723, 'test_loss': 0.6178382888436318, 'bleu': 18.8711, 'gen_len': 9.7945}




 71%|███████▏  | 320/449 [2:01:39<50:46, 23.62s/it]

For epoch 324: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.71batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.41batches/s]



Metrics: {'train_loss': 0.005050574713273019, 'test_loss': 0.6136833921074867, 'bleu': 19.7402, 'gen_len': 9.7466}




 71%|███████▏  | 321/449 [2:02:02<49:34, 23.24s/it]

For epoch 325: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.62batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.29batches/s]



Metrics: {'train_loss': 0.005058371518715852, 'test_loss': 0.6101640969514847, 'bleu': 18.8482, 'gen_len': 9.5959}




 72%|███████▏  | 322/449 [2:02:25<48:56, 23.13s/it]

For epoch 326: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.66batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.45batches/s]



Metrics: {'train_loss': 0.005292892691724729, 'test_loss': 0.6177493378520011, 'bleu': 18.8231, 'gen_len': 9.589}




 72%|███████▏  | 323/449 [2:02:47<48:17, 22.99s/it]

For epoch 327: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.69batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.28batches/s]



Metrics: {'train_loss': 0.004921582888062225, 'test_loss': 0.6231163799762726, 'bleu': 18.8908, 'gen_len': 9.8219}




 72%|███████▏  | 324/449 [2:03:10<47:54, 22.99s/it]

For epoch 328: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.62batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:05<00:00,  1.79batches/s]



Metrics: {'train_loss': 0.004794167654259448, 'test_loss': 0.6120824903249741, 'bleu': 19.4561, 'gen_len': 9.6918}




 72%|███████▏  | 325/449 [2:03:35<48:21, 23.40s/it]

For epoch 329: 


Train batch number 41: 100%|██████████| 41/41 [00:17<00:00,  2.38batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:05<00:00,  1.85batches/s]



Metrics: {'train_loss': 0.0050833775610776575, 'test_loss': 0.6054217353463173, 'bleu': 19.843, 'gen_len': 9.5479}




 73%|███████▎  | 326/449 [2:04:01<49:45, 24.27s/it]

For epoch 330: 


Train batch number 41: 100%|██████████| 41/41 [00:16<00:00,  2.42batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.05batches/s]



Metrics: {'train_loss': 0.005488624006918654, 'test_loss': 0.6201694220304489, 'bleu': 18.1892, 'gen_len': 9.5342}




 73%|███████▎  | 327/449 [2:04:26<49:46, 24.48s/it]

For epoch 331: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.74batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.35batches/s]



Metrics: {'train_loss': 0.005844370362621437, 'test_loss': 0.6176627784967422, 'bleu': 19.4938, 'gen_len': 9.3425}




 73%|███████▎  | 328/449 [2:04:48<48:10, 23.89s/it]

For epoch 332: 


Train batch number 41: 100%|██████████| 41/41 [00:16<00:00,  2.55batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.56batches/s]



Metrics: {'train_loss': 0.005001616726697582, 'test_loss': 0.6209966257214546, 'bleu': 19.8952, 'gen_len': 10.0411}




 73%|███████▎  | 329/449 [2:05:11<47:01, 23.51s/it]

For epoch 333: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.63batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.32batches/s]



Metrics: {'train_loss': 0.005436623426404123, 'test_loss': 0.6166266053915024, 'bleu': 19.9174, 'gen_len': 9.7329}




 73%|███████▎  | 330/449 [2:05:34<46:25, 23.41s/it]

For epoch 334: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.58batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.44batches/s]



Metrics: {'train_loss': 0.0056531468985556825, 'test_loss': 0.6098831206560135, 'bleu': 21.0343, 'gen_len': 9.5479}




 74%|███████▎  | 331/449 [2:05:57<45:46, 23.27s/it]

For epoch 335: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.73batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.38batches/s]



Metrics: {'train_loss': 0.006004312972924332, 'test_loss': 0.5943101987242698, 'bleu': 18.8904, 'gen_len': 9.8014}




 74%|███████▍  | 332/449 [2:06:19<44:46, 22.97s/it]

For epoch 336: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.68batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.33batches/s]



Metrics: {'train_loss': 0.005102030745503016, 'test_loss': 0.6148408621549606, 'bleu': 18.8232, 'gen_len': 10.0274}




 74%|███████▍  | 333/449 [2:06:42<44:05, 22.81s/it]

For epoch 337: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.68batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.46batches/s]



Metrics: {'train_loss': 0.005508925447740206, 'test_loss': 0.618354594707489, 'bleu': 16.9663, 'gen_len': 9.5411}




 74%|███████▍  | 334/449 [2:07:04<43:32, 22.72s/it]

For epoch 338: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.70batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.17batches/s]



Metrics: {'train_loss': 0.004938620545833212, 'test_loss': 0.6203341409564018, 'bleu': 17.5936, 'gen_len': 9.8904}




 75%|███████▍  | 335/449 [2:07:27<43:15, 22.77s/it]

For epoch 339: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.69batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.29batches/s]



Metrics: {'train_loss': 0.0054172021748566225, 'test_loss': 0.6206043392419816, 'bleu': 18.1251, 'gen_len': 9.6781}




 75%|███████▍  | 336/449 [2:07:50<42:49, 22.74s/it]

For epoch 340: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.71batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.29batches/s]



Metrics: {'train_loss': 0.005080511021177943, 'test_loss': 0.61743673235178, 'bleu': 19.3256, 'gen_len': 9.6575}




 75%|███████▌  | 337/449 [2:08:12<42:19, 22.68s/it]

For epoch 341: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.67batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.40batches/s]



Metrics: {'train_loss': 0.005395389374958851, 'test_loss': 0.6078358575701713, 'bleu': 18.3499, 'gen_len': 9.8904}




 75%|███████▌  | 338/449 [2:08:35<41:51, 22.63s/it]

For epoch 342: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.62batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.38batches/s]



Metrics: {'train_loss': 0.005947921226346274, 'test_loss': 0.6075932428240776, 'bleu': 20.1088, 'gen_len': 9.9315}




 76%|███████▌  | 339/449 [2:08:58<41:39, 22.72s/it]

For epoch 343: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.67batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.25batches/s]



Metrics: {'train_loss': 0.005116194196999437, 'test_loss': 0.614729805290699, 'bleu': 19.7875, 'gen_len': 9.8425}




 76%|███████▌  | 340/449 [2:09:21<41:36, 22.90s/it]

For epoch 344: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.70batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.38batches/s]



Metrics: {'train_loss': 0.005656416329169055, 'test_loss': 0.6035150617361069, 'bleu': 21.367, 'gen_len': 9.637}




 76%|███████▌  | 341/449 [2:09:43<40:52, 22.71s/it]

For epoch 345: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.67batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.47batches/s]



Metrics: {'train_loss': 0.005066274415429045, 'test_loss': 0.6108603224158287, 'bleu': 20.7899, 'gen_len': 9.6575}




 76%|███████▌  | 342/449 [2:10:06<40:26, 22.67s/it]

For epoch 346: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.65batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.35batches/s]



Metrics: {'train_loss': 0.005415162071585655, 'test_loss': 0.6179350852966309, 'bleu': 19.8973, 'gen_len': 9.3767}




 76%|███████▋  | 343/449 [2:10:29<40:06, 22.70s/it]

For epoch 347: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.70batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.32batches/s]



Metrics: {'train_loss': 0.005544820794744826, 'test_loss': 0.6009419903159141, 'bleu': 20.0093, 'gen_len': 9.7329}




 77%|███████▋  | 344/449 [2:10:51<39:34, 22.62s/it]

For epoch 348: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.61batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.39batches/s]



Metrics: {'train_loss': 0.0056320370015938105, 'test_loss': 0.6230113476514816, 'bleu': 18.6977, 'gen_len': 9.5274}




 77%|███████▋  | 345/449 [2:11:14<39:18, 22.68s/it]

For epoch 349: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.68batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.36batches/s]



Metrics: {'train_loss': 0.005232962544030714, 'test_loss': 0.609942427277565, 'bleu': 18.8347, 'gen_len': 9.9589}




 77%|███████▋  | 346/449 [2:11:37<38:54, 22.67s/it]

For epoch 350: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.63batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.33batches/s]



Metrics: {'train_loss': 0.004913514820722545, 'test_loss': 0.6098272114992142, 'bleu': 19.9183, 'gen_len': 9.4658}




 77%|███████▋  | 347/449 [2:12:00<38:45, 22.80s/it]

For epoch 351: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.65batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.41batches/s]



Metrics: {'train_loss': 0.005193402564789101, 'test_loss': 0.6066893205046654, 'bleu': 20.1424, 'gen_len': 9.4795}




 78%|███████▊  | 348/449 [2:12:23<38:23, 22.80s/it]

For epoch 352: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.57batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:06<00:00,  1.48batches/s]



Metrics: {'train_loss': 0.005126825215794691, 'test_loss': 0.6034798592329025, 'bleu': 19.4732, 'gen_len': 9.6644}




 78%|███████▊  | 349/449 [2:12:49<39:37, 23.78s/it]

For epoch 353: 


Train batch number 41: 100%|██████████| 41/41 [00:16<00:00,  2.44batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:05<00:00,  1.95batches/s]



Metrics: {'train_loss': 0.005429132370187379, 'test_loss': 0.6120451554656029, 'bleu': 19.745, 'gen_len': 9.5822}




 78%|███████▊  | 350/449 [2:13:14<39:55, 24.19s/it]

For epoch 354: 


Train batch number 41: 100%|██████████| 41/41 [00:16<00:00,  2.43batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.30batches/s]



Metrics: {'train_loss': 0.005046519294676439, 'test_loss': 0.6019822046160698, 'bleu': 20.3071, 'gen_len': 9.3904}




 78%|███████▊  | 351/449 [2:13:39<39:46, 24.35s/it]

For epoch 355: 


Train batch number 41: 100%|██████████| 41/41 [00:16<00:00,  2.45batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.35batches/s]



Metrics: {'train_loss': 0.00480946831734533, 'test_loss': 0.6061712592840195, 'bleu': 20.5876, 'gen_len': 9.6096}




 78%|███████▊  | 352/449 [2:14:03<39:27, 24.41s/it]

For epoch 356: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.62batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.30batches/s]



Metrics: {'train_loss': 0.004762501893110755, 'test_loss': 0.6072088062763215, 'bleu': 18.8331, 'gen_len': 9.9452}




 79%|███████▊  | 353/449 [2:14:26<38:27, 24.03s/it]

For epoch 357: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.74batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.30batches/s]



Metrics: {'train_loss': 0.0054873092236315335, 'test_loss': 0.5954445332288743, 'bleu': 19.553, 'gen_len': 9.5616}




 79%|███████▉  | 354/449 [2:14:49<37:15, 23.54s/it]

For epoch 358: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.77batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.53batches/s]



Metrics: {'train_loss': 0.005297218828473422, 'test_loss': 0.5975654721260071, 'bleu': 19.585, 'gen_len': 9.4589}




 79%|███████▉  | 355/449 [2:15:11<36:07, 23.05s/it]

For epoch 359: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.74batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.48batches/s]



Metrics: {'train_loss': 0.005606334128348929, 'test_loss': 0.5898280084133148, 'bleu': 19.8399, 'gen_len': 9.6781}




 79%|███████▉  | 356/449 [2:15:33<35:19, 22.79s/it]

For epoch 360: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.66batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.37batches/s]



Metrics: {'train_loss': 0.005198183123067748, 'test_loss': 0.5957319095730782, 'bleu': 20.9505, 'gen_len': 9.7397}




 80%|███████▉  | 357/449 [2:15:55<34:54, 22.76s/it]

For epoch 361: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.72batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.45batches/s]



Metrics: {'train_loss': 0.005422340177862746, 'test_loss': 0.6014988943934441, 'bleu': 21.5192, 'gen_len': 9.9658}




 80%|███████▉  | 358/449 [2:16:18<34:22, 22.66s/it]

For epoch 362: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.70batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.54batches/s]



Metrics: {'train_loss': 0.005021273425394078, 'test_loss': 0.6180232763290405, 'bleu': 21.6236, 'gen_len': 9.8014}




 80%|███████▉  | 359/449 [2:16:40<33:50, 22.57s/it]

For epoch 363: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.80batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.52batches/s]



Metrics: {'train_loss': 0.005393572855831646, 'test_loss': 0.6062642574310303, 'bleu': 20.4492, 'gen_len': 9.726}




 80%|████████  | 360/449 [2:17:02<33:01, 22.26s/it]

For epoch 364: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.77batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.36batches/s]



Metrics: {'train_loss': 0.004746444114461177, 'test_loss': 0.6118391841650009, 'bleu': 21.4488, 'gen_len': 9.5753}




 80%|████████  | 361/449 [2:17:24<32:29, 22.15s/it]

For epoch 365: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.74batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.63batches/s]



Metrics: {'train_loss': 0.005092207243007313, 'test_loss': 0.5959052950143814, 'bleu': 21.2714, 'gen_len': 9.6644}




 81%|████████  | 362/449 [2:17:45<31:57, 22.04s/it]

For epoch 366: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.78batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.52batches/s]



Metrics: {'train_loss': 0.004800862436773392, 'test_loss': 0.6053279057145119, 'bleu': 20.3857, 'gen_len': 9.5411}




 81%|████████  | 363/449 [2:18:07<31:30, 21.99s/it]

For epoch 367: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.77batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.39batches/s]



Metrics: {'train_loss': 0.005396817887497202, 'test_loss': 0.6058319792151451, 'bleu': 19.0959, 'gen_len': 9.4315}




 81%|████████  | 364/449 [2:18:29<31:10, 22.01s/it]

For epoch 368: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.72batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.42batches/s]



Metrics: {'train_loss': 0.005250308282173625, 'test_loss': 0.6065140277147293, 'bleu': 20.5568, 'gen_len': 9.4795}




 81%|████████▏ | 365/449 [2:18:52<30:58, 22.12s/it]

For epoch 369: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.72batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.42batches/s]



Metrics: {'train_loss': 0.005205244501689222, 'test_loss': 0.6003870785236358, 'bleu': 19.3585, 'gen_len': 9.637}




 82%|████████▏ | 366/449 [2:19:14<30:37, 22.14s/it]

For epoch 370: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.69batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.60batches/s]



Metrics: {'train_loss': 0.005653388457509075, 'test_loss': 0.610932782292366, 'bleu': 18.7203, 'gen_len': 9.5822}




 82%|████████▏ | 367/449 [2:19:36<30:17, 22.17s/it]

For epoch 371: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.75batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.53batches/s]



Metrics: {'train_loss': 0.005418670582357885, 'test_loss': 0.6006886750459671, 'bleu': 20.1923, 'gen_len': 9.5}




 82%|████████▏ | 368/449 [2:19:58<29:47, 22.07s/it]

For epoch 372: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.78batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.40batches/s]



Metrics: {'train_loss': 0.0054462406267525585, 'test_loss': 0.594533483684063, 'bleu': 20.6268, 'gen_len': 9.6301}




 82%|████████▏ | 369/449 [2:20:20<29:24, 22.05s/it]

For epoch 373: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.86batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.80batches/s]



Metrics: {'train_loss': 0.005342695927360981, 'test_loss': 0.6007116168737412, 'bleu': 21.6003, 'gen_len': 9.5548}




 82%|████████▏ | 370/449 [2:20:41<28:39, 21.77s/it]

For epoch 374: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.84batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.68batches/s]



Metrics: {'train_loss': 0.004917803737221331, 'test_loss': 0.6088912934064865, 'bleu': 19.4921, 'gen_len': 9.7329}




 83%|████████▎ | 371/449 [2:21:02<28:03, 21.58s/it]

For epoch 375: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.84batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.77batches/s]



Metrics: {'train_loss': 0.0052258349137335285, 'test_loss': 0.5994773045182228, 'bleu': 21.802, 'gen_len': 9.6644}




 83%|████████▎ | 372/449 [2:21:26<28:39, 22.33s/it]

For epoch 376: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.89batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.73batches/s]



Metrics: {'train_loss': 0.004777900278341116, 'test_loss': 0.5964746072888374, 'bleu': 21.2775, 'gen_len': 9.6438}




 83%|████████▎ | 373/449 [2:21:47<27:37, 21.81s/it]

For epoch 377: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.85batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.75batches/s]



Metrics: {'train_loss': 0.0050084549201107244, 'test_loss': 0.583581292629242, 'bleu': 20.6964, 'gen_len': 9.5548}




 83%|████████▎ | 374/449 [2:22:08<26:58, 21.58s/it]

For epoch 378: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.89batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.67batches/s]



Metrics: {'train_loss': 0.004708185394453566, 'test_loss': 0.5962032064795494, 'bleu': 20.9745, 'gen_len': 9.6438}




 84%|████████▎ | 375/449 [2:22:29<26:26, 21.44s/it]

For epoch 379: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.92batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.68batches/s]



Metrics: {'train_loss': 0.004967049871594078, 'test_loss': 0.5957114338874817, 'bleu': 21.5067, 'gen_len': 9.5616}




 84%|████████▎ | 376/449 [2:22:50<25:46, 21.19s/it]

For epoch 380: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.86batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.71batches/s]



Metrics: {'train_loss': 0.004575590858599398, 'test_loss': 0.6017537400126457, 'bleu': 22.2122, 'gen_len': 9.4178}




 84%|████████▍ | 377/449 [2:23:12<25:59, 21.66s/it]

For epoch 381: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.92batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.68batches/s]



Metrics: {'train_loss': 0.004824728256364058, 'test_loss': 0.6109475180506706, 'bleu': 19.3832, 'gen_len': 9.2055}




 84%|████████▍ | 378/449 [2:23:33<25:19, 21.40s/it]

For epoch 382: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.89batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.65batches/s]



Metrics: {'train_loss': 0.005003781225418717, 'test_loss': 0.6063303738832474, 'bleu': 20.2083, 'gen_len': 9.5137}




 84%|████████▍ | 379/449 [2:23:54<24:45, 21.22s/it]

For epoch 383: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.85batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.64batches/s]



Metrics: {'train_loss': 0.004822933183210652, 'test_loss': 0.6060671076178551, 'bleu': 19.2554, 'gen_len': 9.4521}




 85%|████████▍ | 380/449 [2:24:15<24:24, 21.23s/it]

For epoch 384: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.72batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.64batches/s]



Metrics: {'train_loss': 0.005181789142647531, 'test_loss': 0.5938312441110611, 'bleu': 20.858, 'gen_len': 9.4795}




 85%|████████▍ | 381/449 [2:24:38<24:23, 21.53s/it]

For epoch 385: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.76batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.61batches/s]



Metrics: {'train_loss': 0.005370935234922643, 'test_loss': 0.5912054613232612, 'bleu': 19.8346, 'gen_len': 9.5137}




 85%|████████▌ | 382/449 [2:24:59<24:05, 21.57s/it]

For epoch 386: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.75batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.58batches/s]



Metrics: {'train_loss': 0.004947581300767502, 'test_loss': 0.6005645588040351, 'bleu': 22.0702, 'gen_len': 9.3493}




 85%|████████▌ | 383/449 [2:25:21<23:50, 21.67s/it]

For epoch 387: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.79batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.47batches/s]



Metrics: {'train_loss': 0.0046695936215678, 'test_loss': 0.5996913850307465, 'bleu': 20.559, 'gen_len': 9.9247}




 86%|████████▌ | 384/449 [2:25:43<23:31, 21.72s/it]

For epoch 388: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.72batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:05<00:00,  1.90batches/s]



Metrics: {'train_loss': 0.004828680809246513, 'test_loss': 0.5972690552473068, 'bleu': 20.1976, 'gen_len': 9.6301}




 86%|████████▌ | 385/449 [2:26:06<23:41, 22.21s/it]

For epoch 389: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.65batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.34batches/s]



Metrics: {'train_loss': 0.004873357963089536, 'test_loss': 0.5979078784584999, 'bleu': 20.5505, 'gen_len': 10.0068}




 86%|████████▌ | 386/449 [2:26:29<23:25, 22.31s/it]

For epoch 390: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.70batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.33batches/s]



Metrics: {'train_loss': 0.005046561763553721, 'test_loss': 0.6128540515899659, 'bleu': 19.2503, 'gen_len': 9.4521}




 86%|████████▌ | 387/449 [2:26:51<23:05, 22.35s/it]

For epoch 391: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.69batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.52batches/s]



Metrics: {'train_loss': 0.005105692086877619, 'test_loss': 0.6114754885435104, 'bleu': 20.624, 'gen_len': 9.3562}




 86%|████████▋ | 388/449 [2:27:14<22:48, 22.43s/it]

For epoch 392: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.66batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.38batches/s]



Metrics: {'train_loss': 0.004758272626687114, 'test_loss': 0.6169388011097908, 'bleu': 21.1231, 'gen_len': 9.3219}




 87%|████████▋ | 389/449 [2:27:37<22:34, 22.57s/it]

For epoch 393: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.72batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.38batches/s]



Metrics: {'train_loss': 0.005512759418840088, 'test_loss': 0.6041030794382095, 'bleu': 20.178, 'gen_len': 9.5959}




 87%|████████▋ | 390/449 [2:27:59<22:04, 22.45s/it]

For epoch 394: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.70batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.60batches/s]



Metrics: {'train_loss': 0.005177165835914089, 'test_loss': 0.6030833259224891, 'bleu': 21.9046, 'gen_len': 9.8562}




 87%|████████▋ | 391/449 [2:28:21<21:40, 22.43s/it]

For epoch 395: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.67batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.39batches/s]



Metrics: {'train_loss': 0.004956198583652333, 'test_loss': 0.6088165313005447, 'bleu': 22.0888, 'gen_len': 9.5616}




 87%|████████▋ | 392/449 [2:28:44<21:17, 22.41s/it]

For epoch 396: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.68batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.01batches/s]



Metrics: {'train_loss': 0.004841805537935437, 'test_loss': 0.6084229126572609, 'bleu': 21.0234, 'gen_len': 9.6164}




 88%|████████▊ | 393/449 [2:29:07<21:04, 22.58s/it]

For epoch 397: 


Train batch number 41: 100%|██████████| 41/41 [00:16<00:00,  2.51batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.48batches/s]



Metrics: {'train_loss': 0.004665190927548016, 'test_loss': 0.6151680007576943, 'bleu': 20.4348, 'gen_len': 9.589}




 88%|████████▊ | 394/449 [2:29:30<20:54, 22.80s/it]

For epoch 398: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.60batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.43batches/s]



Metrics: {'train_loss': 0.0046753390158367596, 'test_loss': 0.6161228358745575, 'bleu': 20.6687, 'gen_len': 9.5}




 88%|████████▊ | 395/449 [2:29:53<20:40, 22.97s/it]

For epoch 399: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.57batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.41batches/s]



Metrics: {'train_loss': 0.0047868920444715315, 'test_loss': 0.6037960931658745, 'bleu': 19.4101, 'gen_len': 9.5068}




 88%|████████▊ | 396/449 [2:30:16<20:18, 23.00s/it]

For epoch 400: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.64batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.57batches/s]



Metrics: {'train_loss': 0.004709360044926587, 'test_loss': 0.606260584294796, 'bleu': 22.5231, 'gen_len': 9.5685}




 88%|████████▊ | 397/449 [2:30:41<20:16, 23.40s/it]

For epoch 401: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.58batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:05<00:00,  1.95batches/s]



Metrics: {'train_loss': 0.004673764984146124, 'test_loss': 0.6055139839649201, 'bleu': 20.5639, 'gen_len': 9.7055}




 89%|████████▊ | 398/449 [2:31:04<19:58, 23.50s/it]

For epoch 402: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.61batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.52batches/s]



Metrics: {'train_loss': 0.004894915840406789, 'test_loss': 0.5990793168544769, 'bleu': 20.5065, 'gen_len': 9.5616}




 89%|████████▉ | 399/449 [2:31:27<19:24, 23.29s/it]

For epoch 403: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.65batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.26batches/s]



Metrics: {'train_loss': 0.0049523206416335775, 'test_loss': 0.5948103532195091, 'bleu': 19.5997, 'gen_len': 9.8425}




 89%|████████▉ | 400/449 [2:31:50<18:51, 23.09s/it]

For epoch 404: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.72batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.68batches/s]



Metrics: {'train_loss': 0.004720601574063483, 'test_loss': 0.6031230792403222, 'bleu': 21.1224, 'gen_len': 9.7877}




 89%|████████▉ | 401/449 [2:32:12<18:12, 22.76s/it]

For epoch 405: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.68batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.36batches/s]



Metrics: {'train_loss': 0.00521623724276518, 'test_loss': 0.6041784286499023, 'bleu': 19.3197, 'gen_len': 9.6986}




 90%|████████▉ | 402/449 [2:32:34<17:46, 22.68s/it]

For epoch 406: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.67batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.22batches/s]



Metrics: {'train_loss': 0.00475534973467314, 'test_loss': 0.5985971674323082, 'bleu': 20.1864, 'gen_len': 9.7397}




 90%|████████▉ | 403/449 [2:32:57<17:21, 22.65s/it]

For epoch 407: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.68batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.40batches/s]



Metrics: {'train_loss': 0.004617992797071432, 'test_loss': 0.6179586127400398, 'bleu': 20.4004, 'gen_len': 9.6301}




 90%|████████▉ | 404/449 [2:33:19<16:52, 22.49s/it]

For epoch 408: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.70batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.50batches/s]



Metrics: {'train_loss': 0.005173655896924618, 'test_loss': 0.5990425139665604, 'bleu': 20.7699, 'gen_len': 9.6438}




 90%|█████████ | 405/449 [2:33:41<16:22, 22.33s/it]

For epoch 409: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.71batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.33batches/s]



Metrics: {'train_loss': 0.004964183672422134, 'test_loss': 0.6081686332821846, 'bleu': 20.4923, 'gen_len': 9.8151}




 90%|█████████ | 406/449 [2:34:04<16:02, 22.39s/it]

For epoch 410: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.75batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.53batches/s]



Metrics: {'train_loss': 0.004947291749597686, 'test_loss': 0.5935128092765808, 'bleu': 20.7423, 'gen_len': 10.0274}




 91%|█████████ | 407/449 [2:34:25<15:30, 22.16s/it]

For epoch 411: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.71batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.58batches/s]



Metrics: {'train_loss': 0.004629309544703219, 'test_loss': 0.5935130804777146, 'bleu': 19.9867, 'gen_len': 10.1233}




 91%|█████████ | 408/449 [2:34:47<15:06, 22.11s/it]

For epoch 412: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.70batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.50batches/s]



Metrics: {'train_loss': 0.0044961973070734886, 'test_loss': 0.5958468168973923, 'bleu': 20.9379, 'gen_len': 9.9658}




 91%|█████████ | 409/449 [2:35:09<14:45, 22.13s/it]

For epoch 413: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.71batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.38batches/s]



Metrics: {'train_loss': 0.004708884562161274, 'test_loss': 0.5930032953619957, 'bleu': 21.4786, 'gen_len': 9.8356}




 91%|█████████▏| 410/449 [2:35:31<14:23, 22.13s/it]

For epoch 414: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.57batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.22batches/s]



Metrics: {'train_loss': 0.004556212545849565, 'test_loss': 0.6001488119363785, 'bleu': 20.1735, 'gen_len': 9.6781}




 92%|█████████▏| 411/449 [2:35:55<14:16, 22.53s/it]

For epoch 415: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.71batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.21batches/s]



Metrics: {'train_loss': 0.00455357608805615, 'test_loss': 0.6057587921619415, 'bleu': 20.7171, 'gen_len': 9.6438}




 92%|█████████▏| 412/449 [2:36:17<13:49, 22.43s/it]

For epoch 416: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.71batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.48batches/s]



Metrics: {'train_loss': 0.005474001504253687, 'test_loss': 0.5951172396540642, 'bleu': 20.5781, 'gen_len': 9.7671}




 92%|█████████▏| 413/449 [2:36:40<13:27, 22.42s/it]

For epoch 417: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.70batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.53batches/s]



Metrics: {'train_loss': 0.005077799532289912, 'test_loss': 0.59512098133564, 'bleu': 20.9176, 'gen_len': 9.4726}




 92%|█████████▏| 414/449 [2:37:02<13:07, 22.49s/it]

For epoch 418: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.65batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.29batches/s]



Metrics: {'train_loss': 0.004946659390087716, 'test_loss': 0.595668613910675, 'bleu': 20.5333, 'gen_len': 9.5548}




 92%|█████████▏| 415/449 [2:37:25<12:44, 22.49s/it]

For epoch 419: 


Train batch number 41: 100%|██████████| 41/41 [00:16<00:00,  2.51batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.37batches/s]



Metrics: {'train_loss': 0.0049318920241714255, 'test_loss': 0.6024597719311714, 'bleu': 21.0089, 'gen_len': 9.6164}




 93%|█████████▎| 416/449 [2:37:49<12:35, 22.89s/it]

For epoch 420: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.70batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.60batches/s]



Metrics: {'train_loss': 0.0050074268876370495, 'test_loss': 0.5842848226428032, 'bleu': 21.4185, 'gen_len': 9.7123}




 93%|█████████▎| 417/449 [2:38:11<12:08, 22.75s/it]

For epoch 421: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.69batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.44batches/s]



Metrics: {'train_loss': 0.0049724659212387915, 'test_loss': 0.5867202460765839, 'bleu': 21.1933, 'gen_len': 9.7671}




 93%|█████████▎| 418/449 [2:38:33<11:42, 22.65s/it]

For epoch 422: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.64batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.39batches/s]



Metrics: {'train_loss': 0.0047652602462643166, 'test_loss': 0.585623736679554, 'bleu': 21.2569, 'gen_len': 9.5616}




 93%|█████████▎| 419/449 [2:38:56<11:22, 22.74s/it]

For epoch 423: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.70batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.16batches/s]



Metrics: {'train_loss': 0.004980293749945193, 'test_loss': 0.5913448579609394, 'bleu': 21.9543, 'gen_len': 9.6575}




 94%|█████████▎| 420/449 [2:39:19<10:59, 22.73s/it]

For epoch 424: 


Train batch number 41: 100%|██████████| 41/41 [00:16<00:00,  2.51batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.50batches/s]



Metrics: {'train_loss': 0.004559279564291057, 'test_loss': 0.5918200552463532, 'bleu': 20.8959, 'gen_len': 9.6096}




 94%|█████████▍| 421/449 [2:39:43<10:47, 23.12s/it]

For epoch 425: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.60batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.48batches/s]



Metrics: {'train_loss': 0.004628856254673404, 'test_loss': 0.5934399262070655, 'bleu': 20.3113, 'gen_len': 9.7534}




 94%|█████████▍| 422/449 [2:40:06<10:20, 22.98s/it]

For epoch 426: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.68batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.38batches/s]



Metrics: {'train_loss': 0.004783141807221421, 'test_loss': 0.5937562093138695, 'bleu': 20.2538, 'gen_len': 9.7534}




 94%|█████████▍| 423/449 [2:40:28<09:52, 22.79s/it]

For epoch 427: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.65batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.34batches/s]



Metrics: {'train_loss': 0.004710982831921883, 'test_loss': 0.5866639375686645, 'bleu': 19.8146, 'gen_len': 10.0411}




 94%|█████████▍| 424/449 [2:40:50<09:25, 22.61s/it]

For epoch 428: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.68batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.39batches/s]



Metrics: {'train_loss': 0.00461496266525057, 'test_loss': 0.5859941214323043, 'bleu': 21.338, 'gen_len': 9.589}




 95%|█████████▍| 425/449 [2:41:13<09:02, 22.60s/it]

For epoch 429: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.72batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.44batches/s]



Metrics: {'train_loss': 0.0050272046895016255, 'test_loss': 0.5859708309173584, 'bleu': 21.3543, 'gen_len': 9.6712}




 95%|█████████▍| 426/449 [2:41:35<08:37, 22.52s/it]

For epoch 430: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.66batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.40batches/s]



Metrics: {'train_loss': 0.005178038939470198, 'test_loss': 0.5876839920878411, 'bleu': 20.0517, 'gen_len': 9.9315}




 95%|█████████▌| 427/449 [2:41:57<08:13, 22.43s/it]

For epoch 431: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.69batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.53batches/s]



Metrics: {'train_loss': 0.004706344252066096, 'test_loss': 0.5899806022644043, 'bleu': 20.7199, 'gen_len': 9.8562}




 95%|█████████▌| 428/449 [2:42:20<07:49, 22.38s/it]

For epoch 432: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.66batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.57batches/s]



Metrics: {'train_loss': 0.00481737351113158, 'test_loss': 0.5873937472701073, 'bleu': 21.0048, 'gen_len': 10.0137}




 96%|█████████▌| 429/449 [2:42:42<07:28, 22.40s/it]

For epoch 433: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.67batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.54batches/s]



Metrics: {'train_loss': 0.004354820696057797, 'test_loss': 0.5955991953611374, 'bleu': 20.7264, 'gen_len': 9.6096}




 96%|█████████▌| 430/449 [2:43:04<07:03, 22.31s/it]

For epoch 434: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.79batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.49batches/s]



Metrics: {'train_loss': 0.004528949477300957, 'test_loss': 0.5908652245998383, 'bleu': 21.0247, 'gen_len': 9.7945}




 96%|█████████▌| 431/449 [2:43:26<06:38, 22.13s/it]

For epoch 435: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.78batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.70batches/s]



Metrics: {'train_loss': 0.004604216506023232, 'test_loss': 0.5897318750619889, 'bleu': 20.5226, 'gen_len': 9.6712}




 96%|█████████▌| 432/449 [2:43:47<06:12, 21.91s/it]

For epoch 436: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.64batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.24batches/s]



Metrics: {'train_loss': 0.004920883354072164, 'test_loss': 0.5913947746157646, 'bleu': 20.4279, 'gen_len': 9.7534}




 96%|█████████▋| 433/449 [2:44:10<05:54, 22.15s/it]

For epoch 437: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.69batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.42batches/s]



Metrics: {'train_loss': 0.004697866155765951, 'test_loss': 0.5945815041661262, 'bleu': 20.9824, 'gen_len': 9.8493}




 97%|█████████▋| 434/449 [2:44:32<05:32, 22.17s/it]

For epoch 438: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.66batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.41batches/s]



Metrics: {'train_loss': 0.004738963220450209, 'test_loss': 0.5946228638291359, 'bleu': 19.159, 'gen_len': 9.7671}




 97%|█████████▋| 435/449 [2:44:54<05:10, 22.17s/it]

For epoch 439: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.66batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.47batches/s]



Metrics: {'train_loss': 0.004675512042012997, 'test_loss': 0.5942355677485466, 'bleu': 20.4097, 'gen_len': 9.7877}




 97%|█████████▋| 436/449 [2:45:17<04:48, 22.21s/it]

For epoch 440: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.72batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.53batches/s]



Metrics: {'train_loss': 0.004298699560444575, 'test_loss': 0.5965386278927326, 'bleu': 20.736, 'gen_len': 9.5342}




 97%|█████████▋| 437/449 [2:45:39<04:26, 22.17s/it]

For epoch 441: 


Train batch number 41: 100%|██████████| 41/41 [00:16<00:00,  2.50batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.43batches/s]



Metrics: {'train_loss': 0.004519855819369961, 'test_loss': 0.5962611898779869, 'bleu': 21.0982, 'gen_len': 9.6781}




 98%|█████████▊| 438/449 [2:46:02<04:07, 22.51s/it]

For epoch 442: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.66batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.39batches/s]



Metrics: {'train_loss': 0.004382041680467565, 'test_loss': 0.5931522257626056, 'bleu': 21.1698, 'gen_len': 9.7397}




 98%|█████████▊| 439/449 [2:46:24<03:44, 22.42s/it]

For epoch 443: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.70batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.41batches/s]



Metrics: {'train_loss': 0.004672421505901872, 'test_loss': 0.5895395427942276, 'bleu': 19.7187, 'gen_len': 9.4247}




 98%|█████████▊| 440/449 [2:46:46<03:21, 22.37s/it]

For epoch 444: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.74batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.42batches/s]



Metrics: {'train_loss': 0.004657424126024835, 'test_loss': 0.5904975444078445, 'bleu': 20.7935, 'gen_len': 9.5274}




 98%|█████████▊| 441/449 [2:47:09<02:58, 22.31s/it]

For epoch 445: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.73batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.54batches/s]



Metrics: {'train_loss': 0.0046697426260812434, 'test_loss': 0.596749198436737, 'bleu': 20.574, 'gen_len': 9.2808}




 98%|█████████▊| 442/449 [2:47:30<02:34, 22.12s/it]

For epoch 446: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.79batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.57batches/s]



Metrics: {'train_loss': 0.004781796398186465, 'test_loss': 0.5829227961599827, 'bleu': 20.5528, 'gen_len': 9.5822}




 99%|█████████▊| 443/449 [2:47:52<02:11, 21.88s/it]

For epoch 447: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.81batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.70batches/s]



Metrics: {'train_loss': 0.0047170501202344894, 'test_loss': 0.5851501889526844, 'bleu': 21.0657, 'gen_len': 9.6986}




 99%|█████████▉| 444/449 [2:48:13<01:48, 21.73s/it]

For epoch 448: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.78batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.73batches/s]



Metrics: {'train_loss': 0.004853975487782098, 'test_loss': 0.5812283173203469, 'bleu': 20.8081, 'gen_len': 9.7397}




 99%|█████████▉| 445/449 [2:48:34<01:26, 21.64s/it]

For epoch 449: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.75batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.56batches/s]



Metrics: {'train_loss': 0.004778152127273199, 'test_loss': 0.5911021284759045, 'bleu': 20.5971, 'gen_len': 9.5822}




 99%|█████████▉| 446/449 [2:48:56<01:04, 21.62s/it]

For epoch 450: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.78batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.62batches/s]



Metrics: {'train_loss': 0.004768759506277558, 'test_loss': 0.5974205732345581, 'bleu': 21.1569, 'gen_len': 9.7603}




100%|█████████▉| 447/449 [2:49:17<00:43, 21.55s/it]

For epoch 451: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.74batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.73batches/s]



Metrics: {'train_loss': 0.004404433270371179, 'test_loss': 0.5954935520887374, 'bleu': 21.0385, 'gen_len': 9.8562}




100%|█████████▉| 448/449 [2:49:39<00:21, 21.55s/it]

For epoch 452: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.75batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.56batches/s]



Metrics: {'train_loss': 0.004804632717334643, 'test_loss': 0.592202077805996, 'bleu': 20.254, 'gen_len': 9.5959}




100%|██████████| 449/449 [2:50:01<00:00, 22.72s/it]


-------------

### ---

In [14]:
trainer.train(epochs = config['max_epoch'] - trainer.current_epoch, auto_save=True, metric_for_best_model='bleu', metric_objective='maximize', log_step=1,
              saving_directory = config['new_model_dir'])

  0%|          | 0/348 [00:00<?, ?it/s]

For epoch 453: 


Train batch number 41: 100%|██████████| 41/41 [00:12<00:00,  3.21batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.25batches/s]



Metrics: {'train_loss': 0.0050092799980874835, 'test_loss': 0.5823351517319679, 'bleu': 21.1344, 'gen_len': 9.5685}




  0%|          | 1/348 [00:20<1:57:07, 20.25s/it]

For epoch 454: 


Train batch number 41: 100%|██████████| 41/41 [00:13<00:00,  3.15batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.76batches/s]



Metrics: {'train_loss': 0.005001379279192628, 'test_loss': 0.5836841449141502, 'bleu': 20.8984, 'gen_len': 9.5137}




  1%|          | 2/348 [00:39<1:54:52, 19.92s/it]

For epoch 455: 


Train batch number 41: 100%|██████████| 41/41 [00:13<00:00,  3.05batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.55batches/s]



Metrics: {'train_loss': 0.0048570631742023114, 'test_loss': 0.6009350255131721, 'bleu': 20.2304, 'gen_len': 9.5068}




  1%|          | 3/348 [00:59<1:54:24, 19.90s/it]

For epoch 456: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.78batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.49batches/s]



Metrics: {'train_loss': 0.005045269062265572, 'test_loss': 0.6027449488639831, 'bleu': 20.1088, 'gen_len': 9.7808}




  1%|          | 4/348 [01:21<1:57:45, 20.54s/it]

For epoch 457: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.77batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.48batches/s]



Metrics: {'train_loss': 0.005142967869732075, 'test_loss': 0.5905604302883148, 'bleu': 21.0916, 'gen_len': 9.8699}




  1%|▏         | 5/348 [01:43<2:00:02, 21.00s/it]

For epoch 458: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.75batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.59batches/s]



Metrics: {'train_loss': 0.004783658259661823, 'test_loss': 0.5974761441349983, 'bleu': 23.347, 'gen_len': 9.589}




  2%|▏         | 6/348 [02:07<2:05:53, 22.09s/it]

For epoch 459: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.75batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.62batches/s]



Metrics: {'train_loss': 0.004691473143061668, 'test_loss': 0.5925040110945702, 'bleu': 23.2767, 'gen_len': 9.5137}




  2%|▏         | 7/348 [02:29<2:04:46, 21.96s/it]

For epoch 460: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.68batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.38batches/s]



Metrics: {'train_loss': 0.004503053204133743, 'test_loss': 0.5935363866388798, 'bleu': 21.2021, 'gen_len': 9.6164}




  2%|▏         | 8/348 [02:51<2:04:53, 22.04s/it]

For epoch 461: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.76batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.77batches/s]



Metrics: {'train_loss': 0.00479021239685031, 'test_loss': 0.60011255890131, 'bleu': 21.6666, 'gen_len': 9.5274}




  3%|▎         | 9/348 [03:12<2:03:24, 21.84s/it]

For epoch 462: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.76batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.59batches/s]



Metrics: {'train_loss': 0.004512826843959529, 'test_loss': 0.6008066043257714, 'bleu': 21.9898, 'gen_len': 9.6507}




  3%|▎         | 10/348 [03:34<2:02:40, 21.78s/it]

For epoch 463: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.80batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:05<00:00,  1.90batches/s]



Metrics: {'train_loss': 0.004431433725829531, 'test_loss': 0.6076663166284562, 'bleu': 21.8442, 'gen_len': 9.589}




  3%|▎         | 11/348 [03:57<2:04:17, 22.13s/it]

For epoch 464: 


Train batch number 41: 100%|██████████| 41/41 [00:16<00:00,  2.49batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.44batches/s]



Metrics: {'train_loss': 0.00479432512017921, 'test_loss': 0.6064262799918652, 'bleu': 22.226, 'gen_len': 9.6712}




  3%|▎         | 12/348 [04:20<2:06:35, 22.61s/it]

For epoch 465: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.67batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.40batches/s]



Metrics: {'train_loss': 0.004636793036763443, 'test_loss': 0.6029979132115841, 'bleu': 22.7593, 'gen_len': 9.6849}




  4%|▎         | 13/348 [04:43<2:05:54, 22.55s/it]

For epoch 466: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.65batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.46batches/s]



Metrics: {'train_loss': 0.004860734413718668, 'test_loss': 0.5998526744544506, 'bleu': 20.6103, 'gen_len': 9.7055}




  4%|▍         | 14/348 [05:05<2:05:07, 22.48s/it]

For epoch 467: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.68batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.45batches/s]



Metrics: {'train_loss': 0.005079134955161773, 'test_loss': 0.5903721615672112, 'bleu': 22.0781, 'gen_len': 9.7329}




  4%|▍         | 15/348 [05:27<2:04:11, 22.38s/it]

For epoch 468: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.64batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.50batches/s]



Metrics: {'train_loss': 0.004668265887785975, 'test_loss': 0.6028381988406182, 'bleu': 20.2498, 'gen_len': 9.7123}




  5%|▍         | 16/348 [05:49<2:03:31, 22.33s/it]

For epoch 469: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.61batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.62batches/s]



Metrics: {'train_loss': 0.004754616623380926, 'test_loss': 0.5941126190125943, 'bleu': 20.5062, 'gen_len': 9.4589}




  5%|▍         | 17/348 [06:12<2:03:49, 22.45s/it]

For epoch 470: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.63batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.32batches/s]



Metrics: {'train_loss': 0.0047245661830302415, 'test_loss': 0.5919444367289544, 'bleu': 21.3881, 'gen_len': 9.9795}




  5%|▌         | 18/348 [06:36<2:05:21, 22.79s/it]

For epoch 471: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.66batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.27batches/s]



Metrics: {'train_loss': 0.005015893498571907, 'test_loss': 0.5925855077803135, 'bleu': 20.8428, 'gen_len': 9.863}




  5%|▌         | 19/348 [06:59<2:05:49, 22.95s/it]

For epoch 472: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.70batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.55batches/s]



Metrics: {'train_loss': 0.00459784276431381, 'test_loss': 0.6007501922547818, 'bleu': 20.3482, 'gen_len': 9.8493}




  6%|▌         | 20/348 [07:21<2:03:50, 22.66s/it]

For epoch 473: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.79batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.76batches/s]



Metrics: {'train_loss': 0.00485445283177286, 'test_loss': 0.5940219163894653, 'bleu': 21.3072, 'gen_len': 9.9247}




  6%|▌         | 21/348 [07:43<2:01:34, 22.31s/it]

For epoch 474: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.83batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.63batches/s]



Metrics: {'train_loss': 0.00537408062359091, 'test_loss': 0.5972029604017735, 'bleu': 21.1651, 'gen_len': 9.5753}




  6%|▋         | 22/348 [08:04<1:59:45, 22.04s/it]

For epoch 475: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.79batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.55batches/s]



Metrics: {'train_loss': 0.005050340352193793, 'test_loss': 0.6040955796837807, 'bleu': 20.4819, 'gen_len': 9.7466}




  7%|▋         | 23/348 [08:25<1:58:18, 21.84s/it]

For epoch 476: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.77batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.56batches/s]



Metrics: {'train_loss': 0.005145837589757654, 'test_loss': 0.5963833019137382, 'bleu': 20.1397, 'gen_len': 9.6096}




  7%|▋         | 24/348 [08:47<1:57:12, 21.71s/it]

For epoch 477: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.83batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.74batches/s]



Metrics: {'train_loss': 0.004813191798947207, 'test_loss': 0.5875281617045403, 'bleu': 19.6327, 'gen_len': 9.8836}




  7%|▋         | 25/348 [09:08<1:55:58, 21.54s/it]

For epoch 478: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.73batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.55batches/s]



Metrics: {'train_loss': 0.004913679664818252, 'test_loss': 0.5860365025699139, 'bleu': 20.3954, 'gen_len': 9.4795}




  7%|▋         | 26/348 [09:30<1:55:57, 21.61s/it]

For epoch 479: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.78batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.53batches/s]



Metrics: {'train_loss': 0.0046150885229907564, 'test_loss': 0.6024035267531872, 'bleu': 21.9821, 'gen_len': 9.5548}




  8%|▊         | 27/348 [09:51<1:55:38, 21.62s/it]

For epoch 480: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.81batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.76batches/s]



Metrics: {'train_loss': 0.004526041533298246, 'test_loss': 0.5973945707082748, 'bleu': 20.1227, 'gen_len': 9.7397}




  8%|▊         | 28/348 [10:13<1:54:44, 21.51s/it]

For epoch 481: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.76batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.47batches/s]



Metrics: {'train_loss': 0.004829307328114604, 'test_loss': 0.5904575072228908, 'bleu': 20.7219, 'gen_len': 9.7671}




  8%|▊         | 29/348 [10:34<1:54:40, 21.57s/it]

For epoch 482: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.78batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.53batches/s]



Metrics: {'train_loss': 0.004547554502695254, 'test_loss': 0.5906829617917537, 'bleu': 19.8901, 'gen_len': 9.6781}




  9%|▊         | 30/348 [10:56<1:53:55, 21.50s/it]

For epoch 483: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.71batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.46batches/s]



Metrics: {'train_loss': 0.004697002053101797, 'test_loss': 0.5976182386279106, 'bleu': 20.2704, 'gen_len': 9.6164}




  9%|▉         | 31/348 [11:18<1:54:27, 21.66s/it]

For epoch 484: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.79batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.53batches/s]



Metrics: {'train_loss': 0.004516919319540626, 'test_loss': 0.597817575186491, 'bleu': 21.2413, 'gen_len': 9.7192}




  9%|▉         | 32/348 [11:40<1:54:47, 21.79s/it]

For epoch 485: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.79batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.59batches/s]



Metrics: {'train_loss': 0.004628296129450929, 'test_loss': 0.5922577992081642, 'bleu': 20.8636, 'gen_len': 9.6027}




  9%|▉         | 33/348 [12:01<1:53:34, 21.63s/it]

For epoch 486: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.80batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.40batches/s]



Metrics: {'train_loss': 0.004516585664710075, 'test_loss': 0.5996694289147854, 'bleu': 21.9794, 'gen_len': 9.4863}




 10%|▉         | 34/348 [12:22<1:52:54, 21.58s/it]

For epoch 487: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.78batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.80batches/s]



Metrics: {'train_loss': 0.004591081331188723, 'test_loss': 0.5977569349110127, 'bleu': 19.267, 'gen_len': 9.5753}




 10%|█         | 35/348 [12:44<1:52:11, 21.51s/it]

For epoch 488: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.79batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.61batches/s]



Metrics: {'train_loss': 0.004779155897090166, 'test_loss': 0.5959369741380215, 'bleu': 21.1191, 'gen_len': 9.411}




 10%|█         | 36/348 [13:05<1:51:13, 21.39s/it]

For epoch 489: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.84batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.71batches/s]



Metrics: {'train_loss': 0.0046628359307694, 'test_loss': 0.5851788066327572, 'bleu': 19.0221, 'gen_len': 9.5548}




 11%|█         | 37/348 [13:26<1:50:39, 21.35s/it]

For epoch 490: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.76batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.56batches/s]



Metrics: {'train_loss': 0.00461172596437902, 'test_loss': 0.5833331242203712, 'bleu': 20.0263, 'gen_len': 9.637}




 11%|█         | 38/348 [13:48<1:50:34, 21.40s/it]

For epoch 491: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.78batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.73batches/s]



Metrics: {'train_loss': 0.004515932466857499, 'test_loss': 0.5977690711617469, 'bleu': 21.45, 'gen_len': 9.6438}




 11%|█         | 39/348 [14:09<1:50:11, 21.40s/it]

For epoch 492: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.80batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.67batches/s]



Metrics: {'train_loss': 0.004409730204426479, 'test_loss': 0.5987479530274868, 'bleu': 20.2666, 'gen_len': 9.3082}




 11%|█▏        | 40/348 [14:30<1:49:37, 21.36s/it]

For epoch 493: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.72batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.72batches/s]



Metrics: {'train_loss': 0.004560821995255537, 'test_loss': 0.6038434751331806, 'bleu': 19.2202, 'gen_len': 9.637}




 12%|█▏        | 41/348 [14:52<1:49:59, 21.50s/it]

For epoch 494: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.80batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.58batches/s]



Metrics: {'train_loss': 0.004621691672449432, 'test_loss': 0.5985959336161614, 'bleu': 20.8512, 'gen_len': 9.6507}




 12%|█▏        | 42/348 [15:14<1:49:22, 21.45s/it]

For epoch 495: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.77batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.57batches/s]



Metrics: {'train_loss': 0.004490995580875655, 'test_loss': 0.59308265671134, 'bleu': 20.1306, 'gen_len': 9.774}




 12%|█▏        | 43/348 [15:35<1:49:03, 21.45s/it]

For epoch 496: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.72batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.52batches/s]



Metrics: {'train_loss': 0.004236798493543685, 'test_loss': 0.5987064652144909, 'bleu': 20.2729, 'gen_len': 9.589}




 13%|█▎        | 44/348 [15:57<1:49:15, 21.56s/it]

For epoch 497: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.78batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.59batches/s]



Metrics: {'train_loss': 0.004537638847534432, 'test_loss': 0.6007590666413307, 'bleu': 19.2801, 'gen_len': 9.6438}




 13%|█▎        | 45/348 [16:18<1:48:33, 21.50s/it]

For epoch 498: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.77batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.53batches/s]



Metrics: {'train_loss': 0.004814630274933468, 'test_loss': 0.5887808635830879, 'bleu': 19.0113, 'gen_len': 9.7055}




 13%|█▎        | 46/348 [16:40<1:48:16, 21.51s/it]

For epoch 499: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.77batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.73batches/s]



Metrics: {'train_loss': 0.004608058265592085, 'test_loss': 0.5946430943906307, 'bleu': 19.2472, 'gen_len': 9.6575}




 14%|█▎        | 47/348 [17:01<1:47:48, 21.49s/it]

For epoch 500: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.77batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.53batches/s]



Metrics: {'train_loss': 0.004622618605314595, 'test_loss': 0.5864273920655251, 'bleu': 20.8015, 'gen_len': 9.5616}




 14%|█▍        | 48/348 [17:23<1:47:17, 21.46s/it]

For epoch 501: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.81batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.71batches/s]



Metrics: {'train_loss': 0.004652558062112004, 'test_loss': 0.5865325048565865, 'bleu': 20.6105, 'gen_len': 9.5342}




 14%|█▍        | 49/348 [17:44<1:46:47, 21.43s/it]

For epoch 502: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.73batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.51batches/s]



Metrics: {'train_loss': 0.0048196888784310075, 'test_loss': 0.5864301897585392, 'bleu': 21.0137, 'gen_len': 9.2603}




 14%|█▍        | 50/348 [18:06<1:46:55, 21.53s/it]

For epoch 503: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.75batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.46batches/s]



Metrics: {'train_loss': 0.00439723346257446, 'test_loss': 0.5951973028481007, 'bleu': 20.2578, 'gen_len': 9.6233}




 15%|█▍        | 51/348 [18:27<1:46:48, 21.58s/it]

For epoch 504: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.80batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.56batches/s]



Metrics: {'train_loss': 0.004722181596884095, 'test_loss': 0.5944868206977845, 'bleu': 20.2439, 'gen_len': 9.5753}




 15%|█▍        | 52/348 [18:49<1:46:08, 21.52s/it]

For epoch 505: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.74batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.55batches/s]



Metrics: {'train_loss': 0.0052013730562133036, 'test_loss': 0.5886004880070687, 'bleu': 20.3903, 'gen_len': 9.5822}




 15%|█▌        | 53/348 [19:10<1:45:58, 21.55s/it]

For epoch 506: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.77batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.44batches/s]



Metrics: {'train_loss': 0.005183243045083634, 'test_loss': 0.5893815524876118, 'bleu': 21.4651, 'gen_len': 9.4726}




 16%|█▌        | 54/348 [19:32<1:45:36, 21.55s/it]

For epoch 507: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.79batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.40batches/s]



Metrics: {'train_loss': 0.004590948910179843, 'test_loss': 0.5907298043370247, 'bleu': 20.1417, 'gen_len': 9.7603}




 16%|█▌        | 55/348 [19:54<1:45:46, 21.66s/it]

For epoch 508: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.74batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.51batches/s]



Metrics: {'train_loss': 0.004856047714583394, 'test_loss': 0.5881545767188072, 'bleu': 20.4772, 'gen_len': 9.6301}




 16%|█▌        | 56/348 [20:15<1:45:11, 21.61s/it]

For epoch 509: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.82batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.67batches/s]



Metrics: {'train_loss': 0.004358629162264306, 'test_loss': 0.5954951204359531, 'bleu': 19.9985, 'gen_len': 9.6301}




 16%|█▋        | 57/348 [20:37<1:44:21, 21.52s/it]

For epoch 510: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.78batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.54batches/s]



Metrics: {'train_loss': 0.004800371081772737, 'test_loss': 0.5968059718608856, 'bleu': 20.0714, 'gen_len': 9.2055}




 17%|█▋        | 58/348 [20:58<1:43:45, 21.47s/it]

For epoch 511: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.58batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.30batches/s]



Metrics: {'train_loss': 0.004874208203812198, 'test_loss': 0.5948229253292083, 'bleu': 21.4177, 'gen_len': 9.226}




 17%|█▋        | 59/348 [21:21<1:45:35, 21.92s/it]

For epoch 512: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.75batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.55batches/s]



Metrics: {'train_loss': 0.0048096556807073155, 'test_loss': 0.5987939946353436, 'bleu': 19.593, 'gen_len': 9.3288}




 17%|█▋        | 60/348 [21:42<1:44:40, 21.81s/it]

For epoch 513: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.79batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.48batches/s]



Metrics: {'train_loss': 0.004674147962160953, 'test_loss': 0.5932383954524993, 'bleu': 19.3929, 'gen_len': 9.5}




 18%|█▊        | 61/348 [22:04<1:44:07, 21.77s/it]

For epoch 514: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.78batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.44batches/s]



Metrics: {'train_loss': 0.004629175945344131, 'test_loss': 0.5902465969324112, 'bleu': 20.2998, 'gen_len': 9.7603}




 18%|█▊        | 62/348 [22:26<1:44:24, 21.91s/it]

For epoch 515: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.78batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.52batches/s]



Metrics: {'train_loss': 0.004690131916460104, 'test_loss': 0.5976621806621552, 'bleu': 19.7206, 'gen_len': 9.411}




 18%|█▊        | 63/348 [22:48<1:43:08, 21.72s/it]

For epoch 516: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.80batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.53batches/s]



Metrics: {'train_loss': 0.004841615565166604, 'test_loss': 0.6003953941166401, 'bleu': 20.2427, 'gen_len': 9.3973}




 18%|█▊        | 64/348 [23:09<1:41:58, 21.54s/it]

For epoch 517: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.86batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.41batches/s]



Metrics: {'train_loss': 0.0044195167351196085, 'test_loss': 0.5981863856315612, 'bleu': 19.1559, 'gen_len': 9.411}




 19%|█▊        | 65/348 [23:30<1:41:30, 21.52s/it]

For epoch 518: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.85batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.74batches/s]



Metrics: {'train_loss': 0.004823493987049271, 'test_loss': 0.5970411956310272, 'bleu': 20.0911, 'gen_len': 9.4795}




 19%|█▉        | 66/348 [23:51<1:40:24, 21.36s/it]

For epoch 519: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.85batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.65batches/s]



Metrics: {'train_loss': 0.004632791177733097, 'test_loss': 0.6007360607385636, 'bleu': 20.0886, 'gen_len': 9.4041}




 19%|█▉        | 67/348 [24:12<1:39:48, 21.31s/it]

For epoch 520: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.72batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.52batches/s]



Metrics: {'train_loss': 0.004401918077591534, 'test_loss': 0.6066162526607514, 'bleu': 19.7921, 'gen_len': 9.2603}




 20%|█▉        | 68/348 [24:34<1:40:16, 21.49s/it]

For epoch 521: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.77batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.45batches/s]



Metrics: {'train_loss': 0.004381071671093927, 'test_loss': 0.6099967949092389, 'bleu': 20.6599, 'gen_len': 9.3425}




 20%|█▉        | 69/348 [24:56<1:40:31, 21.62s/it]

For epoch 522: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.82batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.78batches/s]



Metrics: {'train_loss': 0.004643410200071408, 'test_loss': 0.5990396268665791, 'bleu': 20.3735, 'gen_len': 9.7945}




 20%|██        | 70/348 [25:18<1:39:35, 21.50s/it]

For epoch 523: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.78batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.60batches/s]



Metrics: {'train_loss': 0.00486465449240513, 'test_loss': 0.6023434683680534, 'bleu': 19.0514, 'gen_len': 9.6644}




 20%|██        | 71/348 [25:39<1:39:05, 21.46s/it]

For epoch 524: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.78batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.62batches/s]



Metrics: {'train_loss': 0.004439629564663713, 'test_loss': 0.6031367778778076, 'bleu': 19.2299, 'gen_len': 9.4932}




 21%|██        | 72/348 [26:00<1:38:33, 21.42s/it]

For epoch 525: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.73batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.48batches/s]



Metrics: {'train_loss': 0.004555440820162979, 'test_loss': 0.6065262407064438, 'bleu': 19.5622, 'gen_len': 9.8151}




 21%|██        | 73/348 [26:22<1:38:33, 21.50s/it]

For epoch 526: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.77batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.44batches/s]



Metrics: {'train_loss': 0.004425316313622383, 'test_loss': 0.6102984808385372, 'bleu': 19.3284, 'gen_len': 9.5}




 21%|██▏       | 74/348 [26:43<1:38:10, 21.50s/it]

For epoch 527: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.78batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.45batches/s]



Metrics: {'train_loss': 0.004708974718729534, 'test_loss': 0.5981472320854664, 'bleu': 19.3423, 'gen_len': 9.4726}




 22%|██▏       | 75/348 [27:05<1:37:53, 21.51s/it]

For epoch 528: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.77batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.42batches/s]



Metrics: {'train_loss': 0.004790969508154909, 'test_loss': 0.6038782216608525, 'bleu': 19.5876, 'gen_len': 9.7603}




 22%|██▏       | 76/348 [27:27<1:37:38, 21.54s/it]

For epoch 529: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.80batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.53batches/s]



Metrics: {'train_loss': 0.004367933705131092, 'test_loss': 0.6020604491233825, 'bleu': 18.9291, 'gen_len': 9.7466}




 22%|██▏       | 77/348 [27:48<1:36:43, 21.41s/it]

For epoch 530: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.74batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.52batches/s]



Metrics: {'train_loss': 0.0043565125499920145, 'test_loss': 0.6095385737717152, 'bleu': 19.3424, 'gen_len': 9.5411}




 22%|██▏       | 78/348 [28:09<1:36:41, 21.49s/it]

For epoch 531: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.83batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.63batches/s]



Metrics: {'train_loss': 0.004553294240883211, 'test_loss': 0.6061471864581108, 'bleu': 18.813, 'gen_len': 9.9521}




 23%|██▎       | 79/348 [28:31<1:36:06, 21.44s/it]

For epoch 532: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.81batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.53batches/s]



Metrics: {'train_loss': 0.004725551990814871, 'test_loss': 0.5949890121817589, 'bleu': 19.5807, 'gen_len': 9.774}




 23%|██▎       | 80/348 [28:52<1:35:34, 21.40s/it]

For epoch 533: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.80batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.83batches/s]



Metrics: {'train_loss': 0.004625345520652467, 'test_loss': 0.5920843005180358, 'bleu': 21.378, 'gen_len': 10.1096}




 23%|██▎       | 81/348 [29:13<1:34:56, 21.34s/it]

For epoch 534: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.79batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.56batches/s]



Metrics: {'train_loss': 0.0048334305799316345, 'test_loss': 0.6024030193686485, 'bleu': 19.8227, 'gen_len': 9.7945}




 24%|██▎       | 82/348 [29:35<1:34:37, 21.34s/it]

For epoch 535: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.82batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.53batches/s]



Metrics: {'train_loss': 0.004787288052484211, 'test_loss': 0.5990680307149887, 'bleu': 19.4964, 'gen_len': 9.5822}




 24%|██▍       | 83/348 [29:56<1:34:07, 21.31s/it]

For epoch 536: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.76batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.52batches/s]



Metrics: {'train_loss': 0.004565645875295669, 'test_loss': 0.600919434428215, 'bleu': 19.4546, 'gen_len': 9.9041}




 24%|██▍       | 84/348 [30:17<1:33:58, 21.36s/it]

For epoch 537: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.78batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.64batches/s]



Metrics: {'train_loss': 0.004558768652936035, 'test_loss': 0.6115125767886639, 'bleu': 18.8937, 'gen_len': 9.5205}




 24%|██▍       | 85/348 [30:39<1:34:10, 21.48s/it]

For epoch 538: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.78batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.50batches/s]



Metrics: {'train_loss': 0.004689254489920397, 'test_loss': 0.6084436036646366, 'bleu': 20.0363, 'gen_len': 9.9795}




 25%|██▍       | 86/348 [31:00<1:33:44, 21.47s/it]

For epoch 539: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.64batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.52batches/s]



Metrics: {'train_loss': 0.004318826897183388, 'test_loss': 0.6074276804924011, 'bleu': 19.7046, 'gen_len': 9.7671}




 25%|██▌       | 87/348 [31:23<1:34:19, 21.69s/it]

For epoch 540: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.82batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.52batches/s]



Metrics: {'train_loss': 0.004527978337846878, 'test_loss': 0.6203071884810925, 'bleu': 20.9142, 'gen_len': 9.637}




 25%|██▌       | 88/348 [31:44<1:33:35, 21.60s/it]

For epoch 541: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.79batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.52batches/s]



Metrics: {'train_loss': 0.004736831323697981, 'test_loss': 0.6146141327917576, 'bleu': 20.4612, 'gen_len': 9.9247}




 26%|██▌       | 89/348 [32:05<1:32:57, 21.53s/it]

For epoch 542: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.76batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.49batches/s]



Metrics: {'train_loss': 0.004748649514711848, 'test_loss': 0.5987221479415894, 'bleu': 18.7584, 'gen_len': 10.0479}




 26%|██▌       | 90/348 [32:27<1:32:37, 21.54s/it]

For epoch 543: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.77batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.52batches/s]



Metrics: {'train_loss': 0.004733991471300946, 'test_loss': 0.5987269431352615, 'bleu': 20.5763, 'gen_len': 10.1233}




 26%|██▌       | 91/348 [32:48<1:32:11, 21.52s/it]

For epoch 544: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.77batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.53batches/s]



Metrics: {'train_loss': 0.004639501200716306, 'test_loss': 0.6048544481396675, 'bleu': 19.0756, 'gen_len': 10.0137}




 26%|██▋       | 92/348 [33:10<1:31:42, 21.49s/it]

For epoch 545: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.73batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.45batches/s]



Metrics: {'train_loss': 0.005148333020326568, 'test_loss': 0.6048086136579514, 'bleu': 19.2568, 'gen_len': 10.0342}




 27%|██▋       | 93/348 [33:32<1:31:49, 21.61s/it]

For epoch 546: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.80batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.53batches/s]



Metrics: {'train_loss': 0.004984414602470834, 'test_loss': 0.6047069817781449, 'bleu': 19.5691, 'gen_len': 9.8219}




 27%|██▋       | 94/348 [33:53<1:30:57, 21.49s/it]

For epoch 547: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.83batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.68batches/s]



Metrics: {'train_loss': 0.0044119826165913804, 'test_loss': 0.5925324559211731, 'bleu': 20.3053, 'gen_len': 9.911}




 27%|██▋       | 95/348 [34:14<1:30:27, 21.45s/it]

For epoch 548: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.83batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.72batches/s]



Metrics: {'train_loss': 0.0044448799624039635, 'test_loss': 0.6136846274137497, 'bleu': 19.9053, 'gen_len': 9.7945}




 28%|██▊       | 96/348 [34:36<1:29:54, 21.41s/it]

For epoch 549: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.84batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.49batches/s]



Metrics: {'train_loss': 0.004633550577592559, 'test_loss': 0.6041219443082809, 'bleu': 17.4159, 'gen_len': 9.6301}




 28%|██▊       | 97/348 [34:57<1:29:42, 21.45s/it]

For epoch 550: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.85batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.70batches/s]



Metrics: {'train_loss': 0.00459867475054613, 'test_loss': 0.6182482793927193, 'bleu': 18.8505, 'gen_len': 9.5274}




 28%|██▊       | 98/348 [35:18<1:28:56, 21.35s/it]

For epoch 551: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.76batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.52batches/s]



Metrics: {'train_loss': 0.004372985049707406, 'test_loss': 0.6152099132537842, 'bleu': 18.8948, 'gen_len': 9.6644}




 28%|██▊       | 99/348 [35:40<1:28:56, 21.43s/it]

For epoch 552: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.79batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.53batches/s]



Metrics: {'train_loss': 0.004469592419521111, 'test_loss': 0.6088627368211746, 'bleu': 18.2631, 'gen_len': 9.7055}




 29%|██▊       | 100/348 [36:01<1:28:31, 21.42s/it]

For epoch 553: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.67batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.51batches/s]



Metrics: {'train_loss': 0.004536149160164159, 'test_loss': 0.6036221504211425, 'bleu': 19.2046, 'gen_len': 9.8493}




 29%|██▉       | 101/348 [36:24<1:29:36, 21.77s/it]

For epoch 554: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.62batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.48batches/s]



Metrics: {'train_loss': 0.004519404701479688, 'test_loss': 0.608167727291584, 'bleu': 21.0764, 'gen_len': 9.7466}




 29%|██▉       | 102/348 [36:46<1:30:15, 22.02s/it]

For epoch 555: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.65batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.37batches/s]



Metrics: {'train_loss': 0.00449023710949937, 'test_loss': 0.6029855474829674, 'bleu': 19.3338, 'gen_len': 9.9247}




 30%|██▉       | 103/348 [37:09<1:30:19, 22.12s/it]

For epoch 556: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.67batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.40batches/s]



Metrics: {'train_loss': 0.004526792721618421, 'test_loss': 0.621773587167263, 'bleu': 18.3166, 'gen_len': 9.8562}




 30%|██▉       | 104/348 [37:31<1:29:51, 22.10s/it]

For epoch 557: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.64batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.32batches/s]



Metrics: {'train_loss': 0.004498375639342135, 'test_loss': 0.6150906413793564, 'bleu': 20.2798, 'gen_len': 9.5274}




 30%|███       | 105/348 [37:53<1:30:03, 22.24s/it]

For epoch 558: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.68batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.48batches/s]



Metrics: {'train_loss': 0.004227558738642894, 'test_loss': 0.608940739929676, 'bleu': 19.3835, 'gen_len': 9.7123}




 30%|███       | 106/348 [38:16<1:30:03, 22.33s/it]

For epoch 559: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.66batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.66batches/s]



Metrics: {'train_loss': 0.004318117490606155, 'test_loss': 0.6062327764928341, 'bleu': 19.2531, 'gen_len': 9.5616}




 31%|███       | 107/348 [38:38<1:29:23, 22.25s/it]

For epoch 560: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.64batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.39batches/s]



Metrics: {'train_loss': 0.004409046048616491, 'test_loss': 0.6075238332152366, 'bleu': 19.7174, 'gen_len': 9.7123}




 31%|███       | 108/348 [39:01<1:29:25, 22.36s/it]

For epoch 561: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.77batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.49batches/s]



Metrics: {'train_loss': 0.004573550617617623, 'test_loss': 0.6013362459838391, 'bleu': 17.7321, 'gen_len': 9.5068}




 31%|███▏      | 109/348 [39:22<1:28:02, 22.10s/it]

For epoch 562: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.75batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.60batches/s]



Metrics: {'train_loss': 0.0044863456682722284, 'test_loss': 0.6042807161808014, 'bleu': 19.754, 'gen_len': 9.3082}




 32%|███▏      | 110/348 [39:44<1:26:56, 21.92s/it]

For epoch 563: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.78batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.72batches/s]



Metrics: {'train_loss': 0.004610677899951797, 'test_loss': 0.6070572093129158, 'bleu': 20.7753, 'gen_len': 9.5548}




 32%|███▏      | 111/348 [40:05<1:25:45, 21.71s/it]

For epoch 564: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.78batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.61batches/s]



Metrics: {'train_loss': 0.00422755481459473, 'test_loss': 0.6146357722580433, 'bleu': 19.2491, 'gen_len': 9.6918}




 32%|███▏      | 112/348 [40:26<1:25:11, 21.66s/it]

For epoch 565: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.80batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.49batches/s]



Metrics: {'train_loss': 0.00433358623612127, 'test_loss': 0.6071761265397072, 'bleu': 18.577, 'gen_len': 9.7397}




 32%|███▏      | 113/348 [40:48<1:24:26, 21.56s/it]

For epoch 566: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.66batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.01batches/s]



Metrics: {'train_loss': 0.0042705024882197015, 'test_loss': 0.6069729797542095, 'bleu': 18.7596, 'gen_len': 9.9384}




 33%|███▎      | 114/348 [41:11<1:26:11, 22.10s/it]

For epoch 567: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.79batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.69batches/s]



Metrics: {'train_loss': 0.004318207278052663, 'test_loss': 0.6045818783342838, 'bleu': 18.5404, 'gen_len': 9.8151}




 33%|███▎      | 115/348 [41:33<1:25:28, 22.01s/it]

For epoch 568: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.80batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.62batches/s]



Metrics: {'train_loss': 0.004173585980358284, 'test_loss': 0.6031965754926205, 'bleu': 18.7499, 'gen_len': 9.8151}




 33%|███▎      | 116/348 [41:54<1:24:28, 21.84s/it]

For epoch 569: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.77batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.58batches/s]



Metrics: {'train_loss': 0.004516500123583417, 'test_loss': 0.5945652686059475, 'bleu': 19.4044, 'gen_len': 9.8151}




 34%|███▎      | 117/348 [42:16<1:24:27, 21.94s/it]

For epoch 570: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.73batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.56batches/s]



Metrics: {'train_loss': 0.004447201468482069, 'test_loss': 0.5926573261618614, 'bleu': 18.8006, 'gen_len': 9.8219}




 34%|███▍      | 118/348 [42:38<1:23:51, 21.88s/it]

For epoch 571: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.67batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.41batches/s]



Metrics: {'train_loss': 0.0043059061536928865, 'test_loss': 0.5963971495628357, 'bleu': 19.8498, 'gen_len': 9.589}




 34%|███▍      | 119/348 [43:01<1:24:08, 22.05s/it]

For epoch 572: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.66batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.32batches/s]



Metrics: {'train_loss': 0.004395006613510593, 'test_loss': 0.6088782012462616, 'bleu': 18.8775, 'gen_len': 9.6301}




 34%|███▍      | 120/348 [43:23<1:24:15, 22.17s/it]

For epoch 573: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.66batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.34batches/s]



Metrics: {'train_loss': 0.00433454303093618, 'test_loss': 0.6029337301850319, 'bleu': 20.008, 'gen_len': 9.9795}




 35%|███▍      | 121/348 [43:46<1:24:15, 22.27s/it]

For epoch 574: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.68batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.58batches/s]



Metrics: {'train_loss': 0.004316130513893213, 'test_loss': 0.61248320043087, 'bleu': 18.4728, 'gen_len': 9.7603}




 35%|███▌      | 122/348 [44:08<1:23:46, 22.24s/it]

For epoch 575: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.66batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.37batches/s]



Metrics: {'train_loss': 0.004551530971827849, 'test_loss': 0.6106909722089767, 'bleu': 19.2733, 'gen_len': 9.8082}




 35%|███▌      | 123/348 [44:30<1:23:32, 22.28s/it]

For epoch 576: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.57batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.15batches/s]



Metrics: {'train_loss': 0.004401060071115087, 'test_loss': 0.5947997033596039, 'bleu': 19.6026, 'gen_len': 9.6233}




 36%|███▌      | 124/348 [44:53<1:24:13, 22.56s/it]

For epoch 577: 


Train batch number 41: 100%|██████████| 41/41 [00:16<00:00,  2.53batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.38batches/s]



Metrics: {'train_loss': 0.004589367425069213, 'test_loss': 0.5941929206252098, 'bleu': 19.0375, 'gen_len': 9.8493}




 36%|███▌      | 125/348 [45:17<1:25:23, 22.97s/it]

For epoch 578: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.60batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.35batches/s]



Metrics: {'train_loss': 0.004572569841246416, 'test_loss': 0.5957402765750885, 'bleu': 19.373, 'gen_len': 9.7534}




 36%|███▌      | 126/348 [45:40<1:24:54, 22.95s/it]

For epoch 579: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.81batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.69batches/s]



Metrics: {'train_loss': 0.004615341182570995, 'test_loss': 0.6058180660009385, 'bleu': 19.2725, 'gen_len': 9.7603}




 36%|███▋      | 127/348 [46:01<1:22:37, 22.43s/it]

For epoch 580: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.80batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.70batches/s]



Metrics: {'train_loss': 0.004185802172642292, 'test_loss': 0.6049230083823204, 'bleu': 20.5712, 'gen_len': 9.6438}




 37%|███▋      | 128/348 [46:23<1:20:57, 22.08s/it]

For epoch 581: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.79batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.79batches/s]



Metrics: {'train_loss': 0.004439999556691363, 'test_loss': 0.6045995697379112, 'bleu': 20.8673, 'gen_len': 9.8288}




 37%|███▋      | 129/348 [46:44<1:19:53, 21.89s/it]

For epoch 582: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.81batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.62batches/s]



Metrics: {'train_loss': 0.004822831666210621, 'test_loss': 0.603621420264244, 'bleu': 20.2745, 'gen_len': 9.8151}




 37%|███▋      | 130/348 [47:06<1:19:15, 21.81s/it]

For epoch 583: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.67batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.34batches/s]



Metrics: {'train_loss': 0.0042539234954591205, 'test_loss': 0.6034244775772095, 'bleu': 18.9748, 'gen_len': 9.7123}




 38%|███▊      | 131/348 [47:28<1:19:31, 21.99s/it]

For epoch 584: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.66batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.38batches/s]



Metrics: {'train_loss': 0.004837616917495503, 'test_loss': 0.6046863421797752, 'bleu': 21.0387, 'gen_len': 9.637}




 38%|███▊      | 132/348 [47:51<1:19:32, 22.09s/it]

For epoch 585: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.70batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.54batches/s]



Metrics: {'train_loss': 0.004624421359039843, 'test_loss': 0.6042549103498459, 'bleu': 20.1256, 'gen_len': 9.8699}




 38%|███▊      | 133/348 [48:13<1:19:14, 22.11s/it]

For epoch 586: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.64batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.30batches/s]



Metrics: {'train_loss': 0.004413804820193569, 'test_loss': 0.6068082302808762, 'bleu': 19.4884, 'gen_len': 9.6438}




 39%|███▊      | 134/348 [48:35<1:19:19, 22.24s/it]

For epoch 587: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.69batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.34batches/s]



Metrics: {'train_loss': 0.004444795927577992, 'test_loss': 0.6076475635170937, 'bleu': 21.1229, 'gen_len': 9.8425}




 39%|███▉      | 135/348 [48:58<1:19:02, 22.27s/it]

For epoch 588: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.67batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.33batches/s]



Metrics: {'train_loss': 0.004522582760262416, 'test_loss': 0.6097388729453087, 'bleu': 20.6801, 'gen_len': 10.089}




 39%|███▉      | 136/348 [49:20<1:18:48, 22.30s/it]

For epoch 589: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.61batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.37batches/s]



Metrics: {'train_loss': 0.004526058541292824, 'test_loss': 0.6067160308361054, 'bleu': 20.7663, 'gen_len': 9.7534}




 39%|███▉      | 137/348 [49:43<1:18:46, 22.40s/it]

For epoch 590: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.63batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.46batches/s]



Metrics: {'train_loss': 0.004322686322067478, 'test_loss': 0.6090325295925141, 'bleu': 19.1819, 'gen_len': 9.7123}




 40%|███▉      | 138/348 [50:05<1:18:49, 22.52s/it]

For epoch 591: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.77batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.79batches/s]



Metrics: {'train_loss': 0.004258074317831637, 'test_loss': 0.6079587295651436, 'bleu': 21.0067, 'gen_len': 9.8151}




 40%|███▉      | 139/348 [50:26<1:16:57, 22.09s/it]

For epoch 592: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.82batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.74batches/s]



Metrics: {'train_loss': 0.004309898926649333, 'test_loss': 0.6151361718773842, 'bleu': 21.1238, 'gen_len': 9.7945}




 40%|████      | 140/348 [50:48<1:15:30, 21.78s/it]

For epoch 593: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.74batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.13batches/s]



Metrics: {'train_loss': 0.004379639180558847, 'test_loss': 0.622977951169014, 'bleu': 18.747, 'gen_len': 9.7055}




 41%|████      | 141/348 [51:10<1:15:49, 21.98s/it]

For epoch 594: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.71batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.55batches/s]



Metrics: {'train_loss': 0.0042227988213686864, 'test_loss': 0.6219908252358437, 'bleu': 19.4872, 'gen_len': 9.726}




 41%|████      | 142/348 [51:32<1:15:37, 22.02s/it]

For epoch 595: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.75batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.56batches/s]



Metrics: {'train_loss': 0.0042692366043366944, 'test_loss': 0.6237408220767975, 'bleu': 20.7082, 'gen_len': 9.7534}




 41%|████      | 143/348 [51:54<1:15:05, 21.98s/it]

For epoch 596: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.77batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.52batches/s]



Metrics: {'train_loss': 0.00457486988999313, 'test_loss': 0.6116864576935768, 'bleu': 19.4154, 'gen_len': 9.8836}




 41%|████▏     | 144/348 [52:15<1:14:00, 21.77s/it]

For epoch 597: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.86batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.40batches/s]



Metrics: {'train_loss': 0.0042518553289970975, 'test_loss': 0.6070823073387146, 'bleu': 19.6769, 'gen_len': 9.9041}




 42%|████▏     | 145/348 [52:37<1:13:19, 21.67s/it]

For epoch 598: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.78batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.52batches/s]



Metrics: {'train_loss': 0.004402594142536655, 'test_loss': 0.5995219051837921, 'bleu': 18.5599, 'gen_len': 9.7877}




 42%|████▏     | 146/348 [52:58<1:12:41, 21.59s/it]

For epoch 599: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.77batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.47batches/s]



Metrics: {'train_loss': 0.004638098944092124, 'test_loss': 0.5988376021385193, 'bleu': 20.5054, 'gen_len': 9.7329}




 42%|████▏     | 147/348 [53:20<1:12:16, 21.57s/it]

For epoch 600: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.77batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.53batches/s]



Metrics: {'train_loss': 0.004275852081752042, 'test_loss': 0.5974430590867996, 'bleu': 20.7934, 'gen_len': 9.9795}




 43%|████▎     | 148/348 [53:41<1:11:43, 21.52s/it]

For epoch 601: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.83batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.73batches/s]



Metrics: {'train_loss': 0.0040906062406465046, 'test_loss': 0.6010357305407524, 'bleu': 20.8168, 'gen_len': 9.7603}




 43%|████▎     | 149/348 [54:02<1:10:54, 21.38s/it]

For epoch 602: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.78batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.45batches/s]



Metrics: {'train_loss': 0.004292329040742139, 'test_loss': 0.604044409096241, 'bleu': 20.5, 'gen_len': 10.1027}




 43%|████▎     | 150/348 [54:24<1:10:52, 21.48s/it]

For epoch 603: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.78batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.56batches/s]



Metrics: {'train_loss': 0.004262212767526962, 'test_loss': 0.6039452955126763, 'bleu': 19.7956, 'gen_len': 10.1096}




 43%|████▎     | 151/348 [54:46<1:10:47, 21.56s/it]

For epoch 604: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.76batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.53batches/s]



Metrics: {'train_loss': 0.004807072103863991, 'test_loss': 0.6058111593127251, 'bleu': 19.865, 'gen_len': 9.4315}




 44%|████▎     | 152/348 [55:07<1:10:36, 21.62s/it]

For epoch 605: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.77batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.69batches/s]



Metrics: {'train_loss': 0.004323754493692299, 'test_loss': 0.6111486822366714, 'bleu': 18.5942, 'gen_len': 9.6233}




 44%|████▍     | 153/348 [55:30<1:10:53, 21.81s/it]

For epoch 606: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.82batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.71batches/s]



Metrics: {'train_loss': 0.0043557437261702815, 'test_loss': 0.6068095400929451, 'bleu': 19.3013, 'gen_len': 9.6027}




 44%|████▍     | 154/348 [55:51<1:10:06, 21.68s/it]

For epoch 607: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.86batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.76batches/s]



Metrics: {'train_loss': 0.004596016420869202, 'test_loss': 0.5931544721126556, 'bleu': 18.8131, 'gen_len': 9.7534}




 45%|████▍     | 155/348 [56:12<1:09:04, 21.47s/it]

For epoch 608: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.80batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.49batches/s]



Metrics: {'train_loss': 0.004782185052158084, 'test_loss': 0.5897821500897408, 'bleu': 19.87, 'gen_len': 10.1164}




 45%|████▍     | 156/348 [56:33<1:08:25, 21.38s/it]

For epoch 609: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.83batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.71batches/s]



Metrics: {'train_loss': 0.004549195410729181, 'test_loss': 0.5974569499492646, 'bleu': 20.3736, 'gen_len': 9.7877}




 45%|████▌     | 157/348 [56:54<1:07:48, 21.30s/it]

For epoch 610: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.79batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.55batches/s]



Metrics: {'train_loss': 0.004543950025946266, 'test_loss': 0.595411503314972, 'bleu': 20.2269, 'gen_len': 9.7397}




 45%|████▌     | 158/348 [57:15<1:07:27, 21.30s/it]

For epoch 611: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.79batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.35batches/s]



Metrics: {'train_loss': 0.0046757126607500565, 'test_loss': 0.5965435981750489, 'bleu': 19.6202, 'gen_len': 9.8699}




 46%|████▌     | 159/348 [57:37<1:07:21, 21.38s/it]

For epoch 612: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.78batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.58batches/s]



Metrics: {'train_loss': 0.004215111033195947, 'test_loss': 0.5988623052835464, 'bleu': 21.0624, 'gen_len': 9.8836}




 46%|████▌     | 160/348 [57:58<1:06:49, 21.33s/it]

For epoch 613: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.86batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.52batches/s]



Metrics: {'train_loss': 0.004648333531804383, 'test_loss': 0.601151017844677, 'bleu': 20.5711, 'gen_len': 9.8425}




 46%|████▋     | 161/348 [58:20<1:06:23, 21.30s/it]

For epoch 614: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.76batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.34batches/s]



Metrics: {'train_loss': 0.004488481810634456, 'test_loss': 0.5938404530286789, 'bleu': 22.5422, 'gen_len': 9.8493}




 47%|████▋     | 162/348 [58:41<1:06:29, 21.45s/it]

For epoch 615: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.78batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.51batches/s]



Metrics: {'train_loss': 0.004581670063865803, 'test_loss': 0.6001193657517433, 'bleu': 20.429, 'gen_len': 9.863}




 47%|████▋     | 163/348 [59:03<1:05:58, 21.40s/it]

For epoch 616: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.86batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.62batches/s]



Metrics: {'train_loss': 0.004621686601284437, 'test_loss': 0.5959686495363712, 'bleu': 20.4346, 'gen_len': 9.8562}




 47%|████▋     | 164/348 [59:24<1:05:16, 21.28s/it]

For epoch 617: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.80batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.63batches/s]



Metrics: {'train_loss': 0.004464920564759068, 'test_loss': 0.5986361235380173, 'bleu': 20.8382, 'gen_len': 9.8356}




 47%|████▋     | 165/348 [59:45<1:04:56, 21.29s/it]

For epoch 618: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.77batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.55batches/s]



Metrics: {'train_loss': 0.004552064463496208, 'test_loss': 0.59926827698946, 'bleu': 21.7875, 'gen_len': 9.9315}




 48%|████▊     | 166/348 [1:00:06<1:04:40, 21.32s/it]

For epoch 619: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.76batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.68batches/s]



Metrics: {'train_loss': 0.00446354933707725, 'test_loss': 0.6019207671284675, 'bleu': 20.8552, 'gen_len': 10.1233}




 48%|████▊     | 167/348 [1:00:27<1:04:07, 21.26s/it]

For epoch 620: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.80batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.57batches/s]



Metrics: {'train_loss': 0.004290384305754631, 'test_loss': 0.5929939188063145, 'bleu': 20.1542, 'gen_len': 9.8767}




 48%|████▊     | 168/348 [1:00:49<1:03:43, 21.24s/it]

For epoch 621: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.85batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.32batches/s]



Metrics: {'train_loss': 0.0043023478001265264, 'test_loss': 0.6001567952334881, 'bleu': 18.9053, 'gen_len': 10.2329}




 49%|████▊     | 169/348 [1:01:10<1:03:42, 21.35s/it]

For epoch 622: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.75batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.56batches/s]



Metrics: {'train_loss': 0.004609672880799669, 'test_loss': 0.6001211866736412, 'bleu': 20.2438, 'gen_len': 10.0411}




 49%|████▉     | 170/348 [1:01:32<1:03:34, 21.43s/it]

For epoch 623: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.79batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.56batches/s]



Metrics: {'train_loss': 0.004773928139290613, 'test_loss': 0.6058076441287994, 'bleu': 19.6718, 'gen_len': 10.0274}




 49%|████▉     | 171/348 [1:01:53<1:02:59, 21.36s/it]

For epoch 624: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.78batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.47batches/s]



Metrics: {'train_loss': 0.004427143929117337, 'test_loss': 0.5993852689862251, 'bleu': 20.9844, 'gen_len': 9.9932}




 49%|████▉     | 172/348 [1:02:14<1:02:42, 21.38s/it]

For epoch 625: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.78batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.53batches/s]



Metrics: {'train_loss': 0.004411320562656151, 'test_loss': 0.5976863905787468, 'bleu': 20.129, 'gen_len': 9.6849}




 50%|████▉     | 173/348 [1:02:36<1:02:18, 21.36s/it]

For epoch 626: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.86batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.74batches/s]



Metrics: {'train_loss': 0.004719800136915249, 'test_loss': 0.5955876663327218, 'bleu': 20.2813, 'gen_len': 9.5616}




 50%|█████     | 174/348 [1:02:57<1:01:28, 21.20s/it]

For epoch 627: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.63batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.26batches/s]



Metrics: {'train_loss': 0.004400269463431181, 'test_loss': 0.5927586391568184, 'bleu': 21.0661, 'gen_len': 9.8151}




 50%|█████     | 175/348 [1:03:19<1:02:17, 21.60s/it]

For epoch 628: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.65batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.11batches/s]



Metrics: {'train_loss': 0.004450879599217598, 'test_loss': 0.5919282503426075, 'bleu': 20.0396, 'gen_len': 9.8219}




 51%|█████     | 176/348 [1:03:42<1:03:24, 22.12s/it]

For epoch 629: 


Train batch number 41: 100%|██████████| 41/41 [00:16<00:00,  2.51batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.49batches/s]



Metrics: {'train_loss': 0.004230009408940266, 'test_loss': 0.5907405517995358, 'bleu': 20.6654, 'gen_len': 9.7877}




 51%|█████     | 177/348 [1:04:06<1:04:16, 22.55s/it]

For epoch 630: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.62batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.38batches/s]



Metrics: {'train_loss': 0.004344948982002168, 'test_loss': 0.5993749015033245, 'bleu': 20.5259, 'gen_len': 9.6438}




 51%|█████     | 178/348 [1:04:29<1:04:14, 22.67s/it]

For epoch 631: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.73batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.33batches/s]



Metrics: {'train_loss': 0.004229428564629904, 'test_loss': 0.6008913137018681, 'bleu': 20.8373, 'gen_len': 9.5685}




 51%|█████▏    | 179/348 [1:04:51<1:03:19, 22.48s/it]

For epoch 632: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.66batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.36batches/s]



Metrics: {'train_loss': 0.004222376030733491, 'test_loss': 0.6015557363629341, 'bleu': 21.2118, 'gen_len': 9.911}




 52%|█████▏    | 180/348 [1:05:13<1:02:48, 22.43s/it]

For epoch 633: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.67batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.50batches/s]



Metrics: {'train_loss': 0.004355178893225767, 'test_loss': 0.60131900832057, 'bleu': 21.4546, 'gen_len': 9.6438}




 52%|█████▏    | 181/348 [1:05:36<1:02:23, 22.41s/it]

For epoch 634: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.69batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.43batches/s]



Metrics: {'train_loss': 0.004670704746187278, 'test_loss': 0.595467283576727, 'bleu': 22.6206, 'gen_len': 9.8014}




 52%|█████▏    | 182/348 [1:05:58<1:01:47, 22.33s/it]

For epoch 635: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.71batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.60batches/s]



Metrics: {'train_loss': 0.004237879056301786, 'test_loss': 0.5887102752923965, 'bleu': 21.7488, 'gen_len': 9.8014}




 53%|█████▎    | 183/348 [1:06:20<1:01:06, 22.22s/it]

For epoch 636: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.64batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.38batches/s]



Metrics: {'train_loss': 0.004346536427719246, 'test_loss': 0.588556157052517, 'bleu': 20.8898, 'gen_len': 9.774}




 53%|█████▎    | 184/348 [1:06:42<1:01:02, 22.33s/it]

For epoch 637: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.69batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.36batches/s]



Metrics: {'train_loss': 0.004328995145198594, 'test_loss': 0.5940995424985885, 'bleu': 21.8942, 'gen_len': 9.4726}




 53%|█████▎    | 185/348 [1:07:05<1:00:39, 22.33s/it]

For epoch 638: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.69batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.50batches/s]



Metrics: {'train_loss': 0.004535195142280583, 'test_loss': 0.5864237017929554, 'bleu': 21.8131, 'gen_len': 9.9247}




 53%|█████▎    | 186/348 [1:07:27<1:00:02, 22.24s/it]

For epoch 639: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.78batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.78batches/s]



Metrics: {'train_loss': 0.004408955240406368, 'test_loss': 0.5903363958001137, 'bleu': 22.5391, 'gen_len': 9.774}




 54%|█████▎    | 187/348 [1:07:48<58:58, 21.98s/it]  

For epoch 640: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.77batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.68batches/s]



Metrics: {'train_loss': 0.004319009124074222, 'test_loss': 0.5960529237985611, 'bleu': 21.0921, 'gen_len': 9.6918}




 54%|█████▍    | 188/348 [1:08:10<58:12, 21.83s/it]

For epoch 641: 


Train batch number 41: 100%|██████████| 41/41 [00:46<00:00,  1.13s/batches]
Test batch number 10: 100%|██████████| 10/10 [00:23<00:00,  2.33s/batches]

### ---

In [8]:
trainer.train(epochs = config['max_epoch'] - trainer.current_epoch, auto_save=True, metric_for_best_model='bleu', metric_objective='maximize', log_step=1,
              saving_directory = config['new_model_dir'])

  0%|          | 0/359 [00:00<?, ?it/s]

For epoch 642: 


Train batch number 41: 100%|██████████| 41/41 [00:16<00:00,  2.55batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  3.30batches/s]



Metrics: {'train_loss': 0.005135000158078605, 'test_loss': 0.5896475620567798, 'bleu': 21.9956, 'gen_len': 9.4521}




  0%|          | 1/359 [00:22<2:11:28, 22.03s/it]

For epoch 643: 


Train batch number 41: 100%|██████████| 41/41 [00:12<00:00,  3.35batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.98batches/s]



Metrics: {'train_loss': 0.004544671701562659, 'test_loss': 0.5867383792996407, 'bleu': 21.0989, 'gen_len': 9.774}




  1%|          | 2/359 [00:40<1:57:25, 19.73s/it]

For epoch 644: 


Train batch number 41: 100%|██████████| 41/41 [00:12<00:00,  3.28batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  3.07batches/s]



Metrics: {'train_loss': 0.004616934116134738, 'test_loss': 0.5931076958775521, 'bleu': 21.722, 'gen_len': 9.7192}




  1%|          | 3/359 [00:58<1:54:01, 19.22s/it]

For epoch 645: 


Train batch number 41: 100%|██████████| 41/41 [00:13<00:00,  3.04batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  3.00batches/s]



Metrics: {'train_loss': 0.004914376490022533, 'test_loss': 0.5879104286432266, 'bleu': 21.7287, 'gen_len': 9.5685}




  1%|          | 4/359 [01:18<1:54:27, 19.35s/it]

For epoch 646: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.89batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.64batches/s]



Metrics: {'train_loss': 0.00503383327570812, 'test_loss': 0.5991944462060929, 'bleu': 21.2731, 'gen_len': 9.3288}




  1%|▏         | 5/359 [01:39<1:57:05, 19.85s/it]

For epoch 647: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.90batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.95batches/s]



Metrics: {'train_loss': 0.004687873726120082, 'test_loss': 0.5892325691878796, 'bleu': 21.5595, 'gen_len': 9.6438}




  2%|▏         | 6/359 [01:59<1:57:38, 20.00s/it]

For epoch 648: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.83batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.87batches/s]



Metrics: {'train_loss': 0.004526555027085833, 'test_loss': 0.5961752407252788, 'bleu': 23.1376, 'gen_len': 9.5411}




  2%|▏         | 7/359 [02:20<1:58:42, 20.24s/it]

For epoch 649: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.67batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.55batches/s]



Metrics: {'train_loss': 0.004261045424244934, 'test_loss': 0.5997731812298298, 'bleu': 20.9615, 'gen_len': 9.7123}




  2%|▏         | 8/359 [02:41<2:01:31, 20.77s/it]

For epoch 650: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.70batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.66batches/s]



Metrics: {'train_loss': 0.004203370224298318, 'test_loss': 0.598857606202364, 'bleu': 22.6151, 'gen_len': 9.4932}




  3%|▎         | 9/359 [03:03<2:02:24, 20.98s/it]

For epoch 651: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.77batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.82batches/s]



Metrics: {'train_loss': 0.004743623554354488, 'test_loss': 0.6007527828216552, 'bleu': 21.376, 'gen_len': 9.5616}




  3%|▎         | 10/359 [03:24<2:02:12, 21.01s/it]

For epoch 652: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.76batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.76batches/s]



Metrics: {'train_loss': 0.004667035559555743, 'test_loss': 0.5964943572878838, 'bleu': 20.7253, 'gen_len': 9.6644}




  3%|▎         | 11/359 [03:45<2:02:10, 21.07s/it]

For epoch 653: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.79batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.85batches/s]



Metrics: {'train_loss': 0.004506277221414011, 'test_loss': 0.6022572726011276, 'bleu': 20.9758, 'gen_len': 9.6986}




  3%|▎         | 12/359 [04:06<2:01:47, 21.06s/it]

For epoch 654: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.74batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.82batches/s]



Metrics: {'train_loss': 0.004148699979229671, 'test_loss': 0.6031935013830662, 'bleu': 21.6676, 'gen_len': 9.6849}




  4%|▎         | 13/359 [04:28<2:01:59, 21.16s/it]

For epoch 655: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.65batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.40batches/s]



Metrics: {'train_loss': 0.004443215947348352, 'test_loss': 0.6023364618420601, 'bleu': 21.5236, 'gen_len': 9.6164}




  4%|▍         | 14/359 [04:50<2:03:26, 21.47s/it]

For epoch 656: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.71batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.58batches/s]



Metrics: {'train_loss': 0.004280152232329384, 'test_loss': 0.5953427210450173, 'bleu': 21.6477, 'gen_len': 9.8082}




  4%|▍         | 15/359 [05:11<2:03:20, 21.51s/it]

For epoch 657: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.73batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.49batches/s]



Metrics: {'train_loss': 0.004375461244773938, 'test_loss': 0.6007835507392884, 'bleu': 20.2639, 'gen_len': 9.8219}




  4%|▍         | 16/359 [05:33<2:03:08, 21.54s/it]

For epoch 658: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.71batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.62batches/s]



Metrics: {'train_loss': 0.004414806923283855, 'test_loss': 0.6056582480669022, 'bleu': 21.8451, 'gen_len': 9.589}




  5%|▍         | 17/359 [05:54<2:02:36, 21.51s/it]

For epoch 659: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.67batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.58batches/s]



Metrics: {'train_loss': 0.004466332222630338, 'test_loss': 0.5988088592886924, 'bleu': 21.5569, 'gen_len': 9.7671}




  5%|▌         | 18/359 [06:16<2:02:42, 21.59s/it]

For epoch 660: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.74batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.79batches/s]



Metrics: {'train_loss': 0.004351054363679595, 'test_loss': 0.6071567565202713, 'bleu': 20.5464, 'gen_len': 9.6644}




  5%|▌         | 19/359 [06:38<2:01:58, 21.52s/it]

For epoch 661: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.74batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.55batches/s]



Metrics: {'train_loss': 0.004532547522813264, 'test_loss': 0.6079017773270607, 'bleu': 21.022, 'gen_len': 9.6027}




  6%|▌         | 20/359 [06:59<2:01:20, 21.48s/it]

For epoch 662: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.70batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.55batches/s]



Metrics: {'train_loss': 0.00446623851965386, 'test_loss': 0.6016876697540283, 'bleu': 21.7873, 'gen_len': 9.3082}




  6%|▌         | 21/359 [07:21<2:01:15, 21.52s/it]

For epoch 663: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.72batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.52batches/s]



Metrics: {'train_loss': 0.00432773917436418, 'test_loss': 0.6034417778253556, 'bleu': 19.7849, 'gen_len': 9.5479}




  6%|▌         | 22/359 [07:42<2:00:50, 21.51s/it]

For epoch 664: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.78batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.75batches/s]



Metrics: {'train_loss': 0.004391599695514831, 'test_loss': 0.6030517436563969, 'bleu': 20.4581, 'gen_len': 9.6712}




  6%|▋         | 23/359 [08:03<2:00:02, 21.44s/it]

For epoch 665: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.69batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.62batches/s]



Metrics: {'train_loss': 0.00424919279985039, 'test_loss': 0.5982158124446869, 'bleu': 18.785, 'gen_len': 9.6849}




  7%|▋         | 24/359 [08:25<2:00:16, 21.54s/it]

For epoch 666: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.71batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.84batches/s]



Metrics: {'train_loss': 0.004627332676247489, 'test_loss': 0.6090446546673774, 'bleu': 17.809, 'gen_len': 9.7192}




  7%|▋         | 25/359 [08:47<1:59:44, 21.51s/it]

For epoch 667: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.68batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.61batches/s]



Metrics: {'train_loss': 0.004476869033036254, 'test_loss': 0.599316093325615, 'bleu': 21.4581, 'gen_len': 9.8219}




  7%|▋         | 26/359 [09:09<2:00:06, 21.64s/it]

For epoch 668: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.69batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.61batches/s]



Metrics: {'train_loss': 0.004249653935137137, 'test_loss': 0.6072284057736397, 'bleu': 20.9819, 'gen_len': 9.4658}




  8%|▊         | 27/359 [09:30<1:59:58, 21.68s/it]

For epoch 669: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.70batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.62batches/s]



Metrics: {'train_loss': 0.004225348633509584, 'test_loss': 0.5993056908249855, 'bleu': 19.2376, 'gen_len': 9.6164}




  8%|▊         | 28/359 [09:52<1:59:41, 21.70s/it]

For epoch 670: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.72batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.73batches/s]



Metrics: {'train_loss': 0.004520769529158204, 'test_loss': 0.5898060575127602, 'bleu': 19.4527, 'gen_len': 9.6507}




  8%|▊         | 29/359 [10:14<1:59:04, 21.65s/it]

For epoch 671: 


Train batch number 41: 100%|██████████| 41/41 [00:17<00:00,  2.40batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.03batches/s]



Metrics: {'train_loss': 0.004277920811834586, 'test_loss': 0.5936950847506524, 'bleu': 20.3349, 'gen_len': 9.5479}




  8%|▊         | 30/359 [10:38<2:03:32, 22.53s/it]

For epoch 672: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.68batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.43batches/s]



Metrics: {'train_loss': 0.004321601156645068, 'test_loss': 0.6033155530691147, 'bleu': 20.7163, 'gen_len': 9.8425}




  9%|▊         | 31/359 [11:00<2:02:33, 22.42s/it]

For epoch 673: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.64batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.39batches/s]



Metrics: {'train_loss': 0.004588496851416804, 'test_loss': 0.5971854701638222, 'bleu': 21.8586, 'gen_len': 9.7808}




  9%|▉         | 32/359 [11:23<2:03:11, 22.60s/it]

For epoch 674: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.71batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.58batches/s]



Metrics: {'train_loss': 0.004321518794224575, 'test_loss': 0.6087628111243248, 'bleu': 20.5666, 'gen_len': 9.4863}




  9%|▉         | 33/359 [11:45<2:01:30, 22.36s/it]

For epoch 675: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.71batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.47batches/s]



Metrics: {'train_loss': 0.0046385816133740106, 'test_loss': 0.5988063618540764, 'bleu': 20.8799, 'gen_len': 9.8082}




  9%|▉         | 34/359 [12:07<2:00:29, 22.24s/it]

For epoch 676: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.71batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.14batches/s]



Metrics: {'train_loss': 0.0042676498244584695, 'test_loss': 0.6039575077593327, 'bleu': 21.4205, 'gen_len': 9.7466}




 10%|▉         | 35/359 [12:30<2:00:33, 22.33s/it]

For epoch 677: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.68batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.63batches/s]



Metrics: {'train_loss': 0.004385238996048163, 'test_loss': 0.6093518480658531, 'bleu': 20.9121, 'gen_len': 9.6918}




 10%|█         | 36/359 [12:52<1:59:41, 22.23s/it]

For epoch 678: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.72batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.64batches/s]



Metrics: {'train_loss': 0.004547233000488543, 'test_loss': 0.5987645938992501, 'bleu': 22.5366, 'gen_len': 9.5068}




 10%|█         | 37/359 [13:13<1:58:06, 22.01s/it]

For epoch 679: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.75batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.57batches/s]



Metrics: {'train_loss': 0.004668089143772859, 'test_loss': 0.606817090511322, 'bleu': 21.1756, 'gen_len': 9.5685}




 11%|█         | 38/359 [13:35<1:56:51, 21.84s/it]

For epoch 680: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.67batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.56batches/s]



Metrics: {'train_loss': 0.004727660172204374, 'test_loss': 0.6006692051887512, 'bleu': 22.641, 'gen_len': 9.6027}




 11%|█         | 39/359 [13:57<1:56:48, 21.90s/it]

For epoch 681: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.75batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.52batches/s]



Metrics: {'train_loss': 0.004555647488592602, 'test_loss': 0.6041686728596687, 'bleu': 21.1784, 'gen_len': 9.274}




 11%|█         | 40/359 [14:18<1:55:59, 21.82s/it]

For epoch 682: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.68batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.43batches/s]



Metrics: {'train_loss': 0.0048468716857137115, 'test_loss': 0.5857984781265259, 'bleu': 22.7426, 'gen_len': 9.6438}




 11%|█▏        | 41/359 [14:40<1:56:05, 21.90s/it]

For epoch 683: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.67batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.42batches/s]



Metrics: {'train_loss': 0.0044700235878003805, 'test_loss': 0.5878001764416695, 'bleu': 21.7819, 'gen_len': 9.6849}




 12%|█▏        | 42/359 [15:03<1:56:16, 22.01s/it]

For epoch 684: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.75batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.76batches/s]



Metrics: {'train_loss': 0.004664324769159642, 'test_loss': 0.5930204749107361, 'bleu': 21.4893, 'gen_len': 9.7397}




 12%|█▏        | 43/359 [15:24<1:54:51, 21.81s/it]

For epoch 685: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.68batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.58batches/s]



Metrics: {'train_loss': 0.004643202381685558, 'test_loss': 0.581130450963974, 'bleu': 22.9077, 'gen_len': 9.6575}




 12%|█▏        | 44/359 [15:45<1:54:03, 21.72s/it]

For epoch 686: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.70batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.46batches/s]



Metrics: {'train_loss': 0.0041340497285468365, 'test_loss': 0.5924633510410786, 'bleu': 23.0327, 'gen_len': 9.7534}




 13%|█▎        | 45/359 [16:07<1:53:50, 21.75s/it]

For epoch 687: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.67batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.53batches/s]



Metrics: {'train_loss': 0.004107312685469302, 'test_loss': 0.5925308458507061, 'bleu': 23.4881, 'gen_len': 9.6712}




 13%|█▎        | 46/359 [16:31<1:56:21, 22.31s/it]

For epoch 688: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.73batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.63batches/s]



Metrics: {'train_loss': 0.004176556922086492, 'test_loss': 0.5951249569654464, 'bleu': 22.62, 'gen_len': 9.8082}




 13%|█▎        | 47/359 [16:52<1:54:32, 22.03s/it]

For epoch 689: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.69batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.48batches/s]



Metrics: {'train_loss': 0.004339729546478427, 'test_loss': 0.6026889711618424, 'bleu': 23.7095, 'gen_len': 9.8562}




 13%|█▎        | 48/359 [17:16<1:56:35, 22.49s/it]

For epoch 690: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.73batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.53batches/s]



Metrics: {'train_loss': 0.004112556281431419, 'test_loss': 0.5994828447699547, 'bleu': 22.8985, 'gen_len': 9.7671}




 14%|█▎        | 49/359 [17:37<1:54:31, 22.17s/it]

For epoch 691: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.72batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.49batches/s]



Metrics: {'train_loss': 0.00419142250562223, 'test_loss': 0.5989953950047493, 'bleu': 22.715, 'gen_len': 9.8904}




 14%|█▍        | 50/359 [17:59<1:53:20, 22.01s/it]

For epoch 692: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.63batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.58batches/s]



Metrics: {'train_loss': 0.0042638399797241865, 'test_loss': 0.5970909267663955, 'bleu': 21.7636, 'gen_len': 9.8699}




 14%|█▍        | 51/359 [18:21<1:53:23, 22.09s/it]

For epoch 693: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.77batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.63batches/s]



Metrics: {'train_loss': 0.004296401820838361, 'test_loss': 0.5935538552701474, 'bleu': 22.1335, 'gen_len': 9.7466}




 14%|█▍        | 52/359 [18:43<1:52:05, 21.91s/it]

For epoch 694: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.73batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.52batches/s]



Metrics: {'train_loss': 0.004132213611184142, 'test_loss': 0.5977059528231621, 'bleu': 22.3884, 'gen_len': 9.9247}




 15%|█▍        | 53/359 [19:04<1:51:29, 21.86s/it]

For epoch 695: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.63batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.38batches/s]



Metrics: {'train_loss': 0.004178802496980785, 'test_loss': 0.5995888203382492, 'bleu': 21.076, 'gen_len': 9.8904}




 15%|█▌        | 54/359 [19:27<1:51:39, 21.96s/it]

For epoch 696: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.69batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.53batches/s]



Metrics: {'train_loss': 0.004298140311886261, 'test_loss': 0.5903781965374947, 'bleu': 20.7626, 'gen_len': 9.6233}




 15%|█▌        | 55/359 [19:48<1:50:42, 21.85s/it]

For epoch 697: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.73batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.45batches/s]



Metrics: {'train_loss': 0.004262053779689822, 'test_loss': 0.6054087713360786, 'bleu': 21.1323, 'gen_len': 9.9247}




 16%|█▌        | 56/359 [20:10<1:50:03, 21.79s/it]

For epoch 698: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.63batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.46batches/s]



Metrics: {'train_loss': 0.0040810557547956705, 'test_loss': 0.6021751642227173, 'bleu': 21.1874, 'gen_len': 9.6644}




 16%|█▌        | 57/359 [20:32<1:50:13, 21.90s/it]

For epoch 699: 


Train batch number 41: 100%|██████████| 41/41 [00:17<00:00,  2.37batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.27batches/s]



Metrics: {'train_loss': 0.004310594202961955, 'test_loss': 0.5911638453602791, 'bleu': 22.392, 'gen_len': 9.6233}




 16%|█▌        | 58/359 [20:56<1:53:23, 22.60s/it]

For epoch 700: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.64batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.37batches/s]



Metrics: {'train_loss': 0.00416394632037093, 'test_loss': 0.5987414553761482, 'bleu': 21.1307, 'gen_len': 9.7603}




 16%|█▋        | 59/359 [21:19<1:52:48, 22.56s/it]

For epoch 701: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.71batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.49batches/s]



Metrics: {'train_loss': 0.004238174595052331, 'test_loss': 0.6014227449893952, 'bleu': 21.7983, 'gen_len': 9.7192}




 17%|█▋        | 60/359 [21:41<1:51:18, 22.34s/it]

For epoch 702: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.68batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.50batches/s]



Metrics: {'train_loss': 0.004763100942221992, 'test_loss': 0.5918902322649956, 'bleu': 22.1121, 'gen_len': 9.9589}




 17%|█▋        | 61/359 [22:03<1:50:37, 22.27s/it]

For epoch 703: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.74batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.40batches/s]



Metrics: {'train_loss': 0.004312530216738218, 'test_loss': 0.5984502896666527, 'bleu': 21.952, 'gen_len': 9.9521}




 17%|█▋        | 62/359 [22:25<1:49:56, 22.21s/it]

For epoch 704: 


Train batch number 41: 100%|██████████| 41/41 [00:16<00:00,  2.42batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.64batches/s]



Metrics: {'train_loss': 0.004346506783693302, 'test_loss': 0.6013494059443474, 'bleu': 21.4815, 'gen_len': 9.7808}




 18%|█▊        | 63/359 [22:48<1:51:12, 22.54s/it]

For epoch 705: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.58batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.25batches/s]



Metrics: {'train_loss': 0.0043796288820619625, 'test_loss': 0.5993398517370224, 'bleu': 22.3619, 'gen_len': 9.8219}




 18%|█▊        | 64/359 [23:12<1:53:14, 23.03s/it]

For epoch 706: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.68batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.49batches/s]



Metrics: {'train_loss': 0.004405901826940841, 'test_loss': 0.6033985957503318, 'bleu': 21.8281, 'gen_len': 9.5}




 18%|█▊        | 65/359 [23:34<1:50:58, 22.65s/it]

For epoch 707: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.61batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:05<00:00,  1.68batches/s]



Metrics: {'train_loss': 0.004196600830920677, 'test_loss': 0.6086117625236511, 'bleu': 21.1243, 'gen_len': 9.7603}




 18%|█▊        | 66/359 [23:58<1:52:50, 23.11s/it]

For epoch 708: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.65batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:05<00:00,  1.97batches/s]



Metrics: {'train_loss': 0.0042694826928398955, 'test_loss': 0.6076828002929687, 'bleu': 21.1062, 'gen_len': 9.8288}




 19%|█▊        | 67/359 [24:22<1:52:54, 23.20s/it]

For epoch 709: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.71batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.51batches/s]



Metrics: {'train_loss': 0.004298429311502997, 'test_loss': 0.6073699489235878, 'bleu': 21.2558, 'gen_len': 9.726}




 19%|█▉        | 68/359 [24:43<1:50:33, 22.79s/it]

For epoch 710: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.59batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.25batches/s]



Metrics: {'train_loss': 0.004216058849834088, 'test_loss': 0.603525860607624, 'bleu': 21.747, 'gen_len': 9.7466}




 19%|█▉        | 69/359 [25:07<1:50:38, 22.89s/it]

For epoch 711: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.68batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.75batches/s]



Metrics: {'train_loss': 0.004301304537121479, 'test_loss': 0.6018543064594268, 'bleu': 21.6169, 'gen_len': 9.9863}




 19%|█▉        | 70/359 [25:28<1:48:30, 22.53s/it]

For epoch 712: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.68batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.48batches/s]



Metrics: {'train_loss': 0.004008664553644272, 'test_loss': 0.6109346657991409, 'bleu': 20.6414, 'gen_len': 9.6233}




 20%|█▉        | 71/359 [25:50<1:47:48, 22.46s/it]

For epoch 713: 


Train batch number 41: 100%|██████████| 41/41 [00:17<00:00,  2.38batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:05<00:00,  1.74batches/s]



Metrics: {'train_loss': 0.004486865369693898, 'test_loss': 0.6069539368152619, 'bleu': 21.9148, 'gen_len': 9.5205}




 20%|██        | 72/359 [26:16<1:51:45, 23.37s/it]

For epoch 714: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.69batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.64batches/s]



Metrics: {'train_loss': 0.004344456227178254, 'test_loss': 0.6031808465719223, 'bleu': 21.3329, 'gen_len': 9.6507}




 20%|██        | 73/359 [26:38<1:49:15, 22.92s/it]

For epoch 715: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.68batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.58batches/s]



Metrics: {'train_loss': 0.0041171213577887635, 'test_loss': 0.6026098847389221, 'bleu': 22.0882, 'gen_len': 9.6986}




 21%|██        | 74/359 [27:00<1:47:40, 22.67s/it]

For epoch 716: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.62batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.47batches/s]



Metrics: {'train_loss': 0.004056094211480785, 'test_loss': 0.6119893297553063, 'bleu': 21.3104, 'gen_len': 9.6918}




 21%|██        | 75/359 [27:22<1:46:48, 22.57s/it]

For epoch 717: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.77batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.51batches/s]



Metrics: {'train_loss': 0.004015157145459387, 'test_loss': 0.6091611474752426, 'bleu': 22.1086, 'gen_len': 9.8493}




 21%|██        | 76/359 [27:44<1:45:00, 22.26s/it]

For epoch 718: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.71batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.32batches/s]



Metrics: {'train_loss': 0.00406571538840625, 'test_loss': 0.6006011828780174, 'bleu': 21.9274, 'gen_len': 9.4932}




 21%|██▏       | 77/359 [28:06<1:44:10, 22.17s/it]

For epoch 719: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.72batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.46batches/s]



Metrics: {'train_loss': 0.004053534806433429, 'test_loss': 0.6044710889458657, 'bleu': 22.1576, 'gen_len': 9.4795}




 22%|██▏       | 78/359 [28:28<1:43:13, 22.04s/it]

For epoch 720: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.73batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.49batches/s]



Metrics: {'train_loss': 0.004175543773737623, 'test_loss': 0.6031597793102265, 'bleu': 21.3574, 'gen_len': 9.5753}




 22%|██▏       | 79/359 [28:49<1:42:19, 21.93s/it]

For epoch 721: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.73batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.25batches/s]



Metrics: {'train_loss': 0.004280570224381801, 'test_loss': 0.6078099429607391, 'bleu': 20.6783, 'gen_len': 9.6712}




 22%|██▏       | 80/359 [29:11<1:42:09, 21.97s/it]

For epoch 722: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.69batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.38batches/s]



Metrics: {'train_loss': 0.0041343196062371135, 'test_loss': 0.607762810587883, 'bleu': 20.3569, 'gen_len': 9.6164}




 23%|██▎       | 81/359 [29:34<1:42:54, 22.21s/it]

For epoch 723: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.72batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.40batches/s]



Metrics: {'train_loss': 0.004358630252593174, 'test_loss': 0.610245268046856, 'bleu': 21.2499, 'gen_len': 9.4932}




 23%|██▎       | 82/359 [29:56<1:42:02, 22.10s/it]

For epoch 724: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.70batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.48batches/s]



Metrics: {'train_loss': 0.004088389657426444, 'test_loss': 0.6096836343407631, 'bleu': 21.9822, 'gen_len': 9.5479}




 23%|██▎       | 83/359 [30:18<1:41:20, 22.03s/it]

For epoch 725: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.76batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.61batches/s]



Metrics: {'train_loss': 0.004195589566512442, 'test_loss': 0.6225066125392914, 'bleu': 19.993, 'gen_len': 9.6644}




 23%|██▎       | 84/359 [30:39<1:40:02, 21.83s/it]

For epoch 726: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.67batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.65batches/s]



Metrics: {'train_loss': 0.0042173887784706385, 'test_loss': 0.6214605420827866, 'bleu': 20.4563, 'gen_len': 9.6712}




 24%|██▎       | 85/359 [31:01<1:39:40, 21.83s/it]

For epoch 727: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.78batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.52batches/s]



Metrics: {'train_loss': 0.004502806065763097, 'test_loss': 0.6043241202831269, 'bleu': 20.9983, 'gen_len': 9.6575}




 24%|██▍       | 86/359 [31:22<1:38:33, 21.66s/it]

For epoch 728: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.79batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.84batches/s]



Metrics: {'train_loss': 0.004334406347990763, 'test_loss': 0.5947187259793282, 'bleu': 23.2824, 'gen_len': 9.8014}




 24%|██▍       | 87/359 [31:44<1:37:45, 21.56s/it]

For epoch 729: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.77batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.57batches/s]



Metrics: {'train_loss': 0.004314361554684072, 'test_loss': 0.6010682806372643, 'bleu': 21.3762, 'gen_len': 9.7192}




 25%|██▍       | 88/359 [32:05<1:36:54, 21.46s/it]

For epoch 730: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.80batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.85batches/s]



Metrics: {'train_loss': 0.004731467661657938, 'test_loss': 0.6043962255120278, 'bleu': 21.4486, 'gen_len': 9.7534}




 25%|██▍       | 89/359 [32:26<1:35:59, 21.33s/it]

For epoch 731: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.74batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.69batches/s]



Metrics: {'train_loss': 0.003994587169982857, 'test_loss': 0.6048440039157867, 'bleu': 21.9051, 'gen_len': 9.911}




 25%|██▌       | 90/359 [32:47<1:36:08, 21.44s/it]

For epoch 732: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.81batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.37batches/s]



Metrics: {'train_loss': 0.004248524706515416, 'test_loss': 0.605307824909687, 'bleu': 22.3652, 'gen_len': 9.8219}




 25%|██▌       | 91/359 [33:09<1:35:41, 21.42s/it]

For epoch 733: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.69batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.36batches/s]



Metrics: {'train_loss': 0.004061948320623942, 'test_loss': 0.6084683746099472, 'bleu': 22.3785, 'gen_len': 9.7534}




 26%|██▌       | 92/359 [33:31<1:36:20, 21.65s/it]

For epoch 734: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.77batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.71batches/s]



Metrics: {'train_loss': 0.004105440857706637, 'test_loss': 0.6006340891122818, 'bleu': 23.5462, 'gen_len': 9.5822}




 26%|██▌       | 93/359 [33:53<1:35:48, 21.61s/it]

For epoch 735: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.84batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.67batches/s]



Metrics: {'train_loss': 0.00398237377450597, 'test_loss': 0.610350814461708, 'bleu': 23.4416, 'gen_len': 9.7466}




 26%|██▌       | 94/359 [34:14<1:34:49, 21.47s/it]

For epoch 736: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.78batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.61batches/s]



Metrics: {'train_loss': 0.004217160933810036, 'test_loss': 0.6083076044917106, 'bleu': 22.2815, 'gen_len': 9.8082}




 26%|██▋       | 95/359 [34:35<1:34:29, 21.47s/it]

For epoch 737: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.82batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.82batches/s]



Metrics: {'train_loss': 0.004044181932751998, 'test_loss': 0.6022359371185303, 'bleu': 21.5758, 'gen_len': 9.7603}




 27%|██▋       | 96/359 [34:56<1:33:20, 21.29s/it]

For epoch 738: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.80batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.81batches/s]



Metrics: {'train_loss': 0.004236510835542548, 'test_loss': 0.602705504000187, 'bleu': 23.3453, 'gen_len': 9.7466}




 27%|██▋       | 97/359 [35:17<1:32:43, 21.23s/it]

For epoch 739: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.78batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.63batches/s]



Metrics: {'train_loss': 0.004654557618778199, 'test_loss': 0.5979950547218322, 'bleu': 22.1766, 'gen_len': 9.7192}




 27%|██▋       | 98/359 [35:38<1:32:22, 21.24s/it]

For epoch 740: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.71batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.64batches/s]



Metrics: {'train_loss': 0.004529517768632348, 'test_loss': 0.5970145910978317, 'bleu': 22.2217, 'gen_len': 9.8014}




 28%|██▊       | 99/359 [36:00<1:32:37, 21.38s/it]

For epoch 741: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.75batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.60batches/s]



Metrics: {'train_loss': 0.004460833459899466, 'test_loss': 0.6130794271826744, 'bleu': 22.2252, 'gen_len': 9.637}




 28%|██▊       | 100/359 [36:21<1:32:10, 21.35s/it]

For epoch 742: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.79batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.80batches/s]



Metrics: {'train_loss': 0.004422632568493122, 'test_loss': 0.6057129934430122, 'bleu': 22.1939, 'gen_len': 9.5548}




 28%|██▊       | 101/359 [36:43<1:31:32, 21.29s/it]

For epoch 743: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.74batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.58batches/s]



Metrics: {'train_loss': 0.00417330113194156, 'test_loss': 0.5919860646128654, 'bleu': 22.9462, 'gen_len': 9.9726}




 28%|██▊       | 102/359 [37:04<1:31:22, 21.33s/it]

For epoch 744: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.81batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.64batches/s]



Metrics: {'train_loss': 0.004606686393366899, 'test_loss': 0.6013699725270272, 'bleu': 21.9547, 'gen_len': 9.8425}




 29%|██▊       | 103/359 [37:25<1:30:40, 21.25s/it]

For epoch 745: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.77batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.76batches/s]



Metrics: {'train_loss': 0.004465067806006296, 'test_loss': 0.6061446815729141, 'bleu': 22.4991, 'gen_len': 9.6164}




 29%|██▉       | 104/359 [37:46<1:30:19, 21.25s/it]

For epoch 746: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.79batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.75batches/s]



Metrics: {'train_loss': 0.0046626687931498835, 'test_loss': 0.6065061166882515, 'bleu': 22.89, 'gen_len': 9.589}




 29%|██▉       | 105/359 [38:07<1:29:50, 21.22s/it]

For epoch 747: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.78batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.89batches/s]



Metrics: {'train_loss': 0.0044477656710802055, 'test_loss': 0.5959480926394463, 'bleu': 23.2962, 'gen_len': 9.6164}




 30%|██▉       | 106/359 [38:28<1:29:07, 21.14s/it]

For epoch 748: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.80batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.72batches/s]



Metrics: {'train_loss': 0.004339217656950761, 'test_loss': 0.594127707183361, 'bleu': 23.1831, 'gen_len': 9.5205}




 30%|██▉       | 107/359 [38:49<1:28:18, 21.03s/it]

For epoch 749: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.80batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.37batches/s]



Metrics: {'train_loss': 0.004413288552314043, 'test_loss': 0.5942404143512249, 'bleu': 23.0365, 'gen_len': 9.6575}




 30%|███       | 108/359 [39:11<1:28:30, 21.16s/it]

For epoch 750: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.84batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.85batches/s]



Metrics: {'train_loss': 0.004484069116254587, 'test_loss': 0.5906889356672764, 'bleu': 22.7431, 'gen_len': 9.8425}




 30%|███       | 109/359 [39:31<1:27:45, 21.06s/it]

For epoch 751: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.84batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.71batches/s]



Metrics: {'train_loss': 0.004918142950173649, 'test_loss': 0.5908770069479943, 'bleu': 21.4956, 'gen_len': 9.6575}




 31%|███       | 110/359 [39:53<1:27:39, 21.12s/it]

For epoch 752: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.80batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.77batches/s]



Metrics: {'train_loss': 0.004065299119868475, 'test_loss': 0.5974784418940544, 'bleu': 21.0407, 'gen_len': 9.6575}




 31%|███       | 111/359 [40:14<1:27:23, 21.14s/it]

For epoch 753: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.77batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.55batches/s]



Metrics: {'train_loss': 0.004148201633630911, 'test_loss': 0.5952803015708923, 'bleu': 20.8674, 'gen_len': 9.5}




 31%|███       | 112/359 [40:35<1:27:32, 21.26s/it]

For epoch 754: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.79batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.62batches/s]



Metrics: {'train_loss': 0.0045159502045773875, 'test_loss': 0.5929833129048347, 'bleu': 21.5288, 'gen_len': 9.9315}




 31%|███▏      | 113/359 [40:56<1:26:53, 21.19s/it]

For epoch 755: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.80batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.78batches/s]



Metrics: {'train_loss': 0.004518260044988427, 'test_loss': 0.593874691426754, 'bleu': 21.85, 'gen_len': 9.8562}




 32%|███▏      | 114/359 [41:17<1:26:08, 21.10s/it]

For epoch 756: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.76batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.81batches/s]



Metrics: {'train_loss': 0.004299780387976547, 'test_loss': 0.5915726691484451, 'bleu': 22.0657, 'gen_len': 9.8836}




 32%|███▏      | 115/359 [41:38<1:25:42, 21.08s/it]

For epoch 757: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.86batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.87batches/s]



Metrics: {'train_loss': 0.004498737009537474, 'test_loss': 0.5904885903000832, 'bleu': 22.4404, 'gen_len': 9.4589}




 32%|███▏      | 116/359 [41:59<1:24:57, 20.98s/it]

For epoch 758: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.71batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.51batches/s]



Metrics: {'train_loss': 0.004308587547800526, 'test_loss': 0.5988358825445175, 'bleu': 21.2823, 'gen_len': 9.5685}




 33%|███▎      | 117/359 [42:21<1:25:35, 21.22s/it]

For epoch 759: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.61batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.58batches/s]



Metrics: {'train_loss': 0.004164163219151882, 'test_loss': 0.5961561486124992, 'bleu': 22.5821, 'gen_len': 9.5411}




 33%|███▎      | 118/359 [42:43<1:26:18, 21.49s/it]

For epoch 760: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.80batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.55batches/s]



Metrics: {'train_loss': 0.004408584575441371, 'test_loss': 0.5958381474018097, 'bleu': 21.3402, 'gen_len': 9.4247}




 33%|███▎      | 119/359 [43:04<1:25:36, 21.40s/it]

For epoch 761: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.76batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.61batches/s]



Metrics: {'train_loss': 0.004343716251641148, 'test_loss': 0.5988811865448952, 'bleu': 23.0036, 'gen_len': 9.411}




 33%|███▎      | 120/359 [43:25<1:25:02, 21.35s/it]

For epoch 762: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.74batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.66batches/s]



Metrics: {'train_loss': 0.0045231546547899884, 'test_loss': 0.6048312395811081, 'bleu': 20.7463, 'gen_len': 9.5274}




 34%|███▎      | 121/359 [43:47<1:24:40, 21.35s/it]

For epoch 763: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.79batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.69batches/s]



Metrics: {'train_loss': 0.004531853413493259, 'test_loss': 0.5970023766160011, 'bleu': 22.2479, 'gen_len': 9.4247}




 34%|███▍      | 122/359 [44:08<1:24:10, 21.31s/it]

For epoch 764: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.79batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.83batches/s]



Metrics: {'train_loss': 0.004161810297973272, 'test_loss': 0.6029136419296265, 'bleu': 21.7502, 'gen_len': 9.3288}




 34%|███▍      | 123/359 [44:29<1:23:36, 21.26s/it]

For epoch 765: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.78batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.77batches/s]



Metrics: {'train_loss': 0.004225218137612612, 'test_loss': 0.6042122587561607, 'bleu': 22.3331, 'gen_len': 9.4795}




 35%|███▍      | 124/359 [44:50<1:23:09, 21.23s/it]

For epoch 766: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.75batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.39batches/s]



Metrics: {'train_loss': 0.004523583932030128, 'test_loss': 0.5915032789111138, 'bleu': 21.5006, 'gen_len': 9.411}




 35%|███▍      | 125/359 [45:12<1:23:43, 21.47s/it]

For epoch 767: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.77batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.85batches/s]



Metrics: {'train_loss': 0.00457448299644833, 'test_loss': 0.5833310872316361, 'bleu': 21.9455, 'gen_len': 9.6986}




 35%|███▌      | 126/359 [45:34<1:23:16, 21.44s/it]

For epoch 768: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.80batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.66batches/s]



Metrics: {'train_loss': 0.0042704889755241756, 'test_loss': 0.5966363370418548, 'bleu': 22.7195, 'gen_len': 9.5205}




 35%|███▌      | 127/359 [45:55<1:22:30, 21.34s/it]

For epoch 769: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.79batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.80batches/s]



Metrics: {'train_loss': 0.004204962409680664, 'test_loss': 0.6094239741563797, 'bleu': 22.937, 'gen_len': 9.6644}




 36%|███▌      | 128/359 [46:16<1:21:32, 21.18s/it]

For epoch 770: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.80batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.96batches/s]



Metrics: {'train_loss': 0.0044792830949740075, 'test_loss': 0.6219380319118499, 'bleu': 22.42, 'gen_len': 9.7603}




 36%|███▌      | 129/359 [46:36<1:20:34, 21.02s/it]

For epoch 771: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.69batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.49batches/s]



Metrics: {'train_loss': 0.0048062976069248664, 'test_loss': 0.5975985720753669, 'bleu': 22.6429, 'gen_len': 9.5068}




 36%|███▌      | 130/359 [46:58<1:21:10, 21.27s/it]

For epoch 772: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.73batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:05<00:00,  1.87batches/s]



Metrics: {'train_loss': 0.004209349888785765, 'test_loss': 0.6013129025697708, 'bleu': 22.9086, 'gen_len': 9.7808}




 36%|███▋      | 131/359 [47:21<1:22:42, 21.76s/it]

For epoch 773: 


Train batch number 41: 100%|██████████| 41/41 [00:16<00:00,  2.51batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.59batches/s]



Metrics: {'train_loss': 0.004132723802041899, 'test_loss': 0.5993783637881279, 'bleu': 23.0707, 'gen_len': 9.726}




 37%|███▋      | 132/359 [47:44<1:23:39, 22.11s/it]

For epoch 774: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.66batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.86batches/s]



Metrics: {'train_loss': 0.0043190775109773006, 'test_loss': 0.6006828263401985, 'bleu': 23.146, 'gen_len': 9.5342}




 37%|███▋      | 133/359 [48:06<1:23:22, 22.14s/it]

For epoch 775: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.71batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.58batches/s]



Metrics: {'train_loss': 0.004677302445421313, 'test_loss': 0.595185287296772, 'bleu': 22.565, 'gen_len': 9.589}




 37%|███▋      | 134/359 [48:28<1:22:24, 21.98s/it]

For epoch 776: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.75batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.51batches/s]



Metrics: {'train_loss': 0.00431617939413139, 'test_loss': 0.5933796979486943, 'bleu': 23.5891, 'gen_len': 9.2329}




 38%|███▊      | 135/359 [48:49<1:21:22, 21.80s/it]

For epoch 777: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.72batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.52batches/s]



Metrics: {'train_loss': 0.004130767269355313, 'test_loss': 0.603956051170826, 'bleu': 23.3537, 'gen_len': 9.7192}




 38%|███▊      | 136/359 [49:11<1:20:43, 21.72s/it]

For epoch 778: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.74batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.84batches/s]



Metrics: {'train_loss': 0.004298102573446203, 'test_loss': 0.5989986106753349, 'bleu': 21.8051, 'gen_len': 9.9589}




 38%|███▊      | 137/359 [49:32<1:20:00, 21.63s/it]

For epoch 779: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.74batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.68batches/s]



Metrics: {'train_loss': 0.00477645539099396, 'test_loss': 0.5881249204277992, 'bleu': 23.6749, 'gen_len': 9.9452}




 38%|███▊      | 138/359 [49:54<1:19:36, 21.61s/it]

For epoch 780: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.77batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.87batches/s]



Metrics: {'train_loss': 0.004118178036949802, 'test_loss': 0.603646557033062, 'bleu': 22.4069, 'gen_len': 9.3082}




 39%|███▊      | 139/359 [50:15<1:18:43, 21.47s/it]

For epoch 781: 


Train batch number 41: 100%|██████████| 41/41 [00:16<00:00,  2.45batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.24batches/s]



Metrics: {'train_loss': 0.004221356113259567, 'test_loss': 0.6043725132942199, 'bleu': 22.3102, 'gen_len': 9.6644}




 39%|███▉      | 140/359 [50:39<1:21:05, 22.22s/it]

For epoch 782: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.72batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.64batches/s]



Metrics: {'train_loss': 0.004354622856736547, 'test_loss': 0.5979043871164322, 'bleu': 22.4118, 'gen_len': 9.4863}




 39%|███▉      | 141/359 [51:01<1:20:13, 22.08s/it]

For epoch 783: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.67batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.52batches/s]



Metrics: {'train_loss': 0.00495901969886135, 'test_loss': 0.5867842346429825, 'bleu': 22.5056, 'gen_len': 9.8493}




 40%|███▉      | 142/359 [51:23<1:19:47, 22.06s/it]

For epoch 784: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.70batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.65batches/s]



Metrics: {'train_loss': 0.004319051134388712, 'test_loss': 0.5982120603322982, 'bleu': 20.788, 'gen_len': 9.6096}




 40%|███▉      | 143/359 [51:44<1:19:03, 21.96s/it]

For epoch 785: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.79batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.56batches/s]



Metrics: {'train_loss': 0.0045661354102421465, 'test_loss': 0.6018957838416099, 'bleu': 21.8179, 'gen_len': 9.7055}




 40%|████      | 144/359 [52:06<1:17:59, 21.77s/it]

For epoch 786: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.76batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.33batches/s]



Metrics: {'train_loss': 0.004256487626684602, 'test_loss': 0.6035672545433044, 'bleu': 21.7809, 'gen_len': 9.4863}




 40%|████      | 145/359 [52:27<1:17:46, 21.80s/it]

For epoch 787: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.75batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.61batches/s]



Metrics: {'train_loss': 0.004409166799699206, 'test_loss': 0.5988839238882064, 'bleu': 22.2795, 'gen_len': 9.6918}




 41%|████      | 146/359 [52:49<1:17:18, 21.78s/it]

For epoch 788: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.77batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.84batches/s]



Metrics: {'train_loss': 0.004435872366629177, 'test_loss': 0.5987440913915634, 'bleu': 22.3156, 'gen_len': 9.7055}




 41%|████      | 147/359 [53:10<1:16:25, 21.63s/it]

For epoch 789: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.76batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.64batches/s]



Metrics: {'train_loss': 0.004228763719566348, 'test_loss': 0.6044333562254905, 'bleu': 21.8766, 'gen_len': 9.6507}




 41%|████      | 148/359 [53:32<1:15:42, 21.53s/it]

For epoch 790: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.76batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.78batches/s]



Metrics: {'train_loss': 0.004104991988619653, 'test_loss': 0.598707640171051, 'bleu': 21.3418, 'gen_len': 9.9589}




 42%|████▏     | 149/359 [53:53<1:15:00, 21.43s/it]

For epoch 791: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.80batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.63batches/s]



Metrics: {'train_loss': 0.004033642744350179, 'test_loss': 0.614594255387783, 'bleu': 21.1019, 'gen_len': 9.8425}




 42%|████▏     | 150/359 [54:14<1:14:38, 21.43s/it]

For epoch 792: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.61batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.38batches/s]



Metrics: {'train_loss': 0.004149627025670758, 'test_loss': 0.6244646489620209, 'bleu': 20.3924, 'gen_len': 9.6233}




 42%|████▏     | 151/359 [54:37<1:15:20, 21.73s/it]

For epoch 793: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.71batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.65batches/s]



Metrics: {'train_loss': 0.004238164109155172, 'test_loss': 0.6086948409676551, 'bleu': 23.36, 'gen_len': 9.8288}




 42%|████▏     | 152/359 [54:58<1:14:35, 21.62s/it]

For epoch 794: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.77batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.62batches/s]



Metrics: {'train_loss': 0.004167107391034867, 'test_loss': 0.6069419667124748, 'bleu': 23.163, 'gen_len': 9.6507}




 43%|████▎     | 153/359 [55:20<1:14:07, 21.59s/it]

For epoch 795: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.78batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.73batches/s]



Metrics: {'train_loss': 0.004057965448648646, 'test_loss': 0.6120897248387337, 'bleu': 22.8052, 'gen_len': 9.7397}




 43%|████▎     | 154/359 [55:41<1:13:21, 21.47s/it]

For epoch 796: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.74batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.85batches/s]



Metrics: {'train_loss': 0.00425021814341407, 'test_loss': 0.6036760330200195, 'bleu': 22.877, 'gen_len': 9.774}




 43%|████▎     | 155/359 [56:02<1:12:46, 21.40s/it]

For epoch 797: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.79batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.67batches/s]



Metrics: {'train_loss': 0.004453175195078252, 'test_loss': 0.5985578253865242, 'bleu': 21.0029, 'gen_len': 9.5274}




 43%|████▎     | 156/359 [56:23<1:11:59, 21.28s/it]

For epoch 798: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.85batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.83batches/s]



Metrics: {'train_loss': 0.004497403605477657, 'test_loss': 0.6040803119540215, 'bleu': 21.6008, 'gen_len': 9.4315}




 44%|████▎     | 157/359 [56:44<1:11:00, 21.09s/it]

For epoch 799: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.80batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.64batches/s]



Metrics: {'train_loss': 0.004365216142770539, 'test_loss': 0.5988973453640938, 'bleu': 21.851, 'gen_len': 9.6438}




 44%|████▍     | 158/359 [57:05<1:10:36, 21.08s/it]

For epoch 800: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.76batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.95batches/s]



Metrics: {'train_loss': 0.004340846540133764, 'test_loss': 0.6025350525975227, 'bleu': 22.4936, 'gen_len': 9.9315}




 44%|████▍     | 159/359 [57:26<1:10:15, 21.08s/it]

For epoch 801: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.61batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.47batches/s]



Metrics: {'train_loss': 0.004189159275350593, 'test_loss': 0.5969458490610122, 'bleu': 21.3553, 'gen_len': 10.137}




 45%|████▍     | 160/359 [57:48<1:11:01, 21.41s/it]

For epoch 802: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.71batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.63batches/s]



Metrics: {'train_loss': 0.004259062527747053, 'test_loss': 0.602462200820446, 'bleu': 21.7831, 'gen_len': 9.8904}




 45%|████▍     | 161/359 [58:10<1:10:42, 21.43s/it]

For epoch 803: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.79batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.64batches/s]



Metrics: {'train_loss': 0.004078625880249935, 'test_loss': 0.6046108916401863, 'bleu': 21.6879, 'gen_len': 9.8151}




 45%|████▌     | 162/359 [58:30<1:09:50, 21.27s/it]

For epoch 804: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.81batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.82batches/s]



Metrics: {'train_loss': 0.00445851639180058, 'test_loss': 0.6043728724122047, 'bleu': 20.6673, 'gen_len': 9.8562}




 45%|████▌     | 163/359 [58:51<1:09:07, 21.16s/it]

For epoch 805: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.81batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  3.00batches/s]



Metrics: {'train_loss': 0.004450059643514999, 'test_loss': 0.6008149012923241, 'bleu': 21.2297, 'gen_len': 9.8425}




 46%|████▌     | 164/359 [59:12<1:08:06, 20.95s/it]

For epoch 806: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.79batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.90batches/s]



Metrics: {'train_loss': 0.0044123015576610115, 'test_loss': 0.6104689002037048, 'bleu': 21.5233, 'gen_len': 9.6986}




 46%|████▌     | 165/359 [59:33<1:07:44, 20.95s/it]

For epoch 807: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.79batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.73batches/s]



Metrics: {'train_loss': 0.004216929855075006, 'test_loss': 0.6101917624473572, 'bleu': 20.7838, 'gen_len': 9.863}




 46%|████▌     | 166/359 [59:54<1:07:15, 20.91s/it]

For epoch 808: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.85batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.81batches/s]



Metrics: {'train_loss': 0.004246353725448433, 'test_loss': 0.5969490990042686, 'bleu': 21.0979, 'gen_len': 9.5753}




 47%|████▋     | 167/359 [1:00:14<1:06:40, 20.84s/it]

For epoch 809: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.79batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.67batches/s]



Metrics: {'train_loss': 0.004239226856109936, 'test_loss': 0.6011067986488342, 'bleu': 22.5259, 'gen_len': 9.6233}




 47%|████▋     | 168/359 [1:00:36<1:06:44, 20.97s/it]

For epoch 810: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.82batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.99batches/s]



Metrics: {'train_loss': 0.004034310565102936, 'test_loss': 0.6031008318066597, 'bleu': 23.4386, 'gen_len': 9.7329}




 47%|████▋     | 169/359 [1:00:56<1:06:19, 20.94s/it]

For epoch 811: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.67batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.66batches/s]



Metrics: {'train_loss': 0.004296136991597894, 'test_loss': 0.6036695584654808, 'bleu': 22.788, 'gen_len': 9.6301}




 47%|████▋     | 170/359 [1:01:18<1:06:34, 21.14s/it]

For epoch 812: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.78batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.84batches/s]



Metrics: {'train_loss': 0.004265559658340019, 'test_loss': 0.605664224922657, 'bleu': 22.7813, 'gen_len': 9.774}




 48%|████▊     | 171/359 [1:01:39<1:06:08, 21.11s/it]

For epoch 813: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.76batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.65batches/s]



Metrics: {'train_loss': 0.00408173916384396, 'test_loss': 0.6069266095757484, 'bleu': 22.7867, 'gen_len': 9.5959}




 48%|████▊     | 172/359 [1:02:00<1:05:50, 21.12s/it]

For epoch 814: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.78batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.84batches/s]



Metrics: {'train_loss': 0.004177575825895296, 'test_loss': 0.6004225626587868, 'bleu': 21.8977, 'gen_len': 9.5068}




 48%|████▊     | 173/359 [1:02:21<1:05:25, 21.11s/it]

For epoch 815: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.67batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.78batches/s]



Metrics: {'train_loss': 0.004048305052537017, 'test_loss': 0.6097417563199997, 'bleu': 22.0508, 'gen_len': 9.7123}




 48%|████▊     | 174/359 [1:02:43<1:05:41, 21.31s/it]

For epoch 816: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.75batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.76batches/s]



Metrics: {'train_loss': 0.004364947212019526, 'test_loss': 0.6111921846866608, 'bleu': 22.5052, 'gen_len': 9.7534}




 49%|████▊     | 175/359 [1:03:04<1:05:26, 21.34s/it]

For epoch 817: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.74batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.65batches/s]



Metrics: {'train_loss': 0.004287358481300677, 'test_loss': 0.6052656427025795, 'bleu': 21.1552, 'gen_len': 9.6507}




 49%|████▉     | 176/359 [1:03:26<1:05:12, 21.38s/it]

For epoch 818: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.79batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.59batches/s]



Metrics: {'train_loss': 0.004169712183860744, 'test_loss': 0.6103675827383995, 'bleu': 22.1456, 'gen_len': 9.8356}




 49%|████▉     | 177/359 [1:03:47<1:04:32, 21.28s/it]

For epoch 819: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.76batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.66batches/s]



Metrics: {'train_loss': 0.004126952858868896, 'test_loss': 0.6101983532309532, 'bleu': 22.1544, 'gen_len': 9.5137}




 50%|████▉     | 178/359 [1:04:08<1:04:07, 21.26s/it]

For epoch 820: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.77batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.57batches/s]



Metrics: {'train_loss': 0.004088138071138684, 'test_loss': 0.6124800443649292, 'bleu': 21.9763, 'gen_len': 9.6575}




 50%|████▉     | 179/359 [1:04:30<1:04:03, 21.35s/it]

For epoch 821: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.67batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.36batches/s]



Metrics: {'train_loss': 0.004048575655693506, 'test_loss': 0.6081854775547981, 'bleu': 21.2124, 'gen_len': 9.8425}




 50%|█████     | 180/359 [1:04:52<1:04:19, 21.56s/it]

For epoch 822: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.79batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.40batches/s]



Metrics: {'train_loss': 0.004280027737500283, 'test_loss': 0.610800701379776, 'bleu': 23.0991, 'gen_len': 9.6918}




 50%|█████     | 181/359 [1:05:13<1:03:53, 21.54s/it]

For epoch 823: 


Train batch number 41: 100%|██████████| 41/41 [00:16<00:00,  2.53batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.60batches/s]



Metrics: {'train_loss': 0.004167745540077548, 'test_loss': 0.6094971641898155, 'bleu': 22.0294, 'gen_len': 9.7466}




 51%|█████     | 182/359 [1:05:36<1:04:25, 21.84s/it]

For epoch 824: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.78batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.61batches/s]



Metrics: {'train_loss': 0.004131198522816525, 'test_loss': 0.6117557570338249, 'bleu': 22.6017, 'gen_len': 9.6096}




 51%|█████     | 183/359 [1:05:57<1:03:29, 21.65s/it]

For epoch 825: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.78batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.35batches/s]



Metrics: {'train_loss': 0.004058003542013466, 'test_loss': 0.6142192721366883, 'bleu': 21.7652, 'gen_len': 9.8356}




 51%|█████▏    | 184/359 [1:06:19<1:03:14, 21.68s/it]

For epoch 826: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.70batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.64batches/s]



Metrics: {'train_loss': 0.004110690623680811, 'test_loss': 0.6131397008895874, 'bleu': 21.1808, 'gen_len': 9.5274}




 52%|█████▏    | 185/359 [1:06:41<1:03:21, 21.85s/it]

For epoch 827: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.82batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.83batches/s]



Metrics: {'train_loss': 0.004034395914570224, 'test_loss': 0.6084867969155312, 'bleu': 22.3587, 'gen_len': 9.7671}




 52%|█████▏    | 186/359 [1:07:03<1:02:50, 21.79s/it]

For epoch 828: 


Train batch number 41: 100%|██████████| 41/41 [00:15<00:00,  2.68batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.93batches/s]



Metrics: {'train_loss': 0.004160937478356972, 'test_loss': 0.6150997877120972, 'bleu': 21.2508, 'gen_len': 9.7397}




 52%|█████▏    | 187/359 [1:07:24<1:02:20, 21.75s/it]

For epoch 829: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.87batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.69batches/s]



Metrics: {'train_loss': 0.004182810159740833, 'test_loss': 0.6154631823301315, 'bleu': 20.9139, 'gen_len': 9.5068}




 52%|█████▏    | 188/359 [1:07:45<1:01:12, 21.48s/it]

For epoch 830: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.82batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.82batches/s]



Metrics: {'train_loss': 0.004092101064487928, 'test_loss': 0.6154541581869125, 'bleu': 21.3113, 'gen_len': 9.6301}




 53%|█████▎    | 189/359 [1:08:06<1:00:19, 21.29s/it]

For epoch 831: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.89batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.94batches/s]



Metrics: {'train_loss': 0.004074798388656501, 'test_loss': 0.6170223444700241, 'bleu': 21.6724, 'gen_len': 9.4384}




 53%|█████▎    | 190/359 [1:08:26<59:12, 21.02s/it]  

For epoch 832: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.75batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.58batches/s]



Metrics: {'train_loss': 0.004309886668967765, 'test_loss': 0.6210570454597473, 'bleu': 21.3197, 'gen_len': 9.411}




 53%|█████▎    | 191/359 [1:08:48<59:18, 21.18s/it]

For epoch 833: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.74batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.65batches/s]



Metrics: {'train_loss': 0.00421020027431773, 'test_loss': 0.6189143031835556, 'bleu': 20.7344, 'gen_len': 9.7877}




 53%|█████▎    | 192/359 [1:09:10<59:23, 21.34s/it]

For epoch 834: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.81batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.89batches/s]



Metrics: {'train_loss': 0.0043838068008104835, 'test_loss': 0.6146216839551926, 'bleu': 21.4785, 'gen_len': 9.6644}




 54%|█████▍    | 193/359 [1:09:30<58:11, 21.03s/it]

For epoch 835: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.84batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.68batches/s]



Metrics: {'train_loss': 0.004074337017699712, 'test_loss': 0.6173753902316094, 'bleu': 20.9215, 'gen_len': 9.6575}




 54%|█████▍    | 194/359 [1:09:51<57:44, 21.00s/it]

For epoch 836: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.81batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.67batches/s]



Metrics: {'train_loss': 0.004332991541227008, 'test_loss': 0.6168133527040481, 'bleu': 19.6174, 'gen_len': 9.7329}




 54%|█████▍    | 195/359 [1:10:12<57:15, 20.95s/it]

For epoch 837: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.77batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.65batches/s]



Metrics: {'train_loss': 0.004481516415026129, 'test_loss': 0.6156846866011619, 'bleu': 19.7138, 'gen_len': 9.5342}




 55%|█████▍    | 196/359 [1:10:33<56:57, 20.97s/it]

For epoch 838: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.81batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.93batches/s]



Metrics: {'train_loss': 0.004473355116050054, 'test_loss': 0.6125123456120491, 'bleu': 20.4514, 'gen_len': 9.7055}




 55%|█████▍    | 197/359 [1:10:54<56:27, 20.91s/it]

For epoch 839: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.78batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.66batches/s]



Metrics: {'train_loss': 0.004166108513500814, 'test_loss': 0.6138490691781044, 'bleu': 19.7768, 'gen_len': 9.7329}




 55%|█████▌    | 198/359 [1:11:15<56:13, 20.95s/it]

For epoch 840: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.77batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.82batches/s]



Metrics: {'train_loss': 0.0043251518154425956, 'test_loss': 0.6170810431241989, 'bleu': 20.9747, 'gen_len': 9.8219}




 55%|█████▌    | 199/359 [1:11:36<56:00, 21.00s/it]

For epoch 841: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.79batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.86batches/s]



Metrics: {'train_loss': 0.004160423462120135, 'test_loss': 0.5973883464932441, 'bleu': 20.8466, 'gen_len': 9.7534}




 56%|█████▌    | 200/359 [1:11:56<55:25, 20.91s/it]

For epoch 842: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.81batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.69batches/s]



Metrics: {'train_loss': 0.004199145157344457, 'test_loss': 0.6084083586931228, 'bleu': 20.2644, 'gen_len': 9.6164}




 56%|█████▌    | 201/359 [1:12:17<54:49, 20.82s/it]

For epoch 843: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.74batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.95batches/s]



Metrics: {'train_loss': 0.0042810419492037375, 'test_loss': 0.6044034138321877, 'bleu': 19.9838, 'gen_len': 9.6849}




 56%|█████▋    | 202/359 [1:12:38<54:57, 21.00s/it]

For epoch 844: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.86batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.95batches/s]



Metrics: {'train_loss': 0.004702024584885959, 'test_loss': 0.613100303709507, 'bleu': 19.8414, 'gen_len': 9.4658}




 57%|█████▋    | 203/359 [1:12:59<54:22, 20.92s/it]

For epoch 845: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.79batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.74batches/s]



Metrics: {'train_loss': 0.004534927378522187, 'test_loss': 0.6005889430642128, 'bleu': 20.0491, 'gen_len': 9.1781}




 57%|█████▋    | 204/359 [1:13:20<54:04, 20.94s/it]

For epoch 846: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.84batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.75batches/s]



Metrics: {'train_loss': 0.004232669535956186, 'test_loss': 0.6029304891824723, 'bleu': 20.5041, 'gen_len': 9.4932}




 57%|█████▋    | 205/359 [1:13:41<53:47, 20.96s/it]

For epoch 847: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.86batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.86batches/s]



Metrics: {'train_loss': 0.004153776709444639, 'test_loss': 0.6123334899544716, 'bleu': 20.5241, 'gen_len': 9.7603}




 57%|█████▋    | 206/359 [1:14:02<53:08, 20.84s/it]

For epoch 848: 


Train batch number 41: 100%|██████████| 41/41 [00:14<00:00,  2.80batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.68batches/s]



Metrics: {'train_loss': 0.004108519523359108, 'test_loss': 0.6031994193792343, 'bleu': 22.0755, 'gen_len': 9.8151}




 58%|█████▊    | 207/359 [1:14:23<52:51, 20.87s/it]

For epoch 849: 


Train batch number 37:  88%|████████▊ | 36/41 [03:32<00:09,  1.86s/batches]

### --- Wandb v3

In [12]:
# let us initialize the hyperparameter configuration 
config = {
    'random_state': 0,
    'fr_char_p': 0.7263195964594187,
    'fr_word_p': 0.04869562586034352,
    'learning_rate': 0.0054801115631762445,
    'weight_decay': 0.16862012664364023,
    'batch_size': 8,
    'warmup_ratio': 0.0,
    'max_epoch': 125,
    'max_len': 51,
    'bleu': 3.1436,
    'model_dir': 'data/checkpoints/wf_t5_small_custom_train_v3_2_checkpoints/',
    'new_model_dir': 'data/checkpoints/t5_small_custom_train_results_wf_v3_2/'
}

# Initialize the model name
model_name = 't5-small'

# import the model with its pre-trained weights
model = T5ForConditionalGeneration.from_pretrained(model_name)

# resize the token embeddings
model.resize_token_embeddings(len(tokenizer))

# let us initialize the evaluation class
evaluation = TranslationEvaluation(tokenizer)

# let us initialize the trainer
trainer = ModelRunner(model, seed = 0, version = 1, evaluation = evaluation, optimizer = Adafactor)

# split the data
split_data(config['random_state'])

# recuperate train and test set
train_dataset, test_dataset = recuperate_datasets(config['fr_char_p'], 
                                                    config['fr_word_p'], config['max_len'])

# let us calculate the appropriate warmup steps (let us take a max epoch of 100)
length = len(train_dataset)

n_steps = length // config['batch_size']

num_steps = config['max_epoch'] * n_steps

warmup_steps = (config['max_epoch'] * n_steps) * config['warmup_ratio']

# Initialize the scheduler parameters
scheduler_args = {'num_warmup_steps': warmup_steps, 'num_training_steps': num_steps}

# Initialize the optimizer parameters
optimizer_args = {
    'lr': config['learning_rate'],
    'weight_decay': config['weight_decay'],
    # 'betas': (0.9, 0.98),
    'relative_step': False
}

# Initialize the loaders parameters
train_loader_args = {'batch_size': config['batch_size']}

# Add the datasets and hyperparameters to trainer
trainer.compile(train_dataset, test_dataset, tokenizer, train_loader_args,
                optimizer_kwargs = optimizer_args,
                lr_scheduler=get_linear_schedule_with_warmup,
                lr_scheduler_kwargs=scheduler_args, 
                predict_with_generate = True,
                hugging_face = True,
                logging_dir="data/logs/t5_small_custom_train_wf_v3_2"
                )

# We will from checkpoints so let us the model
# trainer.load(config['model_dir'], load_best=True) # Only for the first loading
trainer.load(config['new_model_dir'], load_best=True)

        

### --- 

In [10]:
trainer.train(epochs = config['max_epoch'] - trainer.current_epoch, auto_save=True, metric_for_best_model='bleu', metric_objective='maximize', log_step=1,
              saving_directory = config['new_model_dir'])



For epoch 6: {Learning rate: [0.005259562287995655]}


Train batch number 164: 100%|██████████| 164/164 [02:32<00:00,  1.08batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.64batches/s]



Metrics: {'train_loss': 0.37706227591488417, 'test_loss': 0.5732732146978379, 'bleu': 5.1447, 'gen_len': 8.5959}




  1%|          | 1/120 [02:38<5:13:44, 158.19s/it]

For epoch 7: {Learning rate: [0.005215452432959537]}


Train batch number 164: 100%|██████████| 164/164 [00:33<00:00,  4.82batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  3.00batches/s]



Metrics: {'train_loss': 0.3167460635304451, 'test_loss': 0.5646547913551331, 'bleu': 2.4355, 'gen_len': 8.2534}




  2%|▏         | 2/120 [03:16<2:52:22, 87.65s/it] 

For epoch 8: {Learning rate: [0.005171342577923418]}


Train batch number 164: 100%|██████████| 164/164 [00:39<00:00,  4.15batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.24batches/s]



Metrics: {'train_loss': 0.26711766921528957, 'test_loss': 0.5743936955928802, 'bleu': 6.4414, 'gen_len': 8.9452}




  2%|▎         | 3/120 [04:02<2:13:37, 68.52s/it]

For epoch 9: {Learning rate: [0.0051272327228873]}


Train batch number 164: 100%|██████████| 164/164 [00:41<00:00,  3.93batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.47batches/s]



Metrics: {'train_loss': 0.22601714362276765, 'test_loss': 0.5467088893055916, 'bleu': 5.8022, 'gen_len': 8.9932}




  3%|▎         | 4/120 [04:49<1:56:10, 60.09s/it]

For epoch 10: {Learning rate: [0.005083122867851182]}


Train batch number 164: 100%|██████████| 164/164 [00:37<00:00,  4.37batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.55batches/s]



Metrics: {'train_loss': 0.18737927542590513, 'test_loss': 0.5673720240592957, 'bleu': 7.6023, 'gen_len': 9.0411}




  4%|▍         | 5/120 [05:33<1:43:53, 54.21s/it]

For epoch 11: {Learning rate: [0.005039013012815064]}


Train batch number 164: 100%|██████████| 164/164 [00:39<00:00,  4.10batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.82batches/s]



Metrics: {'train_loss': 0.15727639116528558, 'test_loss': 0.5675997957587242, 'bleu': 8.1002, 'gen_len': 8.5137}




  5%|▌         | 6/120 [06:17<1:36:56, 51.02s/it]

For epoch 12: {Learning rate: [0.004994903157778946]}


Train batch number 164: 100%|██████████| 164/164 [00:35<00:00,  4.58batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.84batches/s]



Metrics: {'train_loss': 0.13699832590433156, 'test_loss': 0.5505748108029366, 'bleu': 10.4809, 'gen_len': 9.6849}




  6%|▌         | 7/120 [06:58<1:29:36, 47.58s/it]

For epoch 13: {Learning rate: [0.004950793302742828]}


Train batch number 164: 100%|██████████| 164/164 [00:32<00:00,  5.07batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  3.05batches/s]



Metrics: {'train_loss': 0.11938012291381998, 'test_loss': 0.5598742499947548, 'bleu': 10.3136, 'gen_len': 9.3425}




  7%|▋         | 8/120 [07:34<1:22:12, 44.04s/it]

For epoch 14: {Learning rate: [0.0049066834477067105]}


Train batch number 164: 100%|██████████| 164/164 [00:33<00:00,  4.93batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.94batches/s]



Metrics: {'train_loss': 0.10662350165316971, 'test_loss': 0.5703690692782402, 'bleu': 9.4286, 'gen_len': 9.8836}




  8%|▊         | 9/120 [08:12<1:17:42, 42.00s/it]

For epoch 15: {Learning rate: [0.0048625735926705925]}


Train batch number 164: 100%|██████████| 164/164 [00:33<00:00,  4.87batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  3.06batches/s]



Metrics: {'train_loss': 0.09547583979168316, 'test_loss': 0.5540228724479676, 'bleu': 11.5011, 'gen_len': 8.863}




  8%|▊         | 10/120 [08:51<1:15:12, 41.02s/it]

For epoch 16: {Learning rate: [0.004818463737634475]}


Train batch number 164: 100%|██████████| 164/164 [00:35<00:00,  4.59batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.41batches/s]



Metrics: {'train_loss': 0.08583713497784806, 'test_loss': 0.553189893066883, 'bleu': 10.545, 'gen_len': 9.5}




  9%|▉         | 11/120 [09:32<1:14:26, 40.97s/it]

For epoch 17: {Learning rate: [0.004774353882598357]}


Train batch number 164: 100%|██████████| 164/164 [00:37<00:00,  4.41batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.37batches/s]



Metrics: {'train_loss': 0.0770307999422274, 'test_loss': 0.5603786438703537, 'bleu': 13.8116, 'gen_len': 9.2671}




 10%|█         | 12/120 [10:16<1:15:45, 42.09s/it]

For epoch 18: {Learning rate: [0.004730244027562239]}


Train batch number 164: 100%|██████████| 164/164 [00:34<00:00,  4.71batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  3.01batches/s]



Metrics: {'train_loss': 0.07053642851732127, 'test_loss': 0.5501765787601471, 'bleu': 13.6064, 'gen_len': 9.4932}




 11%|█         | 13/120 [10:55<1:13:27, 41.19s/it]

For epoch 19: {Learning rate: [0.004686134172526121]}


Train batch number 164: 100%|██████████| 164/164 [00:32<00:00,  5.03batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  3.03batches/s]



Metrics: {'train_loss': 0.0671849556312692, 'test_loss': 0.5557519607245922, 'bleu': 13.9548, 'gen_len': 9.1712}




 12%|█▏        | 14/120 [11:32<1:10:32, 39.93s/it]

For epoch 20: {Learning rate: [0.004642024317490003]}


Train batch number 164: 100%|██████████| 164/164 [00:34<00:00,  4.80batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  3.03batches/s]



Metrics: {'train_loss': 0.05970688155148088, 'test_loss': 0.553097614645958, 'bleu': 13.294, 'gen_len': 10.0342}




 12%|█▎        | 15/120 [12:11<1:09:02, 39.46s/it]

For epoch 21: {Learning rate: [0.004597914462453884]}


Train batch number 164: 100%|██████████| 164/164 [00:33<00:00,  4.94batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  3.01batches/s]



Metrics: {'train_loss': 0.05666683144049674, 'test_loss': 0.5481992065906525, 'bleu': 13.444, 'gen_len': 9.5548}




 13%|█▎        | 16/120 [12:48<1:07:16, 38.81s/it]

For epoch 22: {Learning rate: [0.004553804607417766]}


Train batch number 164: 100%|██████████| 164/164 [00:32<00:00,  4.99batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.71batches/s]



Metrics: {'train_loss': 0.05212589450998277, 'test_loss': 0.5728618398308754, 'bleu': 14.3437, 'gen_len': 9.411}




 14%|█▍        | 17/120 [13:26<1:06:06, 38.51s/it]

For epoch 23: {Learning rate: [0.004509694752381648]}


Train batch number 164: 100%|██████████| 164/164 [00:33<00:00,  4.94batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.88batches/s]



Metrics: {'train_loss': 0.05024891486391425, 'test_loss': 0.5593014046549797, 'bleu': 14.6324, 'gen_len': 9.3836}




 15%|█▌        | 18/120 [14:04<1:05:06, 38.30s/it]

For epoch 24: {Learning rate: [0.00446558489734553]}


Train batch number 164: 100%|██████████| 164/164 [00:34<00:00,  4.77batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.75batches/s]



Metrics: {'train_loss': 0.045970691061328825, 'test_loss': 0.5612509787082672, 'bleu': 16.7039, 'gen_len': 9.8151}




 16%|█▌        | 19/120 [14:43<1:04:58, 38.60s/it]

For epoch 25: {Learning rate: [0.004421475042309413]}


Train batch number 164: 100%|██████████| 164/164 [00:33<00:00,  4.88batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.69batches/s]



Metrics: {'train_loss': 0.0458613348352473, 'test_loss': 0.5678857356309891, 'bleu': 16.3611, 'gen_len': 9.5}




 17%|█▋        | 20/120 [15:21<1:04:12, 38.52s/it]

For epoch 26: {Learning rate: [0.004377365187273295]}


Train batch number 164: 100%|██████████| 164/164 [00:34<00:00,  4.82batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.81batches/s]



Metrics: {'train_loss': 0.040839978005373624, 'test_loss': 0.5650599420070648, 'bleu': 14.1028, 'gen_len': 9.0685}




 18%|█▊        | 21/120 [16:00<1:03:33, 38.52s/it]

For epoch 27: {Learning rate: [0.004333255332237177]}


Train batch number 164: 100%|██████████| 164/164 [00:32<00:00,  5.08batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  3.01batches/s]



Metrics: {'train_loss': 0.04037886012981578, 'test_loss': 0.543212553113699, 'bleu': 18.156, 'gen_len': 9.5274}




 18%|█▊        | 22/120 [16:37<1:02:03, 37.99s/it]

For epoch 28: {Learning rate: [0.004289145477201059]}


Train batch number 164: 100%|██████████| 164/164 [00:34<00:00,  4.77batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.83batches/s]



Metrics: {'train_loss': 0.03909661826427754, 'test_loss': 0.5781049966812134, 'bleu': 13.3383, 'gen_len': 9.137}




 19%|█▉        | 23/120 [17:16<1:01:53, 38.29s/it]

For epoch 29: {Learning rate: [0.004245035622164941]}


Train batch number 164: 100%|██████████| 164/164 [00:35<00:00,  4.60batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.80batches/s]



Metrics: {'train_loss': 0.03715374148677944, 'test_loss': 0.5761025741696357, 'bleu': 13.9914, 'gen_len': 9.4589}




 20%|██        | 24/120 [17:56<1:02:09, 38.85s/it]

For epoch 30: {Learning rate: [0.004200925767128823]}


Train batch number 164: 100%|██████████| 164/164 [00:34<00:00,  4.79batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.89batches/s]



Metrics: {'train_loss': 0.03586908825673163, 'test_loss': 0.5622772470116615, 'bleu': 15.9993, 'gen_len': 9.274}




 21%|██        | 25/120 [18:34<1:01:24, 38.79s/it]

For epoch 31: {Learning rate: [0.004156815912092705]}


Train batch number 164: 100%|██████████| 164/164 [00:36<00:00,  4.55batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.97batches/s]



Metrics: {'train_loss': 0.03315493942056669, 'test_loss': 0.5636720344424248, 'bleu': 19.8885, 'gen_len': 9.0548}




 22%|██▏       | 26/120 [19:16<1:02:06, 39.64s/it]

For epoch 32: {Learning rate: [0.004112706057056587]}


Train batch number 164: 100%|██████████| 164/164 [00:33<00:00,  4.90batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  3.02batches/s]



Metrics: {'train_loss': 0.03250534194748758, 'test_loss': 0.5731255769729614, 'bleu': 16.2114, 'gen_len': 9.3493}




 22%|██▎       | 27/120 [19:54<1:00:31, 39.04s/it]

For epoch 33: {Learning rate: [0.004068596202020469]}


Train batch number 164: 100%|██████████| 164/164 [00:32<00:00,  4.99batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.97batches/s]



Metrics: {'train_loss': 0.0320014423838385, 'test_loss': 0.5796553611755371, 'bleu': 17.0567, 'gen_len': 9.411}




 23%|██▎       | 28/120 [20:31<58:56, 38.44s/it]  

For epoch 34: {Learning rate: [0.004024486346984351]}


Train batch number 164: 100%|██████████| 164/164 [00:32<00:00,  5.03batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.46batches/s]



Metrics: {'train_loss': 0.031337419740583114, 'test_loss': 0.5650656059384346, 'bleu': 17.1438, 'gen_len': 9.4863}




 24%|██▍       | 29/120 [21:08<57:53, 38.17s/it]

For epoch 35: {Learning rate: [0.003980376491948233]}


Train batch number 164: 100%|██████████| 164/164 [00:32<00:00,  4.99batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.01batches/s]



Metrics: {'train_loss': 0.030074618257037024, 'test_loss': 0.5634165436029435, 'bleu': 17.4606, 'gen_len': 9.5}




 25%|██▌       | 30/120 [21:47<57:33, 38.37s/it]

For epoch 36: {Learning rate: [0.003936266636912115]}


Train batch number 164: 100%|██████████| 164/164 [00:34<00:00,  4.70batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.99batches/s]



Metrics: {'train_loss': 0.027560292907831508, 'test_loss': 0.5924248322844505, 'bleu': 16.8616, 'gen_len': 9.137}




 26%|██▌       | 31/120 [22:26<57:17, 38.62s/it]

For epoch 37: {Learning rate: [0.003892156781875997]}


Train batch number 164: 100%|██████████| 164/164 [00:32<00:00,  5.01batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  3.07batches/s]



Metrics: {'train_loss': 0.028389066258990545, 'test_loss': 0.5761129647493363, 'bleu': 15.0433, 'gen_len': 9.6712}




 27%|██▋       | 32/120 [23:03<55:52, 38.09s/it]

For epoch 38: {Learning rate: [0.003848046926839879]}


Train batch number 164: 100%|██████████| 164/164 [00:32<00:00,  5.03batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  3.00batches/s]



Metrics: {'train_loss': 0.027585540000894446, 'test_loss': 0.5840333625674248, 'bleu': 17.9824, 'gen_len': 9.363}




 28%|██▊       | 33/120 [23:40<54:40, 37.71s/it]

For epoch 39: {Learning rate: [0.0038039370718037607]}


Train batch number 164: 100%|██████████| 164/164 [00:32<00:00,  5.05batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.83batches/s]



Metrics: {'train_loss': 0.025610021343909023, 'test_loss': 0.5774774014949798, 'bleu': 19.1789, 'gen_len': 9.2877}




 28%|██▊       | 34/120 [24:17<53:42, 37.47s/it]

For epoch 40: {Learning rate: [0.0037598272167676428]}


Train batch number 164: 100%|██████████| 164/164 [00:33<00:00,  4.91batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  3.03batches/s]



Metrics: {'train_loss': 0.02586492991483793, 'test_loss': 0.5863970667123795, 'bleu': 19.7076, 'gen_len': 9.2603}




 29%|██▉       | 35/120 [24:55<53:08, 37.51s/it]

For epoch 41: {Learning rate: [0.003715717361731525]}


Train batch number 164: 100%|██████████| 164/164 [00:32<00:00,  5.08batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.76batches/s]



Metrics: {'train_loss': 0.02515048533110175, 'test_loss': 0.5699920743703842, 'bleu': 17.5807, 'gen_len': 9.4726}




 30%|███       | 36/120 [25:31<52:11, 37.28s/it]

For epoch 42: {Learning rate: [0.003671607506695407]}


Train batch number 164: 100%|██████████| 164/164 [00:32<00:00,  5.02batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  3.07batches/s]



Metrics: {'train_loss': 0.023853920349033504, 'test_loss': 0.5732041791081428, 'bleu': 18.2291, 'gen_len': 9.3151}




 31%|███       | 37/120 [26:08<51:23, 37.14s/it]

For epoch 43: {Learning rate: [0.003627497651659289]}


Train batch number 164: 100%|██████████| 164/164 [00:32<00:00,  5.00batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  3.02batches/s]



Metrics: {'train_loss': 0.023719656008591012, 'test_loss': 0.5696721896529198, 'bleu': 19.6681, 'gen_len': 9.6027}




 32%|███▏      | 38/120 [26:45<50:41, 37.10s/it]

For epoch 44: {Learning rate: [0.003583387796623171]}


Train batch number 164: 100%|██████████| 164/164 [00:33<00:00,  4.89batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.98batches/s]



Metrics: {'train_loss': 0.02238731716032617, 'test_loss': 0.5802316710352897, 'bleu': 19.6284, 'gen_len': 9.2466}




 32%|███▎      | 39/120 [27:23<50:21, 37.31s/it]

For epoch 45: {Learning rate: [0.003539277941587053]}


Train batch number 164: 100%|██████████| 164/164 [00:33<00:00,  4.87batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.82batches/s]



Metrics: {'train_loss': 0.022585594177632253, 'test_loss': 0.5892699480056762, 'bleu': 17.3987, 'gen_len': 9.0411}




 33%|███▎      | 40/120 [28:01<50:02, 37.53s/it]

For epoch 46: {Learning rate: [0.0034951680865509347]}


Train batch number 164: 100%|██████████| 164/164 [00:31<00:00,  5.14batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  3.00batches/s]



Metrics: {'train_loss': 0.022673993783139782, 'test_loss': 0.5772102333605289, 'bleu': 17.5378, 'gen_len': 9.3836}




 34%|███▍      | 41/120 [28:37<48:51, 37.10s/it]

For epoch 47: {Learning rate: [0.0034510582315148168]}


Train batch number 164: 100%|██████████| 164/164 [00:33<00:00,  4.84batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  3.09batches/s]



Metrics: {'train_loss': 0.022004925072329436, 'test_loss': 0.5858615651726723, 'bleu': 17.4537, 'gen_len': 9.5411}




 35%|███▌      | 42/120 [29:15<48:34, 37.37s/it]

For epoch 48: {Learning rate: [0.003406948376478699]}


Train batch number 164: 100%|██████████| 164/164 [00:31<00:00,  5.13batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  3.03batches/s]



Metrics: {'train_loss': 0.021724270071788895, 'test_loss': 0.6011502236127854, 'bleu': 18.941, 'gen_len': 9.363}




 36%|███▌      | 43/120 [29:51<47:28, 37.00s/it]

For epoch 49: {Learning rate: [0.003362838521442581]}


Train batch number 164: 100%|██████████| 164/164 [00:33<00:00,  4.90batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  3.04batches/s]



Metrics: {'train_loss': 0.020042033320883425, 'test_loss': 0.5876112207770348, 'bleu': 18.8009, 'gen_len': 9.411}




 37%|███▋      | 44/120 [30:29<47:07, 37.20s/it]

For epoch 50: {Learning rate: [0.003318728666406463]}


Train batch number 164: 100%|██████████| 164/164 [00:32<00:00,  4.97batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.73batches/s]



Metrics: {'train_loss': 0.020191768354668124, 'test_loss': 0.5987733855843544, 'bleu': 19.2169, 'gen_len': 9.774}




 38%|███▊      | 45/120 [31:06<46:36, 37.29s/it]

For epoch 51: {Learning rate: [0.003274618811370345]}


Train batch number 164: 100%|██████████| 164/164 [00:32<00:00,  5.10batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  3.06batches/s]



Metrics: {'train_loss': 0.020239781202157824, 'test_loss': 0.5843050062656403, 'bleu': 18.7246, 'gen_len': 9.3836}




 38%|███▊      | 46/120 [31:43<45:35, 36.97s/it]

For epoch 52: {Learning rate: [0.0032305089563342266]}


Train batch number 164: 100%|██████████| 164/164 [00:33<00:00,  4.88batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  3.03batches/s]



Metrics: {'train_loss': 0.020143617799210294, 'test_loss': 0.5878256380558013, 'bleu': 17.5686, 'gen_len': 9.5068}




 39%|███▉      | 47/120 [32:20<45:17, 37.22s/it]

For epoch 53: {Learning rate: [0.0031863991012981087]}


Train batch number 164: 100%|██████████| 164/164 [00:34<00:00,  4.72batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.63batches/s]



Metrics: {'train_loss': 0.01935496423734216, 'test_loss': 0.5823399871587753, 'bleu': 20.0654, 'gen_len': 9.1438}




 40%|████      | 48/120 [33:00<45:36, 38.01s/it]

For epoch 54: {Learning rate: [0.0031422892462619908]}


Train batch number 164: 100%|██████████| 164/164 [00:33<00:00,  4.92batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.76batches/s]



Metrics: {'train_loss': 0.017993268389368385, 'test_loss': 0.5871823564171791, 'bleu': 17.2944, 'gen_len': 9.5685}




 41%|████      | 49/120 [33:38<44:56, 37.98s/it]

For epoch 55: {Learning rate: [0.003098179391225873]}


Train batch number 164: 100%|██████████| 164/164 [00:33<00:00,  4.96batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.95batches/s]



Metrics: {'train_loss': 0.018295095449813257, 'test_loss': 0.5875287994742393, 'bleu': 20.3482, 'gen_len': 9.6575}




 42%|████▏     | 50/120 [34:16<44:11, 37.88s/it]

For epoch 56: {Learning rate: [0.003054069536189755]}


Train batch number 164: 100%|██████████| 164/164 [00:32<00:00,  5.04batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  3.04batches/s]



Metrics: {'train_loss': 0.017632208386913123, 'test_loss': 0.5866471037268639, 'bleu': 18.136, 'gen_len': 9.4932}




 42%|████▎     | 51/120 [34:52<43:09, 37.53s/it]

For epoch 57: {Learning rate: [0.003009959681153637]}


Train batch number 164: 100%|██████████| 164/164 [00:32<00:00,  5.11batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.93batches/s]



Metrics: {'train_loss': 0.017314196953254684, 'test_loss': 0.591906763613224, 'bleu': 19.366, 'gen_len': 9.2808}




 43%|████▎     | 52/120 [35:29<42:08, 37.18s/it]

For epoch 58: {Learning rate: [0.002965849826117519]}


Train batch number 164: 100%|██████████| 164/164 [00:32<00:00,  5.06batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  3.02batches/s]



Metrics: {'train_loss': 0.017070128179223435, 'test_loss': 0.5853349059820175, 'bleu': 19.8957, 'gen_len': 9.3767}




 44%|████▍     | 53/120 [36:05<41:19, 37.00s/it]

For epoch 59: {Learning rate: [0.0029217399710814006]}


Train batch number 164: 100%|██████████| 164/164 [00:33<00:00,  4.91batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.95batches/s]



Metrics: {'train_loss': 0.01820072606660244, 'test_loss': 0.587802705168724, 'bleu': 18.389, 'gen_len': 9.2877}




 45%|████▌     | 54/120 [36:43<40:55, 37.20s/it]

For epoch 60: {Learning rate: [0.0028776301160452827]}


Train batch number 164: 100%|██████████| 164/164 [00:44<00:00,  3.72batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.68batches/s]



Metrics: {'train_loss': 0.016506916329423648, 'test_loss': 0.582241901755333, 'bleu': 20.6056, 'gen_len': 9.4452}




 46%|████▌     | 55/120 [37:32<44:08, 40.75s/it]

For epoch 61: {Learning rate: [0.0028335202610091648]}


Train batch number 164: 100%|██████████| 164/164 [00:36<00:00,  4.46batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.35batches/s]



Metrics: {'train_loss': 0.016060819721598996, 'test_loss': 0.5771393671631813, 'bleu': 20.058, 'gen_len': 9.6918}




 47%|████▋     | 56/120 [38:14<43:50, 41.10s/it]

For epoch 62: {Learning rate: [0.002789410405973047]}


Train batch number 164: 100%|██████████| 164/164 [00:36<00:00,  4.51batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.51batches/s]



Metrics: {'train_loss': 0.015800825467832933, 'test_loss': 0.5994600296020508, 'bleu': 17.9253, 'gen_len': 9.1986}




 48%|████▊     | 57/120 [38:56<43:16, 41.22s/it]

For epoch 63: {Learning rate: [0.002745300550936929]}


Train batch number 164: 100%|██████████| 164/164 [00:37<00:00,  4.35batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.42batches/s]



Metrics: {'train_loss': 0.014985333330270538, 'test_loss': 0.6041627794504165, 'bleu': 19.1289, 'gen_len': 9.274}




 48%|████▊     | 58/120 [39:39<43:08, 41.75s/it]

For epoch 64: {Learning rate: [0.002701190695900811]}


Train batch number 164: 100%|██████████| 164/164 [00:35<00:00,  4.67batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.66batches/s]



Metrics: {'train_loss': 0.014840405905710124, 'test_loss': 0.6077981546521187, 'bleu': 18.3561, 'gen_len': 8.9795}




 49%|████▉     | 59/120 [40:18<41:52, 41.19s/it]

For epoch 65: {Learning rate: [0.002657080840864693]}


Train batch number 164: 100%|██████████| 164/164 [00:34<00:00,  4.75batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.50batches/s]



Metrics: {'train_loss': 0.014822579595180818, 'test_loss': 0.5921641036868095, 'bleu': 21.0977, 'gen_len': 9.137}




 50%|█████     | 60/120 [40:58<40:50, 40.83s/it]

For epoch 66: {Learning rate: [0.002612970985828575]}


Train batch number 164: 100%|██████████| 164/164 [00:34<00:00,  4.71batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.79batches/s]



Metrics: {'train_loss': 0.015407313552980379, 'test_loss': 0.5765330225229264, 'bleu': 19.3539, 'gen_len': 9.1027}




 51%|█████     | 61/120 [41:38<39:44, 40.42s/it]

For epoch 67: {Learning rate: [0.002568861130792457]}


Train batch number 164: 100%|██████████| 164/164 [00:35<00:00,  4.57batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.65batches/s]



Metrics: {'train_loss': 0.015092022464292624, 'test_loss': 0.5911264240741729, 'bleu': 20.5139, 'gen_len': 9.5411}




 52%|█████▏    | 62/120 [42:19<39:08, 40.49s/it]

For epoch 68: {Learning rate: [0.002524751275756339]}


Train batch number 164: 100%|██████████| 164/164 [00:34<00:00,  4.81batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.82batches/s]



Metrics: {'train_loss': 0.015089938365967899, 'test_loss': 0.5972484946250916, 'bleu': 18.4305, 'gen_len': 9.089}




 52%|█████▎    | 63/120 [42:57<37:55, 39.91s/it]

For epoch 69: {Learning rate: [0.0024806414207202213]}


Train batch number 164: 100%|██████████| 164/164 [00:36<00:00,  4.52batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.35batches/s]



Metrics: {'train_loss': 0.014340972699645179, 'test_loss': 0.5958916179835796, 'bleu': 19.9758, 'gen_len': 9.3014}




 53%|█████▎    | 64/120 [43:39<37:44, 40.44s/it]

For epoch 70: {Learning rate: [0.002436531565684103]}


Train batch number 164: 100%|██████████| 164/164 [00:39<00:00,  4.20batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.81batches/s]



Metrics: {'train_loss': 0.013667562428475699, 'test_loss': 0.5978620707988739, 'bleu': 20.453, 'gen_len': 9.3288}




 54%|█████▍    | 65/120 [44:22<37:56, 41.39s/it]

For epoch 71: {Learning rate: [0.002392421710647985]}


Train batch number 164: 100%|██████████| 164/164 [00:36<00:00,  4.48batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.30batches/s]



Metrics: {'train_loss': 0.013465950705633476, 'test_loss': 0.6031666696071625, 'bleu': 20.3632, 'gen_len': 9.137}




 55%|█████▌    | 66/120 [45:04<37:25, 41.58s/it]

For epoch 72: {Learning rate: [0.002348311855611867]}


Train batch number 164: 100%|██████████| 164/164 [00:39<00:00,  4.15batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.45batches/s]



Metrics: {'train_loss': 0.014328845783716022, 'test_loss': 0.5813095800578594, 'bleu': 21.7312, 'gen_len': 9.4863}




 56%|█████▌    | 67/120 [45:49<37:36, 42.58s/it]

For epoch 73: {Learning rate: [0.002304202000575749]}


Train batch number 164: 100%|██████████| 164/164 [00:35<00:00,  4.59batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.57batches/s]



Metrics: {'train_loss': 0.013796324077880055, 'test_loss': 0.607198677957058, 'bleu': 21.2849, 'gen_len': 9.1986}




 57%|█████▋    | 68/120 [46:30<36:22, 41.97s/it]

For epoch 74: {Learning rate: [0.002260092145539631]}


Train batch number 164: 100%|██████████| 164/164 [00:35<00:00,  4.65batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.75batches/s]



Metrics: {'train_loss': 0.013779326980910831, 'test_loss': 0.6008716762065888, 'bleu': 22.1785, 'gen_len': 9.7466}




 57%|█████▊    | 69/120 [47:10<35:12, 41.42s/it]

For epoch 75: {Learning rate: [0.002215982290503513]}


Train batch number 164: 100%|██████████| 164/164 [00:35<00:00,  4.60batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.53batches/s]



Metrics: {'train_loss': 0.01360741252556625, 'test_loss': 0.5918160781264306, 'bleu': 21.1255, 'gen_len': 9.4589}




 58%|█████▊    | 70/120 [47:51<34:18, 41.16s/it]

For epoch 76: {Learning rate: [0.002171872435467395]}


Train batch number 164: 100%|██████████| 164/164 [00:36<00:00,  4.52batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.57batches/s]



Metrics: {'train_loss': 0.013159858744318893, 'test_loss': 0.6008285209536552, 'bleu': 20.5378, 'gen_len': 9.5068}




 59%|█████▉    | 71/120 [48:32<33:36, 41.15s/it]

For epoch 77: {Learning rate: [0.002127762580431277]}


Train batch number 164: 100%|██████████| 164/164 [00:37<00:00,  4.40batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.73batches/s]



Metrics: {'train_loss': 0.012734522592991864, 'test_loss': 0.5950274258852005, 'bleu': 20.5287, 'gen_len': 9.2534}




 60%|██████    | 72/120 [49:14<33:07, 41.40s/it]

For epoch 78: {Learning rate: [0.002083652725395159]}


Train batch number 164: 100%|██████████| 164/164 [00:35<00:00,  4.56batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.62batches/s]



Metrics: {'train_loss': 0.012617188351081185, 'test_loss': 0.604338426887989, 'bleu': 20.3863, 'gen_len': 9.2534}




 61%|██████    | 73/120 [49:54<32:16, 41.19s/it]

For epoch 79: {Learning rate: [0.002039542870359041]}


Train batch number 164: 100%|██████████| 164/164 [00:36<00:00,  4.55batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.41batches/s]



Metrics: {'train_loss': 0.012763473308906413, 'test_loss': 0.5980563327670098, 'bleu': 22.2998, 'gen_len': 9.2877}




 62%|██████▏   | 74/120 [50:36<31:39, 41.29s/it]

For epoch 80: {Learning rate: [0.001995433015322923]}


Train batch number 164: 100%|██████████| 164/164 [00:36<00:00,  4.45batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.65batches/s]



Metrics: {'train_loss': 0.013041243781960319, 'test_loss': 0.6118626549839974, 'bleu': 19.6842, 'gen_len': 9.1438}




 62%|██████▎   | 75/120 [51:18<31:03, 41.41s/it]

For epoch 81: {Learning rate: [0.001951323160286805]}


Train batch number 164: 100%|██████████| 164/164 [00:36<00:00,  4.43batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.47batches/s]



Metrics: {'train_loss': 0.012086475473862686, 'test_loss': 0.5813816800713539, 'bleu': 20.8458, 'gen_len': 9.3836}




 63%|██████▎   | 76/120 [52:00<30:31, 41.62s/it]

For epoch 82: {Learning rate: [0.001907213305250687]}


Train batch number 164: 100%|██████████| 164/164 [00:36<00:00,  4.55batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.61batches/s]



Metrics: {'train_loss': 0.011506833327741066, 'test_loss': 0.5923467621207237, 'bleu': 21.3509, 'gen_len': 9.1986}




 64%|██████▍   | 77/120 [52:41<29:39, 41.39s/it]

For epoch 83: {Learning rate: [0.0018631034502145693]}


Train batch number 164: 100%|██████████| 164/164 [00:36<00:00,  4.53batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.56batches/s]



Metrics: {'train_loss': 0.01151422687192879, 'test_loss': 0.5950937300920487, 'bleu': 20.2715, 'gen_len': 9.2466}




 65%|██████▌   | 78/120 [53:22<28:55, 41.33s/it]

For epoch 84: {Learning rate: [0.0018189935951784513]}


Train batch number 164: 100%|██████████| 164/164 [00:35<00:00,  4.56batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.59batches/s]



Metrics: {'train_loss': 0.011894571038371906, 'test_loss': 0.5898650661110878, 'bleu': 19.6999, 'gen_len': 9.363}




 66%|██████▌   | 79/120 [54:03<28:08, 41.17s/it]

For epoch 85: {Learning rate: [0.0017748837401423332]}


Train batch number 164: 100%|██████████| 164/164 [00:36<00:00,  4.47batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.66batches/s]



Metrics: {'train_loss': 0.011629014601789006, 'test_loss': 0.6078643664717674, 'bleu': 20.5685, 'gen_len': 9.137}




 67%|██████▋   | 80/120 [54:44<27:31, 41.29s/it]

For epoch 86: {Learning rate: [0.0017307738851062152]}


Train batch number 164: 100%|██████████| 164/164 [00:35<00:00,  4.59batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.35batches/s]



Metrics: {'train_loss': 0.011705552656544237, 'test_loss': 0.5963006526231766, 'bleu': 21.7152, 'gen_len': 9.1849}




 68%|██████▊   | 81/120 [55:25<26:47, 41.22s/it]

For epoch 87: {Learning rate: [0.0016866640300700973]}


Train batch number 164: 100%|██████████| 164/164 [00:35<00:00,  4.66batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.52batches/s]



Metrics: {'train_loss': 0.01086892100567816, 'test_loss': 0.5973270237445831, 'bleu': 20.8299, 'gen_len': 9.2055}




 68%|██████▊   | 82/120 [56:05<25:55, 40.93s/it]

For epoch 88: {Learning rate: [0.0016425541750339793]}


Train batch number 164: 100%|██████████| 164/164 [00:36<00:00,  4.52batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.33batches/s]



Metrics: {'train_loss': 0.012227002411325485, 'test_loss': 0.5978757470846177, 'bleu': 21.643, 'gen_len': 9.1233}




 69%|██████▉   | 83/120 [56:47<25:21, 41.13s/it]

For epoch 89: {Learning rate: [0.0015984443199978612]}


Train batch number 164: 100%|██████████| 164/164 [00:35<00:00,  4.59batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.68batches/s]



Metrics: {'train_loss': 0.011305816042827578, 'test_loss': 0.5920078083872795, 'bleu': 20.9242, 'gen_len': 9.3288}




 70%|███████   | 84/120 [57:28<24:35, 40.98s/it]

For epoch 90: {Learning rate: [0.0015543344649617432]}


Train batch number 164: 100%|██████████| 164/164 [00:35<00:00,  4.60batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.66batches/s]



Metrics: {'train_loss': 0.010782118888838762, 'test_loss': 0.5927596122026444, 'bleu': 20.8864, 'gen_len': 9.2808}




 71%|███████   | 85/120 [58:08<23:47, 40.78s/it]

For epoch 91: {Learning rate: [0.0015102246099256253]}


Train batch number 164: 100%|██████████| 164/164 [00:35<00:00,  4.68batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.65batches/s]



Metrics: {'train_loss': 0.010467291430558828, 'test_loss': 0.5945031598210335, 'bleu': 22.8314, 'gen_len': 9.1233}




 72%|███████▏  | 86/120 [58:48<22:59, 40.57s/it]

For epoch 92: {Learning rate: [0.0014661147548895072]}


Train batch number 164: 100%|██████████| 164/164 [00:36<00:00,  4.51batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.81batches/s]



Metrics: {'train_loss': 0.010993188278317996, 'test_loss': 0.5932282269001007, 'bleu': 22.5047, 'gen_len': 9.6301}




 72%|███████▎  | 87/120 [59:29<22:21, 40.66s/it]

For epoch 93: {Learning rate: [0.0014220048998533892]}


Train batch number 164: 100%|██████████| 164/164 [00:35<00:00,  4.67batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.77batches/s]



Metrics: {'train_loss': 0.010352298853033019, 'test_loss': 0.5878886595368386, 'bleu': 22.9625, 'gen_len': 9.3562}




 73%|███████▎  | 88/120 [1:00:09<21:34, 40.47s/it]

For epoch 94: {Learning rate: [0.0013778950448172713]}


Train batch number 164: 100%|██████████| 164/164 [00:35<00:00,  4.63batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.68batches/s]



Metrics: {'train_loss': 0.010509867987754504, 'test_loss': 0.5931658878922462, 'bleu': 22.7943, 'gen_len': 9.3836}




 74%|███████▍  | 89/120 [1:00:49<20:51, 40.39s/it]

For epoch 95: {Learning rate: [0.0013337851897811531]}


Train batch number 164: 100%|██████████| 164/164 [00:36<00:00,  4.52batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.64batches/s]



Metrics: {'train_loss': 0.010161467481449974, 'test_loss': 0.5882873453199864, 'bleu': 22.0486, 'gen_len': 9.5068}




 75%|███████▌  | 90/120 [1:01:30<20:19, 40.63s/it]

For epoch 96: {Learning rate: [0.0012896753347450352]}


Train batch number 164: 100%|██████████| 164/164 [00:36<00:00,  4.47batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.61batches/s]



Metrics: {'train_loss': 0.01002924528438598, 'test_loss': 0.5898765407502651, 'bleu': 22.2012, 'gen_len': 9.226}




 76%|███████▌  | 91/120 [1:02:12<19:45, 40.87s/it]

For epoch 97: {Learning rate: [0.0012455654797089172]}


Train batch number 164: 100%|██████████| 164/164 [00:37<00:00,  4.39batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.46batches/s]



Metrics: {'train_loss': 0.009887668169273927, 'test_loss': 0.5920429788529873, 'bleu': 21.3139, 'gen_len': 9.4247}




 77%|███████▋  | 92/120 [1:02:54<19:18, 41.36s/it]

For epoch 98: {Learning rate: [0.001201455624672799]}


Train batch number 164: 100%|██████████| 164/164 [00:36<00:00,  4.51batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.59batches/s]



Metrics: {'train_loss': 0.00959879833148656, 'test_loss': 0.5958058923482895, 'bleu': 22.3025, 'gen_len': 9.1712}




 78%|███████▊  | 93/120 [1:03:36<18:36, 41.36s/it]

For epoch 99: {Learning rate: [0.0011573457696366812]}


Train batch number 164: 100%|██████████| 164/164 [00:36<00:00,  4.53batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.72batches/s]



Metrics: {'train_loss': 0.009725391377819838, 'test_loss': 0.5947215057909488, 'bleu': 19.9633, 'gen_len': 9.2603}




 78%|███████▊  | 94/120 [1:04:16<17:51, 41.21s/it]

For epoch 100: {Learning rate: [0.0011132359146005632]}


Train batch number 164: 100%|██████████| 164/164 [00:35<00:00,  4.61batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.65batches/s]



Metrics: {'train_loss': 0.010001302940952705, 'test_loss': 0.5955226019024848, 'bleu': 21.7712, 'gen_len': 9.3014}




 79%|███████▉  | 95/120 [1:04:57<17:04, 40.97s/it]

For epoch 101: {Learning rate: [0.0010691260595644453]}


Train batch number 164: 100%|██████████| 164/164 [00:36<00:00,  4.53batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.63batches/s]



Metrics: {'train_loss': 0.009486189698402928, 'test_loss': 0.6036926001310349, 'bleu': 22.5972, 'gen_len': 9.1986}




 80%|████████  | 96/120 [1:05:38<16:23, 40.96s/it]

For epoch 102: {Learning rate: [0.0010250162045283273]}


Train batch number 164: 100%|██████████| 164/164 [00:35<00:00,  4.60batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.64batches/s]



Metrics: {'train_loss': 0.009440895138739995, 'test_loss': 0.5997613281011581, 'bleu': 22.3423, 'gen_len': 9.2808}




 81%|████████  | 97/120 [1:06:18<15:39, 40.83s/it]

For epoch 103: {Learning rate: [0.0009809063494922094]}


Train batch number 164: 100%|██████████| 164/164 [00:36<00:00,  4.46batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.74batches/s]



Metrics: {'train_loss': 0.009417708673537141, 'test_loss': 0.5987559989094734, 'bleu': 22.6438, 'gen_len': 9.1233}




 82%|████████▏ | 98/120 [1:07:00<15:03, 41.06s/it]

For epoch 104: {Learning rate: [0.0009367964944560914]}


Train batch number 164: 100%|██████████| 164/164 [00:35<00:00,  4.64batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.67batches/s]



Metrics: {'train_loss': 0.00933618698320238, 'test_loss': 0.599660475552082, 'bleu': 22.4704, 'gen_len': 9.1918}




 82%|████████▎ | 99/120 [1:07:40<14:17, 40.82s/it]

For epoch 105: {Learning rate: [0.0008926866394199733]}


Train batch number 164: 100%|██████████| 164/164 [00:35<00:00,  4.64batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.72batches/s]



Metrics: {'train_loss': 0.009442916267733203, 'test_loss': 0.5985696293413639, 'bleu': 21.6206, 'gen_len': 9.0753}




 83%|████████▎ | 100/120 [1:08:20<13:32, 40.62s/it]

For epoch 106: {Learning rate: [0.0008485767843838553]}


Train batch number 164: 100%|██████████| 164/164 [00:35<00:00,  4.61batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.34batches/s]



Metrics: {'train_loss': 0.009004225485912728, 'test_loss': 0.6020602740347385, 'bleu': 21.7585, 'gen_len': 9.2466}




 84%|████████▍ | 101/120 [1:09:01<12:53, 40.72s/it]

For epoch 107: {Learning rate: [0.0008044669293477373]}


Train batch number 164: 100%|██████████| 164/164 [00:35<00:00,  4.61batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.62batches/s]



Metrics: {'train_loss': 0.009256876496847992, 'test_loss': 0.6033275052905083, 'bleu': 21.8461, 'gen_len': 9.2877}




 85%|████████▌ | 102/120 [1:09:42<12:11, 40.66s/it]

For epoch 108: {Learning rate: [0.0007603570743116193]}


Train batch number 164: 100%|██████████| 164/164 [00:34<00:00,  4.70batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.64batches/s]



Metrics: {'train_loss': 0.009361003240195625, 'test_loss': 0.6036890566349029, 'bleu': 21.8169, 'gen_len': 9.363}




 86%|████████▌ | 103/120 [1:10:22<11:26, 40.37s/it]

For epoch 109: {Learning rate: [0.0007162472192755012]}


Train batch number 164: 100%|██████████| 164/164 [00:35<00:00,  4.57batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.60batches/s]



Metrics: {'train_loss': 0.008908306499772773, 'test_loss': 0.6045443780720234, 'bleu': 20.7738, 'gen_len': 9.2808}




 87%|████████▋ | 104/120 [1:11:02<10:47, 40.47s/it]

For epoch 110: {Learning rate: [0.0006721373642393833]}


Train batch number 164: 100%|██████████| 164/164 [00:34<00:00,  4.69batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.77batches/s]



Metrics: {'train_loss': 0.008882536650095816, 'test_loss': 0.6018308885395527, 'bleu': 20.5355, 'gen_len': 9.3219}




 88%|████████▊ | 105/120 [1:11:42<10:02, 40.18s/it]

For epoch 111: {Learning rate: [0.0006280275092032654]}


Train batch number 164: 100%|██████████| 164/164 [00:36<00:00,  4.51batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.68batches/s]



Metrics: {'train_loss': 0.008667017764798024, 'test_loss': 0.6055180683732033, 'bleu': 21.6727, 'gen_len': 9.3014}




 88%|████████▊ | 106/120 [1:12:23<09:26, 40.47s/it]

For epoch 112: {Learning rate: [0.0005839176541671473]}


Train batch number 164: 100%|██████████| 164/164 [00:35<00:00,  4.62batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.74batches/s]



Metrics: {'train_loss': 0.008753346832946124, 'test_loss': 0.6066048666834831, 'bleu': 21.932, 'gen_len': 9.1849}




 89%|████████▉ | 107/120 [1:13:03<08:44, 40.35s/it]

For epoch 113: {Learning rate: [0.0005398077991310294]}


Train batch number 164: 100%|██████████| 164/164 [00:36<00:00,  4.55batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.44batches/s]



Metrics: {'train_loss': 0.008365048850863808, 'test_loss': 0.6039181880652904, 'bleu': 22.1212, 'gen_len': 9.1849}




 90%|█████████ | 108/120 [1:13:44<08:06, 40.58s/it]

For epoch 114: {Learning rate: [0.0004956979440949113]}


Train batch number 164: 100%|██████████| 164/164 [00:35<00:00,  4.61batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.75batches/s]



Metrics: {'train_loss': 0.008329180417803866, 'test_loss': 0.6050814360380172, 'bleu': 21.8542, 'gen_len': 9.2397}




 91%|█████████ | 109/120 [1:14:24<07:25, 40.50s/it]

For epoch 115: {Learning rate: [0.0004515880890587934]}


Train batch number 164: 100%|██████████| 164/164 [00:34<00:00,  4.82batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.77batches/s]



Metrics: {'train_loss': 0.008466812705325826, 'test_loss': 0.60279146656394, 'bleu': 21.6681, 'gen_len': 9.3973}




 92%|█████████▏| 110/120 [1:15:03<06:39, 39.92s/it]

For epoch 116: {Learning rate: [0.0004074782340226754]}


Train batch number 164: 100%|██████████| 164/164 [00:34<00:00,  4.71batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.62batches/s]



Metrics: {'train_loss': 0.008204694515060089, 'test_loss': 0.6044035725295543, 'bleu': 21.8162, 'gen_len': 9.3288}




 92%|█████████▎| 111/120 [1:15:42<05:58, 39.80s/it]

For epoch 117: {Learning rate: [0.0003633683789865574]}


Train batch number 164: 100%|██████████| 164/164 [00:35<00:00,  4.62batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.66batches/s]



Metrics: {'train_loss': 0.008056425698181024, 'test_loss': 0.605882766097784, 'bleu': 21.7869, 'gen_len': 9.3425}




 93%|█████████▎| 112/120 [1:16:23<05:19, 39.94s/it]

For epoch 118: {Learning rate: [0.00031925852395043935]}


Train batch number 164: 100%|██████████| 164/164 [00:35<00:00,  4.62batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.64batches/s]



Metrics: {'train_loss': 0.008053348723390117, 'test_loss': 0.6048972301185132, 'bleu': 21.9661, 'gen_len': 9.2671}




 94%|█████████▍| 113/120 [1:17:03<04:40, 40.05s/it]

For epoch 119: {Learning rate: [0.00027514866891432136]}


Train batch number 164: 100%|██████████| 164/164 [00:35<00:00,  4.65batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.71batches/s]



Metrics: {'train_loss': 0.008130701285752854, 'test_loss': 0.6063158608973026, 'bleu': 21.9597, 'gen_len': 9.2603}




 95%|█████████▌| 114/120 [1:17:43<03:59, 39.99s/it]

For epoch 120: {Learning rate: [0.0002310388138782034]}


Train batch number 164: 100%|██████████| 164/164 [00:33<00:00,  4.84batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.82batches/s]



Metrics: {'train_loss': 0.007949110595424228, 'test_loss': 0.6046860657632351, 'bleu': 21.8411, 'gen_len': 9.3082}




 96%|█████████▌| 115/120 [1:18:21<03:17, 39.50s/it]

For epoch 121: {Learning rate: [0.0001869289588420854]}


Train batch number 164: 100%|██████████| 164/164 [00:35<00:00,  4.63batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:04<00:00,  2.25batches/s]



Metrics: {'train_loss': 0.007751608539298858, 'test_loss': 0.6056878186762333, 'bleu': 21.9363, 'gen_len': 9.2945}




 97%|█████████▋| 116/120 [1:19:02<02:39, 39.94s/it]

For epoch 122: {Learning rate: [0.0001428191038059674]}


Train batch number 164: 100%|██████████| 164/164 [00:35<00:00,  4.66batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.67batches/s]



Metrics: {'train_loss': 0.007866974288268333, 'test_loss': 0.6064018420875072, 'bleu': 21.9455, 'gen_len': 9.274}




 98%|█████████▊| 117/120 [1:19:42<01:59, 39.95s/it]

For epoch 123: {Learning rate: [9.870924876984941e-05]}


Train batch number 164: 100%|██████████| 164/164 [00:34<00:00,  4.72batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.87batches/s]



Metrics: {'train_loss': 0.007681851462665491, 'test_loss': 0.6061825342476368, 'bleu': 21.909, 'gen_len': 9.3151}




 98%|█████████▊| 118/120 [1:20:21<01:19, 39.73s/it]

For epoch 124: {Learning rate: [5.459939373373142e-05]}


Train batch number 164: 100%|██████████| 164/164 [00:35<00:00,  4.61batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.69batches/s]



Metrics: {'train_loss': 0.007834111607247373, 'test_loss': 0.6062217697501182, 'bleu': 21.8766, 'gen_len': 9.3082}




 99%|█████████▉| 119/120 [1:21:02<00:39, 39.90s/it]

For epoch 125: {Learning rate: [1.0489538697613426e-05]}


Train batch number 164: 100%|██████████| 164/164 [00:35<00:00,  4.62batches/s]
Test batch number 10: 100%|██████████| 10/10 [00:03<00:00,  2.62batches/s]



Metrics: {'train_loss': 0.007942471042752447, 'test_loss': 0.6062527023255825, 'bleu': 21.9039, 'gen_len': 9.2877}




100%|██████████| 120/120 [1:21:42<00:00, 40.85s/it]


### Predictions and Evaluation

In [15]:
# let us get the test set
test_dataset = SentenceDataset(f"data/extractions/new_data/test_set.csv",
                                        corpus_1='wolof',
                                        corpus_2='french',
                                        tokenizer = tokenizer,
                                        truncation = True)

Let us make the evaluation and print the predicted sentences.

In [16]:
# evaluation with test set
df_ft_to_wf = trainer.evaluate(test_dataset)

Evaluation batch number 11: 100%|██████████| 11/11 [00:04<00:00,  2.68batches/s]


In [17]:
df_ft_to_wf[1].tail(10)

Unnamed: 0,original_sentences,translations,predictions
152,"Yaa ŋgi, dem ŋga","Te voila, tu as été","Toi, tu as été"
153,Bëgg na ŋga dem,Il veut que tu viennes,Il veut que tu as été
154,Liggéeykat yi man ag yaw la.,Les travailleurs c'est toi et moi.,Il a vu les dames.
155,Foofee fan?,Où?,Là-bas où?
156,"Yaa ŋgi, mi ŋgi","Te voilà, le voilà","Toi, tu n'as pas été"
157,Gis ŋga kooku?,Tu as vu celui-ci?,Tu as vu celui-là?
158,Dem naa ba ci moom.,J'ai été jusqu'à lui.,J'ai été jusqu'à Saint-Louis.
159,Yéen ñan la wax?,Il parle de vous?,Il parle desquelles de vous?
160,Moo doon ganam.,C'était son hôte habituellement.,C'est le Laobe.
161,Nil waa ji na ñëw,Dis à la personne qu'elle vienne,L'homme est venu


In [18]:
# let us display 100 samples
pd.options.display.max_rows = 100
df_ft_to_wf[1].sample(100)

Unnamed: 0,original_sentences,translations,predictions
135,Góor gi demul,L'homme ne part pas,L'homme n'a pas voulu
106,Seetal ma ñenn ñuu!,Surveille-moi les-uns que voilà!,Surveille-moi ceux-là!
5,Naka ŋgeen bëggé góor gi dimëlé leen?,Comment voulez-vous que l'homme vous aide?,Vous êtes des enfants seulement?
96,Di tel-teli doŋŋ taxul sotal dara.,S'agiter simplement ne suffit à rien résoudre.,Sois quelqu'un de studieux.
145,Kooku dem ku më bëgg la!,"Celui qui est parti, c'est quelqu'un que j'app...","C'est quelqu'un que j'apprécie, celui qui est ..."
4,Gis naa xale booba?,J'ai vu cet enfant-là?,J'ai vu cet enfant-là?
141,Gis naa am xar.,J'ai vu un mouton.,J'ai vu un mouton.
66,Yaa daan ganu Mustaf,Tu étais d'habitude l'hôte de Mustapha,C'est toi qui eusses été élu
51,Xam naa xale bi.,Je connais l'enfant.,Je vois les gens.
13,Ma japp nag yee yan?,Que j'attrape quelles vaches?,Quelles personnes se sont égarées?
