<a href="https://colab.research.google.com/github/davifcs/ia376/blob/main/Aula5_Basico.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Neste colab iremos treinar um modelo T5 para traduzir de inglês para português. Iremos treiná-lo com o data Paracrawl.

In [None]:
# Configurações gerais
model_name = "t5-small"
batch_size = 32
accumulate_grad_batches = 2
source_max_length = 128
target_max_length = 128
learning_rate = 5e-3

In [None]:
! pip install sacrebleu
! pip install pytorch-lightning
! pip install transformers

Collecting sacrebleu
[?25l  Downloading https://files.pythonhosted.org/packages/a3/c4/8e948f601a4f9609e8b2b58f31966cb13cf17b940b82aa3e767f01c42c52/sacrebleu-1.4.14-py3-none-any.whl (64kB)
[K     |█████                           | 10kB 20.7MB/s eta 0:00:01[K     |██████████▏                     | 20kB 1.7MB/s eta 0:00:01[K     |███████████████▏                | 30kB 2.2MB/s eta 0:00:01[K     |████████████████████▎           | 40kB 2.5MB/s eta 0:00:01[K     |█████████████████████████▎      | 51kB 2.0MB/s eta 0:00:01[K     |██████████████████████████████▍ | 61kB 2.2MB/s eta 0:00:01[K     |████████████████████████████████| 71kB 2.0MB/s 
[?25hCollecting portalocker
  Downloading https://files.pythonhosted.org/packages/89/a6/3814b7107e0788040870e8825eebf214d72166adf656ba7d4bf14759a06a/portalocker-2.0.0-py2.py3-none-any.whl
Installing collected packages: portalocker, sacrebleu
Successfully installed portalocker-2.0.0 sacrebleu-1.4.14
Collecting pytorch-lightning
[?25l  Downlo

In [None]:
# Importar todos os pacotes de uma só vez para evitar duplicados ao longo do notebook.
import gzip
import nvidia_smi
import os
import pytorch_lightning as pl
import random
import sacrebleu
import torch
import torch.nn.functional as F

from google.colab import drive

from pytorch_lightning.callbacks import ModelCheckpoint

from transformers import T5ForConditionalGeneration
from transformers import T5Tokenizer
from torch.utils.data import DataLoader
from torch.utils.data import Dataset

from typing import Dict
from typing import List
from typing import Tuple

In [None]:
# Important: Fix seeds so we can replicate results
seed = 123
random.seed(seed)
# np.random.seed(seed)
torch.random.manual_seed(seed)
torch.cuda.manual_seed(seed)

DICA para modelos reais: Um modelo otimizado deve manter o uso de GPU próximo a 100% durante o treino.
Vamos utilizar a bilioteca abaixo para monitorar isso. Note que no modelo simples utilizado aqui o uso não vai chegar a 100%.

In [None]:
print(f"Pytorch Lightning Version: {pl.__version__}")
nvidia_smi.nvmlInit()
handle = nvidia_smi.nvmlDeviceGetHandleByIndex(0)
print(f"Device name: {nvidia_smi.nvmlDeviceGetName(handle)}")

def gpu_usage():
    global handle
    return str(nvidia_smi.nvmlDeviceGetUtilizationRates(handle).gpu) + '%'

Pytorch Lightning Version: 1.0.3
Device name: b'Tesla K80'


Iremos salvar os checkpoints (pesos do modelo) no google drive, para que possamos continuar o treino de onde paramos.

In [None]:
drive.mount('/content/drive')

Mounted at /content/drive


## Preparando Dados

Primeiro, fazemos download do dataset:

In [None]:
! wget -nc https://storage.googleapis.com/neuralresearcher_data/unicamp/ia376e_2020s1/paracrawl_enpt_train.tsv.gz
! wget -nc https://storage.googleapis.com/neuralresearcher_data/unicamp/ia376e_2020s1/paracrawl_enpt_test.tsv.gz

--2020-10-21 12:37:49--  https://storage.googleapis.com/neuralresearcher_data/unicamp/ia376e_2020s1/paracrawl_enpt_train.tsv.gz
Resolving storage.googleapis.com (storage.googleapis.com)... 74.125.203.128, 74.125.204.128, 64.233.189.128, ...
Connecting to storage.googleapis.com (storage.googleapis.com)|74.125.203.128|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 106548256 (102M) [text/tab-separated-values]
Saving to: ‘paracrawl_enpt_train.tsv.gz’


2020-10-21 12:37:50 (80.7 MB/s) - ‘paracrawl_enpt_train.tsv.gz’ saved [106548256/106548256]

--2020-10-21 12:37:50--  https://storage.googleapis.com/neuralresearcher_data/unicamp/ia376e_2020s1/paracrawl_enpt_test.tsv.gz
Resolving storage.googleapis.com (storage.googleapis.com)... 64.233.189.128, 108.177.97.128, 74.125.203.128, ...
Connecting to storage.googleapis.com (storage.googleapis.com)|64.233.189.128|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 2139168 (2.0M) [text/tab-separated-v

## Carregando o dataset

Criaremos uma divisão de treino (100k pares) e val (5k pares) artificialmente.

Nota: Evitar de olhar ao máximo o dataset de teste para não ficar enviseado no que será testado. Em aplicações reais, o dataset de teste só estará disponível no futuro, ou seja, é quando o usuário começa a testar o seu produto.

In [None]:
def load_text_pairs(path):
    text_pairs = []
    for line in gzip.open(path, mode='rt'):
        text_pairs.append(line.strip().split('\t'))
    return text_pairs

x_train = load_text_pairs('paracrawl_enpt_train.tsv.gz')
x_test = load_text_pairs('paracrawl_enpt_test.tsv.gz')

# Embaralhamos o treino para depois fazermos a divisão treino/val.
random.shuffle(x_train)

# Truncamos o dataset para 100k pares de treino e 5k pares de validação.
x_val = x_train[100000:105000]
x_train = x_train[:100000]

for set_name, x in [('treino', x_train), ('validação', x_val), ('test', x_test)]:
    print(f'\n{len(x)} amostras de {set_name}')
    print(f'3 primeiras amostras {set_name}:')
    for i, (source, target) in enumerate(x[:3]):
        print(f'{i}: source: {source}\n   target: {target}')


100000 amostras de treino
3 primeiras amostras treino:
0: source: More Croatian words and phrases
   target: Mais palavras e frases em croata
1: source: Jerseys and pullovers, containing at least 50Â % by weight of wool and weighing 600Â g or more per article 6110 11 10 (PCE)
   target: Camisolas e pulôveres, com pelo menos 50 %, em peso, de lã e pesando 600g ou mais por unidade 6110 11 10 (PCE)
2: source: Atex Colombia SAS makes available its lead product, 100% natural liquid latex, excellent quality and price. ... Welding manizales caldas Colombia a DuckDuckGo
   target: Atex Colômbia SAS torna principal produto está disponível, látex líquido 100% natural, excelente qualidade e preço. ...

5000 amostras de validação
3 primeiras amostras validação:
0: source: «You have hidden these things from the wise and the learned you have revealed them to the childlike»
   target: «Escondeste estas coisas aos sábios e entendidos e as revelaste aos pequenos»
1: source: Repair of computers, applic

Criando Dataset


In [None]:
tokenizer = T5Tokenizer.from_pretrained(model_name)
 
class MyDataset(Dataset):
    def __init__(self, text_pairs: List[Tuple[str]], tokenizer,
                 source_max_length: int = 32, target_max_length: int = 32):
        self.tokenizer = tokenizer
        self.text_pairs = text_pairs
        self.source_max_length = source_max_length
        self.target_max_length = target_max_length
 
    def __len__(self):
        return len(self.text_pairs)
 
    def __getitem__(self, idx):
        source, target = self.text_pairs[idx]
 
        original_source = source
        original_target = target
        
        source_ = "translate English to Portuguese %s" % (source)
 
        tokenized_source = self.tokenizer(source_,
                                                  max_length=self.source_max_length, 
                                                  padding='max_length', 
                                                  truncation=True,
                                                  return_tensors="pt") 
        
        source_token_ids = tokenized_source['input_ids'].squeeze(0)
        source_mask = tokenized_source['attention_mask'].squeeze(0)
        
        tokenized_target = self.tokenizer(target, 
                                          max_length=self.target_max_length, 
                                          padding='max_length', 
                                          truncation=True,
                                          return_tensors="pt")  
        
        target_token_ids = tokenized_target['input_ids'].squeeze(0)
        target_mask = tokenized_target['attention_mask'].squeeze(0)
                
        return (source_token_ids, source_mask, target_token_ids, target_mask, original_source, original_target)

## Testando o DataLoader

In [None]:
text_pairs = [('we like pizza', 'eu gosto de pizza')]
dataset_debug = MyDataset(
    text_pairs=text_pairs,
    tokenizer=tokenizer,
    source_max_length=source_max_length,
    target_max_length=target_max_length)

dataloader_debug = DataLoader(dataset_debug, batch_size=10, shuffle=True, 
                              num_workers=0)

source_token_ids, source_mask, target_token_ids, target_mask, _, _= next(iter(dataloader_debug))
print('source_token_ids:\n', source_token_ids)
print('source_mask:\n', source_mask)
print('target_token_ids:\n', target_token_ids)
print('target_mask:\n', target_mask)

print('source_token_ids.shape:', source_token_ids.shape)
print('source_mask.shape:', source_mask.shape)
print('target_token_ids.shape:', target_token_ids.shape)
print('target_mask.shape:', target_mask.shape)

source_token_ids:
 tensor([[13959,  1566,    12, 21076,    62,   114,  6871,     1,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,    

## Criando DataLoaders de Treino/Val/Test

In [None]:
dataset_train = MyDataset(text_pairs=x_train,
                          tokenizer=tokenizer,
                          source_max_length=source_max_length,
                          target_max_length=target_max_length)
 
dataset_val = MyDataset(text_pairs=x_val,
                        tokenizer=tokenizer,
                        source_max_length=source_max_length,
                        target_max_length=target_max_length)

dataset_test = MyDataset(text_pairs=x_test,
                         tokenizer=tokenizer,
                         source_max_length=source_max_length,
                         target_max_length=target_max_length)
 
train_dataloader = DataLoader(dataset_train, batch_size=batch_size,
                              shuffle=True, num_workers=4)
 
val_dataloader = DataLoader(dataset_val, batch_size=batch_size, 
                            shuffle=False, num_workers=4)
 
test_dataloader = DataLoader(dataset_test, batch_size=batch_size,
                             shuffle=False, num_workers=4)

## Criando o T5 com Pytorch Lightning

In [None]:
class T5Finetuner(pl.LightningModule):

    def __init__(self, tokenizer, train_dataloader, val_dataloader,
                 test_dataloader, learning_rate, target_max_length=32):
        super(T5Finetuner, self).__init__()
        
        self._train_dataloader = train_dataloader
        self._val_dataloader = val_dataloader
        self._test_dataloader = test_dataloader

        self.model = T5ForConditionalGeneration.from_pretrained(model_name)
        
        self.tokenizer = tokenizer
        self.learning_rate = learning_rate
        self.target_max_length = target_max_length

        self.count = 0 

    def forward(self, source_token_ids, source_mask, target_token_ids=None,
                target_mask=None):

        if self.training:
            target_token_ids[target_token_ids[:, :] == self.tokenizer.pad_token_id] = -100

            outputs = self.model(input_ids=source_token_ids, 
                              attention_mask=source_mask,
                              labels=target_token_ids,
                              decoder_attention_mask=target_mask)
            return outputs[0]
        else:
            predicted_token_ids = self.model.generate( source_token_ids,
                                                      max_length = self.target_max_length)
            return predicted_token_ids

    def training_step(self, batch, batch_nb):
        # batch
        source_token_ids, source_mask, target_token_ids, target_mask, _,_ = batch

        # fwd
        loss = self(
            source_token_ids, source_mask, target_token_ids, target_mask)

        # logs
        tensorboard_logs = {'train_loss': loss}
        progress_bar = {'gpu_usage': gpu_usage()}
        return {'loss': loss, 'log': tensorboard_logs,
                'progress_bar': progress_bar}

    def validation_step(self, batch, batch_nb):
        source_token_ids, source_mask, target_token_ids, target_mask, original_source, original_target = batch
        predicted_token_ids = self(source_token_ids, source_mask)

        decoded_pred = [self.tokenizer.decode(token, skip_special_tokens=True) for token in predicted_token_ids]

        avg_bleu = sacrebleu.corpus_bleu(decoded_pred, [original_target]).score
        
        progress_bar = {'gpu_usage': gpu_usage()}
        return {'val_bleu': avg_bleu, 'progress_bar': progress_bar}

    def test_step(self, batch, batch_nb):
        source_token_ids, source_mask, target_token_ids, target_mask, original_source, original_target = batch
        predicted_token_ids = self(source_token_ids, source_mask)

        decoded_pred = [self.tokenizer.decode(token, skip_special_tokens=True) for token in predicted_token_ids]

        avg_bleu = sacrebleu.corpus_bleu(decoded_pred, [original_target]).score

        if self.count % 100 == 0:
          print(f' source: {original_source}\n target: {original_target}\n predicted: {decoded_pred[0:1]}')
          self.count += 1

        progress_bar = {'gpu_usage': gpu_usage()}
        return {'test_bleu': avg_bleu, 'progress_bar': progress_bar}

    def validation_epoch_end(self, outputs):
        avg_bleu = sum([x['val_bleu'] for x in outputs]) / len(outputs)

        tensorboard_logs = {'avg_val_bleu': avg_bleu}
        
        return {'avg_val_bleu': avg_bleu, 'progress_bar': tensorboard_logs}

    def test_epoch_end(self, outputs):
        avg_bleu = sum([x['test_bleu'] for x in outputs]) / len(outputs)

        tensorboard_logs = {'avg_test_bleu': avg_bleu}
        
        return {'avg_test_bleu': avg_bleu, 'progress_bar': tensorboard_logs}
    
    def configure_optimizers(self):
        return torch.optim.Adam(
            [p for p in self.parameters() if p.requires_grad],
            lr=self.learning_rate, eps=1e-08)

    def train_dataloader(self):
        return self._train_dataloader

    def val_dataloader(self):
        return self._val_dataloader

    def test_dataloader(self):
        return self._test_dataloader

In [None]:
model = T5Finetuner(tokenizer=tokenizer,
                    train_dataloader=train_dataloader,
                    val_dataloader=val_dataloader,
                    test_dataloader=test_dataloader,
                    learning_rate=learning_rate)

HBox(children=(FloatProgress(value=0.0, description='Downloading', max=1197.0, style=ProgressStyle(description…




HBox(children=(FloatProgress(value=0.0, description='Downloading', max=242065649.0, style=ProgressStyle(descri…




## Número de parâmetros do modelo

In [None]:
sum([torch.tensor(x.size()).prod() for x in model.parameters() if x.requires_grad]) # trainable parameters

tensor(60506880)

## Testando rapidamente o modelo em treino, validação e teste com um batch

In [None]:
trainer = pl.Trainer(gpus=1, 
                     checkpoint_callback=False,  # Disable checkpoint saving.
                     fast_dev_run=True)
trainer.fit(model)
trainer.test(model)

del model # Para não ter estouro de mémoria da GPU

GPU available: True, used: True
TPU available: False, using: 0 TPU cores
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
Running in fast_dev_run mode: will run a full train, val and test loop using a single batch

  | Name  | Type                       | Params
-----------------------------------------------------
0 | model | T5ForConditionalGeneration | 60 M  


HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Training', layout=Layout(flex='2'), max…

Please use self.log(...) inside the lightningModule instead.

# log on a step or aggregate epoch metric to the logger and/or progress bar
# (inside LightningModule)
self.log('train_loss', loss, on_step=True, on_epoch=True, prog_bar=True)
Please use self.log(...) inside the lightningModule instead.

# log on a step or aggregate epoch metric to the logger and/or progress bar
# (inside LightningModule)
self.log('train_loss', loss, on_step=True, on_epoch=True, prog_bar=True)


HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…






HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Testing', layout=Layout(flex='2'), max=…

 source: ('In this way, the civil life of a nation matures, making it possible for all citizens to enjoy the fruits of genuine tolerance and mutual respect.', '1999 XIII. Winnipeg, Canada July 23 to August 8', "In the mystery of Christmas, Christ's light shines on the earth, spreading, as it were, in concentric circles.", 'making it viable to drill two new boreholes in the west of that peninsula.', 'His eyes were shining and his voice was cheerful.', 'Injuries, accidents, bereavement, abuse, separation, shock, rape, bullying, harassment, stress, depression, anxiety, eating.', 'Whiteness HP Maxx is a 35% hydrogen peroxide whitening gel for the whitening of vital and non-vital teeth.', 'Lines: with indication of Line Number, From and To ends, insulation, the P&ID where they are drawn.', 'The cruises depart from Manaus, capital of the State of Amazonas, a city in the jungle that prospered during the rubber boom last century and where you will find a smaller copy of the Opera House in Pari



## Overfit em algumas amostras

Antes de treinar o modelo no dataset todo, faremos overfit do 
modelo em poucas de treino para verificar se loss vai para próximo de 0. Isso serve para depurar se a implementação do modelo está correta.

Podemos também medir se a acurácia neste minibatch chega perto de 100%. Isso serve para depurar se nossa função que mede a acurácia está correta.

Nota: se treinarmos por muitas épocas (ex: 500) é possivel que a loss vá para zero mesmo com bugs na implementação. O ideal é que a loss chege próxima a zero antes de 100 épocas.

In [None]:
trainer = pl.Trainer(gpus=1,
                     max_epochs=30,
                     check_val_every_n_epoch=10,
                     checkpoint_callback=False,  # Disable checkpoint saving
                     overfit_batches=0.005)

# Dataset usando apenas um batch de amostras de treino.
dataset_debug = MyDataset(text_pairs=x_train,
                          tokenizer=tokenizer,
                          source_max_length=source_max_length,
                          target_max_length=target_max_length)

debug_dataloader = DataLoader(dataset_debug, batch_size=batch_size,
                              shuffle=False, num_workers=4)

model = T5Finetuner(tokenizer=tokenizer,
                    train_dataloader=debug_dataloader,
                    val_dataloader=debug_dataloader,
                    test_dataloader=None,
                    learning_rate=learning_rate)

trainer.fit(model)
del model  # Para não ter estouro de mémoria da GPU

GPU available: True, used: True
TPU available: False, using: 0 TPU cores
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name  | Type                       | Params
-----------------------------------------------------
0 | model | T5ForConditionalGeneration | 60 M  


HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validation sanity check', layout=Layout…



Please use self.log(...) inside the lightningModule instead.

# log on a step or aggregate epoch metric to the logger and/or progress bar
# (inside LightningModule)
self.log('train_loss', loss, on_step=True, on_epoch=True, prog_bar=True)


HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Training', layout=Layout(flex='2'), max…

Please use self.log(...) inside the lightningModule instead.

# log on a step or aggregate epoch metric to the logger and/or progress bar
# (inside LightningModule)
self.log('train_loss', loss, on_step=True, on_epoch=True, prog_bar=True)


HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…




## Treinamento e Validação no dataset todo

In [None]:
max_epochs = 3
 
checkpoint_path = '/content/drive/My Drive/Colab Notebooks/epoch=10.ckpt'
checkpoint_dir = os.path.dirname(os.path.abspath(checkpoint_path))
print(f'Files in {checkpoint_dir}: {os.listdir(checkpoint_dir)}')
print(f'Saving checkpoints to {checkpoint_dir}')
checkpoint_callback = ModelCheckpoint(filepath=checkpoint_dir,
                                      save_top_k=-1)  # Keeps all checkpoints.
 
resume_from_checkpoint = None
if os.path.exists(checkpoint_path):
    print(f'Restoring checkpoint: {checkpoint_path}')
    resume_from_checkpoint = checkpoint_path
 
trainer = pl.Trainer(gpus=1,
                     max_epochs=max_epochs,
                     check_val_every_n_epoch=1,
                     profiler=True,
                     accumulate_grad_batches=accumulate_grad_batches,
                     checkpoint_callback=checkpoint_callback,
                     progress_bar_refresh_rate=50,
                     resume_from_checkpoint=resume_from_checkpoint)
 
model = T5Finetuner(tokenizer=tokenizer,
                    train_dataloader=train_dataloader,
                    val_dataloader=val_dataloader,
                    test_dataloader=test_dataloader,
                    learning_rate=learning_rate,
                    target_max_length=target_max_length)

trainer.fit(model)

GPU available: True, used: True
TPU available: False, using: 0 TPU cores
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Files in /content/drive/My Drive/Colab Notebooks: ['Cópia de Copy of Explorando-Convolucao-no-PyTorch (1).ipynb', 'Cópia de Copy of Explorando-Convolucao-no-PyTorch.ipynb', 'Cópia de Copy of cifar10-CNN-features (1).ipynb', 'Cópia de Copy of Introducao-CNN-PyTorch.ipynb', 'Cópia de Copy of cifar10-CNN-features.ipynb', 'Cópia de Aula 1 - Classificação de Imagens.ipynb', 'Untitled', 'Davi Santos - Atividade Aula2 - Notebook 2.ipynb', 'Davi Santos - Atividade Aula2 - Notebook 1.ipynb', 'Cópia de Aula3 - Básico - Auto-Atenção - Template.ipynb', '.ipynb_checkpoints', 'Aula3-Basico-Auto-Atencao.pdf', 'Aula3-Básico-Auto-Atencao.ipynb', 'Cópia de Aula3-Básico-Auto-Atencao.ipynb', 'Aula 4 - Basico - Auto Atencao Completa.pdf', 'Aula 4 - Basico - Auto Atencao Completa.ipynb', 'epoch=0.ckpt', 'epoch=0-v0.ckpt', 'epoch=0-v1.ckpt', 'epoch=0-v2.ckpt', 'epoch=0-v3.ckpt', 'epoch=0-v4.ckpt', 'epoch=0-v5.ckpt', 'epoch=1.ckpt', 'epoch=2.ckpt', 'Cópia de Professor_Aula5.ipynb']
Saving chec


  | Name  | Type                       | Params
-----------------------------------------------------
0 | model | T5ForConditionalGeneration | 60 M  


HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validation sanity check', layout=Layout…



Please use self.log(...) inside the lightningModule instead.

# log on a step or aggregate epoch metric to the logger and/or progress bar
# (inside LightningModule)
self.log('train_loss', loss, on_step=True, on_epoch=True, prog_bar=True)


HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Training', layout=Layout(flex='2'), max…

Please use self.log(...) inside the lightningModule instead.

# log on a step or aggregate epoch metric to the logger and/or progress bar
# (inside LightningModule)
self.log('train_loss', loss, on_step=True, on_epoch=True, prog_bar=True)


HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…



Profiler Report

Action              	|  Mean duration (s)	|  Total time (s) 
-----------------------------------------------------------------
on_fit_start        	|  3.0853e-05     	|  3.0853e-05     
on_validation_start 	|  0.022217       	|  0.088867       
on_validation_epoch_start	|  3.1684e-05     	|  0.00012674     
on_validation_batch_start	|  2.1289e-05     	|  0.01007        
validation_step_end 	|  3.0424e-05     	|  0.01439        
on_validation_batch_end	|  0.00010651     	|  0.050378       
on_validation_epoch_end	|  2.7344e-05     	|  0.00010938     
on_validation_end   	|  3.2472         	|  12.989         
on_train_start      	|  0.048722       	|  0.048722       
on_epoch_start      	|  0.0030595      	|  0.0091784      
on_train_epoch_start	|  2.9039e-05     	|  8.7117e-05     
get_train_batch     	|  0.0035247      	|  33.044         
on_batch_start      	|  3.2088e-05     	|  0.30083        
on_train_batch_start	|  1.4064e-05     	|  0.13185        
training_ste




1

## Após treinado, avaliamos o modelo no dataset de teste

É importante que essa avaliação seja feita poucas vezes para evitar "overfit manual" no dataset de teste.

In [None]:
trainer.test(model)

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Testing', layout=Layout(flex='2'), max=…

 source: ('In this way, the civil life of a nation matures, making it possible for all citizens to enjoy the fruits of genuine tolerance and mutual respect.', '1999 XIII. Winnipeg, Canada July 23 to August 8', "In the mystery of Christmas, Christ's light shines on the earth, spreading, as it were, in concentric circles.", 'making it viable to drill two new boreholes in the west of that peninsula.', 'His eyes were shining and his voice was cheerful.', 'Injuries, accidents, bereavement, abuse, separation, shock, rape, bullying, harassment, stress, depression, anxiety, eating.', 'Whiteness HP Maxx is a 35% hydrogen peroxide whitening gel for the whitening of vital and non-vital teeth.', 'Lines: with indication of Line Number, From and To ends, insulation, the P&ID where they are drawn.', 'The cruises depart from Manaus, capital of the State of Amazonas, a city in the jungle that prospered during the rubber boom last century and where you will find a smaller copy of the Opera House in Pari

Please use self.log(...) inside the lightningModule instead.

# log on a step or aggregate epoch metric to the logger and/or progress bar
# (inside LightningModule)
self.log('train_loss', loss, on_step=True, on_epoch=True, prog_bar=True)


[{'avg_test_bleu': 30.112290238305274}]