# Aula 11: Análise de Sentimentos usando roBERTa
Nome: **Orlem Lima dos Santos**

Neste notebook iremos treinar um modelo para fazer análise de sentimento usando o dataset IMDB.

# roBERTa (com pre-treino)

## RoBERTa: A Robustly Optimized BERT Pretraining Approach

https://arxiv.org/abs/1907.11692

RoBERTa iterates on BERT's pretraining procedure, including training the model longer, with bigger batches over more data; removing the next sentence prediction objective; training on longer sequences; and dynamically changing the masking pattern applied to the training data. See the associated paper for more details.

In [None]:
# !pip install pytorch_lightning==1.2.10
# !pip install neptune-client
# !pip install fairseq
# !pip install git+https://github.com/pytorch/fairseq --upgrade

In [None]:
version = "roberta_large_pretrain_imbd_test8" #@param {type: "string"}
lr = 5e-6#@param {type: "number"}
w_decay = 0#@param {type: "number"}
bs = 4#@param {type: "integer"}
accum_grads = 8#@param {type: "integer"}
patience = 10#@param {type: "integer"}
max_epochs = 100#@param {type: "integer"}
warm_up_epochs =  2#@param {type: "integer"}
reduction = "sum"#@param {type: "string"}

# Define hyperparameters
hparams = {"version": version,
          "lr": lr,
          "w_decay": w_decay,
          "bs": bs, 
          "patience": patience,
          "accum_grads": accum_grads,
          "warm_up_epochs":warm_up_epochs,
          "reduction":reduction,
          "max_epochs": max_epochs}
hparams

{'accum_grads': 8,
 'bs': 4,
 'lr': 5e-06,
 'max_epochs': 100,
 'patience': 10,
 'reduction': 'sum',
 'version': 'roberta_large_pretrain_imbd_test8',
 'w_decay': 0,
 'warm_up_epochs': 2}

# Preparando Dados

In [None]:
from pytorch_lightning.loggers.neptune import NeptuneLogger

In [None]:
neptune_logger = NeptuneLogger(
    api_key="eyJhcGlfYWRkcmVzcyI6Imh0dHBzOi8vYXBwLm5lcHR1bmUuYWkiLCJhcGlfdXJsIjoiaHR0cHM6Ly9hcHAubmVwdHVuZS5haSIsImFwaV9rZXkiOiJjMmJkZTg1Yy1kMjQ1LTRmMjEtYjBmYy1kMjVlMmMxODcyMTgifQ==",
    project_name='orllem/Aula11imdb')

NeptuneLogger will work in online mode


Primeiro, fazemos download do dataset:

In [None]:
!wget -nc http://files.fast.ai/data/examples/imdb_sample.tgz
!tar -xzf imdb_sample.tgz

File ‘imdb_sample.tgz’ already there; not retrieving.



In [None]:
import nvidia_smi
import pytorch_lightning as pl
from torch.utils.data import DataLoader
import torch
from torch import nn
import torch.nn.functional as F
from sklearn.model_selection import train_test_split
from torchmetrics.functional import f1
from torchmetrics.functional import accuracy
import os
from google.colab import drive
import numpy as np
import itertools

In [None]:
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [None]:
print(f"Pytorch Lightning Version: {pl.__version__}")
nvidia_smi.nvmlInit()
handle = nvidia_smi.nvmlDeviceGetHandleByIndex(0)
print(f"Device name: {nvidia_smi.nvmlDeviceGetName(handle)}")

Pytorch Lightning Version: 1.2.10
Device name: b'Tesla P100-PCIE-16GB'


Carregamos o dataset .csv usando o pandas:

In [None]:
import pandas as pd
df = pd.read_csv('imdb_sample/texts.csv')
df.shape
df.head()

Unnamed: 0,label,text,is_valid
0,negative,Un-bleeping-believable! Meg Ryan doesn't even ...,False
1,positive,This is a extremely well-made film. The acting...,False
2,negative,Every once in a long while a movie will come a...,False
3,positive,Name just says it all. I watched this movie wi...,False
4,negative,This movie succeeds at being one of the most u...,False


Iremos agora dividir o dataset em conjuntos de treino e teste:

In [None]:
treino = df[df['is_valid'] == False]
test = df[df['is_valid'] == True]

print('treino.shape:', treino.shape)
print('test.shape:', test.shape)

treino.shape: (800, 3)
test.shape: (200, 3)


E iremos dividir estes dois conjuntos em entrada (X) e saída desejada (Y, ground-truth) do modelo:

In [None]:
X_treino = treino['text']
Y_treino = treino['label']
X_test = test['text']
Y_teste = test['label']

print('X_treino.head():', X_treino.head())
print('Y_treino.head():', Y_treino.head())

print('X_test.head():', X_test.head())

X_treino.head(): 0    Un-bleeping-believable! Meg Ryan doesn't even ...
1    This is a extremely well-made film. The acting...
2    Every once in a long while a movie will come a...
3    Name just says it all. I watched this movie wi...
4    This movie succeeds at being one of the most u...
Name: text, dtype: object
Y_treino.head(): 0    negative
1    positive
2    negative
3    positive
4    negative
Name: label, dtype: object
X_test.head(): 800    This very funny British comedy shows what migh...
801    I saw this movie once as a kid on the late-lat...
802    This is, in my opinion, a very good film, espe...
803    In Iran, women are not permitted to attend men...
804    "In April 1946, the University of Chicago agre...
Name: text, dtype: object


Ainda falta converter as strings "positive" e "negative" do ground-truth para valores booleanos:

In [None]:
mapeamento = {'positive': True, 'negative': False}
Y_treino_bool = Y_treino.map(mapeamento)
Y_test = Y_teste.map(mapeamento)
print(Y_treino_bool.head())

0    False
1     True
2    False
3     True
4    False
Name: label, dtype: bool


In [None]:
df['text']

0      Un-bleeping-believable! Meg Ryan doesn't even ...
1      This is a extremely well-made film. The acting...
2      Every once in a long while a movie will come a...
3      Name just says it all. I watched this movie wi...
4      This movie succeeds at being one of the most u...
                             ...                        
995    There are many different versions of this one ...
996    Once upon a time Hollywood produced live-actio...
997    Wenders was great with Million $ Hotel.I don't...
998    Although a film with Bruce Willis is always wo...
999    A compelling, honest, daring, and unforgettabl...
Name: text, Length: 1000, dtype: object

In [None]:
X_train, X_valid, Y_train, Y_valid = train_test_split(X_treino.values, Y_treino_bool.values,
                                                      test_size=0.15, stratify=Y_treino_bool.values,
                                                      random_state=12)

In [None]:
print(X_train[0:5])
print(Y_train[0:5])

['As horror fans we all know that blind rentals are a crap-shoot. Sometimes we find a real gem, but many times we find that the film we\'ve just spent our hard earned money on is nothing more than a putrid steamer made worse by the completely undeserved rave reviews and film fest awards listed on the box. Such is the case with Five Across the Eyes ( a title I\'m sure is a double entendre referring to both the films budget and the compulsion anyone watching it might have to using all five fingers to stab their eyes out ).<br /><br />The story, or, at least what the *ahem* writers think passes for one, centers on a group of teen girls who unwisely decide to go on a backwoods joyride late at night after leaving a football game and run afoul of a crazy woman who plays cat and mouse with them as punishment for what she thinks the girls found in her car after a fender-bender in a gas station parking lot.<br /><br />In fairness, it\'s an interesting idea. Some of the best horrors have very si

In [None]:
print(X_valid[0:5])
print(Y_valid[0:5])

['Frank Sinatra was far from the ideal actor for westerns. He was a great actor, From Here to Eternity and The Man with The Golden arm are a proof of that, but he did not have the physique of a western hero, you identified him as an urban guy. But he tried to do his job well in Johnny Concho, the fact that the film was a failure at the box office was not his fault. I blame it on two factors: a) the story was too unusual, specially in the fact that Sinatra behaves more like a villain than as a hero throughout the movie. In a genre where people kind of expected a certain pattern, to break away from it the film has to be very good. b) the story is not convincing, it is hard to believe that a whole town will allow Sinatra to do anything he wants just because they are afraid of his brother. Also when a man shows him a special holster that will open sideways so he has not to draw the gun you wonder that if that will make him invincible, why all the gunfighters have not adopted it? I think th

In [None]:
print(X_test.values[0:5])
print(Y_test.values[0:5])

["This very funny British comedy shows what might happen if a section of London, in this case Pimlico, were to declare itself independent from the rest of the UK and its laws, taxes & post-war restrictions. Merry mayhem is what would happen.<br /><br />The explosion of a wartime bomb leads to the discovery of ancient documents which show that Pimlico was ceded to the Duchy of Burgundy centuries ago, a small historical footnote long since forgotten. To the new Burgundians, however, this is an unexpected opportunity to live as they please, free from any interference from Whitehall.<br /><br />Stanley Holloway is excellent as the minor city politician who suddenly finds himself leading one of the world's tiniest nations. Dame Margaret Rutherford is a delight as the history professor who sides with Pimlico. Others in the stand-out cast include Hermione Baddeley, Paul Duplis, Naughton Wayne, Basil Radford & Sir Michael Hordern.<br /><br />Welcome to Burgundy!"
 "I saw this movie once as a k

In [None]:
roberta = torch.hub.load('pytorch/fairseq', 'roberta.large')
# roberta = torch.hub.load('pytorch/fairseq', 'roberta.base')
# roberta.eval()  # disable dropout (or leave in train mode to finetune)
print(roberta)

Using cache found in /root/.cache/torch/hub/pytorch_fairseq_master


RobertaHubInterface(
  (model): RobertaModel(
    (encoder): RobertaEncoder(
      (sentence_encoder): TransformerEncoder(
        (dropout_module): FairseqDropout()
        (embed_tokens): Embedding(50265, 1024, padding_idx=1)
        (embed_positions): LearnedPositionalEmbedding(514, 1024, padding_idx=1)
        (layernorm_embedding): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
        (layers): ModuleList(
          (0): TransformerEncoderLayer(
            (self_attn): MultiheadAttention(
              (dropout_module): FairseqDropout()
              (k_proj): Linear(in_features=1024, out_features=1024, bias=True)
              (v_proj): Linear(in_features=1024, out_features=1024, bias=True)
              (q_proj): Linear(in_features=1024, out_features=1024, bias=True)
              (out_proj): Linear(in_features=1024, out_features=1024, bias=True)
            )
            (self_attn_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
            (dropou

In [None]:
tokens = roberta.encode('Hello world!')
assert tokens.tolist() == [0, 31414, 232, 328, 2]
roberta.decode(tokens)  # 'Hello world!'

'Hello world!'

In [None]:
for sentence in X_train[0:5]:
  print('sentence:', sentence)
  tokens = roberta.encode(sentence)
  print('tokens shape', tokens.shape)
  print('sentence decode:', roberta.decode(tokens))

sentence: As horror fans we all know that blind rentals are a crap-shoot. Sometimes we find a real gem, but many times we find that the film we've just spent our hard earned money on is nothing more than a putrid steamer made worse by the completely undeserved rave reviews and film fest awards listed on the box. Such is the case with Five Across the Eyes ( a title I'm sure is a double entendre referring to both the films budget and the compulsion anyone watching it might have to using all five fingers to stab their eyes out ).<br /><br />The story, or, at least what the *ahem* writers think passes for one, centers on a group of teen girls who unwisely decide to go on a backwoods joyride late at night after leaving a football game and run afoul of a crazy woman who plays cat and mouse with them as punishment for what she thinks the girls found in her car after a fender-bender in a gas station parking lot.<br /><br />In fairness, it's an interesting idea. Some of the best horrors have ve

# Definindo Dataset e dataloaders

In [None]:
NUM_LABELS = 2
labels = {False: 0, True: 1}

In [None]:
def make_target(label, labels):
    return torch.LongTensor([labels[label]])

In [None]:
class ImdbDataset(torch.utils.data.Dataset):

  def __init__(self, X, y, labels):
    self.X = X
    self.y = y
    self.labels = labels

  def __len__(self):
    return len(self.X)

  def __getitem__(self, idx):
    
    vec = self.X[idx]
    target = make_target(self.y[idx], self.labels)[0]
    return vec, target

In [None]:
train_dataset = ImdbDataset(X_train, Y_train, 
                               labels=labels)
val_dataset = ImdbDataset(X_valid, Y_valid, 
                             labels=labels)
test_dataset = ImdbDataset(X_test.values, Y_test.values,
                              labels=labels)

In [None]:
print('Número de amostras de trenamento:', len(train_dataset))
print('Número de amostras de validação:', len(val_dataset))
print('Número de amostras de teste:', len(test_dataset))

Número de amostras de trenamento: 680
Número de amostras de validação: 120
Número de amostras de teste: 200


In [None]:
data = train_dataset[0]
print(data)
x = data[0]
y = data[1]
print('x:', x)
print('y:', y)

('As horror fans we all know that blind rentals are a crap-shoot. Sometimes we find a real gem, but many times we find that the film we\'ve just spent our hard earned money on is nothing more than a putrid steamer made worse by the completely undeserved rave reviews and film fest awards listed on the box. Such is the case with Five Across the Eyes ( a title I\'m sure is a double entendre referring to both the films budget and the compulsion anyone watching it might have to using all five fingers to stab their eyes out ).<br /><br />The story, or, at least what the *ahem* writers think passes for one, centers on a group of teen girls who unwisely decide to go on a backwoods joyride late at night after leaving a football game and run afoul of a crazy woman who plays cat and mouse with them as punishment for what she thinks the girls found in her car after a fender-bender in a gas station parking lot.<br /><br />In fairness, it\'s an interesting idea. Some of the best horrors have very si

## Transformando word ids em batches

In [None]:
from fairseq.data.data_utils import collate_tokens

In [None]:
seq_len = 512

def collate_fn(batch):
    words_id_list, labels_list= zip(*batch)  
    batch_word_ids = collate_tokens(
    # [roberta.encode(sentence[:seq_len]) for sentence in words_id_list], pad_to_length=seq_len, pad_idx=1)    
    [roberta.encode(sentence)[:seq_len] for sentence in words_id_list], pad_to_length=seq_len, pad_idx=1)
    # [roberta.encode(sentence)[:seq_len] if len(sentence)>=seq_len else roberta.encode(sentence) for sentence in words_id_list], pad_to_length=seq_len, pad_idx=1)

    return torch.LongTensor(batch_word_ids), torch.LongTensor(labels_list)

In [None]:
batch_size = hparams["bs"]

train_dataloader = DataLoader(train_dataset, batch_size=batch_size,
                              collate_fn=collate_fn,
                              shuffle=True, num_workers=4)
val_dataloader = DataLoader(val_dataset, batch_size=batch_size,
                            collate_fn=collate_fn,
                            shuffle=False,  num_workers=4)
test_dataloader = DataLoader(test_dataset, batch_size=batch_size,
                             collate_fn=collate_fn,
                             shuffle=False,  num_workers=4)

print('Número de minibatches de trenamento:', len(train_dataloader))
print('Número de minibatches de validação:', len(val_dataloader))
print('Número de minibatches de teste:', len(test_dataloader))


x_train, y_train = next(iter(train_dataloader))
x_valid,  y_valid = next(iter(val_dataloader))
x_test, y_test = next(iter(test_dataloader))
print("\nDimensões dos dados de um minibatch:", x_train.size())
print("\nDimensões dos dados de um minibatch:", x_valid.size())
print("\nDimensões dos dados de um minibatch:", x_test.size())
print("\nDimensões dos dados de um minibatch:", y_train.size())
print("Valores mínimo e máximo dos x: ", torch.min(x_train), torch.max(x_train))
print("Valores mínimo e máximo dos y: ", torch.min(y_train), torch.max(y_train))
print("Tipo dos dados das sentenças:         ", type(x_train))
print("Tipo das classes das sentenças:       ", type(y_train))

print(x_train)

Número de minibatches de trenamento: 170
Número de minibatches de validação: 30
Número de minibatches de teste: 50

Dimensões dos dados de um minibatch: torch.Size([4, 512])

Dimensões dos dados de um minibatch: torch.Size([4, 512])

Dimensões dos dados de um minibatch: torch.Size([4, 512])

Dimensões dos dados de um minibatch: torch.Size([4])
Valores mínimo e máximo dos x:  tensor(0) tensor(49069)
Valores mínimo e máximo dos y:  tensor(1) tensor(1)
Tipo dos dados das sentenças:          <class 'torch.Tensor'>
Tipo das classes das sentenças:        <class 'torch.Tensor'>
tensor([[    0,   713,  1569,  ...,    38,    21,  2908],
        [    0,   713,   822,  ...,     1,     1,     1],
        [    0, 15243,     5,  ...,     1,     1,     1],
        [    0,   113, 11475,  ...,  7850,     8,   117]])


# Definindo o Classificador word2vec

In [None]:
class LabelSmoothingCrossEntropy(nn.Module):
    def __init__(self):
        super(LabelSmoothingCrossEntropy, self).__init__()
    def forward(self, logprobs, target, smoothing=0.1, reduction="sum"):
        confidence = 1. - smoothing
        nll_loss = -logprobs.gather(dim=-1, index=target.unsqueeze(1))
        nll_loss = nll_loss.squeeze(1)
        smooth_loss = -logprobs.mean(dim=-1)
        loss = confidence * nll_loss + smoothing * smooth_loss
        if reduction=="sum":
          loss = loss.sum()
        elif reduction=="mean":
          loss = loss.mean() 
        else:
          print('not implemented')
          exit()   
        return loss

## Definindo a função de perda

In [None]:
loss_function = LabelSmoothingCrossEntropy()

In [None]:
class robertaNetClassifier(pl.LightningModule):
    def __init__(self, *args, **kwargs):
        super().__init__()

        self.hparams = hparams

        # Note como a arquitetura esta dependente dos hiperparâmetros salvos.
        self.model = roberta

        self.model.register_classification_head('imdb_head',
                                                # inner_dim=768,
                                                # inner_dim=2048,
                                                # activation_fn='relu',
                                                # pooler_dropout=0.01,
                                                num_classes=NUM_LABELS)
 
    def forward(self, x):        
        logprobs = self.model.predict('imdb_head', x)
        return logprobs

    def predict_step(self, batch, batch_idx, dataloader_idx=None):
        x, y = batch
        # print('x', x.shape)
        logprobs = self(x)
        return logprobs    

    def training_step(self, train_batch, batch_idx):
        
        x, y = train_batch

        # loss cross-entropy compute
        logprobs = self.forward(x)
        # loss = F.nll_loss(logprobs, y, reduction=self.hparams["reduction"])
        loss = loss_function(logprobs, y, reduction=self.hparams["reduction"])

        self.log('cross_loss_step', loss, prog_bar=True)
        
        return loss

    def training_epoch_end(self, outputs):
        loss = torch.stack([x['loss'] for x in outputs]).mean()       

        self.log("train_loss", loss, prog_bar=True)
  
    def validation_step(self, val_batch, batch_idx):
        
        x, y = val_batch

        logprobs = self.forward(x)
      
        val_loss = F.nll_loss(logprobs, y, reduction=self.hparams["reduction"])
        preds = logprobs.argmax(dim=1)

        val_f1 = f1(preds, y, num_classes=2, average='weighted')
        val_acc = accuracy(preds, y)

        self.log('val_loss_step', val_loss, prog_bar=True)
        self.log('val_f1_step', val_f1, prog_bar=True)
        self.log('val_acc_step', val_acc, prog_bar=True)

        return {"val_loss_step": val_loss, "val_f1_step": val_f1,
                "val_acc_step": val_acc}

    def validation_epoch_end(self, outputs):
        val_loss = torch.stack([x['val_loss_step'] for x in outputs]).mean()
        val_f1 = torch.stack([x['val_f1_step'] for x in outputs]).mean()
        val_acc = torch.stack([x['val_acc_step'] for x in outputs]).mean()

        self.log("val_loss", val_loss, prog_bar=True)
        self.log("val_f1", val_f1, prog_bar=True)
        self.log("val_acc", val_acc, prog_bar=True)
  
    def test_step(self, test_batch, batch_idx):
        
        x, y = test_batch

        logprobs = self.forward(x)
        test_loss = F.nll_loss(logprobs, y, reduction=self.hparams["reduction"])

        preds = logprobs.argmax(dim=1)

        test_f1 = f1(preds, y, num_classes=2, average='weighted')
        test_acc = accuracy(preds, y)

        self.log('test_loss_step', test_loss, prog_bar=True)
        self.log('test_f1_step', test_f1, prog_bar=True)
        self.log('test_acc_step', test_acc, prog_bar=True)

        return {"test_loss_step": test_loss, "test_f1_step": test_f1,
                "test_acc_step": test_acc}

    def test_epoch_end(self, outputs):
        loss = torch.stack([x['test_loss_step'] for x in outputs]).mean()
        f1_test = torch.stack([x['test_f1_step'] for x in outputs]).mean()
        acc = torch.stack([x['test_acc_step'] for x in outputs]).mean()

        self.log("test_loss", loss, prog_bar=True)
        self.log("test_f1", f1_test, prog_bar=True)
        self.log("test_acc", acc, prog_bar=True)

    def configure_optimizers(self):

        # optimizer = torch.optim.RMSprop(self.parameters(),
        #                  lr=self.hparams["lr"],
        #                  weight_decay=self.hparams["w_decay"])

        def lr_foo(epoch):
            if epoch < self.hparams["warm_up_epochs"]:
                # warm up lr
                lr_scale = 0.1 ** (self.hparams["warm_up_epochs"] - epoch)
            else:
                lr_scale = 1
                # lr_scale = 0.95 ** epoch

            return lr_scale
        
        optimizer = torch.optim.Adam(self.parameters(),
                         lr=self.hparams["lr"],
                         betas=(0.9, 0.98),
                         eps=1e-06,
                         weight_decay=self.hparams["w_decay"])
        
        scheduler = torch.optim.lr_scheduler.LambdaLR(
            optimizer,
            lr_lambda=lr_foo
        )
        
        return {'optimizer': optimizer, 'lr_scheduler': scheduler}

In [None]:
hparams["lr"]

5e-06

# Criando o classificador e treinamento

In [None]:
pl_model =  robertaNetClassifier(hparams=hparams)
print(pl_model)

checkpoint_path = '/content/drive/MyDrive/aula11_checkpoints_AP/'
print(f'Files in {checkpoint_path}: {os.listdir(checkpoint_path)}')
print(f'Saving checkpoints to {checkpoint_path}')
checkpoint_callback = pl.callbacks.ModelCheckpoint(filename=hparams["version"]+'-{epoch:02d}-{val_f1:.2f}',
                                                    dirpath=checkpoint_path,
                                                    save_top_k=1, 
                                                    verbose=True,
                                                    monitor="val_f1", mode="max")
early_stop_callback = pl.callbacks.EarlyStopping(monitor='val_f1', patience=hparams["patience"], mode='max')
lr_monitor = pl.callbacks.LearningRateMonitor(logging_interval='epoch')

trainer = pl.Trainer(gpus=1, 
                     # precision=16,
                     logger=neptune_logger,
                     num_sanity_val_steps=0,
                     accumulate_grad_batches=hparams["accum_grads"],
                     checkpoint_callback=checkpoint_callback, 
                     callbacks=[early_stop_callback, lr_monitor],
                     max_epochs=hparams["max_epochs"])

GPU available: True, used: True
TPU available: False, using: 0 TPU cores


robertaNetClassifier(
  (model): RobertaHubInterface(
    (model): RobertaModel(
      (encoder): RobertaEncoder(
        (sentence_encoder): TransformerEncoder(
          (dropout_module): FairseqDropout()
          (embed_tokens): Embedding(50265, 1024, padding_idx=1)
          (embed_positions): LearnedPositionalEmbedding(514, 1024, padding_idx=1)
          (layernorm_embedding): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
          (layers): ModuleList(
            (0): TransformerEncoderLayer(
              (self_attn): MultiheadAttention(
                (dropout_module): FairseqDropout()
                (k_proj): Linear(in_features=1024, out_features=1024, bias=True)
                (v_proj): Linear(in_features=1024, out_features=1024, bias=True)
                (q_proj): Linear(in_features=1024, out_features=1024, bias=True)
                (out_proj): Linear(in_features=1024, out_features=1024, bias=True)
              )
              (self_attn_layer_norm): LayerNo

In [None]:
trainer.fit(pl_model, train_dataloader, val_dataloader)

LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


https://app.neptune.ai/orllem/Aula11imdb/e/AUL7-65



  | Name  | Type                | Params
----------------------------------------------
0 | model | RobertaHubInterface | 356 M 
----------------------------------------------
356 M     Trainable params
0         Non-trainable params
356 M     Total params
1,425.851 Total estimated model params size (MB)


HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Training', layout=Layout(flex='2'), max…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

Epoch 0, global step 21: val_f1 reached 0.51032 (best 0.51032), saving model to "/content/drive/MyDrive/aula11_checkpoints_AP/roberta_large_pretrain_imbd_test8-epoch=00-val_f1=0.51.ckpt" as top 1


HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

Epoch 1, step 43: val_f1 was not in top 1


HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

Epoch 2, step 65: val_f1 was not in top 1


HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

Epoch 3, global step 87: val_f1 reached 0.94444 (best 0.94444), saving model to "/content/drive/MyDrive/aula11_checkpoints_AP/roberta_large_pretrain_imbd_test8-epoch=03-val_f1=0.94.ckpt" as top 1


HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

Epoch 4, step 109: val_f1 was not in top 1


HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

Epoch 5, global step 131: val_f1 reached 0.95222 (best 0.95222), saving model to "/content/drive/MyDrive/aula11_checkpoints_AP/roberta_large_pretrain_imbd_test8-epoch=05-val_f1=0.95.ckpt" as top 1


HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

Epoch 6, step 153: val_f1 was not in top 1


HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

Epoch 7, step 175: val_f1 was not in top 1


HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

Epoch 8, step 197: val_f1 was not in top 1


HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

Epoch 9, step 219: val_f1 was not in top 1


HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

Epoch 10, step 241: val_f1 was not in top 1


HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

Epoch 11, step 263: val_f1 was not in top 1


HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

Epoch 12, step 285: val_f1 was not in top 1


HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

Epoch 13, step 307: val_f1 was not in top 1


HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

Epoch 14, step 329: val_f1 was not in top 1


HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

Epoch 15, step 351: val_f1 was not in top 1





1

## Teste - Avaliação do classificador 


In [None]:
from sklearn import metrics

In [None]:
best_model = checkpoint_callback.best_model_path
# best_model = '/content/drive/MyDrive/aula11_checkpoints_AP/roberta_large_pretrain_imbd_test6-epoch=06-val_f1=0.94.ckpt'
print(best_model)
test_model = robertaNetClassifier.load_from_checkpoint(best_model, hparams=hparams).cuda().eval()

/content/drive/MyDrive/aula11_checkpoints_AP/roberta_large_pretrain_imbd_test8-epoch=05-val_f1=0.95.ckpt


re-registering head "imdb_head" with num_classes 2 (prev: 2) and inner_dim None (prev: 1024)


In [None]:
trainer.test(test_model, test_dataloader)

LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Testing', layout=Layout(flex='2'), max=…


--------------------------------------------------------------------------------
DATALOADER:0 TEST RESULTS
{'test_acc': 0.9449999928474426,
 'test_acc_step': 0.9449999928474426,
 'test_f1': 0.9379047155380249,
 'test_f1_step': 0.9379047155380249,
 'test_loss': 0.7352401614189148,
 'test_loss_step': 0.73524010181427}
--------------------------------------------------------------------------------


[{'test_acc': 0.9449999928474426,
  'test_acc_step': 0.9449999928474426,
  'test_f1': 0.9379047155380249,
  'test_f1_step': 0.9379047155380249,
  'test_loss': 0.7352401614189148,
  'test_loss_step': 0.73524010181427}]

In [None]:
y_true = list()
y_pred = list()

for i, batch in enumerate(test_dataloader):
  x, y = batch
  with torch.no_grad():
    logprobs = test_model.predict_step(batch, i)
  preds = logprobs.argmax(dim=1)   
  
  y_true.append(y.cpu().numpy())
  y_pred.append(preds.cpu().numpy())

y_true = np.concatenate(y_true)
y_pred = np.concatenate(y_pred)

In [None]:
# f1 score (hard-majority vote)
print('f1:', metrics.f1_score(y_true, y_pred, average='weighted'))

# accuracy
print('acc:', metrics.accuracy_score(y_true, y_pred))

# balanced accuracy
print('balanced acc:', metrics.balanced_accuracy_score(y_true, y_pred))

f1: 0.9448681853729913
acc: 0.945
balanced acc: 0.9429705557230429


In [None]:
print('classification report:', metrics.classification_report(y_true, y_pred))

classification report:               precision    recall  f1-score   support

           0       0.93      0.97      0.95       107
           1       0.97      0.91      0.94        93

    accuracy                           0.94       200
   macro avg       0.95      0.94      0.94       200
weighted avg       0.95      0.94      0.94       200



# Considerações finais

 
1. Usando RoBERTa (large) do fairseq e dataset reduzido (800, 200);
2. Foram feitos testes com o modelo base e large;
3. Achei interessante o tokenizer do RoBERTa que usa o bpe (Byte-level BPE) que é obtido pelo GPT-2;
4. O modelo large teve um aumento subindo de 91% para 94.5%;
5. O uso do warm-up foi observado com sendo bastante eficiente.



