## BERT with custom classifier

Reference
  - [Multi Class Text Classification With Deep Learning Using BERT](https://towardsdatascience.com/multi-class-text-classification-with-deep-learning-using-bert-b59ca2f5c613)
  - [Sentiment Analysis with BERT and Transformers by Hugging Face using PyTorch and Python](https://curiousily.com/posts/sentiment-analysis-with-bert-and-hugging-face-using-pytorch-and-python/)


What used?
  * BertTokenizer
  * BertModel
 
BerModel returns hidden state output and pooled output. I add custom MLP with 1 hidden layer as classifier.


### Requirement
  * transformers == 4.6.0
  * pytorch == 1.8.0
  * numpy == 1.1.92
  * pandas == 1.2.3
  * tqdm == 4.60.0
  * scikit-learn == 0.24.1

<a id="Table"></a>
### Table
  - [Import library](#Import)
  - [Check train set](#Trainset)
  - [Split train/validation set](#Split)
  - [Tokenizer, Encoding data](#Encode)
  - [Check output of BertModel](#Output)
  - [Define custom classifier model](#Classifier)
  - [Call custom classifier model, Data loaders](#Model)
    - Can modify batch_size in this block
  - [Optimizer & Scheduler](#Optim)
    - Can modify epochs, lr, etc. in this block
  - [Load pre-trained weight](#Load)
    - Can load pre-trained weight. If you don't want pre-trained weight, please skip this block
  - [Running model](#Run)
    - Can use pre-trained parameters
  - [Test & Submission](#Sub)

<a id="Import"></a>
### Import library
  - [Return to table](#Table)

In [64]:
import numpy as np
import pandas as pd
import torch
from tqdm.notebook import tqdm

from torch.utils.data import TensorDataset, DataLoader, RandomSampler, SequentialSampler
from torch import nn, optim
import torch.nn.functional as F

from sklearn.model_selection import train_test_split
from sklearn.metrics import f1_score

from transformers import BertTokenizer, BertForSequenceClassification
from transformers import BertForMaskedLM, BertModel
from transformers import AdamW, get_linear_schedule_with_warmup

In [65]:
USE_CUDA = torch.cuda.is_available()
print('Use', 'cuda' if USE_CUDA else 'cpu')
device = torch.device('cuda' if USE_CUDA else 'cpu')
#device = torch.device('cpu')

Use cuda


<a id="Trainset"></a>
### Check train set
  - [Return to table](#Table)

In [67]:
train_df = pd.read_csv('./data/train_final.csv')
#train_df = pd.read_csv('./data/aug_train_final.csv')

aug_de_train_df = pd.read_csv("../data/augmented/train_x_de2en.csv")
aug_ru_train_df = pd.read_csv("../data/augmented/train_x_ru2en.csv")

# aug_de_train_df의 중복 데이터 제거
aug_de_train_df = aug_de_train_df[aug_de_train_df['Sentence'] != aug_ru_train_df['Sentence']]
aug_ru_train_df = aug_ru_train_df[aug_ru_train_df['Sentence'] != train_df['Sentence']]

new_train_df = pd.concat([train_df,aug_ru_train_df,aug_de_train_df], ignore_index=True)
train_df = new_train_df
#train_df = train_df.drop_duplicates(['Sentence'])

print("len of train_df :" , train_df)

len of train_df :         Id  Category                                           Sentence  \
0      0.0         3  -LRB- The film -RRB- tackles the topic of rela...   
1      1.0         2              Lavishly , exhilaratingly tasteless .   
2      2.0         4                     It is also beautifully acted .   
3      3.0         1  But , like Silence , it 's a movie that gets u...   
4      4.0         2  It 's been made with an innocent yet fervid co...   
...    ...       ...                                                ...   
34306  NaN         3  Although Frailty fits into a classic genre, it...   
34307  NaN         1                  Mediocre fable from Burkina Faso.   
34308  NaN         4  Like all great films about a life you never kn...   
34309  NaN         4  Those who are not deterred by the film's auste...   
34310  NaN         4  An ambitious film that, like Shiner's organisa...   

       Unnamed: 0  
0             NaN  
1             NaN  
2             NaN  
3

Number of dataset, label distribution

In [68]:
train_df.tail()

Unnamed: 0.1,Id,Category,Sentence,Unnamed: 0
34306,,3,"Although Frailty fits into a classic genre, it...",11539.0
34307,,1,Mediocre fable from Burkina Faso.,11540.0
34308,,4,Like all great films about a life you never kn...,11541.0
34309,,4,Those who are not deterred by the film's auste...,11542.0
34310,,4,"An ambitious film that, like Shiner's organisa...",11543.0


In [69]:
train_df['Category'].value_counts()

3    9258
1    8960
2    6547
4    5157
0    4389
Name: Category, dtype: int64

<a id="Split"></a>
### Split train/validation set
  - [Return to table](#Table)

In [70]:
X_train, X_val, y_train, y_val = train_test_split(train_df.index.values,
                                                  train_df.Category.values,
                                                  test_size=0.05, 
                                                  random_state=42, 
                                                  stratify=train_df.Category.values)

train_df.loc[X_train, 'data_type'] = 'train'
train_df.loc[X_val, 'data_type'] = 'val'

train_df.groupby(['Category', 'data_type']).count()

Unnamed: 0_level_0,Unnamed: 1_level_0,Id,Sentence,Unnamed: 0
Category,data_type,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
0,train,1413,4169,2756
0,val,63,220,157
1,train,2854,8512,5658
1,val,161,448,287
2,train,2097,6220,4123
2,val,113,327,214
3,train,2954,8795,5841
3,val,155,463,308
4,train,1653,4899,3246
4,val,81,258,177


In [71]:
possible_labels = train_df.Category.unique()

label_dict = {x:x for x in sorted(train_df['Category'].unique())}
label_dict

{0: 0, 1: 1, 2: 2, 3: 3, 4: 4}

<a id="Encode"></a>
### Tokenizer, Encoding data
  - [Return to table](#Table)

In [72]:
#Use below tokenizer
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased', 
                                          do_lower_case=True)

#Encode train sentence
encoded_data_train = tokenizer.batch_encode_plus(
    train_df[train_df.data_type=='train'].Sentence.values, #Sentence data
    add_special_tokens=True,    #Encoded with special tokens relative to their model
    return_attention_mask=True, #Return attention mask according to tokenizer defined by max_length att.
    padding='longest',           #Padding!
    #truncation=True,            
    return_tensors='pt'         #Return torch tensor
)

#Encode validation sentence
encoded_data_val = tokenizer.batch_encode_plus(
    train_df[train_df.data_type=='val'].Sentence.values, 
    add_special_tokens=True, 
    return_attention_mask=True, 
    padding='longest', 
    #truncation=True, 
    return_tensors='pt'
)

In [73]:
"""
encoded_data_train has 3 keys:
  *input_ids
  *token_type_ids
  *attention_mask
"""

#Input
input_ids_train = encoded_data_train['input_ids']
attention_masks_train = encoded_data_train['attention_mask']
labels_train = torch.tensor(train_df[train_df.data_type=='train'].Category.values)

#Validation
input_ids_val = encoded_data_val['input_ids']
attention_masks_val = encoded_data_val['attention_mask']
labels_val = torch.tensor(train_df[train_df.data_type=='val'].Category.values)

dataset_train = TensorDataset(input_ids_train, attention_masks_train, labels_train)
dataset_val = TensorDataset(input_ids_val, attention_masks_val, labels_val)

<a id="Output"></a>
### Check output of BertModel
  - [Return to table](#Table)

In [74]:
model = BertModel.from_pretrained('bert-base-uncased')

In [75]:
print(train_df['Sentence'][0])

-LRB- The film -RRB- tackles the topic of relationships in such a straightforward , emotionally honest manner that by the end , it 's impossible to ascertain whether the film is , at its core , deeply pessimistic or quietly hopeful .


In [76]:
encoding = tokenizer.encode_plus(
                train_df['Sentence'][0],
                add_special_tokens=True,
                return_attention_mask=True,
                padding='longest',        
                return_tensors='pt')

outputs = model(
            input_ids=encoding['input_ids'],
            attention_mask=encoding['attention_mask'])

In [52]:
outputs['last_hidden_state'].shape

TypeError: tuple indices must be integers or slices, not str

In [None]:
outputs['pooler_output'].shape

<a id="Classifier"></a>
### Define custom classifier model
  - [Return to table](#Table)

In [77]:
class Classifier(nn.Module):
    def __init__(self, n_classes):
        super(Classifier, self).__init__()
        self.bert = BertModel.from_pretrained('bert-base-uncased')
        self.drop1 = nn.Dropout(p=0.3)
        self.fc1 = nn.Linear(self.bert.config.hidden_size, self.bert.config.hidden_size//2)
        self.output = nn.Linear(self.bert.config.hidden_size//2,  n_classes)
        
    def forward(self, input_ids, attention_mask):
        x = self.bert(
                    input_ids=input_ids,
                    attention_mask=attention_mask)
        x = self.drop1(x[1])
        x = F.relu(self.fc1(x))
        x = F.softmax(self.output(x), dim=1)
        return x

<a id="Model"></a>
### Call custom classifier model, Data loaders
  - [Return to table](#Table)

In [78]:
model = Classifier(len(train_df.Category.unique()))
model.to(device)

Classifier(
  (bert): BertModel(
    (embeddings): BertEmbeddings(
      (word_embeddings): Embedding(30522, 768, padding_idx=0)
      (position_embeddings): Embedding(512, 768)
      (token_type_embeddings): Embedding(2, 768)
      (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
      (dropout): Dropout(p=0.1, inplace=False)
    )
    (encoder): BertEncoder(
      (layer): ModuleList(
        (0): BertLayer(
          (attention): BertAttention(
            (self): BertSelfAttention(
              (query): Linear(in_features=768, out_features=768, bias=True)
              (key): Linear(in_features=768, out_features=768, bias=True)
              (value): Linear(in_features=768, out_features=768, bias=True)
              (dropout): Dropout(p=0.1, inplace=False)
            )
            (output): BertSelfOutput(
              (dense): Linear(in_features=768, out_features=768, bias=True)
              (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
 

In [79]:
batch_size = 8

dataloader_train = DataLoader(dataset_train, 
                              sampler=RandomSampler(dataset_train), 
                              batch_size=batch_size)

dataloader_validation = DataLoader(dataset_val, 
                                   sampler=SequentialSampler(dataset_val), 
                                   batch_size=batch_size)

<a id="Optim"></a>
### Optimizer & Scheduler
  - [Return to table](#Table)

In [80]:
optimizer = AdamW(model.parameters(),
                  lr=2e-5, 
                  eps=1e-8)
                  
epochs = 60

scheduler = get_linear_schedule_with_warmup(optimizer, 
                                            num_warmup_steps=0,
                                            num_training_steps=len(dataloader_train)*epochs)

loss_fn = nn.CrossEntropyLoss().to(device)

<a id="Load"></a>
### Load pre-trained weight
  - [Return to table](#Table)

In [81]:
# model.load_state_dict(torch.load('./model_save/SST_5_fine_tuning.model'))

<a id="Run"></a>
### Running model
  - [Return to table](#Table)

In [82]:
import random

seed_val = 42
random.seed(seed_val)
np.random.seed(seed_val)
torch.manual_seed(seed_val)
torch.cuda.manual_seed_all(seed_val)

def evaluate(dataloader_val):

    model.eval()
    predictions = []
    
    correct_predictions = 0
    losses = []
    
    for batch in dataloader_val:
        
        batch = tuple(b.to(device) for b in batch)
        
        inputs = {'input_ids':      batch[0],
                  'attention_mask': batch[1],
                 }

        with torch.no_grad():        
            outputs = model(**inputs)
            
        _, preds = torch.max(outputs, dim=1)
        predictions.append(preds)
        
        loss = loss_fn(outputs, batch[2])
        
        correct_predictions += torch.sum(preds == batch[2])
        losses.append(loss.item())
        
    return correct_predictions.double() / len(dataloader_validation.dataset), np.mean(losses), predictions

In [83]:
epochs = 60

for epoch in tqdm(range(1, epochs+1)):
    
    model.train()
    
    correct_predictions = 0
    losses = []


    progress_bar = tqdm(dataloader_train, desc='Epoch {:1d}'.format(epoch), leave=False, disable=False)
    for batch in progress_bar:


        model.zero_grad()
        
        batch = tuple(b.to(device) for b in batch)
        
        inputs = {'input_ids':      batch[0],
                  'attention_mask': batch[1],
                 }       

        outputs = model(**inputs)
        
        _, preds = torch.max(outputs, dim=1)
        loss = loss_fn(outputs, batch[2])
        
        correct_predictions += torch.sum(preds == batch[2])
        
        losses.append(loss.item())
        
        loss.backward()
        nn.utils.clip_grad_norm_(model.parameters(), 1.0)

        optimizer.step()
        scheduler.step()
        
        progress_bar.set_postfix({'training_loss': '{:.3f}'.format(loss.item()/len(batch))})
        
        optimizer.zero_grad()
         
    #If you want to save model parameters, please modify below code.
    #Line 80 save model for every epochs
    #torch.save(model.state_dict(), f'data_volume/finetuned_BERT_epoch_{epoch}.model')
        
    tqdm.write(f'\nEpoch {epoch}')
    
    loss_train_avg = np.mean(losses)            
    tqdm.write(f'Training loss: {loss_train_avg}')
    
    val_acc, val_loss, _ = evaluate(dataloader_validation)
    tqdm.write(f'Validation loss: {val_loss}')
    tqdm.write(f'Valdation accuracy: {val_acc}')

HBox(children=(FloatProgress(value=0.0, max=60.0), HTML(value='')))

HBox(children=(FloatProgress(value=0.0, description='Epoch 1', max=4075.0, style=ProgressStyle(description_wid…


Epoch 1
Training loss: 1.3995310572173698
Validation loss: 1.301764938720437
Valdation accuracy: 0.6002331002331003


HBox(children=(FloatProgress(value=0.0, description='Epoch 2', max=4075.0, style=ProgressStyle(description_wid…


Epoch 2
Training loss: 1.28339090243439
Validation loss: 1.2571220908054086
Valdation accuracy: 0.6456876456876457


HBox(children=(FloatProgress(value=0.0, description='Epoch 3', max=4075.0, style=ProgressStyle(description_wid…


Epoch 3
Training loss: 1.2210308394110276
Validation loss: 1.1984367439913195
Valdation accuracy: 0.7051282051282052


HBox(children=(FloatProgress(value=0.0, description='Epoch 4', max=4075.0, style=ProgressStyle(description_wid…


Epoch 4
Training loss: 1.167189468474476
Validation loss: 1.2348911914714547
Valdation accuracy: 0.6655011655011654


HBox(children=(FloatProgress(value=0.0, description='Epoch 5', max=4075.0, style=ProgressStyle(description_wid…


Epoch 5
Training loss: 1.1335064834319741
Validation loss: 1.1605064885560856
Valdation accuracy: 0.7406759906759907


HBox(children=(FloatProgress(value=0.0, description='Epoch 6', max=4075.0, style=ProgressStyle(description_wid…


Epoch 6
Training loss: 1.1109823449403962
Validation loss: 1.1503051181172215
Valdation accuracy: 0.7534965034965035


HBox(children=(FloatProgress(value=0.0, description='Epoch 7', max=4075.0, style=ProgressStyle(description_wid…


Epoch 7
Training loss: 1.0917314385343915
Validation loss: 1.1391311540160067
Valdation accuracy: 0.7651515151515151


HBox(children=(FloatProgress(value=0.0, description='Epoch 8', max=4075.0, style=ProgressStyle(description_wid…


Epoch 8
Training loss: 1.0777266907545686
Validation loss: 1.1155270437861597
Valdation accuracy: 0.7884615384615384


HBox(children=(FloatProgress(value=0.0, description='Epoch 9', max=4075.0, style=ProgressStyle(description_wid…


Epoch 9
Training loss: 1.0748482527615835
Validation loss: 1.120284086881682
Valdation accuracy: 0.782051282051282


HBox(children=(FloatProgress(value=0.0, description='Epoch 10', max=4075.0, style=ProgressStyle(description_wi…


Epoch 10
Training loss: 1.0604623031323672
Validation loss: 1.119757919810539
Valdation accuracy: 0.7837995337995338


HBox(children=(FloatProgress(value=0.0, description='Epoch 11', max=4075.0, style=ProgressStyle(description_wi…


Epoch 11
Training loss: 1.0560709384760243
Validation loss: 1.1141852093297382
Valdation accuracy: 0.7902097902097902


HBox(children=(FloatProgress(value=0.0, description='Epoch 12', max=4075.0, style=ProgressStyle(description_wi…


Epoch 12
Training loss: 1.0481723447226308
Validation loss: 1.0957206154978552
Valdation accuracy: 0.8082750582750583


HBox(children=(FloatProgress(value=0.0, description='Epoch 13', max=4075.0, style=ProgressStyle(description_wi…


Epoch 13
Training loss: 1.0417083756177703
Validation loss: 1.0830047191575516
Valdation accuracy: 0.8210955710955711


HBox(children=(FloatProgress(value=0.0, description='Epoch 14', max=4075.0, style=ProgressStyle(description_wi…


Epoch 14
Training loss: 1.0361927900285077
Validation loss: 1.0836311085279597
Valdation accuracy: 0.8199300699300699


HBox(children=(FloatProgress(value=0.0, description='Epoch 15', max=4075.0, style=ProgressStyle(description_wi…


Epoch 15
Training loss: 1.034005517447653
Validation loss: 1.0961807974549227
Valdation accuracy: 0.8088578088578089


HBox(children=(FloatProgress(value=0.0, description='Epoch 16', max=4075.0, style=ProgressStyle(description_wi…


Epoch 16
Training loss: 1.0305885634831855
Validation loss: 1.077976063240406
Valdation accuracy: 0.8245920745920746


HBox(children=(FloatProgress(value=0.0, description='Epoch 17', max=4075.0, style=ProgressStyle(description_wi…


Epoch 17
Training loss: 1.025636733309623
Validation loss: 1.0721703684607218
Valdation accuracy: 0.833916083916084


HBox(children=(FloatProgress(value=0.0, description='Epoch 18', max=4075.0, style=ProgressStyle(description_wi…


Epoch 18
Training loss: 1.014456954865368
Validation loss: 1.0644416318383327
Valdation accuracy: 0.8397435897435898


HBox(children=(FloatProgress(value=0.0, description='Epoch 19', max=4075.0, style=ProgressStyle(description_wi…


Epoch 19
Training loss: 1.012535260411128
Validation loss: 1.0653641484504521
Valdation accuracy: 0.8397435897435898


HBox(children=(FloatProgress(value=0.0, description='Epoch 20', max=4075.0, style=ProgressStyle(description_wi…


Epoch 20
Training loss: 1.0108019427872874
Validation loss: 1.0703695552293644
Valdation accuracy: 0.833916083916084


HBox(children=(FloatProgress(value=0.0, description='Epoch 21', max=4075.0, style=ProgressStyle(description_wi…


Epoch 21
Training loss: 1.0100599035602407
Validation loss: 1.0648669780686844
Valdation accuracy: 0.8397435897435898


HBox(children=(FloatProgress(value=0.0, description='Epoch 22', max=4075.0, style=ProgressStyle(description_wi…


Epoch 22
Training loss: 1.0111349557075033
Validation loss: 1.0623274972272474
Valdation accuracy: 0.8420745920745921


HBox(children=(FloatProgress(value=0.0, description='Epoch 23', max=4075.0, style=ProgressStyle(description_wi…


Epoch 23
Training loss: 1.0035236997107055
Validation loss: 1.0707950625308724
Valdation accuracy: 0.833916083916084


HBox(children=(FloatProgress(value=0.0, description='Epoch 24', max=4075.0, style=ProgressStyle(description_wi…


Epoch 24
Training loss: 0.999907493825339
Validation loss: 1.0696806048238001
Valdation accuracy: 0.8356643356643356


HBox(children=(FloatProgress(value=0.0, description='Epoch 25', max=4075.0, style=ProgressStyle(description_wi…


Epoch 25
Training loss: 1.0026470622548296
Validation loss: 1.0645733167958813
Valdation accuracy: 0.8409090909090909


HBox(children=(FloatProgress(value=0.0, description='Epoch 26', max=4075.0, style=ProgressStyle(description_wi…


Epoch 26
Training loss: 0.995438851315551
Validation loss: 1.0621311975079915
Valdation accuracy: 0.8409090909090909


HBox(children=(FloatProgress(value=0.0, description='Epoch 27', max=4075.0, style=ProgressStyle(description_wi…


Epoch 27
Training loss: 0.9941167753166947
Validation loss: 1.0618591419486112
Valdation accuracy: 0.8420745920745921


HBox(children=(FloatProgress(value=0.0, description='Epoch 28', max=4075.0, style=ProgressStyle(description_wi…


Epoch 28
Training loss: 0.9929192385498
Validation loss: 1.0545177268427472
Valdation accuracy: 0.8502331002331003


HBox(children=(FloatProgress(value=0.0, description='Epoch 29', max=4075.0, style=ProgressStyle(description_wi…


Epoch 29
Training loss: 0.9912832790210935
Validation loss: 1.0529110811477485
Valdation accuracy: 0.8519813519813519


HBox(children=(FloatProgress(value=0.0, description='Epoch 30', max=4075.0, style=ProgressStyle(description_wi…


Epoch 30
Training loss: 0.9884480674427711
Validation loss: 1.055931567868521
Valdation accuracy: 0.8479020979020979


HBox(children=(FloatProgress(value=0.0, description='Epoch 31', max=4075.0, style=ProgressStyle(description_wi…


Epoch 31
Training loss: 0.9884850508157461
Validation loss: 1.0736748304477959
Valdation accuracy: 0.8304195804195804


HBox(children=(FloatProgress(value=0.0, description='Epoch 32', max=4075.0, style=ProgressStyle(description_wi…


Epoch 32
Training loss: 0.9877861681745096
Validation loss: 1.036252805244091
Valdation accuracy: 0.8677156177156177


HBox(children=(FloatProgress(value=0.0, description='Epoch 33', max=4075.0, style=ProgressStyle(description_wi…


Epoch 33
Training loss: 0.9822962973454247
Validation loss: 1.0423751625903817
Valdation accuracy: 0.8618881118881119


HBox(children=(FloatProgress(value=0.0, description='Epoch 34', max=4075.0, style=ProgressStyle(description_wi…


Epoch 34
Training loss: 0.9800187796756534
Validation loss: 1.04613962201185
Valdation accuracy: 0.8583916083916084


HBox(children=(FloatProgress(value=0.0, description='Epoch 35', max=4075.0, style=ProgressStyle(description_wi…


Epoch 35
Training loss: 0.9759613502537546
Validation loss: 1.0526007466538008
Valdation accuracy: 0.8525641025641025


HBox(children=(FloatProgress(value=0.0, description='Epoch 36', max=4075.0, style=ProgressStyle(description_wi…


Epoch 36
Training loss: 0.9755535488479707
Validation loss: 1.0432740194852963
Valdation accuracy: 0.8613053613053613


HBox(children=(FloatProgress(value=0.0, description='Epoch 37', max=4075.0, style=ProgressStyle(description_wi…


Epoch 37
Training loss: 0.9762645457712419
Validation loss: 1.0398325931194217
Valdation accuracy: 0.8636363636363636


HBox(children=(FloatProgress(value=0.0, description='Epoch 38', max=4075.0, style=ProgressStyle(description_wi…


Epoch 38
Training loss: 0.974932176745011
Validation loss: 1.0378269409024439
Valdation accuracy: 0.8671328671328671


HBox(children=(FloatProgress(value=0.0, description='Epoch 39', max=4075.0, style=ProgressStyle(description_wi…


Epoch 39
Training loss: 0.9751944753289954
Validation loss: 1.0406097656072573
Valdation accuracy: 0.8642191142191142


HBox(children=(FloatProgress(value=0.0, description='Epoch 40', max=4075.0, style=ProgressStyle(description_wi…


Epoch 40
Training loss: 0.9720527753800702
Validation loss: 1.0328101682108501
Valdation accuracy: 0.8712121212121212


HBox(children=(FloatProgress(value=0.0, description='Epoch 41', max=4075.0, style=ProgressStyle(description_wi…


Epoch 41
Training loss: 0.9719115902456038
Validation loss: 1.0358671834302502
Valdation accuracy: 0.8677156177156177


HBox(children=(FloatProgress(value=0.0, description='Epoch 42', max=4075.0, style=ProgressStyle(description_wi…


Epoch 42
Training loss: 0.9702987069440034
Validation loss: 1.030691809709682
Valdation accuracy: 0.8735431235431236


HBox(children=(FloatProgress(value=0.0, description='Epoch 43', max=4075.0, style=ProgressStyle(description_wi…


Epoch 43
Training loss: 0.9694513260806265
Validation loss: 1.031729298414186
Valdation accuracy: 0.872960372960373


HBox(children=(FloatProgress(value=0.0, description='Epoch 44', max=4075.0, style=ProgressStyle(description_wi…


Epoch 44
Training loss: 0.9676647763018228
Validation loss: 1.0314684249633967
Valdation accuracy: 0.872960372960373


HBox(children=(FloatProgress(value=0.0, description='Epoch 45', max=4075.0, style=ProgressStyle(description_wi…


Epoch 45
Training loss: 0.9665607948244715
Validation loss: 1.0341705577318059
Valdation accuracy: 0.87004662004662


HBox(children=(FloatProgress(value=0.0, description='Epoch 46', max=4075.0, style=ProgressStyle(description_wi…


Epoch 46
Training loss: 0.9671793953187626
Validation loss: 1.0260930854220722
Valdation accuracy: 0.8776223776223776


HBox(children=(FloatProgress(value=0.0, description='Epoch 47', max=4075.0, style=ProgressStyle(description_wi…


Epoch 47
Training loss: 0.9659180397343782
Validation loss: 1.022293519696524
Valdation accuracy: 0.8822843822843823


HBox(children=(FloatProgress(value=0.0, description='Epoch 48', max=4075.0, style=ProgressStyle(description_wi…


Epoch 48
Training loss: 0.9644106255572267
Validation loss: 1.0220631444176962
Valdation accuracy: 0.8822843822843823


HBox(children=(FloatProgress(value=0.0, description='Epoch 49', max=4075.0, style=ProgressStyle(description_wi…


Epoch 49
Training loss: 0.9644006553164289
Validation loss: 1.0214083338892737
Valdation accuracy: 0.8828671328671329


HBox(children=(FloatProgress(value=0.0, description='Epoch 50', max=4075.0, style=ProgressStyle(description_wi…


Epoch 50
Training loss: 0.9635057238859633
Validation loss: 1.0265339005825131
Valdation accuracy: 0.8776223776223776


HBox(children=(FloatProgress(value=0.0, description='Epoch 51', max=4075.0, style=ProgressStyle(description_wi…


Epoch 51
Training loss: 0.9625204997998805
Validation loss: 1.0253319518510686
Valdation accuracy: 0.8793706293706294


HBox(children=(FloatProgress(value=0.0, description='Epoch 52', max=4075.0, style=ProgressStyle(description_wi…


Epoch 52
Training loss: 0.9625144568250223
Validation loss: 1.0239758053491281
Valdation accuracy: 0.8805361305361306


HBox(children=(FloatProgress(value=0.0, description='Epoch 53', max=4075.0, style=ProgressStyle(description_wi…


Epoch 53
Training loss: 0.9613182574547141
Validation loss: 1.022796571809192
Valdation accuracy: 0.8811188811188811


HBox(children=(FloatProgress(value=0.0, description='Epoch 54', max=4075.0, style=ProgressStyle(description_wi…


Epoch 54
Training loss: 0.9611736802089433
Validation loss: 1.0253071926360906
Valdation accuracy: 0.8787878787878788


HBox(children=(FloatProgress(value=0.0, description='Epoch 55', max=4075.0, style=ProgressStyle(description_wi…


Epoch 55
Training loss: 0.9605420550978256
Validation loss: 1.019024083503457
Valdation accuracy: 0.8851981351981352


HBox(children=(FloatProgress(value=0.0, description='Epoch 56', max=4075.0, style=ProgressStyle(description_wi…


Epoch 56
Training loss: 0.9602242974269609
Validation loss: 1.0189304252003515
Valdation accuracy: 0.8857808857808858


HBox(children=(FloatProgress(value=0.0, description='Epoch 57', max=4075.0, style=ProgressStyle(description_wi…


Epoch 57
Training loss: 0.9596622216482104
Validation loss: 1.0162017026612924
Valdation accuracy: 0.8886946386946387


HBox(children=(FloatProgress(value=0.0, description='Epoch 58', max=4075.0, style=ProgressStyle(description_wi…


Epoch 58
Training loss: 0.9595074037832716
Validation loss: 1.017732637427574
Valdation accuracy: 0.8869463869463869


HBox(children=(FloatProgress(value=0.0, description='Epoch 59', max=4075.0, style=ProgressStyle(description_wi…


Epoch 59
Training loss: 0.9589327369584628
Validation loss: 1.0131637955820838
Valdation accuracy: 0.8916083916083916


HBox(children=(FloatProgress(value=0.0, description='Epoch 60', max=4075.0, style=ProgressStyle(description_wi…


Epoch 60
Training loss: 0.9588940210722707
Validation loss: 1.0147852856059407
Valdation accuracy: 0.8898601398601399



<a id="Sub"></a>
### Test & Submission
  - [Return to table](#Table)

In [84]:
test_df = pd.read_csv('./data/eval_final_open.csv')
test_df["Category"] = np.zeros(test_df.shape[0])

In [85]:
#Encode test sentence
encoded_data_test = tokenizer.batch_encode_plus(
    test_df.Sentence.values, #Sentence data
    add_special_tokens=True,    #Encoded with special tokens relative to their model
    return_attention_mask=True, #Return attention mask according to tokenizer defined by max_length att.
    padding='longest',           #Padding!
    #truncation=True,            
    return_tensors='pt'         #Return torch tensor
)

dataloader_test = DataLoader(dataset_val, 
                             batch_size=batch_size)

input_ids_test = encoded_data_test['input_ids']
attention_masks_test = encoded_data_test['attention_mask']
labels_test = torch.tensor(test_df.Category.values, dtype=int)

dataset_test = TensorDataset(input_ids_test, attention_masks_test, labels_test)

dataloader_test = DataLoader(dataset_test,
                             batch_size=batch_size)

In [86]:
_, _, predictions = evaluate(dataloader_test)
pred = torch.cat([x for x in predictions])

In [87]:
sub = pd.read_csv("./data/sample_sub.csv")
sub['Category'] = pred.cpu()

sub.to_csv("./data/210602_all_bs8.csv", index=False)