<a href="https://colab.research.google.com/github/ryuqae/EntityRelation/blob/main/%EB%8F%99%ED%98%95%EC%9D%B4%EC%9D%98%EC%96%B4_%ED%8C%90%EB%B3%84_koelectra.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Final Project: 2021년 국립국어원 인공지능 언어능력 평가

- [2021년 국립국어원 인공지능 언어능력 평가](https://corpus.korean.go.kr/task/taskList.do?taskId=1&clCd=END_TASK&subMenuId=sub01) 는 9월 1일부터 시작하여 11월 1일까지 마감된 [네 가지 과제에](https://corpus.korean.go.kr/task/taskDownload.do?taskId=1&clCd=END_TASK&subMenuId=sub02) 대한 언어능력 평가 대회
- 여기서 제시된 과제를 그대로 수행하여 그 결과를 [최종 선정된 결과들](https://corpus.korean.go.kr/task/taskLeaderBoard.do?taskId=4&clCd=END_TASK&subMenuId=sub04)과 비교할 수 있도록 수행
- 아직 테스트 셋의 정답이 공식적으로 공개되고 있지 않아, 네 가지 과제의 자료에서 evaluation dataset으로 가지고 성능을 비교할 계획
- 기말 발표전까지 정답셋이 공개될 경우 이 정답셋을 가지고 성능 검증
- Transformers 기반 방법론, 신경망 등 각자 생각한 방법대로 구현 가능
- 현재 대회기간이 종료되어 자료가 다운로드 가능하지 않으니 첨부된 자료 참조
- 개인적으로 하거나 최대 두명까지 그룹 허용. 
- 이 노트북 화일에 이름을 변경하여 작업하고 제출. 제출시 화일명을 FinalProject_[DS또는 CL]_학과_이름.ipynb
- 마감 12월 6일(월) 23:59분까지.
- 12월 7일, 9일 기말 발표 presentation 예정

## 리더보드

- 최종발표전까지 각조는 각 태스크별 실행성능을 **시도된 여러 방법의 결과들을 지속적으로**  [리더보드](https://docs.google.com/spreadsheets/d/1-uenfp5GolpY2Gf0TsFbODvj585IIiFKp9fvYxcfgkY/edit#gid=0)에 해당 팀명(구성원 이름 포함)을 입력하여 공개하여야 함. 
- 최종 마감일에 이 순위와 실제 제출한 프로그램의 수행 결과를 비교하여 성능을 확인

In [None]:
!pip install transformers

from functools import partial
from tqdm.notebook import trange, tqdm

import torch
import torch.nn as nn
import torch.optim as optim
import torch.nn.functional as F
from transformers import AutoTokenizer, ElectraPreTrainedModel, AutoConfig, AutoModel, RobertaPreTrainedModel
from transformers import ElectraModel, ElectraTokenizer, ElectraForSequenceClassification, ElectraConfig
from torch.utils.data import DataLoader, RandomSampler, SequentialSampler, Dataset

from transformers import AdamW, get_linear_schedule_with_warmup, AutoTokenizer
import torch.nn.functional as F

import torchtext
from torchtext.legacy.data import Field, TabularDataset, BucketIterator

import os
import numpy as np
import random
import math
import time
import re
import logging

print("\n-- Packages Version --")
# print(f"{'NLTK':<10} | {nltk.__version__:>7}")
print(f"{'PyTorch':<10} | {torch.__version__:>7}")
print(f"{'TorchText':<10} | {torchtext.__version__:>7}")

print("\n-- CUDA Availability --")
# Set and check available cuda device

GPU_NUM = 0
device = torch.device(f'cuda:{GPU_NUM}' if torch.cuda.is_available() else 'cpu')
torch.cuda.set_device(device)

if device.type=='cuda':
    print(f"{torch.cuda.get_device_properties(device).name}\nDevice Num: {torch.cuda.current_device()}")
else:
    print('Warning: cuda is not available')
    
    
def init_logger():
    logging.basicConfig(
        format="%(asctime)s - %(levelname)s - %(name)s -   %(message)s",
        datefmt="%m/%d/%Y %H:%M:%S",
        level=logging.INFO,
    )

def set_seed(SEED=2934):
    random.seed(SEED)
    np.random.seed(SEED)
    torch.manual_seed(SEED)
    torch.cuda.manual_seed(SEED)
    torch.cuda.manual_seed_all(SEED)
    torch.backends.cudnn.deterministic = True
    torch.backends.cudnn.benchmark = False


-- Packages Version --
PyTorch    | 1.10.0+cu111
TorchText  |  0.11.0

-- CUDA Availability --
Tesla P100-PCIE-16GB
Device Num: 0


In [None]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [None]:
from glob import glob

data_dir = glob('*.tsv')
print(data_dir)

['NIKL_SKT_WiC_Train.tsv', 'NIKL_SKT_WiC_Dev.tsv']


In [None]:
import pandas as pd

def load_data(dataset_dir, mode = 'train'):
    dataset = pd.read_csv(dataset_dir, delimiter='\t')
    if mode == 'test':
        dataset["ANSWER"] = [0] * len(dataset)
    dataset["ANSWER"] = dataset["ANSWER"].astype(int)
    return dataset

In [None]:
set_seed()

In [None]:
class HomonymDataset(Dataset):
    def __init__(self, tokenized_dataset, labels):
        self.tokenized_dataset = tokenized_dataset
        self.labels = labels
    
    def __getitem__(self, idx):
        item = {key: torch.tensor(val[idx]) for key, val in self.tokenized_dataset.items()}
        item['labels'] = torch.tensor(self.labels[idx], dtype=torch.long)
        return item

    def __len__(self):
        return len(self.labels)

In [None]:
def convert_sentence_to_features(dataset, tokenizer, max_len):

    max_seq_len=max_len
    pad_token=tokenizer.pad_token_id
    add_sep_token=False
    mask_padding_with_zero=True
    
    all_input_ids = []
    all_attention_mask = []
    all_e1_mask=[]
    all_e2_mask=[]
    all_label=[]
    m_len=0
        
    err = 0

    for idx in tqdm(range(len(dataset))):
        sentence = '[CLS]' + dataset['SENTENCE1'][idx][:dataset['start_s1'][idx]] \
        + '<e1>' + dataset['SENTENCE1'][idx][dataset['start_s1'][idx]:dataset['end_s1'][idx]] \
        + '</e1>' + dataset['SENTENCE1'][idx][dataset['end_s1'][idx]:] + '[SEP]' + dataset['SENTENCE2'][idx][:dataset['start_s2'][idx]] \
        + ' <e2> ' + dataset['SENTENCE2'][idx][dataset['start_s2'][idx]:dataset['end_s2'][idx]] \
        + ' </e2> ' + dataset['SENTENCE2'][idx][dataset['end_s2'][idx]:] + '[SEP]'
        
        token = tokenizer.tokenize(sentence)
        m_len = max(m_len, len(token))
        e11_p = token.index("<e1>")  # the start position of entity1
        e12_p = token.index("</e1>")  # the end position of entity1
        e21_p = token.index("<e2>")  # the start position of entity2
        e22_p = token.index("</e2>")  # the end position of entity2

        token[e11_p] = "$"
        token[e12_p] = "$"
        token[e21_p] = "#"
        token[e22_p] = "#"

        special_tokens_count = 1

        # masks for entity
        if len(token) < max_seq_len - special_tokens_count:
            input_ids = tokenizer.convert_tokens_to_ids(token)
            attention_mask = [1 if mask_padding_with_zero else 0] * len(input_ids)

            padding_length = max_seq_len - len(input_ids)
            input_ids = input_ids + ([pad_token] * padding_length)
            attention_mask = attention_mask + ([0 if mask_padding_with_zero else 1] * padding_length)

            e1_mask = [0] * len(attention_mask)
            e2_mask = [0] * len(attention_mask)

            for i in range(e11_p, e12_p + 1):
                e1_mask[i] = 1
            
            for j in range(e21_p, e22_p + 1):
                e2_mask[j] = 1

            assert len(input_ids) == max_seq_len, "Error with input length {} vs {}".format(len(input_ids), max_seq_len)
            assert len(attention_mask) == max_seq_len, "Error with attention mask length {} vs {}".format(
                len(attention_mask), max_seq_len
            )

            all_input_ids.append(input_ids)
            all_attention_mask.append(attention_mask)
            all_e1_mask.append(e1_mask)
            all_e2_mask.append(e2_mask)
            all_label.append(dataset['ANSWER'][idx])

    all_features = {
        'input_ids' : torch.tensor(all_input_ids),
        'attention_mask' : torch.tensor(all_attention_mask),
        'e1_mask' : torch.tensor(all_e1_mask),
        'e2_mask' : torch.tensor(all_e2_mask)
    }

    return HomonymDataset(all_features, all_label)


In [None]:
train_raw = load_data(data_dir[0])
print(train_raw.shape)

(7748, 9)


# Model

In [None]:
class FCLayer(nn.Module):
    def __init__(self, input_dim, output_dim, dropout=0.0, use_activation=True):
        super().__init__()
        self.use_activation = use_activation
        self.input_dim = input_dim
        self.output_dim = output_dim
        
        self.dropout = nn.Dropout(dropout)
        self.fc1 = nn.Linear(self.input_dim, self.output_dim)
        self.tanh = nn.Tanh()
        
    def forward(self, x):
        x = self.dropout(x)
        if self.use_activation:
            x = self.tanh(x)
        x = self.fc1(x)
        return x

In [None]:
class PoolingHead(nn.Module):

    def __init__(
            self,
            input_dim: int,
            inner_dim: int,
            pooler_dropout: float,
    ):
        super().__init__()
        self.dense = nn.Linear(input_dim, inner_dim)
        self.dropout = nn.Dropout(p=pooler_dropout)

    def forward(self, hidden_states: torch.Tensor):
        hidden_states = self.dropout(hidden_states)
        hidden_states = self.dense(hidden_states)
        hidden_states = torch.tanh(hidden_states)
        return hidden_states

class HomonymNet(ElectraPreTrainedModel):
    def __init__(self, config, dropout_rate):
        super().__init__(config)
        self.model = ElectraModel.from_pretrained("monologg/koelectra-base-v3-discriminator")

        self.num_labels = config.num_labels

        self.pooling = PoolingHead(input_dim=config.hidden_size,
                            inner_dim=config.hidden_size,
                            pooler_dropout=0.1)

        self.cls_fclayer = FCLayer(config.hidden_size, config.hidden_size, dropout_rate)
        self.e1_fclayer = FCLayer(config.hidden_size, config.hidden_size, dropout_rate)
        self.e2_fclayer = FCLayer(config.hidden_size, config.hidden_size, dropout_rate)
        self.label_classifier = FCLayer(
            config.hidden_size * 3, # concat cls, e1, e2 output
            config.num_labels,
            dropout_rate,
            use_activation=False,
        )
    
    @staticmethod
    def entity_hidden_average(hidden_output, entity_mask):

        unsq_entity_mask = entity_mask.unsqueeze(1)
        length_tensor = (entity_mask != 0).sum(dim=1).unsqueeze(1)

        entity_sum = torch.bmm(unsq_entity_mask.float(), hidden_output).squeeze(1)
        entity_average = entity_sum.float() / length_tensor.float()  # broadcasting

        return entity_average


    def forward(self, input_ids, attention_mask, e1_mask, e2_mask, labels):
        outputs = self.model(input_ids, attention_mask=attention_mask)
        sequence_output = outputs[0]
        pooled_output =  self.pooling(outputs[0][:, 0, :])

        sentence_representation = self.cls_fclayer(pooled_output)

        e1_hidden = self.entity_hidden_average(sequence_output, e1_mask)
        e2_hidden = self.entity_hidden_average(sequence_output, e2_mask)

        e1_hidden = self.e1_fclayer(e1_hidden)
        e2_hidden = self.e2_fclayer(e2_hidden)

        concat_hidden = torch.cat([sentence_representation, e1_hidden, e2_hidden], dim=-1)
        logits = self.label_classifier(concat_hidden)
        outputs = (logits,) + outputs[2:]

        loss_func = nn.CrossEntropyLoss()
        loss = loss_func(logits.view(-1, self.num_labels), labels.view(-1))
        outputs = (loss,) + outputs

        return outputs

# Train

In [None]:
num_train_epochs = 10
adam_learning_rate = 1e-5
adam_epsilon = 1e-8
weight_decay = 1e-2

gradient_accumulation_steps = 2

max_len = 200
batch_size = 16
eval_batch_size = 16

In [None]:
def compute_metrics(preds, labels):
    assert len(preds) == len(labels)
    return acc_and_f1(preds, labels)

def simple_accuracy(preds, labels):
    return (preds == labels).mean()

def acc_and_f1(preds, labels, average="macro"):
    acc = simple_accuracy(preds, labels)
    return {
        "acc": acc,
    }

def init_logger():
    logging.basicConfig(
        format="%(asctime)s - %(levelname)s - %(name)s -   %(message)s",
        datefmt="%m/%d/%Y %H:%M:%S",
        level=logging.INFO,
    )

In [None]:
from transformers import ElectraTokenizer 

tokenizer = ElectraTokenizer.from_pretrained("monologg/koelectra-base-v3-discriminator")

ADDITIONAL_SPECIAL_TOKENS = ["<e1>", "</e1>", "<e2>", "</e2>"]
tokenizer.add_special_tokens({"additional_special_tokens": ADDITIONAL_SPECIAL_TOKENS})


config = ElectraConfig.from_pretrained(
    "monologg/koelectra-base-v3-discriminator",
    num_labels=2,
    summary='first'
)

model = HomonymNet(
    config=config,
    dropout_rate=0.1
)

model.to(device)

# Prepare optimizer and schedule (linear warmup and decay)
no_decay = ["bias", "LayerNorm.weight"]
optimizer_grouped_parameters = [
    {
        "params": [p for n, p in model.named_parameters() if not any(nd in n for nd in no_decay)],
        "weight_decay": weight_decay,
    },
    {
        "params": [p for n, p in model.named_parameters() if any(nd in n for nd in no_decay)],
        "weight_decay": 0.0,
    },
]
optimizer = AdamW(
    optimizer_grouped_parameters,
    lr=adam_learning_rate,
    eps=adam_epsilon,
)


Some weights of the model checkpoint at monologg/koelectra-base-v3-discriminator were not used when initializing ElectraModel: ['discriminator_predictions.dense_prediction.bias', 'discriminator_predictions.dense.bias', 'discriminator_predictions.dense_prediction.weight', 'discriminator_predictions.dense.weight']
- This IS expected if you are initializing ElectraModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing ElectraModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


In [None]:
model.zero_grad()

In [None]:
test_raw = load_data("/content/NIKL_SKT_WiC_Dev.tsv")
test_dataset = convert_sentence_to_features(test_raw, tokenizer, max_len=max_len)

base_dir = '/content/drive/MyDrive/NLP_FinalProject/'

def save_checkpoint(model, model_dir):
    if not os.path.exists(model_dir):
        os.makedirs(model_dir)

    model_to_save = model.module if hasattr(model, 'module') else model
    model_to_save.save_pretrained(model_dir)


def evaluate(model, fold_id, best_score, mode='valid'):
    if mode=='valid':
        test_dataloader = DataLoader(valid_dataset, batch_size=eval_batch_size)
    elif mode=='test':
        test_dataloader = DataLoader(test_dataset, batch_size=eval_batch_size)

    predict = None
    eval_loss = 0.0
    nb_eval_steps = 0

    model.eval()

    for batch in tqdm(test_dataloader, desc="Evaluating"):
        batch = tuple(batch[t].to(device) for t in batch)
        with torch.no_grad():
            inputs = {
                "input_ids":batch[0],
                "attention_mask":batch[1],
                "e1_mask":batch[2],
                "e2_mask":batch[3],
                "labels":batch[4]
            }
            outputs = model(**inputs)
            tmp_eval_loss, logits = outputs[:2]
            eval_loss += tmp_eval_loss.mean().item()
        nb_eval_steps+=1

        if predict is None:
            predict = logits.detach().cpu().numpy()
            out_label_ids = inputs["labels"].detach().cpu().numpy()
        else:
            predict = np.append(predict, logits.detach().cpu().numpy(), axis=0)
            out_label_ids = np.append(out_label_ids, inputs["labels"].detach().cpu().numpy(), axis=0)
        # print(list(zip(predict_label, out_label_ids)))


    predict_label = np.argmax(predict, axis=1)
    result = compute_metrics(predict_label, out_label_ids)

    eval_loss = eval_loss / nb_eval_steps
    results = {"loss": eval_loss}
    results.update(result)

    if mode =='valid':
        if result['acc'] > best_score:
            save_checkpoint(model, f"{base_dir}koelectra_model_fold_{fold_id}")
            best_score = result['acc']
            print(f"Saved new best model - acc : {best_score}")

    return results, best_score

  0%|          | 0/1166 [00:00<?, ?it/s]

In [28]:
from sklearn.model_selection import KFold
kfold = KFold(n_splits=5, random_state=2934, shuffle=True)

best_score=0

for fold_id, (train_ids, valid_ids) in tqdm(enumerate(kfold.split(train_raw))):

    train_dataset = convert_sentence_to_features(train_raw.iloc[train_ids].reset_index(), tokenizer, max_len=max_len)
    valid_dataset = convert_sentence_to_features(train_raw.iloc[valid_ids].reset_index(), tokenizer, max_len=max_len)

    train_dataloader = DataLoader(train_dataset, batch_size=batch_size)
    total_steps = len(train_dataloader) * num_train_epochs
    scheduler = get_linear_schedule_with_warmup(optimizer, 
                                                num_warmup_steps = 0,
                                                num_training_steps = total_steps)

    print(f"FOLD {fold_id+1} : Split train dataset to train vs valid")

    train_loss = 0.0
    fold_best_score = 0

    for epoch_step in tqdm(range(num_train_epochs), desc="Epoch"):
        for step, batch in enumerate(tqdm(train_dataloader, desc="Iteration")):
            model.train()
            batch = tuple(batch[t].to(device) for t in batch)
            inputs = {
                "input_ids":batch[0],
                "attention_mask":batch[1],
                "e1_mask":batch[2],
                "e2_mask":batch[3],
                "labels":batch[4]
            }

            outputs = model(**inputs)
            loss = outputs[0]
            loss.backward()

            train_loss += loss.item()

            optimizer.step()
            scheduler.step()
            model.zero_grad()

        valid_loss, fold_best_score = evaluate(model, fold_id+1, fold_best_score, 'valid')
        test_loss, _ = evaluate(model, fold_id+1, fold_best_score, 'test')
            
        print(f"============================= Epoch #{epoch_step+1} =============================")
        print(f" - train: {train_loss}")
        print(f" - valid: {valid_loss}")
        print(f" - test : {test_loss}")
    
    print(f"FOLD {fold_id+1} Best Validation Score : {fold_best_score}")

0it [00:00, ?it/s]

  0%|          | 0/6198 [00:00<?, ?it/s]

  0%|          | 0/1550 [00:00<?, ?it/s]

FOLD 1 : Split train dataset to train vs valid


Epoch:   0%|          | 0/10 [00:00<?, ?it/s]

Iteration:   0%|          | 0/387 [00:00<?, ?it/s]

  import sys


Evaluating:   0%|          | 0/97 [00:00<?, ?it/s]

Saved new best model - acc : 0.8341935483870968


Evaluating:   0%|          | 0/73 [00:00<?, ?it/s]

 - train: 190.1050538495183
 - valid: {'loss': 0.3997994090632065, 'acc': 0.8341935483870968}
 - test : {'loss': 0.4900895043799322, 'acc': 0.7881646655231561}


Iteration:   0%|          | 0/387 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/97 [00:00<?, ?it/s]

Saved new best model - acc : 0.8651612903225806


Evaluating:   0%|          | 0/73 [00:00<?, ?it/s]

 - train: 293.1095589734614
 - valid: {'loss': 0.3542350521599202, 'acc': 0.8651612903225806}
 - test : {'loss': 0.4151490144686748, 'acc': 0.8293310463121784}


Iteration:   0%|          | 0/387 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/97 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/73 [00:00<?, ?it/s]

 - train: 355.8044137917459
 - valid: {'loss': 0.44158397352841405, 'acc': 0.8554838709677419}
 - test : {'loss': 0.5198714743967946, 'acc': 0.8284734133790738}


Iteration:   0%|          | 0/387 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/97 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/73 [00:00<?, ?it/s]

 - train: 394.0921055015642
 - valid: {'loss': 0.4530651018061896, 'acc': 0.8651612903225806}
 - test : {'loss': 0.5286105176217037, 'acc': 0.8344768439108061}


Iteration:   0%|          | 0/387 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/97 [00:00<?, ?it/s]

Saved new best model - acc : 0.8812903225806452


Evaluating:   0%|          | 0/73 [00:00<?, ?it/s]

 - train: 421.84075508778915
 - valid: {'loss': 0.42564516398364427, 'acc': 0.8812903225806452}
 - test : {'loss': 0.4507631501469965, 'acc': 0.8679245283018868}


Iteration:   0%|          | 0/387 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/97 [00:00<?, ?it/s]

Saved new best model - acc : 0.8864516129032258


Evaluating:   0%|          | 0/73 [00:00<?, ?it/s]

 - train: 441.5958075337112
 - valid: {'loss': 0.42550082938246353, 'acc': 0.8864516129032258}
 - test : {'loss': 0.41151682332426004, 'acc': 0.8919382504288165}


Iteration:   0%|          | 0/387 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/97 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/73 [00:00<?, ?it/s]

 - train: 453.982566683786
 - valid: {'loss': 0.49776422613679633, 'acc': 0.8787096774193548}
 - test : {'loss': 0.5278256257536681, 'acc': 0.8713550600343053}


Iteration:   0%|          | 0/387 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/97 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/73 [00:00<?, ?it/s]

 - train: 464.3557918301085
 - valid: {'loss': 0.5452974736618512, 'acc': 0.8722580645161291}
 - test : {'loss': 0.5950771703429469, 'acc': 0.8610634648370498}


Iteration:   0%|          | 0/387 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/97 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/73 [00:00<?, ?it/s]

 - train: 473.0281360488152
 - valid: {'loss': 0.5034848389098152, 'acc': 0.8819354838709678}
 - test : {'loss': 0.5328997501876996, 'acc': 0.8782161234991424}


Iteration:   0%|          | 0/387 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/97 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/73 [00:00<?, ?it/s]

 - train: 480.23655552748824
 - valid: {'loss': 0.5598387717396578, 'acc': 0.8716129032258064}
 - test : {'loss': 0.6147442587088372, 'acc': 0.8627787307032591}
FOLD 1 Best Validation Score : 0.8864516129032258


  0%|          | 0/6198 [00:00<?, ?it/s]

  0%|          | 0/1550 [00:00<?, ?it/s]

FOLD 2 : Split train dataset to train vs valid


Epoch:   0%|          | 0/10 [00:00<?, ?it/s]

Iteration:   0%|          | 0/388 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/97 [00:00<?, ?it/s]

Saved new best model - acc : 0.9935400516795866


Evaluating:   0%|          | 0/73 [00:00<?, ?it/s]

 - train: 54.18724581063725
 - valid: {'loss': 0.037554850465781296, 'acc': 0.9935400516795866}
 - test : {'loss': 0.35422402344746134, 'acc': 0.8679245283018868}


Iteration:   0%|          | 0/388 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/97 [00:00<?, ?it/s]

Saved new best model - acc : 0.9954780361757106


Evaluating:   0%|          | 0/73 [00:00<?, ?it/s]

 - train: 79.6819996816339
 - valid: {'loss': 0.022847088972543433, 'acc': 0.9954780361757106}
 - test : {'loss': 0.32240538831085785, 'acc': 0.8953687821612349}


Iteration:   0%|          | 0/388 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/97 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/73 [00:00<?, ?it/s]

 - train: 90.07443934306502
 - valid: {'loss': 0.027332916995330435, 'acc': 0.9909560723514211}
 - test : {'loss': 0.49189350399753234, 'acc': 0.8730703259005146}


Iteration:   0%|          | 0/388 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/97 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/73 [00:00<?, ?it/s]

 - train: 98.15233149504638
 - valid: {'loss': 0.02155083080603064, 'acc': 0.9948320413436692}
 - test : {'loss': 0.4452369383070618, 'acc': 0.8842195540308748}


Iteration:   0%|          | 0/388 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/97 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/73 [00:00<?, ?it/s]

 - train: 105.40619463851908
 - valid: {'loss': 0.044338687248023936, 'acc': 0.9844961240310077}
 - test : {'loss': 0.6247581952536153, 'acc': 0.8610634648370498}


Iteration:   0%|          | 0/388 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/97 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/73 [00:00<?, ?it/s]

 - train: 109.47015169775113
 - valid: {'loss': 0.036963592816147826, 'acc': 0.9890180878552972}
 - test : {'loss': 0.6144578431000174, 'acc': 0.8730703259005146}


Iteration:   0%|          | 0/388 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/97 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/73 [00:00<?, ?it/s]

 - train: 112.53606226829288
 - valid: {'loss': 0.061890124779067975, 'acc': 0.9799741602067183}
 - test : {'loss': 0.7655558636735356, 'acc': 0.8533447684391081}


Iteration:   0%|          | 0/388 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/97 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/73 [00:00<?, ?it/s]

 - train: 114.66565205455117
 - valid: {'loss': 0.053356668527823746, 'acc': 0.9832041343669251}
 - test : {'loss': 0.7635610139984972, 'acc': 0.8524871355060034}


Iteration:   0%|          | 0/388 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/97 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/73 [00:00<?, ?it/s]

 - train: 115.90100768830598
 - valid: {'loss': 0.038683591114978794, 'acc': 0.9864341085271318}
 - test : {'loss': 0.6937107243001551, 'acc': 0.8662092624356775}


Iteration:   0%|          | 0/388 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/97 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/73 [00:00<?, ?it/s]

 - train: 117.86493162233091
 - valid: {'loss': 0.03253164124034061, 'acc': 0.9909560723514211}
 - test : {'loss': 0.656791435232787, 'acc': 0.8730703259005146}
FOLD 2 Best Validation Score : 0.9954780361757106


  0%|          | 0/6198 [00:00<?, ?it/s]

  0%|          | 0/1550 [00:00<?, ?it/s]

FOLD 3 : Split train dataset to train vs valid


Epoch:   0%|          | 0/10 [00:00<?, ?it/s]

Iteration:   0%|          | 0/387 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/97 [00:00<?, ?it/s]

Saved new best model - acc : 0.9980632666236281


Evaluating:   0%|          | 0/73 [00:00<?, ?it/s]

 - train: 11.94972162549675
 - valid: {'loss': 0.00801820322641978, 'acc': 0.9980632666236281}
 - test : {'loss': 0.6102490614212999, 'acc': 0.8593481989708405}


Iteration:   0%|          | 0/387 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/97 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/73 [00:00<?, ?it/s]

 - train: 21.069926056516124
 - valid: {'loss': 0.006707876287616751, 'acc': 0.9967721110393802}
 - test : {'loss': 0.6047495281192624, 'acc': 0.8704974271012007}


Iteration:   0%|          | 0/387 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/97 [00:00<?, ?it/s]

Saved new best model - acc : 1.0


Evaluating:   0%|          | 0/73 [00:00<?, ?it/s]

 - train: 26.46004373296455
 - valid: {'loss': 0.0009283980220549058, 'acc': 1.0}
 - test : {'loss': 0.5442466481178337, 'acc': 0.8825042881646655}


Iteration:   0%|          | 0/387 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/97 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/73 [00:00<?, ?it/s]

 - train: 29.152209428582864
 - valid: {'loss': 0.01360929214633964, 'acc': 0.9948353776630084}
 - test : {'loss': 0.8128709166939351, 'acc': 0.8542024013722127}


Iteration:   0%|          | 0/387 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/97 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/73 [00:00<?, ?it/s]

 - train: 32.82773540441849
 - valid: {'loss': 0.008480466092636954, 'acc': 0.9967721110393802}
 - test : {'loss': 0.6995483996699028, 'acc': 0.8636363636363636}


Iteration:   0%|          | 0/387 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/97 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/73 [00:00<?, ?it/s]

 - train: 35.74064215863473
 - valid: {'loss': 0.009328013983436323, 'acc': 0.9974176888315042}
 - test : {'loss': 0.6920105581799137, 'acc': 0.8687821612349914}


Iteration:   0%|          | 0/387 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/97 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/73 [00:00<?, ?it/s]

 - train: 37.09094847610686
 - valid: {'loss': 0.00624939369939141, 'acc': 0.9980632666236281}
 - test : {'loss': 0.6379100056685822, 'acc': 0.8867924528301887}


Iteration:   0%|          | 0/387 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/97 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/73 [00:00<?, ?it/s]

 - train: 37.52278055588613
 - valid: {'loss': 0.006007703852484371, 'acc': 0.9974176888315042}
 - test : {'loss': 0.663341563133917, 'acc': 0.8842195540308748}


Iteration:   0%|          | 0/387 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/97 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/73 [00:00<?, ?it/s]

 - train: 40.21992619670709
 - valid: {'loss': 0.005848116295290067, 'acc': 0.9987088444157521}
 - test : {'loss': 0.6477450121784722, 'acc': 0.888507718696398}


Iteration:   0%|          | 0/387 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/97 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/73 [00:00<?, ?it/s]

 - train: 41.115743485039275
 - valid: {'loss': 0.006855724382083131, 'acc': 0.9980632666236281}
 - test : {'loss': 0.6810019901396329, 'acc': 0.8790737564322469}
FOLD 3 Best Validation Score : 1.0


  0%|          | 0/6199 [00:00<?, ?it/s]

  0%|          | 0/1549 [00:00<?, ?it/s]

FOLD 4 : Split train dataset to train vs valid


Epoch:   0%|          | 0/10 [00:00<?, ?it/s]

Iteration:   0%|          | 0/388 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/97 [00:00<?, ?it/s]

Saved new best model - acc : 0.9980607627666451


Evaluating:   0%|          | 0/73 [00:00<?, ?it/s]

 - train: 7.333159751739004
 - valid: {'loss': 0.004500361582199496, 'acc': 0.9980607627666451}
 - test : {'loss': 0.4665362536703868, 'acc': 0.8893653516295026}


Iteration:   0%|          | 0/388 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/97 [00:00<?, ?it/s]

Saved new best model - acc : 1.0


Evaluating:   0%|          | 0/73 [00:00<?, ?it/s]

 - train: 11.220775224326644
 - valid: {'loss': 0.000617753906873386, 'acc': 1.0}
 - test : {'loss': 0.5608971688175998, 'acc': 0.9030874785591767}


Iteration:   0%|          | 0/388 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/97 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/73 [00:00<?, ?it/s]

 - train: 15.614096068384242
 - valid: {'loss': 0.01766694032859547, 'acc': 0.9922430510665805}
 - test : {'loss': 0.69783630176124, 'acc': 0.8516295025728988}


Iteration:   0%|          | 0/388 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/97 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/73 [00:00<?, ?it/s]

 - train: 19.400528147518344
 - valid: {'loss': 0.0039048546427271577, 'acc': 0.9987071751777634}
 - test : {'loss': 0.542392175623805, 'acc': 0.8893653516295026}


Iteration:   0%|          | 0/388 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/97 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/73 [00:00<?, ?it/s]

 - train: 20.805751351632352
 - valid: {'loss': 0.0004925315102603414, 'acc': 1.0}
 - test : {'loss': 0.5568243556503746, 'acc': 0.8962264150943396}


Iteration:   0%|          | 0/388 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/97 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/73 [00:00<?, ?it/s]

 - train: 23.19069979204869
 - valid: {'loss': 0.002392899424166055, 'acc': 0.9980607627666451}
 - test : {'loss': 0.5983037863183672, 'acc': 0.885934819897084}


Iteration:   0%|          | 0/388 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/97 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/73 [00:00<?, ?it/s]

 - train: 24.27836155417026
 - valid: {'loss': 0.0020641186700831706, 'acc': 0.9987071751777634}
 - test : {'loss': 0.6185919802165663, 'acc': 0.8953687821612349}


Iteration:   0%|          | 0/388 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/97 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/73 [00:00<?, ?it/s]

 - train: 25.208274552820512
 - valid: {'loss': 0.0022624256114458026, 'acc': 0.9987071751777634}
 - test : {'loss': 0.6332386363895127, 'acc': 0.8867924528301887}


Iteration:   0%|          | 0/388 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/97 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/73 [00:00<?, ?it/s]

 - train: 25.901749210123853
 - valid: {'loss': 0.0009559399136331607, 'acc': 1.0}
 - test : {'loss': 0.614395887960698, 'acc': 0.8919382504288165}


Iteration:   0%|          | 0/388 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/97 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/73 [00:00<?, ?it/s]

 - train: 26.609372508007255
 - valid: {'loss': 0.0018963492322148611, 'acc': 0.9987071751777634}
 - test : {'loss': 0.6511621782834008, 'acc': 0.8842195540308748}
FOLD 4 Best Validation Score : 1.0


  0%|          | 0/6199 [00:00<?, ?it/s]

  0%|          | 0/1549 [00:00<?, ?it/s]

FOLD 5 : Split train dataset to train vs valid


Epoch:   0%|          | 0/10 [00:00<?, ?it/s]

Iteration:   0%|          | 0/388 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/97 [00:00<?, ?it/s]

Saved new best model - acc : 0.9987071751777634


Evaluating:   0%|          | 0/73 [00:00<?, ?it/s]

 - train: 5.716600590036251
 - valid: {'loss': 0.005598711542484828, 'acc': 0.9987071751777634}
 - test : {'loss': 0.608517388130952, 'acc': 0.8833619210977701}


Iteration:   0%|          | 0/388 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/97 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/73 [00:00<?, ?it/s]

 - train: 9.159120770807931
 - valid: {'loss': 0.010697953435344071, 'acc': 0.9980607627666451}
 - test : {'loss': 0.5716136725389594, 'acc': 0.8876500857632933}


Iteration:   0%|          | 0/388 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/97 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/73 [00:00<?, ?it/s]

 - train: 12.05953185231374
 - valid: {'loss': 0.005735242127933085, 'acc': 0.9987071751777634}
 - test : {'loss': 0.5972467812548842, 'acc': 0.8910806174957119}


Iteration:   0%|          | 0/388 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/97 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/73 [00:00<?, ?it/s]

 - train: 15.205012277545393
 - valid: {'loss': 0.020936541958493312, 'acc': 0.9941822882999354}
 - test : {'loss': 0.7643863307712807, 'acc': 0.8679245283018868}


Iteration:   0%|          | 0/388 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/97 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/73 [00:00<?, ?it/s]

 - train: 16.53230634934698
 - valid: {'loss': 0.0036208158064780327, 'acc': 0.9987071751777634}
 - test : {'loss': 0.5864069817349471, 'acc': 0.8996569468267581}


Iteration:   0%|          | 0/388 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/97 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/73 [00:00<?, ?it/s]

 - train: 17.689875358697464
 - valid: {'loss': 0.006740198299671788, 'acc': 0.9987071751777634}
 - test : {'loss': 0.6681683011498685, 'acc': 0.8927958833619211}


Iteration:   0%|          | 0/388 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/97 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/73 [00:00<?, ?it/s]

 - train: 18.99836822117686
 - valid: {'loss': 0.0054401790963292465, 'acc': 0.9987071751777634}
 - test : {'loss': 0.6286663676815706, 'acc': 0.8953687821612349}


Iteration:   0%|          | 0/388 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/97 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/73 [00:00<?, ?it/s]

 - train: 20.036514260223157
 - valid: {'loss': 0.004766767672166323, 'acc': 0.9987071751777634}
 - test : {'loss': 0.6505899570295585, 'acc': 0.8962264150943396}


Iteration:   0%|          | 0/388 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/97 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/73 [00:00<?, ?it/s]

 - train: 20.41840541779584
 - valid: {'loss': 0.005889267928373603, 'acc': 0.9987071751777634}
 - test : {'loss': 0.6666612590604628, 'acc': 0.8945111492281304}


Iteration:   0%|          | 0/388 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/97 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/73 [00:00<?, ?it/s]

 - train: 20.761922729238904
 - valid: {'loss': 0.006557132080836043, 'acc': 0.9987071751777634}
 - test : {'loss': 0.6855888951079216, 'acc': 0.8945111492281304}
FOLD 5 Best Validation Score : 0.9987071751777634
