# Trainer API hyperparameter search
모델 내부에서 결정되는 변수를 parameter라고 합니다. 예를 들어, 모델의 weight가 있습니다.   
parameter는 데이터를 모델이 학습하는 과정에서 모델이 스스로 결정합니다.   

모델을 정의할 때, 설계자가 직접 설정해주는 값을 hyper parameter라고 합니다. 예를들어 learning rate가 있습니다.  
hyper parameter를 설정하는 방법에는 규칙이 없습니다. 따라서 모델에 따라서, 데이터에 따라서, 많은 변수에 의해 hyper parameter의 값은 달라질 수 있습니다.  
이러한 환경에서 모델의 성능을 최적화하는 hyper parameter를 찾는 방법은 Grid search, random search, Bayesian Optimization 등 여러가지 방법이 있습니다.  
하지만 이러한 방식을 직접 구현하는데는 시간과 노력이 들어갑니다.  

transformers 라이브러리의 Trainer를 사용하면 쉽게 hyper parameter를 찾을 수 있습니다.  
`Trainer`의 `hyperparameter_search()`함수를 호출하는 것만으로 모델의 성능을 최적화 하는 hyper parameter를 찾을 수 있습니다.  

In [1]:
import os
import warnings

import numpy as np
import pandas as pd
import datasets
from datasets import load_metric
from transformers import AutoTokenizer, AutoModelForSequenceClassification, TrainingArguments, Trainer

from sklearn.model_selection import train_test_split

import torch
from torch.utils.data import Dataset, DataLoader

warnings.filterwarnings(action='ignore')

이번 노트북에서는 klue/bert-base 모델을 사용합니다. 

In [2]:
model_checkpoint = "klue/bert-base"
batch_size = 32
task = "nli"
RANDOM_SEED = 17

In [3]:
tokenizer = AutoTokenizer.from_pretrained(model_checkpoint, use_fast=True)

In [4]:
dataset = pd.read_csv("data/train_data.csv")
test = pd.read_csv("data/test_data.csv")

In [5]:
dataset_train, dataset_val = train_test_split(dataset,test_size = 0.2,random_state = RANDOM_SEED)

In [6]:
class TrainDataset(Dataset):
    def __init__(self, dataset, sent_key, label_key, bert_tokenizer):
        self.sentences = [ bert_tokenizer(i,truncation=True,return_token_type_ids=False) for i in dataset[sent_key] ]
        self.labels = [np.int64(i) for i in dataset[label_key]]


    def __getitem__(self, i):
        self.sentences[i]["label"] = self.labels[i]
        return self.sentences[i]


    def __len__(self):
        return len(self.sentences)
    
class TestDataset(Dataset):
    def __init__(self, dataset, sent_key, bert_tokenizer):
        self.sentences = [ bert_tokenizer(i,truncation=True,return_token_type_ids=False) for i in dataset[sent_key] ]
        
    def __getitem__(self, i):
        return self.sentences[i]
    
    def __len__(self):
        return len(self.sentences)


In [7]:
data_train = TrainDataset(dataset_train, "title", "topic_idx", tokenizer)
data_validation = TrainDataset(dataset_val, "title", "topic_idx", tokenizer)
data_test = TestDataset(test, "title", tokenizer)

모델의 성능을 측정하기 위한 metric(지표)를 불러옵니다.  
수행할 과제는 Text Classification이기 때문에 비슷한 과제인 glue-qnli의 metric을 가져옵니다.

In [8]:
metric = load_metric("glue", "qnli")

In [9]:
def compute_metrics(eval_pred):
    predictions, labels = eval_pred
    predictions = np.argmax(predictions, axis=1)
    return metric.compute(predictions=predictions, references=labels)

In [10]:
metric_name = "accuracy"

args = TrainingArguments(
    "test-nli",
    evaluation_strategy="epoch",
    learning_rate=2e-5,
    per_device_train_batch_size=batch_size,
    per_device_eval_batch_size=batch_size,
    num_train_epochs=3,
    weight_decay=0.01,
    load_best_model_at_end=True,
    metric_for_best_model=metric_name,
)

Trainer의 `hyperparameter_search()`는 최적의 hyper parameter를 찾기 위해 여러번의 학습을 반복합니다.  
학습을 반복할 때 모델의 parameter들을 초기화시켜주어야 하기 때문에 새로운 model을 정의하는 함수를 Trainer에 전달해주어야 합니다.  
`model_init()`함수를 만들어 Trainer에 전달해주도록 하겠습니다. 


In [11]:
num_labels = 7

def model_init():
    return AutoModelForSequenceClassification.from_pretrained(model_checkpoint, num_labels=num_labels)

In [13]:
trainer = Trainer(
    model_init=model_init,
    args=args,
    train_dataset=data_train,
    eval_dataset=data_validation,
    tokenizer=tokenizer,
    compute_metrics=compute_metrics
)

loading configuration file https://huggingface.co/klue/bert-base/resolve/main/config.json from cache at C:\Users\or7l0/.cache\huggingface\transformers\fbd0b2ef898c4653902683fea8cc0dd99bf43f0e082645b913cda3b92429d1bb.7cee10e8ea7ffa278f8be4b141000263f2b18795e5ef5e025352b2af6851f8fb
Model config BertConfig {
  "architectures": [
    "BertForPretraining"
  ],
  "attention_probs_dropout_prob": 0.1,
  "gradient_checkpointing": false,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "id2label": {
    "0": "LABEL_0",
    "1": "LABEL_1",
    "2": "LABEL_2",
    "3": "LABEL_3",
    "4": "LABEL_4",
    "5": "LABEL_5",
    "6": "LABEL_6"
  },
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "label2id": {
    "LABEL_0": 0,
    "LABEL_1": 1,
    "LABEL_2": 2,
    "LABEL_3": 3,
    "LABEL_4": 4,
    "LABEL_5": 5,
    "LABEL_6": 6
  },
  "layer_norm_eps": 1e-12,
  "max_position_embeddings": 512,
  "model_type": "bert",
  "num_attention_heads": 12,
  "num_hidden_

transformers에서는 hyperparameter를 찾기 위해 `optuna`와 `ray` 라이브러리를 필요로 합니다.  
pip를 통해서 설치해줍시다.  

In [14]:
# !pip install optuna
# !pip install ray[tune]

`hyperparameter_search()`의 인자로 `n_trials`를 조절하여 grid search를 하는 횟수를 조절할 수 있습니다.   
`hyperparameter_search()` 함수는 hyperparameter를 조절해가며 `n_trials` 만큼 학습을 진행한 뒤 가장 성능이 좋았던 hyper parameter를 반환해줍니다.  
학습에 시간이 오래걸려 2번만 search를 진행했습니다.  

In [15]:
best_run = trainer.hyperparameter_search(n_trials=2, direction="maximize")

[32m[I 2021-08-15 22:21:41,466][0m A new study created in memory with name: no-name-23d8a293-af53-4323-b728-069f3431ca44[0m
Trial:
loading configuration file https://huggingface.co/klue/bert-base/resolve/main/config.json from cache at C:\Users\or7l0/.cache\huggingface\transformers\fbd0b2ef898c4653902683fea8cc0dd99bf43f0e082645b913cda3b92429d1bb.7cee10e8ea7ffa278f8be4b141000263f2b18795e5ef5e025352b2af6851f8fb
Model config BertConfig {
  "architectures": [
    "BertForPretraining"
  ],
  "attention_probs_dropout_prob": 0.1,
  "gradient_checkpointing": false,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "id2label": {
    "0": "LABEL_0",
    "1": "LABEL_1",
    "2": "LABEL_2",
    "3": "LABEL_3",
    "4": "LABEL_4",
    "5": "LABEL_5",
    "6": "LABEL_6"
  },
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "label2id": {
    "LABEL_0": 0,
    "LABEL_1": 1,
    "LABEL_2": 2,
    "LABEL_3": 3,
    "LABEL_4": 4,
    "LABEL_5": 5,
    "LABEL_6": 6

Epoch,Training Loss,Validation Loss,Accuracy
1,0.3577,0.336874,0.885664
2,0.2378,0.3558,0.884679
3,0.1347,0.439973,0.87975
4,0.0673,0.53374,0.87986


***** Running Evaluation *****
  Num examples = 9131
  Batch size = 32
Saving model checkpoint to test-nli\run-0\checkpoint-1142
Configuration saved in test-nli\run-0\checkpoint-1142\config.json
Model weights saved in test-nli\run-0\checkpoint-1142\pytorch_model.bin
tokenizer config file saved in test-nli\run-0\checkpoint-1142\tokenizer_config.json
Special tokens file saved in test-nli\run-0\checkpoint-1142\special_tokens_map.json
***** Running Evaluation *****
  Num examples = 9131
  Batch size = 32
Saving model checkpoint to test-nli\run-0\checkpoint-2284
Configuration saved in test-nli\run-0\checkpoint-2284\config.json
Model weights saved in test-nli\run-0\checkpoint-2284\pytorch_model.bin
tokenizer config file saved in test-nli\run-0\checkpoint-2284\tokenizer_config.json
Special tokens file saved in test-nli\run-0\checkpoint-2284\special_tokens_map.json
***** Running Evaluation *****
  Num examples = 9131
  Batch size = 32
Saving model checkpoint to test-nli\run-0\checkpoint-3426
C

Epoch,Training Loss,Validation Loss,Accuracy
1,0.7723,0.671673,0.847552
2,0.6234,0.750209,0.86168
3,0.4874,0.698994,0.871098


***** Running Evaluation *****
  Num examples = 9131
  Batch size = 32
Saving model checkpoint to test-nli\run-1\checkpoint-9131
Configuration saved in test-nli\run-1\checkpoint-9131\config.json
Model weights saved in test-nli\run-1\checkpoint-9131\pytorch_model.bin
tokenizer config file saved in test-nli\run-1\checkpoint-9131\tokenizer_config.json
Special tokens file saved in test-nli\run-1\checkpoint-9131\special_tokens_map.json
***** Running Evaluation *****
  Num examples = 9131
  Batch size = 32
Saving model checkpoint to test-nli\run-1\checkpoint-18262
Configuration saved in test-nli\run-1\checkpoint-18262\config.json
Model weights saved in test-nli\run-1\checkpoint-18262\pytorch_model.bin
tokenizer config file saved in test-nli\run-1\checkpoint-18262\tokenizer_config.json
Special tokens file saved in test-nli\run-1\checkpoint-18262\special_tokens_map.json
***** Running Evaluation *****
  Num examples = 9131
  Batch size = 32
Saving model checkpoint to test-nli\run-1\checkpoint-2

성능이 가장 좋았던 hyper parameter를 확인합니다.

In [16]:
best_run

BestRun(run_id='0', objective=0.8798598182017303, hyperparameters={'learning_rate': 4.5352366623370486e-05, 'num_train_epochs': 4, 'seed': 35, 'per_device_train_batch_size': 32})

trainer에 성능이 가장 좋았던 hyper parameter로 설정해준 뒤 학습을 진행합니다.

In [17]:
for n, v in best_run.hyperparameters.items():
    setattr(trainer.args, n, v)

trainer.train()

loading configuration file https://huggingface.co/klue/bert-base/resolve/main/config.json from cache at C:\Users\or7l0/.cache\huggingface\transformers\fbd0b2ef898c4653902683fea8cc0dd99bf43f0e082645b913cda3b92429d1bb.7cee10e8ea7ffa278f8be4b141000263f2b18795e5ef5e025352b2af6851f8fb
Model config BertConfig {
  "architectures": [
    "BertForPretraining"
  ],
  "attention_probs_dropout_prob": 0.1,
  "gradient_checkpointing": false,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "id2label": {
    "0": "LABEL_0",
    "1": "LABEL_1",
    "2": "LABEL_2",
    "3": "LABEL_3",
    "4": "LABEL_4",
    "5": "LABEL_5",
    "6": "LABEL_6"
  },
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "label2id": {
    "LABEL_0": 0,
    "LABEL_1": 1,
    "LABEL_2": 2,
    "LABEL_3": 3,
    "LABEL_4": 4,
    "LABEL_5": 5,
    "LABEL_6": 6
  },
  "layer_norm_eps": 1e-12,
  "max_position_embeddings": 512,
  "model_type": "bert",
  "num_attention_heads": 12,
  "num_hidden_

Epoch,Training Loss,Validation Loss,Accuracy
1,0.3577,0.336874,0.885664
2,0.2378,0.3558,0.884679
3,0.1347,0.439973,0.87975
4,0.0673,0.53374,0.87986


***** Running Evaluation *****
  Num examples = 9131
  Batch size = 32
Saving model checkpoint to test-nli\checkpoint-1142
Configuration saved in test-nli\checkpoint-1142\config.json
Model weights saved in test-nli\checkpoint-1142\pytorch_model.bin
tokenizer config file saved in test-nli\checkpoint-1142\tokenizer_config.json
Special tokens file saved in test-nli\checkpoint-1142\special_tokens_map.json
***** Running Evaluation *****
  Num examples = 9131
  Batch size = 32
Saving model checkpoint to test-nli\checkpoint-2284
Configuration saved in test-nli\checkpoint-2284\config.json
Model weights saved in test-nli\checkpoint-2284\pytorch_model.bin
tokenizer config file saved in test-nli\checkpoint-2284\tokenizer_config.json
Special tokens file saved in test-nli\checkpoint-2284\special_tokens_map.json
***** Running Evaluation *****
  Num examples = 9131
  Batch size = 32
Saving model checkpoint to test-nli\checkpoint-3426
Configuration saved in test-nli\checkpoint-3426\config.json
Model w

TrainOutput(global_step=4568, training_loss=0.2147540268463763, metrics={'train_runtime': 486.3823, 'train_samples_per_second': 300.365, 'train_steps_per_second': 9.392, 'total_flos': 2123068942884006.0, 'train_loss': 0.2147540268463763, 'epoch': 4.0})

In [18]:
trainer.evaluate()

***** Running Evaluation *****
  Num examples = 9131
  Batch size = 32


{'eval_loss': 0.3368736207485199,
 'eval_accuracy': 0.8856642207863322,
 'eval_runtime': 7.5415,
 'eval_samples_per_second': 1210.761,
 'eval_steps_per_second': 37.923,
 'epoch': 4.0}

In [19]:
pred = trainer.predict(data_test)
pred = pred[0]
pred = np.argmax(pred,1)

***** Running Prediction *****
  Num examples = 9131
  Batch size = 32


In [20]:
submission = pd.read_csv('data/sample_submission.csv')
submission['topic_idx'] = pred
submission.to_csv("results/klue-bert-base-0810.csv",index=False)

참고문헌  
transformers 공식문서 How to fine-tune a model on text classification
https://github.com/huggingface/notebooks/blob/master/examples/text_classification.ipynb  

