### 2. Trainer API 이용한 fine-tuning

In [1]:
# !pip install -q transformers datasets

Transformers는 `Trainer` 클래스를 제공함으로써 사전 학습된 모델을 fine-tuning 할 수 있도록 도와준다.

In [2]:
# 준비
from datasets import load_dataset
from transformers import AutoTokenizer, DataCollatorWithPadding

raw_datasets = load_dataset('glue', 'mrpc')
ckpt = 'bert-base-uncased'
tokenizer = AutoTokenizer.from_pretrained(ckpt)

def tokenize_function(example):
    return tokenizer(
        example['sentence1'],
        example['sentence2'],
        truncation=True
    )

tokenized_datasets = raw_datasets.map(
    tokenize_function,
    batched=True
)

data_collator = DataCollatorWithPadding(tokenizer=tokenizer)

Map:   0%|          | 0/408 [00:00<?, ? examples/s]

### 학습

`Trainer` 를 정의하기 전에 `Trainer` 가 학습 및 평가에 사용할 모든 하이퍼파라미터를 포함하는 `TrainingArgument` 클래스를 정의한다.

In [3]:
# 학습된 모델이 저장될 디렉토리만 지정
from transformers import TrainingArguments

training_args = TrainingArguments("test-trainer")

In [4]:
# 모델 정의
from transformers import AutoModelForSequenceClassification

model = AutoModelForSequenceClassification.from_pretrained(ckpt, num_labels=2)

Downloading model.safetensors:   0%|          | 0.00/440M [00:00<?, ?B/s]

Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier.weight', 'classifier.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


In [5]:
# Trainer 정의하기
from transformers import Trainer

trainer = Trainer(
    model,
    training_args,
    train_dataset=tokenized_datasets['train'],
    eval_dataset=tokenized_datasets['validation'],
    data_collator=data_collator,
    tokenizer=tokenizer,
)

In [6]:
trainer.train()

You're using a BertTokenizerFast tokenizer. Please note that with a fast tokenizer, using the `__call__` method is faster than using a method to encode the text followed by a call to the `pad` method to get a padded encoding.


Step,Training Loss
500,0.6282
1000,0.5389


TrainOutput(global_step=1377, training_loss=0.528146382250817, metrics={'train_runtime': 211.1727, 'train_samples_per_second': 52.109, 'train_steps_per_second': 6.521, 'total_flos': 405324636337200.0, 'train_loss': 0.528146382250817, 'epoch': 3.0})

### 평가

`compute_metrics()` 함수를 구현하기\
-> `EvalPrediction` 객체 (`predictions` 와 `label_ids` 가 포함된 네임드 튜플) 를 필요로 한다.\
문자열과 실수값을 매핑하는 딕셔너리를 반환한다

> 문자열 : 반환된 메트릭의 이름\
> 실수값 : 해당 메트릭에 기반한 평과 결과값

`Trainer.predict()` 명령을 사용해서 모델의 예측 결과를 얻을 수 있다.

In [8]:
predictions = trainer.predict(tokenized_datasets["validation"])
print(predictions.predictions.shape, predictions.label_ids.shape)

(408, 2) (408,)


`predict()` 메소드의 출력: `predictions`, `label_ids`, `metrics` 가 있는 네임드튜플



In [10]:
# 로짓 값을 레이블과 비교할 수 있는 예측 결과로 변환하기
import numpy as np

preds = np.argmax(predictions.predictions, axis=-1)

In [11]:
# 메트릭 로드
from datasets import load_metric

metric = load_metric('glue', 'mrpc')
metric.compute(predictions=preds, references=predictions.label_ids)

  metric = load_metric('glue', 'mrpc')


Downloading builder script:   0%|          | 0.00/1.84k [00:00<?, ?B/s]

{'accuracy': 0.8137254901960784, 'f1': 0.8597785977859778}

In [12]:
def compute_metrics(eval_preds):
    metric = load_metric('glue', 'mrpc')
    logits, labels = eval_preds
    predictions = np.argmax(logits, axis=-1)
    return metric.compute(
        predictions=predictions,
        references=labels
    )

In [13]:
# 에포크가 끝날 때 메트릭을 출력하도록 함
training_args = TrainingArguments(
    'test-trainer',
    evaluation_strategy='epoch'
)

model = AutoModelForSequenceClassification.from_pretrained(ckpt, num_labels=2)

trainer = Trainer(
    model,
    training_args,
    train_dataset=tokenized_datasets['train'],
    eval_dataset=tokenized_datasets['validation'],
    data_collator=data_collator,
    compute_metrics=compute_metrics,
)

Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier.weight', 'classifier.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


In [14]:
trainer.train()



Epoch,Training Loss,Validation Loss,Accuracy,F1
1,No log,0.7057,0.316176,0.0
2,0.648700,0.597894,0.740196,0.825658
3,0.580500,0.501086,0.791667,0.857143


TrainOutput(global_step=1377, training_loss=0.5768346405583356, metrics={'train_runtime': 203.0997, 'train_samples_per_second': 54.18, 'train_steps_per_second': 6.78, 'total_flos': 405540469624800.0, 'train_loss': 0.5768346405583356, 'epoch': 3.0})