# [N21] Sentence Classification 모델 학습 (w/KLUE YNAT 데이터셋)


## N21 실습
> 실제로 N21 Problem 에 해당하는 Topicl Classification 을 해결해보면서 N21 문제에 대한 이해도를 높이는 실습입니다.
## 데이터 설명
```
- KLUE: 한국어 NLU(Natural Language Understanding) 데이터셋 벤치마크
 - https://github.com/KLUE-benchmark/KLUE
 * 본 데이터는 Creative Commons Attribution-ShareAlike 4.0 International License(CC BY-SA 4.0)를 따릅니다.
```
> 이번 실습에는 Korean Language Understanding Evaluation(KLUE)의 Yonhap News Agency Topic Classificaiton(YNAT) 데이터셋을 사용합니다. 이 데이터셋에는 2가지 데이터가 존재합니다.
- title : 기사 제목
- label : 기사 주제
## 모델링 방법
> 모델의 목적은 
1. 기사 제목을 통해
2. 기사의 주제를 유추하려고 합니다.
>
> 따라서 모델의 input/output은 다음과 같습니다.
- input : 기사 제목 ( = N tokens )
- output : 기사의 주제 ( = 1 single class )
## 평가 방법
> 모델이 "문제를 얼마나 잘 맞추는지" Accuracy 지표를 통해 성능을 확인해 봅니다.

# 기존 Huggingface 에 배포된 모델 및 Tokenizer 확인해보기

- 라이브러리 설치

- transformers.AutoTokenizer의 from_pretrained() 함수 실습

- 호출된 토크나이저의 형태 확인

In [1]:
# # 라이브러리 설치
# !pip install pytorch_lightning
# !pip install transformers
# !pip install datasets





In [1]:
# https://huggingface.co/docs/transformers/model_doc/auto#transformers.AutoTokenizer
from transformers import AutoTokenizer

# transformers.AutoTokenizer의 from_pretrained() 함수 실습
# 많은 Tokenizer 가 있으나 이번 실습에서는 3가지 정도를 확인해 보겠습니다.
klue_tokenizer   = AutoTokenizer.from_pretrained('klue/bert-base')
kykim_tokenizer  = AutoTokenizer.from_pretrained('kykim/bert-kor-base')
snunlp_tokenizer = AutoTokenizer.from_pretrained('snunlp/KR-BERT-char16424')

In [2]:
# 호출된 토크나이저의 형태 확인
# 아래 3개의 tokenizer 를 번갈아 주석을 제거해가면서 학인해 보시기 바랍니다.

klue_tokenizer
# kykim_tokenizer
# snunlp_tokenizer

BertTokenizerFast(name_or_path='klue/bert-base', vocab_size=32000, model_max_length=512, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'unk_token': '[UNK]', 'sep_token': '[SEP]', 'pad_token': '[PAD]', 'cls_token': '[CLS]', 'mask_token': '[MASK]'})

# Tokenizer 실습
- 세 tokenizer의 다른 출력 확인

In [3]:
print(klue_tokenizer.tokenize("다른 종류의 tokenizer는 다른 결과를 출력합니다."))
print(kykim_tokenizer.tokenize("다른 종류의 tokenizer는 다른 결과를 출력합니다."))
print(snunlp_tokenizer.tokenize("다른 종류의 tokenizer는 다른 결과를 출력합니다."))

['다른', '종류', '##의', 'to', '##ke', '##n', '##ize', '##r', '##는', '다른', '결과', '##를', '출력', '##합니다', '.']
['다른', '종류의', 'to', '##k', '##en', '##ize', '##r', '##는', '다른', '결과를', '출력', '##합니다', '.']
['다른', '종류', '##의', 't', '##o', '##k', '##en', '##iz', '##er', '##는', '다른', '결과', '##를', '출', '##력', '##합니다', '.']


In [4]:
klue_tokenizer("다른 종류의 tokenizer는 다른 결과를 출력합니다.")

{'input_ids': [2, 3656, 5285, 2079, 7052, 8010, 2012, 30900, 2008, 2259, 3656, 3731, 2138, 11232, 11800, 18, 3], 'token_type_ids': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], 'attention_mask': [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]}

In [5]:
kykim_tokenizer("다른 종류의 tokenizer는 다른 결과를 출력합니다.")

{'input_ids': [2, 14044, 19500, 17528, 8274, 14470, 27108, 8085, 8034, 14044, 18324, 18800, 14015, 2016, 3], 'token_type_ids': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], 'attention_mask': [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]}

In [6]:
snunlp_tokenizer("다른 종류의 tokenizer는 다른 결과를 출력합니다.")

{'input_ids': [2, 264, 3469, 7, 3365, 458, 927, 1520, 7803, 848, 13, 264, 409, 14, 1197, 424, 2384, 5, 3], 'token_type_ids': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], 'attention_mask': [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]}

# Data 확인 및 Loading

- Huggingface의 datasets 라이브러리를 통해 KLUE-YNAT 데이터셋 호출 및 형태 확인

- 데이터셋을 train, validation, test으로 분할

- 데이터 확인

In [7]:
# https://huggingface.co/docs/datasets/index
from datasets import load_dataset

# https://huggingface.co/datasets/klue
# Huggingface의 datasets 라이브러리를 통해 데이터셋 호출
dataset = load_dataset('klue', 'ynat')

Found cached dataset klue (/home/kingstar/.cache/huggingface/datasets/klue/ynat/1.0.0/e0fc3bc3de3eb03be2c92d72fd04a60ecc71903f821619cb28ca0e1e29e4233e)


  0%|          | 0/2 [00:00<?, ?it/s]

In [8]:
# 데이터셋 형태 확인
dataset

DatasetDict({
    train: Dataset({
        features: ['guid', 'title', 'label', 'url', 'date'],
        num_rows: 45678
    })
    validation: Dataset({
        features: ['guid', 'title', 'label', 'url', 'date'],
        num_rows: 9107
    })
})

In [9]:
# dataset 내부에 어떤 종류의 데이터셋이 제공되는지 확인해 봅니다.
dataset.keys()

dict_keys(['train', 'validation'])

In [10]:
# dataset이 train과 validation으로 이루어져 있음.
train_test_dataset = dataset['train']
valid_dataset = dataset['validation']

# 대부분의 경우에는 test셋이 외부에서 제공되거나 숨겨져 있습니다.
# 본 실습에서는, 최종 평가를 해보기 위해서 trainset의 일부를 test로 사용해 보겠습니다.
#
# test 데이터셋 생성을 위해 train을 train과 test로 분할(train 95%, test 5%)
train_test = train_test_dataset.train_test_split(train_size=0.95, shuffle=False)
train_dataset, test_dataset = train_test['train'], train_test['test']

In [13]:
train_dataset['title'][:10]

['유튜브 내달 2일까지 크리에이터 지원 공간 운영',
 '어버이날 맑다가 흐려져…남부지방 옅은 황사',
 '내년부터 국가RD 평가 때 논문건수는 반영 않는다',
 '김명자 신임 과총 회장 원로와 젊은 과학자 지혜 모을 것',
 '회색인간 작가 김동식 양심고백 등 새 소설집 2권 출간',
 '야외서 생방송 하세요…액션캠 전용 요금제 잇따라',
 '월드컵 태극전사 16강 전초기지 레오강 입성종합',
 '미세먼지 속 출근길',
 '왓츠앱稅 230원에 성난 레바논 민심…총리사퇴로 이어져종합2보',
 '베트남 경제 고성장 지속…2분기 GDP 6.71% 성장']

In [11]:
a = klue_tokenizer(
            train_dataset['title'],
            padding='max_length',
            truncation=True,
            return_tensors='pt',
            max_length=128
        )

In [14]:
train_dataset[0]

{'guid': 'ynat-v1_train_00000',
 'title': '유튜브 내달 2일까지 크리에이터 지원 공간 운영',
 'label': 3,
 'url': 'https://news.naver.com/main/read.nhn?mode=LS2D&mid=shm&sid1=105&sid2=227&oid=001&aid=0008508947',
 'date': '2016.06.30. 오전 10:36'}

# 모델 호출 및 확인

- BertForSequenceClassification 모델 호출 및 확인

- 모델의 내부 실습

In [15]:
import torch
from transformers import BertForSequenceClassification

# https://huggingface.co/docs/transformers/model_doc/bert#transformers.BertForSequenceClassification
klue   = BertForSequenceClassification.from_pretrained('klue/bert-base', num_labels=7)
kykim  = BertForSequenceClassification.from_pretrained('kykim/bert-kor-base', num_labels=7)
snunlp = BertForSequenceClassification.from_pretrained('snunlp/KR-BERT-char16424', num_labels=7)

Downloading pytorch_model.bin:   0%|          | 0.00/445M [00:00<?, ?B/s]

Some weights of the model checkpoint at klue/bert-base were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.seq_relationship.bias', 'cls.predictions.transform.dense.bias', 'cls.predictions.decoder.bias', 'cls.predictions.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight', 'cls.predictions.decoder.weight']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized

Downloading pytorch_model.bin:   0%|          | 0.00/476M [00:00<?, ?B/s]

Some weights of the model checkpoint at kykim/bert-kor-base were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.seq_relationship.bias', 'cls.predictions.transform.dense.bias', 'cls.predictions.decoder.bias', 'cls.predictions.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight', 'cls.predictions.decoder.weight']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initia

Downloading pytorch_model.bin:   0%|          | 0.00/397M [00:00<?, ?B/s]

Some weights of the model checkpoint at snunlp/KR-BERT-char16424 were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.seq_relationship.bias', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight', 'cls.predictions.decoder.weight']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkp

In [17]:
# BertForSequenceClassification 모델의 구조 확인
# 각 모델의 구조에는 큰 차이가 없음

klue

BertForSequenceClassification(
  (bert): BertModel(
    (embeddings): BertEmbeddings(
      (word_embeddings): Embedding(32000, 768, padding_idx=0)
      (position_embeddings): Embedding(512, 768)
      (token_type_embeddings): Embedding(2, 768)
      (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
      (dropout): Dropout(p=0.1, inplace=False)
    )
    (encoder): BertEncoder(
      (layer): ModuleList(
        (0-11): 12 x BertLayer(
          (attention): BertAttention(
            (self): BertSelfAttention(
              (query): Linear(in_features=768, out_features=768, bias=True)
              (key): Linear(in_features=768, out_features=768, bias=True)
              (value): Linear(in_features=768, out_features=768, bias=True)
              (dropout): Dropout(p=0.1, inplace=False)
            )
            (output): BertSelfOutput(
              (dense): Linear(in_features=768, out_features=768, bias=True)
              (LayerNorm): LayerNorm((768,), eps=1e-12,

In [18]:
kykim

BertForSequenceClassification(
  (bert): BertModel(
    (embeddings): BertEmbeddings(
      (word_embeddings): Embedding(42000, 768, padding_idx=0)
      (position_embeddings): Embedding(512, 768)
      (token_type_embeddings): Embedding(2, 768)
      (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
      (dropout): Dropout(p=0.1, inplace=False)
    )
    (encoder): BertEncoder(
      (layer): ModuleList(
        (0-11): 12 x BertLayer(
          (attention): BertAttention(
            (self): BertSelfAttention(
              (query): Linear(in_features=768, out_features=768, bias=True)
              (key): Linear(in_features=768, out_features=768, bias=True)
              (value): Linear(in_features=768, out_features=768, bias=True)
              (dropout): Dropout(p=0.1, inplace=False)
            )
            (output): BertSelfOutput(
              (dense): Linear(in_features=768, out_features=768, bias=True)
              (LayerNorm): LayerNorm((768,), eps=1e-12,

In [19]:
snunlp

BertForSequenceClassification(
  (bert): BertModel(
    (embeddings): BertEmbeddings(
      (word_embeddings): Embedding(16424, 768, padding_idx=0)
      (position_embeddings): Embedding(512, 768)
      (token_type_embeddings): Embedding(2, 768)
      (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
      (dropout): Dropout(p=0.1, inplace=False)
    )
    (encoder): BertEncoder(
      (layer): ModuleList(
        (0-11): 12 x BertLayer(
          (attention): BertAttention(
            (self): BertSelfAttention(
              (query): Linear(in_features=768, out_features=768, bias=True)
              (key): Linear(in_features=768, out_features=768, bias=True)
              (value): Linear(in_features=768, out_features=768, bias=True)
              (dropout): Dropout(p=0.1, inplace=False)
            )
            (output): BertSelfOutput(
              (dense): Linear(in_features=768, out_features=768, bias=True)
              (LayerNorm): LayerNorm((768,), eps=1e-12,

# DataModule

In [20]:
import pytorch_lightning as pl
from torch.utils.data import TensorDataset, DataLoader, Dataset
from datasets import load_dataset
from transformers import AutoTokenizer



In [21]:
class YNATDataset(Dataset):
    def __init__(self, input_data, labels):
        self.input_data = input_data
        # label만 있는 데이터 생성
        self.labels = labels

    def __len__(self):
        return len(self.input_data)

    def __getitem__(self, idx):
        input_ids, attention_mask, token_type_ids = self.input_data['input_ids'][idx], self.input_data['attention_mask'][idx], self.input_data['token_type_ids'][idx]
        # item = {'input_ids':input_ids, 'attention_mask':attention_mask, 'token_type_ids':token_type_ids, 'labels':self.label[idx]}

        return input_ids, attention_mask, token_type_ids, self.labels[idx]

In [22]:
# https://pytorch-lightning.readthedocs.io/en/stable/extensions/datamodules.html

class YNATDataModule(pl.LightningDataModule):
    def __init__(self, model_name, max_len=128, truncation=True):
        super().__init__()
        self.max_len = max_len
        self.truncation = truncation

        self.tokenizer = AutoTokenizer.from_pretrained(model_name)

        self.train_df = None
        self.valid_df = None
        self.test_df  = None

        self.train_labels = None
        self.valid_labels = None
        self.test_labels = None

        self.train_dataset = None
        self.valid_dataset = None
        self.test_dataset  = None

    def prepare_data(self):
        klue_df = load_dataset('klue', 'ynat')

        # dataset이 train과 validation으로 이루어져 있음.
        train_test_df = klue_df['train']
        valid_df = klue_df['validation']

        # 대부분의 경우에는 test셋이 외부에서 제공되거나 숨겨져 있습니다.
        # 본 실습에서는, 최종 평가를 해보기 위해서 trainset의 일부를 test로 사용해 보겠습니다.
        #
        # test 데이터셋 생성을 위해 train을 train과 test로 분할(train 95%, test 5%)
        train_test = train_test_df.train_test_split(train_size=0.95, shuffle=False)
        train_df, test_df = train_test['train'], train_test['test']

        # small data 만 활용해서 훈련 원할 시에 
        train_df = train_df[:1000] # 빠른 학습을 위해 학습 데이터셋 일부만 사용


        self.train_df = self.tokenizing(train_df) 
        self.valid_df = self.tokenizing(valid_df)
        self.test_df  = self.tokenizing(test_df)

        self.train_labels = train_df['label']
        self.valid_labels = train_df['label']
        self.test_labels  = train_df['label']

    def tokenizing(self, df):
        data = self.tokenizer(
            df['title'],
            padding='max_length',
            truncation=self.truncation,
            return_tensors='pt',
            max_length=self.max_len
        )

        return data

    def setup(self, stage='fit'):
        if stage == 'fit':
            self.train_dataset = YNATDataset(self.train_df, self.train_labels)
            self.valid_dataset = YNATDataset(self.valid_df, self.valid_labels)
        else:
            self.test_dataset = YNATDataset(self.test_df, self.test_labels)

    def train_dataloader(self):
        return DataLoader(self.train_dataset, batch_size=32, shuffle=True, num_workers=8)

    def val_dataloader(self):
        return DataLoader(self.valid_dataset, batch_size=32, shuffle=False, num_workers=8)

    def test_dataloader(self):
        return DataLoader(self.test_dataset, batch_size=32, shuffle=False, num_workers=8)

# LightningModule

In [23]:
import torch
from transformers import AdamW
from torchmetrics import Accuracy
from transformers import BertForSequenceClassification

2023-04-11 11:22:01.876480: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  SSE4.1 SSE4.2 AVX AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.


In [24]:
# https://pytorch-lightning.readthedocs.io/en/stable/common/lightning_module.html
class YNATModel(pl.LightningModule):
    def __init__(self, model_name):
        super().__init__()

        self.model_name = model_name

        # https://huggingface.co/docs/transformers/model_doc/bert#transformers.BertForSequenceClassification
        self.text_reader = BertForSequenceClassification.from_pretrained(model_name, num_labels=7) # transformer

        self.optimizer = AdamW(self.parameters(), lr=5e-5)
        self.accuracy = Accuracy(task="multiclass", num_classes=7)

    def forward(self, input_ids, attention_mask, token_type_ids, labels):
        outputs = self.text_reader(input_ids=input_ids, attention_mask=attention_mask, token_type_ids=token_type_ids, labels=labels)

        return outputs

    def configure_optimizers(self):
        return self.optimizer

    def training_step(self, batch, batch_idx):
        input_ids, attention_mask, token_type_ids, labels = batch

        # loss 측정이게 얜 지ㅌ
        outputs = self(input_ids, attention_mask, token_type_ids, labels)

        # loss 기록
        self.log('train/loss', outputs.loss)

        return outputs.loss

    def validation_step(self, batch, batch_idx):
        input_ids, attention_mask, token_type_ids, labels = batch

        # loss 측정
        outputs = self(input_ids, attention_mask, token_type_ids, labels)

        # loss 기록
        self.log('valid/loss', outputs.loss)

        return outputs.loss

    def test_step(self, batch, batch_idx):
        input_ids, attention_mask, token_type_ids, labels = batch

        # input 모델에 입력 후 결과 확인
        outputs = self(input_ids, attention_mask, token_type_ids, labels)

        # one-hot encoding
        preds = outputs.logits.argmax(1)
        
        # accuracy 측정
        accuracy = self.accuracy(preds, labels)
        self.log('test/acc', accuracy, on_epoch=True) # on_epoch=True 옵션을 넣으면 각 step 결과의 평균값을 계산해줌
        
        return accuracy

# Train

- pl.Trainer 를 통해 실제 모델 훈련



In [27]:
gpus = torch.cuda.device_count()
gpus

1

In [43]:
# https://pytorch-lightning.readthedocs.io/en/stable/common/trainer.html

# 모델과 동일한 DataModule 생성
klue_datamodule  = YNATDataModule(model_name='klue/bert-base')

# 모델 생성
klue_model = YNATModel(model_name='klue/bert-base')

# gpu 활용을 위해 가용 gpu 확인
gpus = torch.cuda.device_count()

# trainer 생성 및 활용
klue_trainer = pl.Trainer(
    devices=gpus,
    accelerator='gpu',
    max_epochs=10
)

klue_trainer.fit(klue_model, datamodule=klue_datamodule)

Some weights of the model checkpoint at klue/bert-base were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.seq_relationship.bias', 'cls.predictions.transform.dense.bias', 'cls.predictions.decoder.bias', 'cls.predictions.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight', 'cls.predictions.decoder.weight']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized

  0%|          | 0/2 [00:00<?, ?it/s]

You are using a CUDA device ('NVIDIA GeForce RTX 3060 Ti') that has Tensor Cores. To properly utilize them, you should set `torch.set_float32_matmul_precision('medium' | 'high')` which will trade-off precision for performance. For more details, read https://pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name        | Type                          | Params
--------------------------------------------------------------
0 | text_reader | BertForSequenceClassification | 110 M 
1 | accuracy    | MulticlassAccuracy            | 0     
--------------------------------------------------------------
110 M     Trainable params
0         Non-trainable params
110 M     Total params
442.491   Total estimated model params size (MB)


Sanity Checking: 0it [00:00, ?it/s]

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Av

Training: 0it [00:00, ?it/s]

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Av

Validation: 0it [00:00, ?it/s]

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Av

Validation: 0it [00:00, ?it/s]

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Av

Validation: 0it [00:00, ?it/s]

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Av

Validation: 0it [00:00, ?it/s]

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Av

Validation: 0it [00:00, ?it/s]

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Av

Validation: 0it [00:00, ?it/s]

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Av

Validation: 0it [00:00, ?it/s]

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Av

Validation: 0it [00:00, ?it/s]

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Av

Validation: 0it [00:00, ?it/s]

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Av

Validation: 0it [00:00, ?it/s]

`Trainer.fit` stopped: `max_epochs=10` reached.


In [32]:
# 모델 생성
kykim_model = YNATModel(model_name='kykim/bert-kor-base')

# 모델과 동일한 DataModule 생성
kykim_datamodule = YNATDataModule(model_name='kykim/bert-kor-base')

# gpu 활용을 위해 가용 gpu 확인
gpus = torch.cuda.device_count()

# trainer 생성 및 활용
kykim_trainer = pl.Trainer(
    devices=gpus,
    accelerator='gpu',
    max_epochs=1
)

kykim_trainer.fit(kykim_model, datamodule=kykim_datamodule)

Some weights of the model checkpoint at kykim/bert-kor-base were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.seq_relationship.bias', 'cls.predictions.transform.dense.bias', 'cls.predictions.decoder.bias', 'cls.predictions.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight', 'cls.predictions.decoder.weight']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initia

  0%|          | 0/2 [00:00<?, ?it/s]

You are using a CUDA device ('NVIDIA GeForce RTX 3060 Ti') that has Tensor Cores. To properly utilize them, you should set `torch.set_float32_matmul_precision('medium' | 'high')` which will trade-off precision for performance. For more details, read https://pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name        | Type                          | Params
--------------------------------------------------------------
0 | text_reader | BertForSequenceClassification | 118 M 
1 | accuracy    | MulticlassAccuracy            | 0     
--------------------------------------------------------------
118 M     Trainable params
0         Non-trainable params
118 M     Total params
473.211   Total estimated model params size (MB)


Sanity Checking: 0it [00:00, ?it/s]

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Av

Training: 0it [00:00, ?it/s]

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Av

Validation: 0it [00:00, ?it/s]

`Trainer.fit` stopped: `max_epochs=1` reached.


huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)


In [31]:
# 모델 생성
snunlp_model = YNATModel(model_name='snunlp/KR-BERT-char16424')

# 모델과 동일한 DataModule 생성

snunlp_datamodule = YNATDataModule(model_name='snunlp/KR-BERT-char16424')

# gpu 활용을 위해 가용 gpu 확인
gpus = torch.cuda.device_count()

# trainer 생성 및 활용
snunlp_trainer = pl.Trainer(
    devices=gpus,
    accelerator='gpu',
    max_epochs=1
)

snunlp_model.train()
snunlp_trainer.fit(snunlp_model, datamodule=snunlp_datamodule)

Some weights of the model checkpoint at snunlp/KR-BERT-char16424 were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.seq_relationship.bias', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight', 'cls.predictions.decoder.weight']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkp

  0%|          | 0/2 [00:00<?, ?it/s]

You are using a CUDA device ('NVIDIA GeForce RTX 3060 Ti') that has Tensor Cores. To properly utilize them, you should set `torch.set_float32_matmul_precision('medium' | 'high')` which will trade-off precision for performance. For more details, read https://pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name        | Type                          | Params
--------------------------------------------------------------
0 | text_reader | BertForSequenceClassification | 98.7 M
1 | accuracy    | MulticlassAccuracy            | 0     
--------------------------------------------------------------
98.7 M    Trainable params
0         Non-trainable params
98.7 M    Total params
394.641   Total estimated model params size (MB)


Sanity Checking: 0it [00:00, ?it/s]

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Av

	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)


  rank_zero_warn(


Training: 0it [00:00, ?it/s]

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Av

Validation: 0it [00:00, ?it/s]

`Trainer.fit` stopped: `max_epochs=1` reached.


huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)


# Evaluation

- test 데이터셋과 Trainer.test() 함수를 통해 evaluation 실행

In [44]:
klue_trainer.test(klue_model, datamodule=klue_datamodule)

Found cached dataset klue (/home/kingstar/.cache/huggingface/datasets/klue/ynat/1.0.0/e0fc3bc3de3eb03be2c92d72fd04a60ecc71903f821619cb28ca0e1e29e4233e)


  0%|          | 0/2 [00:00<?, ?it/s]

You are using a CUDA device ('NVIDIA GeForce RTX 3060 Ti') that has Tensor Cores. To properly utilize them, you should set `torch.set_float32_matmul_precision('medium' | 'high')` which will trade-off precision for performance. For more details, read https://pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Av

Testing: 0it [00:00, ?it/s]

────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
       Test metric             DataLoader 0
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
        test/acc            0.6666666865348816
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)


[{'test/acc': 0.6666666865348816}]

In [34]:
kykim_trainer.test(kykim_model, datamodule=kykim_datamodule)

Found cached dataset klue (/home/kingstar/.cache/huggingface/datasets/klue/ynat/1.0.0/e0fc3bc3de3eb03be2c92d72fd04a60ecc71903f821619cb28ca0e1e29e4233e)


  0%|          | 0/2 [00:00<?, ?it/s]

You are using a CUDA device ('NVIDIA GeForce RTX 3060 Ti') that has Tensor Cores. To properly utilize them, you should set `torch.set_float32_matmul_precision('medium' | 'high')` which will trade-off precision for performance. For more details, read https://pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Av

Testing: 0it [00:00, ?it/s]

────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
       Test metric             DataLoader 0
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
        test/acc            0.6666666865348816
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────


[{'test/acc': 0.6666666865348816}]

In [35]:
snunlp_trainer.test(snunlp_model, datamodule=snunlp_datamodule)

Found cached dataset klue (/home/kingstar/.cache/huggingface/datasets/klue/ynat/1.0.0/e0fc3bc3de3eb03be2c92d72fd04a60ecc71903f821619cb28ca0e1e29e4233e)


  0%|          | 0/2 [00:00<?, ?it/s]

You are using a CUDA device ('NVIDIA GeForce RTX 3060 Ti') that has Tensor Cores. To properly utilize them, you should set `torch.set_float32_matmul_precision('medium' | 'high')` which will trade-off precision for performance. For more details, read https://pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Av

Testing: 0it [00:00, ?it/s]

────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
       Test metric             DataLoader 0
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
        test/acc            0.6666666865348816
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────


[{'test/acc': 0.6666666865348816}]

실제로 모델이 문제를 잘 푸는지 확인

In [45]:
# category = ["정치", "경제", "사회", "생활문화", "세계", "IT과학", "스포츠"]
category_dict = {
                  0 : '정치', 
                  1 : '경제',
                  2 : '사회',
                  3 : '생활문화',
                  4 : '세계',
                  5 : 'IT과학',
                  6 : '스포츠'
                }
news_title = '손흥민 "콘테 왜 나에게 메짤라 역할 부여하는지 이해 못해"'

model  = klue_model.cpu()
tokens = klue_datamodule.tokenizer(
            news_title,
            padding='max_length',
            truncation=True,
            return_tensors='pt',
            max_length=128
        )

outputs = model.text_reader(input_ids=tokens['input_ids'], token_type_ids=tokens['token_type_ids'], attention_mask=tokens['attention_mask'])
best_pred_class_label_id = outputs['logits'].argmax(1).item()
print("Predicted : ", category_dict[best_pred_class_label_id])

Predicted :  생활문화


###**콘텐츠 라이선스**

<font color='red'><b>**WARNING**</b></font> : **본 교육 콘텐츠의 지식재산권은 재단법인 네이버커넥트에 귀속됩니다. 본 콘텐츠를 어떠한 경로로든 외부로 유출 및 수정하는 행위를 엄격히 금합니다.** 다만, 비영리적 교육 및 연구활동에 한정되어 사용할 수 있으나 재단의 허락을 받아야 합니다. 이를 위반하는 경우, 관련 법률에 따라 책임을 질 수 있습니다.

