필수 라이브러리 설치

In [None]:
!pip install -U peft transformers datasets accelerate evaluate -qqq

# 파인튜닝 with PEFT and LoRA

Parameter-Efficient Fine-Tuning (PEFT)은 파인튜닝 학습 경량화를 지원하는 라이브러리입니다.

# 1.빠른 시작

## 1.1. PeftConfig

각 🤗 PEFT 메서드는 PeftModel을 구축하기 위한 모든 중요한 매개변수를 저장하는 PeftConfig 클래스에 의해 정의됩니다.

LoRA를 사용할 예정이므로 LoraConfig 클래스를 로드하고 생성해야 합니다. LoraConfig 내에서 다음 파라미터를 지정합니다:

```
- task_type : 이 경우 시퀀스 간 언어 모델링
- inference_mode : 추론에 모델을 사용할지 여부
- r : Low Rank 행렬의 차원
- lora_alpha : Low Rank 행렬의 스케일링 계수
- lora_dropout : LoRA 레이어의 드롭아웃 확률
```

In [2]:
from peft import LoraConfig, TaskType
peft_config = LoraConfig(
    task_type=TaskType.CAUSAL_LM,
    inference_mode=False,
    r=8,
    lora_alpha=32,
    lora_dropout=0.1
)

peft_config

LoraConfig(peft_type=<PeftType.LORA: 'LORA'>, auto_mapping=None, base_model_name_or_path=None, revision=None, task_type=<TaskType.CAUSAL_LM: 'CAUSAL_LM'>, inference_mode=False, r=8, target_modules=None, lora_alpha=32, lora_dropout=0.1, fan_in_fan_out=False, bias='none', use_rslora=False, modules_to_save=None, init_lora_weights=True, layers_to_transform=None, layers_pattern=None, rank_pattern={}, alpha_pattern={}, megatron_config=None, megatron_core='megatron.core', loftq_config={}, use_dora=False, layer_replication=None)

## 1.2. PeftModel

### 베이스 모델 로딩

In [4]:
from transformers import AutoModelForCausalLM

model_name_or_path = 'gpt2'
model = AutoModelForCausalLM.from_pretrained(model_name_or_path)

### get_peft_model
- base model에 peft_config에 해당하는 lora 경량화 모델로 세팅

In [5]:
from peft import get_peft_model

model = get_peft_model(model, peft_config) 



In [7]:
model.print_trainable_parameters()

trainable params: 294,912 || all params: 124,734,720 || trainable%: 0.23643136409814364


In [8]:
model

PeftModelForCausalLM(
  (base_model): LoraModel(
    (model): GPT2LMHeadModel(
      (transformer): GPT2Model(
        (wte): Embedding(50257, 768)
        (wpe): Embedding(1024, 768)
        (drop): Dropout(p=0.1, inplace=False)
        (h): ModuleList(
          (0-11): 12 x GPT2Block(
            (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
            (attn): GPT2Attention(
              (c_attn): lora.Linear(
                (base_layer): Conv1D()
                (lora_dropout): ModuleDict(
                  (default): Dropout(p=0.1, inplace=False)
                )
                (lora_A): ModuleDict(
                  (default): Linear(in_features=768, out_features=8, bias=False)
                )
                (lora_B): ModuleDict(
                  (default): Linear(in_features=8, out_features=2304, bias=False)
                )
                (lora_embedding_A): ParameterDict()
                (lora_embedding_B): ParameterDict()
              )
          

## 1.3. Save and load a model

In [9]:
model.save_pretrained('./../output_dir')


In [11]:
ls -al './../output_dir'

total 1168
drwxrwxrwx 1 cslee cslee    4096 Apr 14 01:39 [0m[34;42m.[0m/
drwxrwxrwx 1 cslee cslee    4096 Apr 14 01:39 [34;42m..[0m/
drwxrwxrwx 1 cslee cslee    4096 Apr 14 01:39 [34;42m.ipynb_checkpoints[0m/
-rwxrwxrwx 1 cslee cslee    5078 Apr 14 01:39 [01;32mREADME.md[0m*
-rwxrwxrwx 1 cslee cslee     613 Apr 14 01:39 [01;32madapter_config.json[0m*
-rwxrwxrwx 1 cslee cslee 1182680 Apr 14 01:39 [01;32madapter_model.safetensors[0m*


In [14]:
# if pushing to Hub
from huggingface_hub import notebook_login

notebook_login()

VBox(children=(HTML(value='<center> <img\nsrc=https://huggingface.co/front/assets/huggingface_logo-noborder.sv…

### 허브에 저장

In [15]:
# 빈 경량화 모델 저장
model.push_to_hub('my_awesome_peft_model')

adapter_model.safetensors:   0%|          | 0.00/1.18M [00:00<?, ?B/s]

CommitInfo(commit_url='https://huggingface.co/minerba/my_awesome_peft_model/commit/833c9d61cabb0c43ded68f37bbda442e053f2add', commit_message='Upload model', commit_description='', oid='833c9d61cabb0c43ded68f37bbda442e053f2add', pr_url=None, pr_revision=None, pr_num=None)

### PEFT 모델 로딩

In [16]:
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel, PeftConfig
import torch

peft_model_id = "minerba/my_awesome_peft_model"
config = PeftConfig.from_pretrained(peft_model_id)
config

adapter_config.json:   0%|          | 0.00/613 [00:00<?, ?B/s]

LoraConfig(peft_type=<PeftType.LORA: 'LORA'>, auto_mapping=None, base_model_name_or_path='gpt2', revision=None, task_type='CAUSAL_LM', inference_mode=True, r=8, target_modules={'c_attn'}, lora_alpha=32, lora_dropout=0.1, fan_in_fan_out=True, bias='none', use_rslora=False, modules_to_save=None, init_lora_weights=True, layers_to_transform=None, layers_pattern=None, rank_pattern={}, alpha_pattern={}, megatron_config=None, megatron_core='megatron.core', loftq_config={}, use_dora=False, layer_replication=None)

In [18]:
# base모델 가져오기
model = AutoModelForCausalLM.from_pretrained(
    config.base_model_name_or_path,
)

# peft모델 가져오기
model = PeftModel.from_pretrained(model, peft_model_id)

tokenizer = AutoTokenizer.from_pretrained(
    config.base_model_name_or_path
)

tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/1.04M [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

In [20]:
model = model.to('cuda')
model.eval()

PeftModelForCausalLM(
  (base_model): LoraModel(
    (model): GPT2LMHeadModel(
      (transformer): GPT2Model(
        (wte): Embedding(50257, 768)
        (wpe): Embedding(1024, 768)
        (drop): Dropout(p=0.1, inplace=False)
        (h): ModuleList(
          (0-11): 12 x GPT2Block(
            (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
            (attn): GPT2Attention(
              (c_attn): lora.Linear(
                (base_layer): Conv1D()
                (lora_dropout): ModuleDict(
                  (default): Dropout(p=0.1, inplace=False)
                )
                (lora_A): ModuleDict(
                  (default): Linear(in_features=768, out_features=8, bias=False)
                )
                (lora_B): ModuleDict(
                  (default): Linear(in_features=8, out_features=2304, bias=False)
                )
                (lora_embedding_A): ParameterDict()
                (lora_embedding_B): ParameterDict()
              )
          

In [21]:
inputs = tokenizer("what is lanugage model?", return_tensors="pt")

with torch.no_grad():
      outputs = model.generate(input_ids=inputs["input_ids"].to("cuda"), max_new_tokens=100)
      print(tokenizer.batch_decode(outputs.detach().cpu().numpy(), skip_special_tokens=True)[0])



The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


what is lanugage model?

The lanugage model is a way to define a set of rules for a given type of data. It is a way to define a set of rules for a given type of data. It is a way to define a set of rules for a given type of data. It is a way to define a set of rules for a given type of data. It is a way to define a set of rules for a given type of data. It is a way to define a set of


# 2.LoRA 모델 학습하기

## 2.1. 준비

In [25]:
from datasets import load_dataset
from transformers import (
    AutoModelForCausalLM,
    AutoTokenizer,
    PreTrainedTokenizerFast,
    DataCollatorForTokenClassification,
    TrainingArguments,
    Trainer,
)
from peft import get_peft_config, PeftModel, PeftConfig, get_peft_model, LoraConfig, TaskType
import evaluate
import torch
import numpy as np


lr = 1e-3
batch_size = 16
num_epochs = 3

## 2.2. 데이터 로딩

In [27]:
# dataset = load_dataset("timdettmers/openassistant-guanaco", split="train[:100]")
dataset = load_dataset("nlpai-lab/openassistant-guanaco-ko", split="train[:100]")
dataset


Dataset({
    features: ['text', 'id'],
    num_rows: 100
})

## 2.3. 데이터 전처리

In [42]:
# tokenizer = AutoTokenizer.from_pretrained("gpt2")
# model = AutoModelForCausalLM.from_pretrained('gpt2')

from transformers import PreTrainedTokenizerFast

model_id = 'skt/kogpt2-base-v2'

tokenizer = PreTrainedTokenizerFast.from_pretrained(
    model_id,
    bos_token='</s>', eos_token='</s>', unk_token='<unk>',
    pad_token='<pad>', mask_token='<mask>'
)

The tokenizer class you load from this checkpoint is not the same type as the class this function is called from. It may result in unexpected tokenization. 
The tokenizer class you load from this checkpoint is 'GPT2Tokenizer'. 
The class this function is called from is 'PreTrainedTokenizerFast'.


In [43]:
model = AutoModelForCausalLM.from_pretrained(model_id)

In [44]:
tokenizer.tokenize("안녕하세요. 한국어 GPT-2 입니다.😤:)l^o")

['▁안녕',
 '하',
 '세',
 '요.',
 '▁한국어',
 '▁G',
 'P',
 'T',
 '-2',
 '▁입',
 '니다.',
 '😤',
 ':)',
 'l^o']

In [45]:
# tokenizer.pad_token = tokenizer.eos_token

def tokenize_function(examples):
	output = tokenizer(
      examples["text"],
      padding="max_length",
      truncation=True,
      max_length=200)
	return output


tokenized_datasets = dataset.map(tokenize_function, batched=True, remove_columns=["text", "id"])
tokenized_datasets

Dataset({
    features: ['input_ids', 'token_type_ids', 'attention_mask'],
    num_rows: 100
})

In [46]:
from transformers import DataCollatorForLanguageModeling

data_collator = DataCollatorForLanguageModeling(tokenizer=tokenizer, mlm=False)

## 2.4. 훈련

### LoraConfig

In [47]:
from peft import LoraConfig, TaskType

peft_config = LoraConfig(
    task_type=TaskType.CAUSAL_LM,
    inference_mode=False,
    r=8,
    lora_alpha=32,
    lora_dropout=0.1
)

In [48]:
model = get_peft_model(model, peft_config)

model.print_trainable_parameters()

trainable params: 294,912 || all params: 125,458,944 || trainable%: 0.2350665409713635




In [49]:
training_args = TrainingArguments(
    output_dir="kogpt2-lora",
    learning_rate=lr,
    per_device_train_batch_size=32,
    num_train_epochs=num_epochs,
    weight_decay=0.01,
    save_strategy="epoch",
)

In [50]:
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=tokenized_datasets,
    tokenizer=tokenizer,
    data_collator=data_collator,
)

trainer.train()

dataloader_config = DataLoaderConfiguration(dispatch_batches=None, split_batches=False, even_batches=True, use_seedable_sampler=True)


Step,Training Loss


TrainOutput(global_step=12, training_loss=3.8848276138305664, metrics={'train_runtime': 6.3787, 'train_samples_per_second': 47.031, 'train_steps_per_second': 1.881, 'total_flos': 30726328320000.0, 'train_loss': 3.8848276138305664, 'epoch': 3.0})

## 2.5. 모델 공유 (Hub)

In [38]:
from huggingface_hub import notebook_login

notebook_login()

VBox(children=(HTML(value='<center> <img\nsrc=https://huggingface.co/front/assets/huggingface_logo-noborder.sv…

In [51]:
model.push_to_hub("kogpt2-lora")

README.md:   0%|          | 0.00/5.17k [00:00<?, ?B/s]

adapter_model.safetensors:   0%|          | 0.00/1.18M [00:00<?, ?B/s]

CommitInfo(commit_url='https://huggingface.co/minerba/kogpt2-lora/commit/d36092024f65c75d22f37b76a13f91532da7d47a', commit_message='Upload model', commit_description='', oid='d36092024f65c75d22f37b76a13f91532da7d47a', pr_url=None, pr_revision=None, pr_num=None)

## 2.6. 추론

### PeftConfig

In [52]:
peft_model_id = 'minerba/kogpt2-lora'

config = PeftConfig.from_pretrained(peft_model_id)
config

adapter_config.json:   0%|          | 0.00/627 [00:00<?, ?B/s]

LoraConfig(peft_type=<PeftType.LORA: 'LORA'>, auto_mapping=None, base_model_name_or_path='skt/kogpt2-base-v2', revision=None, task_type='CAUSAL_LM', inference_mode=True, r=8, target_modules={'c_attn'}, lora_alpha=32, lora_dropout=0.1, fan_in_fan_out=True, bias='none', use_rslora=False, modules_to_save=None, init_lora_weights=True, layers_to_transform=None, layers_pattern=None, rank_pattern={}, alpha_pattern={}, megatron_config=None, megatron_core='megatron.core', loftq_config={}, use_dora=False, layer_replication=None)

### 모델과 토크나이저 로딩

In [54]:
inference_model = AutoModelForCausalLM.from_pretrained(
    config.base_model_name_or_path, use_auth_token=True
)

tokenizer = PreTrainedTokenizerFast.from_pretrained(config.base_model_name_or_path)

The tokenizer class you load from this checkpoint is not the same type as the class this function is called from. It may result in unexpected tokenization. 
The tokenizer class you load from this checkpoint is 'GPT2Tokenizer'. 
The class this function is called from is 'PreTrainedTokenizerFast'.


In [55]:
# base모델과 학습된 peft모델을 가져옴
model = PeftModel.from_pretrained(inference_model, peft_model_id)

In [57]:
from transformers import pipeline

# 1) 질의 문장
input_text = "간호사는 어떤 업무를 하나요?"

# 2) 생성 옵션을 설정합니다.
generator = pipeline(
    'text-generation',
    model=model,
    tokenizer=tokenizer,
)

# 3) 텍스트를 생성합니다.
output = generator(
    input_text,
    max_length=200,
    do_sample=True,
    temperature=0.7
)

# 4) 생성된 텍스트를 출력합니다.
print(output[0]['generated_text'])


The model 'PeftModelForCausalLM' is not supported for text-generation. Supported models are ['BartForCausalLM', 'BertLMHeadModel', 'BertGenerationDecoder', 'BigBirdForCausalLM', 'BigBirdPegasusForCausalLM', 'BioGptForCausalLM', 'BlenderbotForCausalLM', 'BlenderbotSmallForCausalLM', 'BloomForCausalLM', 'CamembertForCausalLM', 'LlamaForCausalLM', 'CodeGenForCausalLM', 'CohereForCausalLM', 'CpmAntForCausalLM', 'CTRLLMHeadModel', 'Data2VecTextForCausalLM', 'ElectraForCausalLM', 'ErnieForCausalLM', 'FalconForCausalLM', 'FuyuForCausalLM', 'GemmaForCausalLM', 'GitForCausalLM', 'GPT2LMHeadModel', 'GPT2LMHeadModel', 'GPTBigCodeForCausalLM', 'GPTNeoForCausalLM', 'GPTNeoXForCausalLM', 'GPTNeoXJapaneseForCausalLM', 'GPTJForCausalLM', 'LlamaForCausalLM', 'MambaForCausalLM', 'MarianForCausalLM', 'MBartForCausalLM', 'MegaForCausalLM', 'MegatronBertForCausalLM', 'MistralForCausalLM', 'MixtralForCausalLM', 'MptForCausalLM', 'MusicgenForCausalLM', 'MusicgenMelodyForCausalLM', 'MvpForCausalLM', 'OpenLlam

간호사는 어떤 업무를 하나요?" "저는 어떤 일을 하고 싶고 어떤 업무를 하고 싶죠?" "저는 어떤 일을 하고 싶을까요?" "저는 어떤 업무를 하고 싶죠?" "저는 어떤 일을 하고 싶은지요?" "저는 어떤 일을 하고 싶으세요?" "저는 어떤 일을 하고 싶지요?" "저는 어떤 일을 하고 싶죠?" "저는 어떤 일을 하고 싶어요.
저는 어떤 일을 하고 싶죠?" "저는 어떤 일을 하고 싶죠?" "저는 어떤 일을 하고 싶은지요?" "저는 어떤 일을 하고 싶죠?" "저는 어떤 일을 하고 싶고 어떤 일을 하고 싶죠?" "저는 어떤 일을 하고 싶은지요?" "저는 어떤 일을 하고 싶죠?" "저는 어떤 일을 하고 싶죠?" "저는 어떤 일을 하고 싶죠?" "저는 어떤 일을 하고 싶죠?" "저는 어떤 일을 하고 싶죠?" "저는 어떤 일을 하고 싶
