# Parameter-Efficient Fine-Tuning (PEFT)

https://github.com/huggingface/peft

Supported methods:

1. LoRA: [LORA: LOW-RANK ADAPTATION OF LARGE LANGUAGE MODELS](https://arxiv.org/abs/2106.09685)
2. Prefix Tuning: [Prefix-Tuning: Optimizing Continuous Prompts for Generation](https://aclanthology.org/2021.acl-long.353/), [P-Tuning v2: Prompt Tuning Can Be Comparable to Fine-tuning Universally Across Scales and Tasks](https://arxiv.org/pdf/2110.07602.pdf)
3. P-Tuning: [GPT Understands, Too](https://arxiv.org/abs/2103.10385)
4. Prompt Tuning: [The Power of Scale for Parameter-Efficient Prompt Tuning](https://arxiv.org/abs/2104.08691)
5. AdaLoRA: [Adaptive Budget Allocation for Parameter-Efficient Fine-Tuning](https://arxiv.org/abs/2303.10512)  
6. $(IA)^3$: [Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learning](https://arxiv.org/abs/2205.05638)
7. MultiTask Prompt Tuning: [Multitask Prompt Tuning Enables Parameter-Efficient Transfer Learning](https://arxiv.org/abs/2303.02861)
8. LoHa: [FedPara: Low-Rank Hadamard Product for Communication-Efficient Federated Learning](https://arxiv.org/abs/2108.06098)

<div><img src="https://ar5iv.labs.arxiv.org/html/2106.09685/assets/x1.png" width="20%"/></div>


In [1]:
%pip install --quiet transformers peft

Note: you may need to restart the kernel to use updated packages.


In [2]:
from peft import get_peft_model, LoraConfig, TaskType
from transformers import AutoModelForCausalLM
import torch

model_name = "ai-forever/ruGPT-3.5-13B"
model = AutoModelForCausalLM.from_pretrained(model_name, load_in_8bit=True, torch_dtype=torch.float16)

peft_config = LoraConfig(
    lora_alpha=16,
    lora_dropout=0.1,
    target_modules=["c_attn"],
    task_type=TaskType.CAUSAL_LM,
)

model = get_peft_model(model, peft_config)
model.print_trainable_parameters()

model_size = sum(t.numel() for t in model.parameters())
print(f"model_size: {model_size/1000**2:.1f}M")

Loading checkpoint shards:   0%|          | 0/6 [00:00<?, ?it/s]

trainable params: 6,553,600 || all params: 12,860,016,640 || trainable%: 0.05096105381089149
model_size: 12860.0M


# Посмотрим, что получилось

In [3]:
model

PeftModelForCausalLM(
  (base_model): LoraModel(
    (model): GPT2LMHeadModel(
      (transformer): GPT2Model(
        (wte): Embedding(50272, 5120)
        (wpe): Embedding(2048, 5120)
        (drop): Dropout(p=0.1, inplace=False)
        (h): ModuleList(
          (0-39): 40 x GPT2Block(
            (ln_1): LayerNorm((5120,), eps=1e-05, elementwise_affine=True)
            (attn): GPT2Attention(
              (c_attn): Linear8bitLt(
                in_features=5120, out_features=15360, bias=True
                (lora_dropout): ModuleDict(
                  (default): Dropout(p=0.1, inplace=False)
                )
                (lora_A): ModuleDict(
                  (default): Linear(in_features=5120, out_features=8, bias=False)
                )
                (lora_B): ModuleDict(
                  (default): Linear(in_features=8, out_features=15360, bias=False)
                )
                (lora_embedding_A): ParameterDict()
                (lora_embedding_B): ParameterDi

In [4]:
peft_config.to_dict()

{'peft_type': <PeftType.LORA: 'LORA'>,
 'auto_mapping': None,
 'base_model_name_or_path': 'ai-forever/ruGPT-3.5-13B',
 'revision': None,
 'task_type': <TaskType.CAUSAL_LM: 'CAUSAL_LM'>,
 'inference_mode': False,
 'r': 8,
 'target_modules': ['c_attn'],
 'lora_alpha': 16,
 'lora_dropout': 0.1,
 'fan_in_fan_out': False,
 'bias': 'none',
 'modules_to_save': None,
 'init_lora_weights': True,
 'layers_to_transform': None,
 'layers_pattern': None}