<a href="https://colab.research.google.com/github/TonyLv/MyCode/blob/AI/Peft_Issue.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
!pip install transformers
!pip install pref
!pip install torch

In [None]:
!pip install peft

This quicktour will show you PEFT’s main features and help you train large pretrained models that would typically be inaccessible on consumer devices. You’ll see how to train the 1.2B parameter **bigscience/mt0-large** model with LoRA to generate a classification label and use it for inference.

# PeftConfig

In [3]:
from peft import LoraConfig, TaskType

In [4]:
peft_config = LoraConfig(task_type=TaskType.SEQ_2_SEQ_LM, inference_mode=False, r=8, lora_alpha=32, lora_dropout=0.1)

# PeftModel

In [5]:
from transformers import AutoModelForSeq2SeqLM

In [6]:
model_name_or_path = "bigscience/mt0-large"
tokenizer_name_or_path = "bigscience/mt0-large"
model = AutoModelForSeq2SeqLM.from_pretrained(model_name_or_path)

Downloading (…)lve/main/config.json:   0%|          | 0.00/800 [00:00<?, ?B/s]

Downloading model.safetensors:   0%|          | 0.00/4.92G [00:00<?, ?B/s]

Wrap your base model and **peft_config** with the **get_peft_model** function to create a PeftModel.

To get a sense of the number of trainable parameters in your model, use the **print_trainable_parameters** method.

In this case, you’re only training 0.19% of the model’s parameters!

In [7]:
from peft import get_peft_model

In [8]:
model = get_peft_model(model, peft_config)
model.print_trainable_parameters()

trainable params: 2,359,296 || all params: 1,231,940,608 || trainable%: 0.19151053100118282


# Save and load a model

In [9]:
!pwd

/content


In [10]:
model.save_pretrained("my_model")

In [11]:
!ls -l

total 8
drwxr-xr-x 2 root root 4096 Aug  7 01:05 my_model
drwxr-xr-x 1 root root 4096 Aug  3 13:45 sample_data


This only saves the incremental PEFT weights that were trained, meaning it is super efficient to store, transfer, and load.

For example, this **bigscience/T0_3B** model trained with LoRA on the twitter_complaints subset of the RAFT dataset only contains two files: **adapter_config.json** and **adapter_model.bin**. The latter file is just 19MB!

In [12]:
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
from peft import PeftModel, PeftConfig

In [13]:
peft_model_id = "my_model/"
config = PeftConfig.from_pretrained(peft_model_id)
model = AutoModelForSeq2SeqLM.from_pretrained(config.base_model_name_or_path)
model = PeftModel.from_pretrained(model, peft_model_id)
tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path)

Downloading (…)okenizer_config.json:   0%|          | 0.00/430 [00:00<?, ?B/s]

Downloading spiece.model:   0%|          | 0.00/4.31M [00:00<?, ?B/s]

Downloading tokenizer.json:   0%|          | 0.00/16.3M [00:00<?, ?B/s]

Downloading (…)cial_tokens_map.json:   0%|          | 0.00/74.0 [00:00<?, ?B/s]