<a href="https://colab.research.google.com/github/rahiakela/transformers-research-and-practice/blob/main/peft_huggingface_guide/01_peft_basic.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

##Setup

In [None]:
!pip install peft

In [10]:
from peft import LoraConfig, TaskType
from peft import get_peft_model
from peft import AutoPeftModelForCausalLM

import torch

from transformers import AutoTokenizer
from transformers import AutoModelForSeq2SeqLM
from transformers import TrainingArguments, Trainer

from huggingface_hub import notebook_login

##Step-1: PEFT Config

Each PEFT method is defined by a `PeftConfig` class that stores all the important parameters for building a `PeftModel`.

In [3]:
# load and create a LoraConfig
peft_config = LoraConfig(
    task_type=TaskType.SEQ_2_SEQ_LM,
    inference_mode=False,
    r=8,
    lora_alpha=32,
    lora_dropout=0.1
)

In [None]:
# Load the base model you want to finetune
model = AutoModelForSeq2SeqLM.from_pretrained("bigscience/mt0-large")

In [5]:
# Wrap the base model and peft_config with the get_peft_model() function to create a PeftModel.
peft_model = get_peft_model(model, peft_config)

peft_model.print_trainable_parameters()

trainable params: 2,359,296 || all params: 1,231,940,608 || trainable%: 0.19151053100118282


##Step-2: Fine-tuning

In [8]:
training_args = TrainingArguments(
    output_dir="rahiakela/bigscience/mt0-large-lora",
    learning_rate=1e-3,
    per_device_train_batch_size=32,
    per_device_eval_batch_size=32,
    num_train_epochs=2,
    weight_decay=0.01,
    evaluation_strategy="epoch",
    save_strategy="epoch",
    load_best_model_at_end=True,
)

In [None]:
trainer = Trainer(
    model=peft_model,
    args=training_args,
    train_dataset=tokenized_datasets["train"],
    eval_dataset=tokenized_datasets["test"],
    tokenizer=tokenizer,
    data_collator=data_collator,
    compute_metrics=compute_metrics,
)

trainer.train()

##Step-3: Inference

In [None]:
model = AutoPeftModelForCausalLM.from_pretrained("ybelkada/opt-350m-lora")
tokenizer = AutoTokenizer.from_pretrained("facebook/opt-350m")

model = model.to("cuda")
model.eval()
inputs = tokenizer("Preheat the oven to 350 degrees and place the cookie dough", return_tensors="pt")

outputs = model.generate(input_ids=inputs["input_ids"].to("cuda"), max_new_tokens=50)

In [12]:
print(tokenizer.batch_decode(outputs.detach().cpu().numpy(), skip_special_tokens=True)[0])

Preheat the oven to 350 degrees and place the cookie dough in the center of the oven.

In a large bowl, combine the flour, baking powder, baking soda, salt, and cinnamon.

In a separate bowl, combine the egg yolks, sugar, and vanilla.


