# LoRA with PEFT

#🔧 What You’ll Learn
✅ How to load a base model and dataset

✅ How to apply LoRA with PEFT

✅ How to fine-tune the model

✅ How to evaluate with Gemini (e.g., BLEU, answer quality)

✅ Full comments and explanations in the code

##Step 0: Install and Import Dependencies

In [2]:
!pip install bitsandbytes



In [1]:
# Step 1: Load a synthetic dataset for fine-tuning
from datasets import Dataset

# Create a simple dataset with short general-purpose text prompts and responses
samples = [
    {"text": "What is AI? AI stands for Artificial Intelligence."},
    {"text": "Python is a popular programming language."},
    {"text": "The capital of France is Paris."},
    {"text": "Machine learning is a subset of AI."},
    {"text": "Water freezes at 0 degrees Celsius."}
]

# Convert the list of dictionaries into a Hugging Face Dataset object
dataset = Dataset.from_list(samples)



In [2]:

# Step 2: Load tokenizer and tokenize dataset
from transformers import AutoTokenizer

base_model = "tiiuae/falcon-rw-1b"  # small LLM suitable for quick experimentation

# Load tokenizer for the base model
tokenizer = AutoTokenizer.from_pretrained(base_model)

# Set pad token to eos_token to avoid tokenization errors
tokenizer.pad_token = tokenizer.eos_token

# Define tokenization function for the dataset
def tokenize(example):
    return tokenizer(example["text"], padding="max_length", truncation=True, max_length=128)

# Apply tokenization to the entire dataset
tokenized_dataset = dataset.map(tokenize, batched=True)



The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


Map:   0%|          | 0/5 [00:00<?, ? examples/s]

In [3]:

# Step 3: Load model with disk offloading support for low-resource environments
import torch
from transformers import AutoModelForCausalLM

# Load the base model in float32 (standard precision), with offload folder set
model = AutoModelForCausalLM.from_pretrained(
    base_model,
    device_map="auto",  # Let HF auto-assign device or offload
    torch_dtype=torch.float32,
    offload_folder="./offload"  # Needed if CPU can't hold full weights in RAM
)



In [4]:

# Step 4: Prepare the model for LoRA fine-tuning
from peft import LoraConfig, get_peft_model, prepare_model_for_kbit_training, TaskType

# Prepare for quantized training (noop here since we’re using float32, but good practice)
model = prepare_model_for_kbit_training(model)

# Define configuration for LoRA adapters
lora_config = LoraConfig(
    r=8,  # Rank for the low-rank matrix
    lora_alpha=32,  # Scaling factor
    target_modules=["query_key_value"],  # LoRA will only modify attention heads
    lora_dropout=0.05,
    bias="none",
    task_type=TaskType.CAUSAL_LM
)

# Wrap the model with LoRA adapters
model = get_peft_model(model, lora_config)

# Print the number of trainable parameters (should be small due to LoRA)
model.print_trainable_parameters()









trainable params: 1,572,864 || all params: 1,313,198,080 || trainable%: 0.1198


In [5]:
# Step 5: Set up training loop using Hugging Face Trainer
from transformers import TrainingArguments, Trainer, DataCollatorForLanguageModeling

# Define training arguments
training_args = TrainingArguments(
    output_dir="./outputs",  # Where to save model checkpoints
    num_train_epochs=1,  # For demo; increase for better training
    per_device_train_batch_size=2,
    gradient_accumulation_steps=2,
    logging_steps=1,
    learning_rate=2e-4,
    fp16=False,  # Avoid fp16 on CPU
    save_total_limit=1,
    report_to="none"
)

# Prepare data collator for Causal Language Modeling (not MLM)
data_collator = DataCollatorForLanguageModeling(tokenizer=tokenizer, mlm=False)

# Fix the 'meta tensor' error by explicitly moving model to CPU
model = model.to(torch.device("cpu"))

# Initialize Trainer
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=tokenized_dataset,
    data_collator=data_collator
)

# Start training
trainer.train()

No label_names provided for model class `PeftModelForCausalLM`. Since `PeftModel` hides base models input arguments, if label_names is not given, label_names can't be set automatically within `Trainer`. Note that empty label_names list will be used instead.


Step,Training Loss
1,2.2765
2,1.7214


TrainOutput(global_step=2, training_loss=1.9989190101623535, metrics={'train_runtime': 46.5386, 'train_samples_per_second': 0.107, 'train_steps_per_second': 0.043, 'total_flos': 4647073873920.0, 'train_loss': 1.9989190101623535, 'epoch': 1.0})

In [6]:

# Step 6: Run inference with the fine-tuned model
prompt = "What is AI?"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=50)

# Print the generated output
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


What is AI?
Artificial Intelligence (AI) is a branch of computer science that deals with the creation of computer systems that can perform tasks that are normally performed by humans.
AI is a broad term that can be used to describe a variety of different technologies.
