# Day 26: LoRA Implementation - Part 1

In this notebook, we'll implement Low-Rank Adaptation (LoRA) to fine-tune a pre-trained language model on a specific task. We'll use the Hugging Face PEFT library to apply LoRA to a base model and train it efficiently.

## Overview

1. Setup and dependencies
2. Loading a pre-trained model
3. Configuring LoRA adapters
4. Preparing a dataset for fine-tuning
5. Training the model with LoRA

## 1. Setup and Dependencies

First, let's install the necessary libraries:

In [None]:
!pip3 install -q transformers datasets peft evaluate accelerate bitsandbytes

In [None]:
import os
import torch
from datasets import load_dataset
from transformers import (
    AutoModelForSequenceClassification,
    AutoTokenizer,
    TrainingArguments,
    Trainer,
    DataCollatorWithPadding
)
from peft import (
    get_peft_model,
    LoraConfig,
    TaskType,
    PeftModel,
    PeftConfig
)
import evaluate
import numpy as np

# Check if GPU is available
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(f"Using device: {device}")

## 2. Loading a Pre-trained Model

For this example, we'll use a RoBERTa model for sentiment analysis. We'll fine-tune it on the SST-2 (Stanford Sentiment Treebank) dataset.

In [None]:
# Define the model name
model_name = "roberta-base"

# Load the tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_name)

# Load the model
model = AutoModelForSequenceClassification.from_pretrained(
    model_name,
    num_labels=2,  # Binary classification for sentiment
    return_dict=True
)

# Move model to the appropriate device
model = model.to(device)

# Print model size
def count_parameters(model):
    return sum(p.numel() for p in model.parameters() if p.requires_grad)

print(f"Model has {count_parameters(model):,} trainable parameters")

## 3. Configuring LoRA Adapters

Now, let's configure LoRA to adapt only specific layers of the model. We'll target the attention layers in the transformer blocks.

In [None]:
# Define LoRA configuration
lora_config = LoraConfig(
    task_type=TaskType.SEQ_CLS,  # Sequence classification task
    r=8,                         # Rank of the update matrices
    lora_alpha=16,               # Alpha parameter for scaling
    lora_dropout=0.1,            # Dropout probability for LoRA layers
    target_modules=["query", "key", "value"],  # Apply LoRA to attention layers
    bias="none",                 # Don't train bias parameters
)

# Create the PEFT model
peft_model = get_peft_model(model, lora_config)

# Print trainable parameters
print(f"Full model parameters: {count_parameters(model):,}")
print(f"LoRA model trainable parameters: {count_parameters(peft_model):,}")
print(f"Parameter efficiency: {count_parameters(peft_model) / count_parameters(model) * 100:.2f}%")

# Print the model architecture with LoRA adapters
print(peft_model)

## 4. Preparing a Dataset for Fine-tuning

We'll use the SST-2 dataset, which contains movie reviews labeled as positive or negative.

In [None]:
# Load the SST-2 dataset
dataset = load_dataset("glue", "sst2")
print(dataset)

# Look at a few examples
for i in range(3):
    print(f"Example {i+1}:")
    print(f"Text: {dataset['train'][i]['sentence']}")
    print(f"Label: {dataset['train'][i]['label']}")
    print()

In [None]:
# Tokenize the dataset
def tokenize_function(examples):
    return tokenizer(examples["sentence"], padding="max_length", truncation=True, max_length=128)

tokenized_datasets = dataset.map(tokenize_function, batched=True)

# Prepare the datasets for training
train_dataset = tokenized_datasets["train"]
eval_dataset = tokenized_datasets["validation"]

# Create a data collator
data_collator = DataCollatorWithPadding(tokenizer=tokenizer)

## 5. Training the Model with LoRA

Now, let's set up the training arguments and train our model with LoRA adapters.

In [None]:
# Define metrics for evaluation
accuracy_metric = evaluate.load("accuracy")

def compute_metrics(eval_pred):
    predictions, labels = eval_pred
    predictions = np.argmax(predictions, axis=1)
    return accuracy_metric.compute(predictions=predictions, references=labels)

In [None]:
# Define training arguments
training_args = TrainingArguments(
    output_dir="./results/roberta-sst2-lora",
    learning_rate=1e-4,
    per_device_train_batch_size=16,
    per_device_eval_batch_size=32,
    num_train_epochs=3,
    weight_decay=0.01,
    evaluation_strategy="epoch",
    save_strategy="epoch",
    load_best_model_at_end=True,
    push_to_hub=False,
    report_to="none",  # Disable wandb, tensorboard, etc.
)

# Create the trainer
trainer = Trainer(
    model=peft_model,
    args=training_args,
    train_dataset=train_dataset,
    eval_dataset=eval_dataset,
    tokenizer=tokenizer,
    data_collator=data_collator,
    compute_metrics=compute_metrics,
)

In [None]:
# Train the model
trainer.train()

## 6. Evaluating the LoRA-adapted Model

Let's evaluate our fine-tuned model on the validation set.

In [None]:
# Evaluate the model
eval_results = trainer.evaluate()
print(f"Evaluation results: {eval_results}")

## 7. Saving the LoRA Adapter

One of the key benefits of LoRA is that we only need to save the adapter weights, not the entire model.

In [None]:
# Save the LoRA adapter weights
peft_model_path = "./lora-roberta-sst2"
peft_model.save_pretrained(peft_model_path)

print(f"LoRA adapter saved to {peft_model_path}")

# Check the size of the saved adapter
!du -sh {peft_model_path}

## 8. Loading and Using the LoRA Adapter

Let's see how to load and use our trained LoRA adapter with the base model.

In [None]:
# Load the base model again
base_model = AutoModelForSequenceClassification.from_pretrained(
    model_name,
    num_labels=2,
    return_dict=True
)

# Load the LoRA adapter
peft_model_loaded = PeftModel.from_pretrained(base_model, peft_model_path)

# Test the model on a sample input
test_texts = [
    "This movie was fantastic! I really enjoyed it.",
    "What a terrible waste of time. I hated every minute."
]

# Tokenize the inputs
inputs = tokenizer(test_texts, return_tensors="pt", padding=True, truncation=True).to(device)

# Get predictions
with torch.no_grad():
    outputs = peft_model_loaded(**inputs)
    predictions = torch.softmax(outputs.logits, dim=-1)

# Print results
for i, text in enumerate(test_texts):
    sentiment = "positive" if predictions[i][1] > predictions[i][0] else "negative"
    confidence = predictions[i][1] if sentiment == "positive" else predictions[i][0]
    print(f"Text: {text}")
    print(f"Sentiment: {sentiment} (confidence: {confidence:.4f})")
    print()

## Conclusion

In this notebook, we've successfully implemented LoRA fine-tuning on a pre-trained RoBERTa model for sentiment analysis. We've seen how LoRA allows us to adapt a large model with only a small number of trainable parameters, making fine-tuning more efficient.

Key takeaways:

1. LoRA significantly reduces the number of trainable parameters (typically <1% of the full model)
2. The adapter weights are small and easy to store/distribute
3. We can achieve competitive performance with much less computational resources
4. The base model remains unchanged, allowing for multiple task adaptations

In Part 2, we'll explore more advanced LoRA techniques, including adapter merging and multi-task adaptation.