# üéØ LoRA Fine-Tuning Guide

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Gaurav14cs17/LLMs_Model/blob/main/Fine-Tuning-LLMs-Guide/notebooks/02_lora_fine_tuning.ipynb)

**Parameter-Efficient Fine-Tuning using Low-Rank Adaptation (LoRA)**

### üîë Key Benefits of LoRA
- Train only **0.1-1%** of model parameters
- **10x less memory** than full fine-tuning  
- Same performance as full fine-tuning
- Easy to merge adapters back into base model

**‚ö†Ô∏è Requirements**: GPU with 16GB+ VRAM


In [None]:
# Install and import
!pip install -q transformers datasets accelerate peft bitsandbytes trl

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import LoraConfig, get_peft_model, TaskType
from trl import SFTTrainer

print(f"GPU: {torch.cuda.get_device_name(0)}")


In [None]:
# Configuration
MODEL_NAME = "microsoft/phi-2"
OUTPUT_DIR = "./lora-fine-tuned"

# Load base model and tokenizer
tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)
tokenizer.pad_token = tokenizer.eos_token

model = AutoModelForCausalLM.from_pretrained(
    MODEL_NAME,
    torch_dtype=torch.float16,
    device_map="auto"
)

# LoRA Configuration - these are the key hyperparameters!
lora_config = LoraConfig(
    r=16,              # Rank: higher = more capacity, more memory
    lora_alpha=32,     # Alpha: scaling factor (often 2x rank)
    lora_dropout=0.05, # Dropout for regularization
    bias="none",       # Don't train bias terms
    task_type=TaskType.CAUSAL_LM,
    target_modules=["q_proj", "k_proj", "v_proj", "o_proj"],  # Attention layers
)

# Apply LoRA to model
model = get_peft_model(model, lora_config)
model.print_trainable_parameters()  # Should show ~0.1% trainable!


In [None]:
# Load and prepare dataset
from datasets import load_dataset
from trl import SFTTrainer, SFTConfig

dataset = load_dataset("tatsu-lab/alpaca", split="train[:2000]")

def format_alpaca(sample):
    if sample.get("input", ""):
        return {"text": f"### Instruction:\n{sample['instruction']}\n\n### Input:\n{sample['input']}\n\n### Response:\n{sample['output']}"}
    return {"text": f"### Instruction:\n{sample['instruction']}\n\n### Response:\n{sample['output']}"}

dataset = dataset.map(format_alpaca)
print(f"Dataset: {len(dataset)} samples")


In [None]:
# Training configuration
training_args = SFTConfig(
    output_dir=OUTPUT_DIR,
    num_train_epochs=1,
    per_device_train_batch_size=4,
    gradient_accumulation_steps=4,
    learning_rate=2e-4,  # LoRA typically uses higher LR
    warmup_ratio=0.03,
    logging_steps=25,
    save_steps=100,
    fp16=True,
    max_seq_length=512,
)

# Train with SFTTrainer
trainer = SFTTrainer(
    model=model,
    args=training_args,
    train_dataset=dataset,
    tokenizer=tokenizer,
)

print("üöÄ Starting LoRA training...")
trainer.train()
print("‚úÖ LoRA training complete!")


In [None]:
# Save LoRA adapter (very small file!)
model.save_pretrained(OUTPUT_DIR)
tokenizer.save_pretrained(OUTPUT_DIR)

# Check adapter size
import os
adapter_size = sum(os.path.getsize(os.path.join(OUTPUT_DIR, f)) for f in os.listdir(OUTPUT_DIR) if f.endswith('.safetensors'))
print(f"üíæ LoRA adapter size: {adapter_size / 1e6:.2f} MB (vs ~5GB for full model!)")


In [None]:
# Test the LoRA fine-tuned model
def generate(prompt, max_tokens=128):
    inputs = tokenizer(f"### Instruction:\n{prompt}\n\n### Response:\n", return_tensors="pt").to(model.device)
    outputs = model.generate(**inputs, max_new_tokens=max_tokens, temperature=0.7, do_sample=True)
    return tokenizer.decode(outputs[0], skip_special_tokens=True).split("### Response:")[-1].strip()

# Test
print("ü§ñ Testing LoRA model:")
print(generate("What are the benefits of exercise?"))


## üîÄ Optional: Merge LoRA into Base Model

```python
# Merge LoRA weights into base model for faster inference
merged_model = model.merge_and_unload()
merged_model.save_pretrained("./merged-model")
```

## üìö Next Steps
- Try [QLoRA](./03_qlora_fine_tuning.ipynb) for 4-bit quantization (even less memory!)
- Try [DPO Training](./04_dpo_training.ipynb) for preference alignment

üìñ Reference: [A Comprehensive Guide to Fine-Tuning LLMs](https://arxiv.org/html/2408.13296v1)
