# Parameter-Efficient Fine-Tuning GPT-2 Model with LoRA

This notebook demonstrates how to use **LoRA (Low-Rank Adaptation)** to fine-tune a pretrained GPT-2 model efficiently. LoRA reduces memory usage and computational costs by freezing most model parameters and training small, low-rank matrices.

We will:
- Load a pretrained GPT-2 model.
- Apply LoRA configuration to the model.
- Tokenize input text.
- Run a forward pass to verify that the setup works correctly.

## Step 1: Install Required Libraries

We will use Hugging Face's `transformers` library for loading GPT-2 and `peft` for applying LoRA.

In [None]:
# !pip install transformers peft

## Step 2: Load Pretrained GPT-2 Model and Tokenizer

Here, we load the pretrained GPT-2 model and its tokenizer. This gives us a base model that has already learned general language patterns.

In [1]:
from transformers import AutoModelForCausalLM, AutoTokenizer

# Load GPT-2 model and tokenizer
model = AutoModelForCausalLM.from_pretrained("gpt2")
tokenizer = AutoTokenizer.from_pretrained("gpt2")

# Set an existing token (e.g., eos_token) as the padding token (The GPT-2 tokenizer already has an eos_token (End of Sequence) that can serve as a padding token)
tokenizer.pad_token = tokenizer.eos_token

# Print confirmation
print("GPT-2 model and tokenizer loaded successfully!")

GPT-2 model and tokenizer loaded successfully!


**Troubleshooting Tip:**
- If you encounter an error here, ensure that the `transformers` library is installed correctly. Use `pip list` to verify.

## Step 3: Apply LoRA Configuration

**LoRA Configuration:**
- `r`: Low-rank dimension (e.g., `8`).
- `lora_alpha`: Scaling factor (e.g., `32`).
- `target_modules`: Specific layers in GPT-2 where LoRA will be applied (e.g., `c_attn`, `c_proj`).

In [2]:
from peft import LoraConfig, get_peft_model

# Define LoRA configuration
lora_config = LoraConfig(
    r=8,            # Low-rank dimension
    lora_alpha=32,  # Scaling factor
    target_modules=["c_attn", "c_proj"],  # Target attention layers in GPT-2
    lora_dropout=0.1,
) 

# Apply LoRA to the model
model = get_peft_model(model, lora_config)

# Print confirmation
print("LoRA applied successfully!")

LoRA applied successfully!




**Common Mistake:** Forgetting to specify the correct target modules (`c_attn`, `c_proj`).
- *Solution:* Double-check the layer names in your model’s architecture if you encounter an error.

# Step 4: Tokenize Input Text

**Explanation:** Tokenization converts plain text into numerical IDs that the model can process.

In [5]:
# Tokenize a single input text

input_text = "Once upon a time in a galaxy far away"
inputs = tokenizer(input_text, return_tensors="pt").input_ids

# Print confirmation
print("Input text tokenized successfully!")

Input text tokenized successfully!


**Troubleshooting Tip:** If you encounter a shape mismatch error later, ensure that your tokenized inputs include a batch dimension (e.g., use `.unsqueeze(0)` if needed).

# Step 5: Run a Forward Pass


In [4]:
# Run a forward pass through the model
outputs = model(inputs)
print("Forward pass completed!")

Forward pass completed!


**Common Mistake:** Forgetting to move inputs or models to the appropriate device (e.g., GPU or CPU).
- Solution: Ensure both are on compatible devices using `.to(device)`.