# 🦥 Fine-tuning Qwen 2.5 7B with Unsloth: A Step-by-Step Guide

Hey there, ML enthusiasts! 👋 Ready to dive into some seriously fun model fine-tuning? We're going to take Qwen 2.5 7B for a spin and teach it some new tricks using Unsloth.

## 📚 What We're Building

We are setting up a fine-tuning pipeline for the Qwen 2.5 7B model using the Chain of Thought (CoT) dataset from OpenO1-SFT. This will help the model become better at explaining its reasoning process - kind of like teaching it to show its work instead of just blurting out answers!

## 🎯 Prerequisites

Before we jump in, make sure you have:
- Python 3.8+
- A GPU with at least 16GB VRAM (we're using an NVIDIA L4)
- Basic understanding of transformers and PyTorch
- A cup of coffee ☕ (optional but recommended)

## 🛠️ Setup

First, let's set up our environment. Unsloth offers different installation methods depending on your setup:

### Option 1: Automatic Installation (Recommended)

# Run this in your terminal to get the optimal installation command for your system


In [None]:
wget -qO- https://raw.githubusercontent.com/unslothai/unsloth/main/unsloth/_auto_install.py | python -

### Option 2: Manual Installation
Choose the appropriate command based on your PyTorch and CUDA versions:

# First, upgrade pip

In [None]:
pip install --upgrade pip

# Then install Unsloth based on our setup:
# For PyTorch 2.5 + CUDA 12.1

In [None]:
pip install "unsloth[cu121-torch250] @ git+https://github.com/unslothai/unsloth.git"

# Install other required packages (if you plan to write your own training script, you can skip these)

In [None]:
pip install wandb scikit-learn evaluate

> 💡 **Pro Tip**: Not sure which version to install? Use Option 1 (automatic installation) - it'll detect your setup and give you the right command!

> ⚠️ **Note**: Make sure you have CUDA toolkit installed on your system. The commands above assume you're using a CUDA-capable GPU.

Now that we have our environment set up, let's dive into the fun part - getting our model ready! 🚀

## 🎬 Step 1: Model Setup (`model_setup.py`)

Let's start with the basics - getting our model ready! Create `model_setup.py`:

In [None]:
from unsloth import FastLanguageModel
import torch

# Configuration
max_seq_length = 2048  # Flexible length - Unsloth handles RoPE scaling internally!
dtype = None  # Auto-detection (Float16 for T4/V100, Bfloat16 for Ampere+)
load_in_4bit = False  # 4-bit quantization option for memory savings

# Initialize model and tokenizer
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = "unsloth/Qwen2.5-7B",
    max_seq_length = max_seq_length,
    dtype = dtype,
    load_in_4bit = load_in_4bit,
)

print("🦥 Model loaded successfully! Ready to learn new tricks!")

## 🎭 Step 2: Dataset Preparation (`dataset.py`)

Now that we have our model, let's prepare our dataset! Create `dataset.py`:

In [None]:
from datasets import load_dataset
from unsloth import FastLanguageModel
import torch

# Import model setup (reusing our previous code)
max_seq_length = 2048
dtype = None 
load_in_4bit = False

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = "unsloth/Qwen2.5-7B",
    max_seq_length = max_seq_length,
    dtype = dtype,
    load_in_4bit = load_in_4bit,
)

# Load our Chain of Thought dataset
dataset = load_dataset("O1-OPEN/OpenO1-SFT", split="train")
print("📚 Dataset columns:", dataset.column_names)

# Define our awesome prompt template
instruction_template = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.

### Instruction:
{}

### Response:
{}"""

# Function to format our prompts
def formatting_prompts_func(examples):
    instructions = examples["instruction"]
    outputs = examples["output"]
    texts = []
    
    for instruction, output in zip(instructions, outputs):
        formatted_text = instruction_template.format(
            instruction.strip(),
            output.strip()
        ) + tokenizer.eos_token
        texts.append(formatted_text)
    
    return {"text": texts}

# Process the dataset
dataset = dataset.map(
    formatting_prompts_func, 
    batched=True,
    remove_columns=dataset.column_names
)

print("🎉 Dataset processed and ready for training!")

## 🚀 Step 3: Training Setup (`qwen-2.5-7b-SFT-training-wandb.py`)

Finally, the main event! Let's set up our training pipeline with all the bells and whistles. With bells and whistles I mean that transitioning from dataset.py to this allows you to appreciate the complexity of the final pipeline:

In [None]:
from trl import SFTTrainer
from transformers import TrainingArguments, EarlyStoppingCallback
from unsloth import is_bfloat16_supported
from datasets import load_dataset
import evaluate
from unsloth import FastLanguageModel
import wandb

# Initialize wandb for experiment tracking
wandb.init(
    project="qwen_cot_finetune",
    config={
        "learning_rate": 3e-5,
        "architecture": "Qwen2.5-7B",
        "dataset": "O1-OPEN/OpenO1-SFT",
        "epochs": 3,
    }
)

# Set up model and tokenizer (from model_setup.py)
max_seq_length = 2048
dtype = None 
load_in_4bit = False

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = "unsloth/Qwen2.5-7B",
    max_seq_length = max_seq_length,
    dtype = dtype,
    load_in_4bit = load_in_4bit,
)

# Load and prepare datasets (from dataset.py)
train_dataset = load_dataset("O1-OPEN/OpenO1-SFT", split="train[:80%]")
eval_dataset = load_dataset("O1-OPEN/OpenO1-SFT", split="train[80%:]")
print("📊 Dataset splits ready!")

# Define our prompt template (from dataset.py)
instruction_template = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.

### Instruction:
{}

### Response:
{}"""

def formatting_prompts_func(examples):
    instructions = examples["instruction"]
    outputs = examples["output"]
    texts = []
    
    for instruction, output in zip(instructions, outputs):
        formatted_text = instruction_template.format(
            instruction.strip(),
            output.strip()
        ) + tokenizer.eos_token
        texts.append(formatted_text)
    
    return {"text": texts}

# Process datasets
train_dataset = train_dataset.map(
    formatting_prompts_func, 
    batched=True,
    remove_columns=train_dataset.column_names
)

eval_dataset = eval_dataset.map(
    formatting_prompts_func, 
    batched=True,
    remove_columns=eval_dataset.column_names
)

# Set up training arguments
training_args = TrainingArguments(
    per_device_train_batch_size=4,
    gradient_accumulation_steps=2,
    warmup_steps=6,
    num_train_epochs=3,
    learning_rate=3e-5,
    fp16=not is_bfloat16_supported(),
    bf16=is_bfloat16_supported(),
    logging_steps=10,
    logging_dir="logs",
    optim="adamw_8bit",
    weight_decay=0.01,
    lr_scheduler_type="linear",
    seed=3407,
    output_dir="outputs",
    report_to="wandb",
    max_grad_norm=1.0,
    load_best_model_at_end=True,
    eval_strategy="epoch",
    metric_for_best_model="eval_loss",
    greater_is_better=False,
    save_strategy="epoch",
    save_total_limit=2,
    run_name="qwen_cot_finetune"
)

# Initialize our trainer
trainer = SFTTrainer(
    model=model,
    tokenizer=tokenizer,
    train_dataset=train_dataset,
    eval_dataset=eval_dataset,
    dataset_text_field="text",
    max_seq_length=max_seq_length,
    dataset_num_proc=4,
    packing=True,
    args=training_args,
    callbacks=[EarlyStoppingCallback(early_stopping_patience=3)],
)

# Let's get this party started! 🎉
print("🚀 Starting training...")
trainer.train()

# Clean up
wandb.finish()
print("✨ Training complete! Time to test our newly trained model!")

## 📈 Monitoring Training

During training, you can:
1. Watch the training progress in your terminal
2. Monitor metrics in real-time on Weights & Biases
3. Check the `outputs` directory for saved checkpoints

## 🎉 What's Next?

After training, you can:
- Evaluate your model on specific tasks
- Share your fine-tuned model on Hugging Face Hub
- Experiment with different hyperparameters
- Try different datasets

## 🤝 Contributing

Found a bug? Have a suggestion? PRs are welcome! Just remember to:
1. Fork the repository
2. Create your feature branch
3. Commit your changes
4. Push to the branch
5. Open a Pull Request

## 📝 License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

---

Happy fine-tuning! Remember, if your model isn't learning, try turning it off and on again (just kidding, but sometimes it feels like that would help, right? 😅).

For questions or issues, feel free to open an issue in the repository. And don't forget to ⭐ the repo if you found it helpful!