![Thinkube AI Lab](../icons/tk_full_logo.svg)

# Unsloth Basics 🚀

Memory-efficient fine-tuning with Unsloth:
- What is Unsloth
- Setup and installation
- Load base models
- Quick fine-tuning example
- Performance comparison

## What is Unsloth?

Unsloth optimizes LLM fine-tuning:

- **2x Faster**: Optimized kernels
- **70% Less Memory**: Efficient implementation
- **QLoRA Support**: 4-bit quantization
- **Easy to Use**: Simple API
- **Free**: Open source

Perfect for fine-tuning on consumer GPUs!

## Installation

In [None]:
# Install Unsloth (if not pre-installed)
# !pip install unsloth

# TODO: Verify Unsloth installation
# TODO: Check CUDA availability
# TODO: Display versions

## Load Base Model with Unsloth

Optimized model loading:

In [None]:
# Load model with Unsloth
from unsloth import FastLanguageModel
import torch

# TODO: Choose base model (Llama-2, Mistral, etc.)
# TODO: Set max sequence length
# TODO: Load model and tokenizer with FastLanguageModel.from_pretrained()
# TODO: Configure 4-bit quantization
# TODO: Display model info and memory usage

## Configure LoRA Adapters

Parameter-efficient fine-tuning:

In [None]:
# Add LoRA adapters

# TODO: Get PEFT model with get_peft_model()
# TODO: Set LoRA rank and alpha
# TODO: Choose target modules (q_proj, v_proj, etc.)
# TODO: Display trainable parameters
# TODO: Show parameter reduction

## Prepare Training Data

Format data for instruction tuning:

In [None]:
# Prepare dataset
from datasets import load_dataset

# TODO: Load instruction dataset
# TODO: Format with chat template
# TODO: Tokenize with padding
# TODO: Create DataLoader
# TODO: Display sample formatted example

## Quick Fine-Tuning

Train with SFTTrainer:

In [None]:
# Fine-tune with SFTTrainer
from trl import SFTTrainer
from transformers import TrainingArguments

# TODO: Define training arguments
#       - Learning rate, epochs, batch size
#       - Gradient accumulation
#       - Mixed precision
# TODO: Create SFTTrainer
# TODO: Start training
# TODO: Monitor GPU memory
# TODO: Display training progress

## Test Fine-Tuned Model

Generate with fine-tuned model:

In [None]:
# Test generation

# TODO: Prepare test prompt
# TODO: Generate response
# TODO: Compare with base model
# TODO: Display both outputs
# TODO: Show improvement

## Performance Comparison

Unsloth vs standard fine-tuning:

In [None]:
# Benchmark Unsloth benefits

# TODO: Measure training speed (tokens/sec)
# TODO: Compare memory usage
# TODO: Calculate speedup
# TODO: Display comparison chart
# TODO: Show memory savings

## Save Model

Save fine-tuned weights:

In [None]:
# Save fine-tuned model

# TODO: Save LoRA adapters
# TODO: Save full model (merged)
# TODO: Save in different formats (GGUF, safetensors)
# TODO: Display save locations

## Supported Models

Unsloth supports:
- Llama 2, Llama 3
- Mistral, Mixtral
- Phi-2, Phi-3
- Gemma
- And more!

## Best Practices

- ✅ Start with 4-bit quantization for memory savings
- ✅ Use appropriate LoRA rank (8-64)
- ✅ Monitor GPU memory usage
- ✅ Use gradient checkpointing for large models
- ✅ Test on small dataset first
- ✅ Save adapters separately for flexibility
- ✅ Compare with base model regularly

## Next Steps

Continue with:
- **02-qlora-tuning.ipynb** - Deep dive into QLoRA
- **03-dataset-preparation.ipynb** - Prepare quality datasets
- **04-evaluation-deployment.ipynb** - Evaluate and deploy