# **MPESA LLM Fine‑Tuning on Mac M1 (16GB): Step-by-Step Guide**

This notebook is a practical, beginner-friendly guide for fine-tuning a Large Language Model (LLM) using MPESA SMS transaction data. Each step is clearly explained, with code and rationale, so you can follow along and understand the process from start to finish.

**What you'll accomplish:**
- Prepare and load your MPESA SMS dataset
- Select and configure a base LLM
- Apply LoRA (PEFT) for efficient fine-tuning
- Set up and run supervised fine-tuning (SFT) with TRL
- Save, push, and optionally merge your trained model
- Run a quick inference to check your results

**Workflow Overview:**
1. Login to Hugging Face and Weights & Biases
2. Load and Prepare the MPESA SMS Data
3. Choose a Base Model
4. Load Tokenizer and Model
5. Configure LoRA (PEFT)
6. Training Configuration (TRL SFT)
7. Train-on-Answer (ToA)
8. Fine-Tune (SFT Trainer)
9. Save, Push to Hub, and Optionally Merge LoRA Weights
10. Quick Sanity Check Inference

1. **Login to Hugging Face and Weights & Biases**

In [None]:
import os
from huggingface_hub import login
import wandb
from dotenv import load_dotenv

# Check for .env file and load environment variables
if not os.path.exists('.env'):
    print("Warning: .env file not found in the current directory.")
load_dotenv()

hf_token = os.getenv("HF_TOKEN")
wandb_api_key = os.getenv("WANDB_API_KEY")

# 1. Login to Hugging Face (run this once per session)
if hf_token:
    login(token=hf_token)
    print("Logged in to Hugging Face Hub.")
else:
    raise ValueError("HF_TOKEN not set in .env file.")

# 2. Login to Weights & Biases (run this once per session)
if wandb_api_key:
    wandb.login(key=wandb_api_key)
    print("Logged in to Weights & Biases.")
else:
    raise ValueError("WANDB_API_KEY not set in .env file.")

# 3. Set your WandB project details and initialize run
wandb_project = "mpesa-llm-finetuning"
wandb_log_model = "checkpoint"
wandb_watch = "all"  # options: "all", "gradients", "parameters", or None

wandb.init(project=wandb_project, log_model=wandb_log_model, watch=wandb_watch)
print(f"WandB run initialized: project={wandb_project}, log_model={wandb_log_model}, watch={wandb_watch}")