# 🦉 TinyOwl Fine-Tuning Notebook

**Train TinyLlama on your 373K theological chunks**

---

## What This Does

This notebook fine-tunes TinyLlama-1.1B on your complete theological database:
- **Phase 1**: Domain adaptation (learn theological knowledge)
- **Phase 2**: Instruction tuning (learn to answer questions)

**Total training time**: ~12-20 hours on free T4 GPU

---

## Before You Start

### ✅ Required Files (upload to Colab):
1. `domain_adaptation.jsonl` (281 MB) - Your 373K chunks
2. `instruction_tuning.jsonl` (~10-15 MB) - Q&A pairs

### ✅ GPU Setup:
- Go to **Runtime** → **Change runtime type**
- Select **T4 GPU** (free tier)
- Click **Save**

---

## Instructions

1. **Upload your files** (left sidebar → folder icon)
2. **Run each cell in order** (click play button or Shift+Enter)
3. **Wait for training** (~12-20 hours total)
4. **Download your model** when complete

Let's go! 🚀

---
## Step 1: Install Dependencies

Installing Unsloth (optimized training library) and required packages.

**Time**: ~2-3 minutes

In [None]:
%%capture
# Install Unsloth for efficient QLoRA training
!pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
!pip install --no-deps trl peft accelerate bitsandbytes

print("✅ Dependencies installed!")

---
## Step 2: Check GPU & Upload Files

Verify GPU is available and guide file upload.

In [None]:
import torch
from pathlib import Path

# Check GPU
if torch.cuda.is_available():
    gpu_name = torch.cuda.get_device_name(0)
    print(f"✅ GPU Available: {gpu_name}")
    print(f"   VRAM: {torch.cuda.get_device_properties(0).total_memory / 1024**3:.1f} GB")
else:
    print("❌ No GPU found!")
    print("   Go to Runtime → Change runtime type → Select T4 GPU")
    raise SystemExit

# Check for uploaded files
print("\n📁 Checking for training files...")

domain_file = Path("/content/domain_adaptation.jsonl")
qa_file = Path("/content/instruction_tuning.jsonl")

if domain_file.exists():
    print(f"✅ domain_adaptation.jsonl found ({domain_file.stat().st_size / 1024**2:.1f} MB)")
else:
    print("❌ domain_adaptation.jsonl NOT FOUND")
    print("   → Upload it using the folder icon on the left")

if qa_file.exists():
    print(f"✅ instruction_tuning.jsonl found ({qa_file.stat().st_size / 1024**2:.1f} MB)")
else:
    print("❌ instruction_tuning.jsonl NOT FOUND")
    print("   → Upload it using the folder icon on the left")

if not (domain_file.exists() and qa_file.exists()):
    print("\n⚠️  Upload both files before continuing!")
else:
    print("\n🎉 All files ready! Proceed to next step.")

---
## Step 3: Load Training Data

Loading your theological datasets.

In [None]:
import json
from datasets import Dataset

def load_jsonl(filepath):
    """Load JSONL file"""
    data = []
    with open(filepath, 'r', encoding='utf-8') as f:
        for line in f:
            if line.strip():
                data.append(json.loads(line))
    return data

print("📚 Loading domain adaptation dataset...")
domain_data = load_jsonl("/content/domain_adaptation.jsonl")
print(f"✅ Loaded {len(domain_data):,} theological chunks")

print("\n📚 Loading instruction tuning dataset...")
qa_data = load_jsonl("/content/instruction_tuning.jsonl")
print(f"✅ Loaded {len(qa_data):,} Q&A pairs")

print("\n📊 Sample domain chunk:")
print(f"   {domain_data[0]['text'][:200]}...")

print("\n📊 Sample Q&A pair:")
print(f"   Q: {qa_data[0]['instruction']}")
print(f"   A: {qa_data[0]['output'][:100]}...")

---
## PHASE 1: Domain Adaptation

Teaching TinyLlama theological knowledge from your 373K chunks.

**Time**: ~8-12 hours on T4 GPU

In [None]:
from unsloth import FastLanguageModel
from transformers import TrainingArguments
from trl import SFTTrainer

print("="*60)
print("PHASE 1: DOMAIN ADAPTATION")
print("Teaching TinyLlama theological knowledge")
print("="*60)
print()

# Load TinyLlama
print("📥 Loading TinyLlama-1.1B-Chat-v1.0...")

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name="TinyLlama/TinyLlama-1.1B-Chat-v1.0",
    max_seq_length=2048,
    dtype=None,
    load_in_4bit=True,  # QLoRA for efficiency
)

print("✅ Model loaded")

# Add LoRA adapters
print("🔧 Adding LoRA adapters...")

model = FastLanguageModel.get_peft_model(
    model,
    r=16,
    target_modules=["q_proj", "k_proj", "v_proj", "o_proj",
                   "gate_proj", "up_proj", "down_proj"],
    lora_alpha=16,
    lora_dropout=0.05,
    bias="none",
    use_gradient_checkpointing=True,
    random_state=3407,
)

print("✅ LoRA adapters added")
print()

In [None]:
# Format dataset for training
print("🔄 Formatting dataset...")

def formatting_func(examples):
    """Format chunks for domain adaptation"""
    texts = []
    for text in examples["text"]:
        formatted = f"""<|im_start|>system
You are a theological research assistant trained on biblical and SDA content.<|im_end|>
<|im_start|>text
{text}<|im_end|>"""
        texts.append(formatted)
    return {"text": texts}

dataset = Dataset.from_list(domain_data)
dataset = dataset.map(formatting_func, batched=True)

print(f"✅ Dataset formatted: {len(dataset):,} examples")
print()

In [None]:
# Train Phase 1
print("🚀 Starting Phase 1 training...")
print("   This will take ~8-12 hours")
print("   You can close the browser - training will continue")
print("   (Colab will email you when done)")
print()

trainer = SFTTrainer(
    model=model,
    tokenizer=tokenizer,
    train_dataset=dataset,
    dataset_text_field="text",
    max_seq_length=2048,
    args=TrainingArguments(
        per_device_train_batch_size=4,
        gradient_accumulation_steps=4,
        warmup_steps=10,
        max_steps=2000,  # Adjust based on dataset size
        learning_rate=2e-4,
        fp16=not torch.cuda.is_bf16_supported(),
        bf16=torch.cuda.is_bf16_supported(),
        logging_steps=10,
        output_dir="/content/phase1_output",
        optim="adamw_8bit",
        weight_decay=0.01,
        lr_scheduler_type="cosine",
        seed=3407,
        save_steps=500,
        save_total_limit=2,
    ),
)

# Start training
trainer.train()

print("")
print("✅ Phase 1 training complete!")
print("")

In [None]:
# Save Phase 1 model
print("💾 Saving Phase 1 model...")

model.save_pretrained("/content/tinyowl-phase1")
tokenizer.save_pretrained("/content/tinyowl-phase1")

print("✅ Phase 1 model saved to /content/tinyowl-phase1")
print("")
print("🎉 PHASE 1 COMPLETE!")
print("   Next: Run Phase 2 cells below")

---
## PHASE 2: Instruction Tuning

Teaching TinyLlama to answer theological questions.

**Time**: ~4-8 hours on T4 GPU

In [None]:
from unsloth import FastLanguageModel
from transformers import TrainingArguments
from trl import SFTTrainer

print("="*60)
print("PHASE 2: INSTRUCTION TUNING")
print("Teaching TinyLlama to answer questions")
print("="*60)
print()

# Load Phase 1 model
print("📥 Loading Phase 1 model...")

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name="/content/tinyowl-phase1",
    max_seq_length=2048,
    dtype=None,
    load_in_4bit=True,
)

print("✅ Phase 1 model loaded")
print()

In [None]:
# Format Q&A dataset
print("🔄 Formatting Q&A dataset...")

def formatting_func_qa(examples):
    """Format Q&A pairs for instruction tuning"""
    texts = []
    for instruction, output in zip(examples["instruction"], examples["output"]):
        text = f"""<|im_start|>system
You are TinyOwl, a theological research assistant trained on biblical Scripture, Spirit of Prophecy, and Strong's concordance. Answer questions accurately based on this knowledge.<|im_end|>
<|im_start|>user
{instruction}<|im_end|>
<|im_start|>assistant
{output}<|im_end|>"""
        texts.append(text)
    return {"text": texts}

qa_dataset = Dataset.from_list(qa_data)
qa_dataset = qa_dataset.map(formatting_func_qa, batched=True)

print(f"✅ Q&A dataset formatted: {len(qa_dataset):,} examples")
print()

In [None]:
# Train Phase 2
print("🚀 Starting Phase 2 training...")
print("   This will take ~4-8 hours")
print()

trainer = SFTTrainer(
    model=model,
    tokenizer=tokenizer,
    train_dataset=qa_dataset,
    dataset_text_field="text",
    max_seq_length=2048,
    args=TrainingArguments(
        per_device_train_batch_size=4,
        gradient_accumulation_steps=4,
        warmup_steps=5,
        num_train_epochs=2,
        learning_rate=2e-5,
        fp16=not torch.cuda.is_bf16_supported(),
        bf16=torch.cuda.is_bf16_supported(),
        logging_steps=10,
        output_dir="/content/phase2_output",
        optim="adamw_8bit",
        weight_decay=0.01,
        lr_scheduler_type="cosine",
        seed=3407,
        save_steps=200,
        save_total_limit=2,
    ),
)

# Start training
trainer.train()

print("")
print("✅ Phase 2 training complete!")
print("")

In [None]:
# Save final TinyOwl model
print("💾 Saving TinyOwl 1.0...")

model.save_pretrained("/content/tinyowl-1.0")
tokenizer.save_pretrained("/content/tinyowl-1.0")

print("✅ TinyOwl 1.0 saved to /content/tinyowl-1.0")
print("")
print("="*60)
print("🎉 TINYOWL 1.0 TRAINING COMPLETE!")
print("="*60)
print("")
print("Next steps:")
print("1. Test the model (next cell)")
print("2. Download the model (instructions below)")
print("3. Quantize to GGUF for distribution")

---
## Test Your Model

Try some theological questions!

In [None]:
from unsloth import FastLanguageModel

# Load final model for inference
print("📥 Loading TinyOwl 1.0 for testing...")

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name="/content/tinyowl-1.0",
    max_seq_length=2048,
    dtype=None,
    load_in_4bit=True,
)

FastLanguageModel.for_inference(model)  # Enable inference mode

print("✅ Model ready for testing!")
print()

# Test function
def ask_tinyowl(question):
    """Ask TinyOwl a theological question"""
    prompt = f"""<|im_start|>system
You are TinyOwl, a theological research assistant trained on biblical Scripture, Spirit of Prophecy, and Strong's concordance.<|im_end|>
<|im_start|>user
{question}<|im_end|>
<|im_start|>assistant
"""
    
    inputs = tokenizer([prompt], return_tensors="pt").to("cuda")
    outputs = model.generate(
        **inputs,
        max_new_tokens=256,
        temperature=0.7,
        top_p=0.9,
    )
    
    response = tokenizer.decode(outputs[0], skip_special_tokens=True)
    # Extract just the assistant's response
    answer = response.split("<|im_start|>assistant")[-1].strip()
    return answer

# Test questions
test_questions = [
    "Who was Aaron?",
    "What does the Bible say about the Sabbath?",
    "Explain the sanctuary service",
]

print("🧪 Testing TinyOwl...")
print()

for question in test_questions:
    print(f"Q: {question}")
    answer = ask_tinyowl(question)
    print(f"A: {answer}")
    print()
    print("-" * 60)
    print()

---
## Download Your Trained Model

Package and download TinyOwl 1.0 to your computer.

In [None]:
# Zip the model for download
print("📦 Packaging TinyOwl 1.0 for download...")

!zip -r tinyowl-1.0.zip /content/tinyowl-1.0/

print("✅ Model packaged!")
print("")
print("📥 Download instructions:")
print("   1. Look at the files panel on the left")
print("   2. Find 'tinyowl-1.0.zip'")
print("   3. Right-click → Download")
print("")
print("⚠️  Note: Colab deletes files when session ends!")
print("   Download NOW before closing this notebook")

# Show file size
import os
size_mb = os.path.getsize("/content/tinyowl-1.0.zip") / 1024**2
print(f"")
print(f"📊 File size: {size_mb:.1f} MB")

---
## 🎉 You Did It!

**TinyOwl 1.0 is trained!**

### What You Accomplished:
✅ Trained TinyLlama on 373K theological chunks  
✅ Fine-tuned on thousands of Q&A pairs  
✅ Created a custom theological AI model  

### Next Steps:
1. **Download the model** (instructions above)
2. **Quantize to GGUF** for efficient inference
3. **Package with your chat app**
4. **Distribute TinyOwl to the world!**

---

**The vision is complete. TinyOwl lives. 🦉**