# 🧙 The Elder - LLM Training Pipeline

## Complete automated training for The Elder wisdom model

**Philosophy blend:** Bushido + Stoicism + Native American Wisdom

### 📋 Checklist Before Running:
1. ✅ Enable GPU: Runtime → Change runtime type → T4 GPU
2. ✅ Add Secrets (🔑 icon on left):
   - `HF_TOKEN`: Your Hugging Face write token
   - `GH_TOKEN`: Your GitHub token (for repo access)
3. ✅ Run all cells in order

### ⏱️ Expected Time:
- Setup: ~5 minutes
- Training: ~30-45 minutes
- GGUF Conversion: ~10 minutes
- Upload: ~5 minutes

**Total: ~1 hour**

## 1️⃣ Environment Setup & Verification

In [None]:
# Check GPU availability
import torch
print(f"PyTorch version: {torch.__version__}")
print(f"CUDA available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"GPU: {torch.cuda.get_device_name(0)}")
    print(f"GPU Memory: {torch.cuda.get_device_properties(0).total_memory / 1e9:.2f} GB")
else:
    print("\n⚠️ WARNING: No GPU detected!")
    print("Go to: Runtime → Change runtime type → Select T4 GPU")
    raise SystemExit("GPU required for training")

In [None]:
# Load secrets
import os
from google.colab import userdata

try:
    HF_TOKEN = userdata.get('HF_TOKEN')
    print("✅ HF_TOKEN loaded")
    os.environ['HF_TOKEN'] = HF_TOKEN
except:
    print("❌ HF_TOKEN not found in secrets!")
    print("Add it: Click 🔑 icon on left → Add HF_TOKEN")
    raise

try:
    GH_TOKEN = userdata.get('GH_TOKEN')
    print("✅ GH_TOKEN loaded")
    os.environ['GH_TOKEN'] = GH_TOKEN
except:
    print("⚠️ GH_TOKEN not found (needed for private repos)")
    GH_TOKEN = None

# Configuration
GITHUB_USERNAME = "Ishabdullah"
HF_USERNAME = "Ishabdullah"
REPO_NAME = "the-elder-llm"
MODEL_NAME = "The_Elder"

## 2️⃣ Install Dependencies

In [None]:
%%capture
# Install all required packages (this takes ~3 minutes)
!pip install -q -U \
    transformers \
    datasets \
    peft \
    accelerate \
    bitsandbytes \
    trl \
    huggingface_hub \
    sentencepiece \
    protobuf

print("✅ All packages installed successfully")

## 3️⃣ Clone Repository & Load Data

In [None]:
# Clone the repository
import os

# Remove existing directory if present
!rm -rf the-elder-llm

# Clone with token if available
if GH_TOKEN:
    repo_url = f"https://{GH_TOKEN}@github.com/{GITHUB_USERNAME}/{REPO_NAME}.git"
else:
    repo_url = f"https://github.com/{GITHUB_USERNAME}/{REPO_NAME}.git"

!git clone {repo_url}
%cd the-elder-llm

# Verify dataset exists
import json
dataset_path = "data/the_elder_dataset.jsonl"
if os.path.exists(dataset_path):
    with open(dataset_path, 'r') as f:
        lines = f.readlines()
    print(f"✅ Dataset loaded: {len(lines)} training examples")
    print(f"\nSample entry:")
    sample = json.loads(lines[0])
    print(f"Q: {sample['instruction'][:100]}...")
    print(f"A: {sample['output'][:100]}...")
else:
    print(f"❌ Dataset not found at {dataset_path}")
    raise FileNotFoundError("Dataset missing")

## 4️⃣ Load Base Model & Tokenizer

In [None]:
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
import torch

# Model selection
BASE_MODEL = "TinyLlama/TinyLlama-1.1B-Chat-v1.0"
print(f"Loading base model: {BASE_MODEL}")

# Configure 4-bit quantization for memory efficiency
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.float16,
    bnb_4bit_use_double_quant=True,
)

# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained(BASE_MODEL, token=HF_TOKEN, trust_remote_code=True)
tokenizer.pad_token = tokenizer.eos_token
tokenizer.padding_side = "right"

# Load model with quantization
model = AutoModelForCausalLM.from_pretrained(
    BASE_MODEL,
    quantization_config=bnb_config,
    device_map="auto",
    token=HF_TOKEN,
    trust_remote_code=True,
)

model.config.use_cache = False
model.config.pretraining_tp = 1

print("✅ Model and tokenizer loaded")
print(f"Model size: {model.get_memory_footprint() / 1e9:.2f} GB")

## 5️⃣ Prepare Dataset

In [None]:
from datasets import load_dataset

# Load dataset
dataset = load_dataset('json', data_files='data/the_elder_dataset.jsonl', split='train')

# Load system prompt
with open('configs/the_elder_system_prompt.txt', 'r') as f:
    system_prompt = f.read().strip()

# Format dataset for instruction tuning
def format_instruction(sample):
    instruction = sample['instruction']
    output = sample['output']
    
    # Create chat-formatted prompt
    prompt = f"""<|system|>
{system_prompt}</s>
<|user|>
{instruction}</s>
<|assistant|>
{output}</s>"""
    
    return {"text": prompt}

# Apply formatting
formatted_dataset = dataset.map(format_instruction, remove_columns=dataset.column_names)

print(f"✅ Dataset formatted: {len(formatted_dataset)} examples")
print("\nSample formatted example:")
print(formatted_dataset[0]['text'][:500] + "...")

## 6️⃣ Configure LoRA (Parameter-Efficient Fine-Tuning)

In [None]:
from peft import LoraConfig, get_peft_model, prepare_model_for_kbit_training

# Prepare model for k-bit training
model = prepare_model_for_kbit_training(model)

# LoRA configuration
lora_config = LoraConfig(
    r=16,  # LoRA rank
    lora_alpha=32,
    target_modules=[
        "q_proj",
        "k_proj",
        "v_proj",
        "o_proj",
        "gate_proj",
        "up_proj",
        "down_proj",
    ],
    lora_dropout=0.05,
    bias="none",
    task_type="CAUSAL_LM",
)

# Apply LoRA
model = get_peft_model(model, lora_config)
model.print_trainable_parameters()

print("\n✅ LoRA configured and applied")

## 7️⃣ Training Configuration & Start Training

In [None]:
from transformers import TrainingArguments
from trl import SFTTrainer

# Training arguments
training_args = TrainingArguments(
    output_dir="./the-elder-output",
    num_train_epochs=3,
    per_device_train_batch_size=4,
    gradient_accumulation_steps=4,  # Effective batch size = 16
    learning_rate=2e-4,
    lr_scheduler_type="cosine",
    warmup_ratio=0.03,
    weight_decay=0.01,
    logging_steps=10,
    save_strategy="steps",
    save_steps=100,
    save_total_limit=3,
    fp16=True,
    optim="paged_adamw_32bit",
    max_grad_norm=0.3,
    group_by_length=True,
    report_to="none",
)

# Create trainer
trainer = SFTTrainer(
    model=model,
    train_dataset=formatted_dataset,
    tokenizer=tokenizer,
    args=training_args,
    dataset_text_field="text",
    max_seq_length=512,
    packing=False,
)

print("✅ Trainer configured")
print("\n" + "="*80)
print("🚀 STARTING TRAINING - The Elder LLM")
print("="*80)
print(f"Dataset size: {len(formatted_dataset)} examples")
print(f"Epochs: {training_args.num_train_epochs}")
print(f"Batch size: {training_args.per_device_train_batch_size}")
print(f"Gradient accumulation: {training_args.gradient_accumulation_steps}")
print(f"Effective batch size: {training_args.per_device_train_batch_size * training_args.gradient_accumulation_steps}")
print("="*80 + "\n")

In [None]:
# Start training
trainer.train()

print("\n" + "="*80)
print("✅ TRAINING COMPLETE!")
print("="*80)

## 8️⃣ Save & Merge Model

In [None]:
# Save LoRA adapter
print("Saving LoRA adapter...")
trainer.model.save_pretrained("./the-elder-lora")
tokenizer.save_pretrained("./the-elder-lora")

# Merge LoRA weights with base model for full model
print("\nMerging LoRA weights with base model...")
from peft import PeftModel

# Reload base model in float16 for merging
base_model = AutoModelForCausalLM.from_pretrained(
    BASE_MODEL,
    torch_dtype=torch.float16,
    device_map="auto",
    token=HF_TOKEN,
    trust_remote_code=True,
)

# Load and merge LoRA adapter
merged_model = PeftModel.from_pretrained(base_model, "./the-elder-lora")
merged_model = merged_model.merge_and_unload()

# Save merged model
print("Saving merged model...")
merged_model.save_pretrained("./the-elder-merged", safe_serialization=True)
tokenizer.save_pretrained("./the-elder-merged")

print("\n✅ Model saved and merged successfully")

## 9️⃣ Test the Model

In [None]:
# Test inference
from transformers import pipeline

print("Testing The Elder model...\n")

# Create generation pipeline
generator = pipeline(
    "text-generation",
    model=merged_model,
    tokenizer=tokenizer,
    max_new_tokens=150,
    temperature=0.7,
    do_sample=True,
    top_p=0.95,
)

# Test questions
test_questions = [
    "What is true strength?",
    "How should I respond when someone insults me?",
    "I'm afraid of failure. What should I do?",
]

for question in test_questions:
    prompt = f"""<|system|>
{system_prompt}</s>
<|user|>
{question}</s>
<|assistant|>
"""
    
    print(f"Q: {question}")
    response = generator(prompt)[0]['generated_text']
    # Extract only the assistant's response
    answer = response.split("<|assistant|>")[-1].split("</s>")[0].strip()
    print(f"A: {answer}")
    print("\n" + "-"*80 + "\n")

## 🔟 Push to Hugging Face Hub

In [None]:
from huggingface_hub import HfApi, create_repo

# Configuration
repo_id = f"{HF_USERNAME}/{MODEL_NAME}"

print(f"Creating/accessing repository: {repo_id}")

# Create repository (or get existing)
try:
    create_repo(repo_id, token=HF_TOKEN, private=False, exist_ok=True)
    print(f"✅ Repository ready: https://huggingface.co/{repo_id}")
except Exception as e:
    print(f"Note: {e}")

# Push model
print("\nPushing model to Hugging Face Hub...")
merged_model.push_to_hub(repo_id, token=HF_TOKEN)
tokenizer.push_to_hub(repo_id, token=HF_TOKEN)

print(f"\n✅ Model pushed successfully!")
print(f"\n🔗 View your model: https://huggingface.co/{repo_id}")

## 1️⃣1️⃣ Create Model Card

In [None]:
model_card = f"""---
license: apache-2.0
language:
- en
tags:
- philosophy
- wisdom
- coaching
- stoicism
- bushido
- native-american-wisdom
- conversational-ai
base_model: {BASE_MODEL}
---

# 🧙 The Elder - Wisdom Guide LLM

## Model Description

**The Elder** is a conversational AI model fine-tuned to provide wise guidance through Socratic dialogue, drawing from three philosophical traditions:

- **Bushido** (The Way of the Warrior): Principles of honor, discipline, courage, and integrity
- **Stoicism**: Teachings of Marcus Aurelius, Seneca, and Epictetus on inner peace, self-mastery, and rational thinking
- **Native American Wisdom**: Understanding of interconnection, balance with nature, and cyclical perspectives on life

The Elder does not preach religious doctrine but expresses universal spiritual awareness, respect for nature, and timeless principles of character development.

## Persona

The Elder communicates as:
- A calm, patient guide who teaches through questions rather than lectures
- Someone who values integrity, humility, courage, and deep reflection
- A teacher who encourages seekers to find truth through inner balance and questioning
- One who uses metaphors from nature, warrior traditions, and everyday life

## Training Details

- **Base Model**: {BASE_MODEL}
- **Fine-tuning Method**: LoRA (Low-Rank Adaptation)
- **Dataset**: 50+ carefully curated Q&A pairs embodying the philosophical principles
- **Training**: 3 epochs on custom wisdom dataset
- **Optimization**: 4-bit quantization for efficient training

## Usage

```python
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline

model_name = "{repo_id}"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

generator = pipeline("text-generation", model=model, tokenizer=tokenizer)

question = "What is true strength?"
response = generator(f"<|user|>\\n{{question}}</s>\\n<|assistant|>\\n", max_new_tokens=150)
print(response[0]['generated_text'])
```

## Mobile Usage (GGUF Format)

A quantized GGUF version optimized for mobile devices is available in the repository files.

**For Android (via SmolChat, LM Studio, or similar):**
1. Download `The_Elder.gguf` from this repository
2. Place in your LLM app's model directory
3. Load and chat

## Example Interactions

**Q:** How should I respond when someone insults me?

**The Elder:** Consider: does this insult change who you are? The Stoics remind us that we cannot control the actions of others, only our response. Like a mountain that does not move when the wind howls, you remain unchanged by words. Respond with silence or with compassion, for anger is a fire that burns the one who carries it.

---

**Q:** I'm afraid of failure. What should I do?

**The Elder:** Fear is natural, young one. But ask yourself: what is failure? Is it not simply another teacher? The warrior trains not to avoid falling, but to rise each time with greater wisdom. When you face what you fear, you discover it has no power over you - only the power you gave it in your mind.

## Limitations

- This model is designed for philosophical guidance and reflective dialogue, not factual information retrieval
- Does not provide medical, legal, or financial advice
- Responses should be taken as perspective and wisdom, not absolute truth
- Best used for personal growth, ethical reflection, and character development

## License

Apache 2.0 - Free to use, modify, and distribute with attribution.

## Citation

```bibtex
@misc{{the_elder_2025,
  author = {{Ishabdullah}},
  title = {{The Elder: A Wisdom Guide LLM}},
  year = {{2025}},
  publisher = {{Hugging Face}},
  url = {{https://huggingface.co/{repo_id}}}
}}
```

## Acknowledgments

Created with inspiration from:
- The Bushido code and samurai philosophy
- Stoic philosophers: Marcus Aurelius, Seneca, Epictetus
- Native American wisdom traditions
- Universal principles of character, courage, and compassion

---

*"The warrior trains not to avoid falling, but to rise each time with greater wisdom."* - The Elder
"""

# Save model card
with open("./the-elder-merged/README.md", "w") as f:
    f.write(model_card)

# Upload model card
api = HfApi()
api.upload_file(
    path_or_fileobj="./the-elder-merged/README.md",
    path_in_repo="README.md",
    repo_id=repo_id,
    token=HF_TOKEN,
)

print("✅ Model card created and uploaded")

## 1️⃣2️⃣ Convert to GGUF Format (For Mobile)

In [None]:
# Install llama.cpp for GGUF conversion
print("Installing llama.cpp for GGUF conversion...")
!git clone https://github.com/ggerganov/llama.cpp
!cd llama.cpp && make

# Install required Python package
!pip install -q gguf

print("✅ GGUF tools installed")

In [None]:
# Convert to GGUF format
print("Converting model to GGUF format...")

# Convert to FP16 GGUF first
!python llama.cpp/convert.py ./the-elder-merged --outtype f16 --outfile ./the-elder-f16.gguf

# Quantize to Q4_K_M (4-bit, medium quality - good balance for mobile)
!./llama.cpp/quantize ./the-elder-f16.gguf ./The_Elder.gguf Q4_K_M

# Check file size
import os
gguf_size = os.path.getsize("./The_Elder.gguf") / (1024 * 1024)  # MB
print(f"\n✅ GGUF model created: The_Elder.gguf ({gguf_size:.2f} MB)")

## 1️⃣3️⃣ Upload GGUF to Hugging Face

In [None]:
# Upload GGUF file
print(f"Uploading GGUF file to {repo_id}...")

api = HfApi()
api.upload_file(
    path_or_fileobj="./The_Elder.gguf",
    path_in_repo="The_Elder.gguf",
    repo_id=repo_id,
    token=HF_TOKEN,
)

print(f"\n✅ GGUF model uploaded!")
print(f"\n📥 Download link:")
print(f"https://huggingface.co/{repo_id}/resolve/main/The_Elder.gguf")

## 1️⃣4️⃣ Commit Back to GitHub

In [None]:
# Copy GGUF to releases folder
!mkdir -p releases
!cp The_Elder.gguf releases/

# Create training summary
summary = f"""# The Elder - Training Complete

## Training Summary
- Base Model: {BASE_MODEL}
- Training Date: {os.popen('date').read().strip()}
- Dataset: 50+ wisdom examples
- Method: LoRA fine-tuning
- Epochs: 3

## Model Links
- Hugging Face: https://huggingface.co/{repo_id}
- GGUF Download: https://huggingface.co/{repo_id}/resolve/main/The_Elder.gguf

## File Sizes
- Full Model: ~2.2 GB
- GGUF (Q4_K_M): ~{gguf_size:.2f} MB

## Status
✅ Training complete
✅ Model pushed to Hugging Face
✅ GGUF created and uploaded
✅ Ready for mobile deployment
"""

with open("TRAINING_COMPLETE.md", "w") as f:
    f.write(summary)

# Configure git
!git config user.email "colab@training.ai"
!git config user.name "Colab Training Bot"

# Commit and push
!git add releases/The_Elder.gguf TRAINING_COMPLETE.md
!git commit -m "Training complete: The Elder v1.0 - GGUF model added"
!git push

print("✅ Results committed to GitHub")

## 🎉 COMPLETE! Final Summary

In [None]:
print("="*80)
print("🎉 THE ELDER - TRAINING PIPELINE COMPLETE")
print("="*80)
print("\n📊 Training Summary:")
print(f"  • Base Model: {BASE_MODEL}")
print(f"  • Training Examples: {len(formatted_dataset)}")
print(f"  • Epochs: 3")
print(f"  • Method: LoRA fine-tuning")
print("\n📦 Outputs Created:")
print(f"  • Full Model (Hugging Face): https://huggingface.co/{repo_id}")
print(f"  • GGUF Model Size: {gguf_size:.2f} MB")
print(f"  • GitHub Repository: https://github.com/{GITHUB_USERNAME}/{REPO_NAME}")
print("\n📥 Direct Download Links:")
print(f"  • GGUF (Mobile): https://huggingface.co/{repo_id}/resolve/main/The_Elder.gguf")
print("\n🚀 Installation Instructions for Android:")
print("  1. Download The_Elder.gguf from link above")
print("  2. Install an LLM app (SmolChat, LM Studio, etc.)")
print("  3. Place The_Elder.gguf in the app's model directory")
print("  4. Load the model and start chatting with The Elder!")
print("\n💡 Next Steps:")
print("  • Test the model on your device")
print("  • Share feedback for v2 improvements")
print("  • Train new models using this same pipeline")
print("\n" + "="*80)
print("✨ May The Elder guide you on your path ✨")
print("="*80)