# üßô The Elder - Fine-Tuning

**Base Model:** `Ishymoto/The_Elder`  
**Output:** `Ishymoto/The_Elder-FineTuned`  
**Dataset:** 90 wisdom examples (Bushido + Stoicism + Native American + Socratic Method)

---

## üìã Instructions:
1. **Enable GPU**: Runtime ‚Üí Change runtime type ‚Üí T4 GPU
2. **Add Secrets**: Click üîë icon, add `HF_TOKEN` with your Hugging Face token
3. **Run All Cells** in order
4. **After Cell 3**: Click "RESTART RUNTIME" button
5. **After restart**: Continue from Cell 4

**‚è±Ô∏è Total time:** ~1 hour

---

## CELL 1: Check GPU

In [None]:
import torch
print(f"PyTorch: {torch.__version__}")
print(f"CUDA: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"GPU: {torch.cuda.get_device_name(0)}")
    print(f"Memory: {torch.cuda.get_device_properties(0).total_memory / 1e9:.2f} GB")
else:
    print("\n‚ö†Ô∏è NO GPU! Go to: Runtime ‚Üí Change runtime type ‚Üí T4 GPU")
    raise SystemExit("GPU required")

## CELL 2: Load Secrets

In [None]:
import os
from google.colab import userdata

try:
    HF_TOKEN = userdata.get('HF_TOKEN')
    os.environ['HF_TOKEN'] = HF_TOKEN
    os.environ['HUGGING_FACE_HUB_TOKEN'] = HF_TOKEN
    print("‚úÖ HF_TOKEN loaded")
except:
    print("‚ùå HF_TOKEN not found! Add it in Secrets (üîë icon)")
    raise

# Configuration
HF_USERNAME = "Ishymoto"
GITHUB_USERNAME = "Ishabdullah"
REPO_NAME = "the-elder-llm"
BASE_MODEL = "Ishymoto/The_Elder"
OUTPUT_MODEL = "The_Elder-FineTuned"

print(f"‚úÖ Base model: {BASE_MODEL}")
print(f"‚úÖ Output model: {HF_USERNAME}/{OUTPUT_MODEL}")

## CELL 3: Install Packages

**‚ö†Ô∏è IMPORTANT:** After this cell completes, click "RESTART RUNTIME" button at the top!

In [None]:
!pip install -q -U \
    transformers \
    datasets \
    accelerate \
    peft \
    trl \
    bitsandbytes \
    huggingface_hub \
    sentencepiece

print("\n‚úÖ Packages installed!")
print("\n‚ö†Ô∏è IMPORTANT: Click 'RESTART RUNTIME' button above, then continue from CELL 4")

## CELL 4: Verify Install (Run AFTER Restart)

In [None]:
# Re-import after restart
import os
import torch
from google.colab import userdata

# Re-load secrets
HF_TOKEN = userdata.get('HF_TOKEN')
os.environ['HF_TOKEN'] = HF_TOKEN
os.environ['HUGGING_FACE_HUB_TOKEN'] = HF_TOKEN

# Re-set configuration
HF_USERNAME = "Ishymoto"
GITHUB_USERNAME = "Ishabdullah"
REPO_NAME = "the-elder-llm"
BASE_MODEL = "Ishymoto/The_Elder"
OUTPUT_MODEL = "The_Elder-FineTuned"

# Verify imports
import transformers
import datasets
import peft
import trl
import bitsandbytes

print("‚úÖ All packages ready!")
print(f"transformers: {transformers.__version__}")
print(f"torch: {torch.__version__}")
print(f"CUDA: {torch.cuda.is_available()}")

## CELL 5: Clone Repository & Load Dataset

In [None]:
!rm -rf the-elder-llm

repo_url = f"https://github.com/{GITHUB_USERNAME}/{REPO_NAME}.git"
!git clone {repo_url}
%cd the-elder-llm

import json
dataset_path = "data/the_elder_complete_dataset.jsonl"

with open(dataset_path, 'r') as f:
    lines = f.readlines()

print(f"‚úÖ Dataset: {len(lines)} examples")

# Show sample
sample = json.loads(lines[0])
if 'instruction' in sample:
    print(f"\nSample instruction: {sample['instruction'][:80]}...")
elif 'prompt' in sample:
    print(f"\nSample prompt: {sample['prompt'][:80]}...")

## CELL 6: Load Base Model & Tokenizer

In [None]:
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
import torch

print(f"Loading base model: {BASE_MODEL}")

# 4-bit quantization config
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.float16,
    bnb_4bit_use_double_quant=True,
)

tokenizer = AutoTokenizer.from_pretrained(BASE_MODEL, token=HF_TOKEN, trust_remote_code=True)
tokenizer.pad_token = tokenizer.eos_token
tokenizer.padding_side = "right"

model = AutoModelForCausalLM.from_pretrained(
    BASE_MODEL,
    quantization_config=bnb_config,
    device_map="auto",
    token=HF_TOKEN,
    trust_remote_code=True,
)

model.config.use_cache = False
model.config.pretraining_tp = 1

print(f"‚úÖ Model loaded: {model.get_memory_footprint() / 1e9:.2f} GB")

## CELL 7: Prepare & Format Dataset

In [None]:
from datasets import load_dataset

dataset = load_dataset('json', data_files='data/the_elder_complete_dataset.jsonl', split='train')

# Load system prompt
with open('configs/the_elder_system_prompt.txt', 'r') as f:
    system_prompt = f.read().strip()

def format_instruction(sample):
    # Handle both formats: instruction/output and prompt/completion
    if 'instruction' in sample:
        user_message = sample['instruction']
        if sample.get('input', ''):
            user_message = f"{user_message}\n{sample['input']}"
        output = sample['output']
    elif 'prompt' in sample:
        user_message = sample['prompt']
        output = sample['completion']
    else:
        raise ValueError("Dataset must have either 'instruction'/'output' or 'prompt'/'completion' fields")
    
    prompt = f"""<|system|>
{system_prompt}</s>
<|user|>
{user_message}</s>
<|assistant|>
{output}</s>"""
    
    return {"text": prompt}

formatted_dataset = dataset.map(format_instruction, remove_columns=dataset.column_names)
print(f"‚úÖ Dataset formatted: {len(formatted_dataset)} examples")

## CELL 8: Configure LoRA

In [None]:
from peft import LoraConfig, get_peft_model, prepare_model_for_kbit_training

model = prepare_model_for_kbit_training(model)

lora_config = LoraConfig(
    r=16,
    lora_alpha=32,
    target_modules=["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"],
    lora_dropout=0.05,
    bias="none",
    task_type="CAUSAL_LM",
)

model = get_peft_model(model, lora_config)
model.print_trainable_parameters()
print("\n‚úÖ LoRA configured")

## CELL 9: Fine-Tune! (~30-45 minutes)

In [None]:
from transformers import TrainingArguments
from trl import SFTTrainer

training_args = TrainingArguments(
    output_dir="./the-elder-finetuned-output",
    num_train_epochs=3,
    per_device_train_batch_size=4,
    gradient_accumulation_steps=4,
    learning_rate=2e-4,
    lr_scheduler_type="cosine",
    warmup_ratio=0.03,
    weight_decay=0.01,
    logging_steps=10,
    save_strategy="steps",
    save_steps=100,
    save_total_limit=3,
    fp16=True,
    optim="paged_adamw_32bit",
    max_grad_norm=0.3,
    group_by_length=True,
    report_to="none",
)

trainer = SFTTrainer(
    model=model,
    train_dataset=formatted_dataset,
    args=training_args,
    dataset_text_field="text",
    max_seq_length=512,
    packing=False,
    processing_class=tokenizer,
)

print("="*80)
print("üöÄ FINE-TUNING THE ELDER")
print("="*80)
print(f"Dataset: {len(formatted_dataset)} examples")
print(f"Epochs: 3")
print(f"Effective batch size: 16")
print("="*80)

trainer.train()

print("\n‚úÖ FINE-TUNING COMPLETE!")

## CELL 10: Save & Merge LoRA

In [None]:
from peft import PeftModel

print("Saving LoRA adapter...")
trainer.model.save_pretrained("./the-elder-lora-finetuned")
tokenizer.save_pretrained("./the-elder-lora-finetuned")

print("Merging LoRA with base model...")
base_model_reload = AutoModelForCausalLM.from_pretrained(
    BASE_MODEL,
    torch_dtype=torch.float16,
    device_map="auto",
    token=HF_TOKEN,
    trust_remote_code=True,
)

merged_model = PeftModel.from_pretrained(base_model_reload, "./the-elder-lora-finetuned")
merged_model = merged_model.merge_and_unload()

merged_model.save_pretrained("./the-elder-merged-finetuned", safe_serialization=True)
tokenizer.save_pretrained("./the-elder-merged-finetuned")

print("‚úÖ Model merged and saved locally")

## CELL 11: Test The Fine-Tuned Elder

In [None]:
from transformers import pipeline

generator = pipeline(
    "text-generation",
    model=merged_model,
    tokenizer=tokenizer,
    max_new_tokens=150,
    temperature=0.7,
    do_sample=True,
    top_p=0.95,
)

test_questions = [
    "What is true strength?",
    "Should I trust someone who has wronged me before?",
    "How do I know if I'm making the right decision?",
]

print("="*80)
print("üßô TESTING THE FINE-TUNED ELDER")
print("="*80 + "\n")

for q in test_questions:
    prompt = f"<|system|>\n{system_prompt}</s>\n<|user|>\n{q}</s>\n<|assistant|>\n"
    response = generator(prompt)[0]['generated_text']
    answer = response.split("<|assistant|>")[-1].split("</s>")[0].strip()
    print(f"Q: {q}")
    print(f"A: {answer}\n")
    print("-"*80 + "\n")

## CELL 12: Push to Hugging Face

In [None]:
from huggingface_hub import HfApi, login

login(token=HF_TOKEN)

repo_id = f"{HF_USERNAME}/{OUTPUT_MODEL}"

print(f"Pushing to {repo_id}...")
merged_model.push_to_hub(repo_id, token=HF_TOKEN)
tokenizer.push_to_hub(repo_id, token=HF_TOKEN)

print(f"\n‚úÖ Model live at: https://huggingface.co/{repo_id}")

## CELL 13: Create Model Card

In [None]:
model_card = f"""---
license: apache-2.0
language: [en]
tags: [philosophy, wisdom, stoicism, bushido, native-american-wisdom, socratic-method]
base_model: {BASE_MODEL}
---

# üßô The Elder - Fine-Tuned Wisdom Guide

A philosophical AI guide fine-tuned on Bushido, Stoicism, Native American wisdom, and the Socratic Method.

## About

This model is a fine-tuned version of `{BASE_MODEL}`, trained on 90 additional wisdom examples that emphasize:

- **Bushido**: The way of the warrior - honor, discipline, courage
- **Stoicism**: Marcus Aurelius, Seneca, Epictetus - control, virtue, acceptance
- **Native American Wisdom**: Connection to nature, balance, ancestral knowledge
- **Socratic Method**: Teaching through questions rather than direct answers

## Usage

```python
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline

model = AutoModelForCausalLM.from_pretrained("{repo_id}")
tokenizer = AutoTokenizer.from_pretrained("{repo_id}")
generator = pipeline("text-generation", model=model, tokenizer=tokenizer)

response = generator("What is true strength?", max_new_tokens=150)
print(response[0]['generated_text'])
```

## Training Details

- **Base Model**: {BASE_MODEL}
- **Method**: LoRA fine-tuning (r=16, Œ±=32)
- **Dataset**: 90 wisdom Q&A pairs
- **Epochs**: 3
- **Batch Size**: 16 (effective)

## Examples

**Direct Wisdom:**
> Q: What is true strength?
> A: True strength is not in never falling, but in rising each time you fall...

**Socratic Guidance:**
> Q: Should I trust this person who has wronged me before?
> A: Seeker, ask yourself: What patterns have you observed? When trust was broken, was it from circumstance or character?

---

*"The warrior trains not to avoid falling, but to rise each time with greater wisdom."*

## License

Apache 2.0
"""

with open("./the-elder-merged-finetuned/README.md", "w") as f:
    f.write(model_card)

api = HfApi()
api.upload_file(
    path_or_fileobj="./the-elder-merged-finetuned/README.md",
    path_in_repo="README.md",
    repo_id=repo_id,
    token=HF_TOKEN,
)

print("‚úÖ Model card uploaded")

## CELL 14: Convert to GGUF (Optional)

‚ö†Ô∏è This may fail in Colab due to compilation issues. If it fails, use the online converter:
https://huggingface.co/spaces/ggml-org/gguf-my-repo

In [None]:
try:
    print("Installing llama.cpp...")
    !git clone https://github.com/ggerganov/llama.cpp 2>/dev/null || true
    !cd llama.cpp && make clean && make
    !pip install -q gguf
    
    print("\nConverting to FP16 GGUF...")
    !python llama.cpp/convert.py ./the-elder-merged-finetuned --outtype f16 --outfile ./the-elder-finetuned-f16.gguf
    
    print("\nQuantizing to Q4_K_M...")
    !./llama.cpp/quantize ./the-elder-finetuned-f16.gguf ./The_Elder_FineTuned.gguf Q4_K_M
    
    import os
    gguf_size = os.path.getsize("./The_Elder_FineTuned.gguf") / (1024 * 1024)
    print(f"\n‚úÖ GGUF created: {gguf_size:.2f} MB")
    
    # Upload GGUF
    print("\nUploading GGUF...")
    api.upload_file(
        path_or_fileobj="./The_Elder_FineTuned.gguf",
        path_in_repo="The_Elder_FineTuned.gguf",
        repo_id=repo_id,
        token=HF_TOKEN,
    )
    print(f"‚úÖ GGUF uploaded!")
    print(f"üì• Download: https://huggingface.co/{repo_id}/resolve/main/The_Elder_FineTuned.gguf")
    
except Exception as e:
    print(f"\n‚ö†Ô∏è GGUF conversion failed: {e}")
    print("\nAlternative: Use online converter")
    print(f"1. Go to: https://huggingface.co/spaces/ggml-org/gguf-my-repo")
    print(f"2. Enter model ID: {repo_id}")
    print(f"3. Select quantization: Q4_K_M")
    print(f"4. Click 'Convert'")

## CELL 15: Final Summary

In [None]:
print("="*80)
print("üéâ THE ELDER - FINE-TUNING COMPLETE")
print("="*80)
print(f"\n‚úÖ Model: https://huggingface.co/{repo_id}")
print(f"‚úÖ Base: {BASE_MODEL}")
print(f"‚úÖ Dataset: {len(formatted_dataset)} examples")
print(f"   - Original wisdom examples")
print(f"   - Socratic method teaching")
print(f"   - Bushido + Stoicism + Native American wisdom")
print("\nüì± Next Steps:")
print("  1. Test on Hugging Face: https://huggingface.co/chat")
print("  2. Download GGUF for mobile (if converted)")
print("  3. Install in SmolChat or LM Studio")
print("\n‚ú® May The Elder guide you on your path ‚ú®")
print("="*80)