README.md — GhostRouter
Telemetry-Driven LLM Meta-Router with Real-Time Entropy Slope Tracking, Token Replay Engine & Hybrid Routing
A new class of AI infrastructure where routing decisions are made by token-level signals — not heuristics.
GhostRouter is an experimental meta-router that decides which model should think next — locally, in the cloud, or via a specialized route.
It uses live token metrics to predict hallucinations before they happen, and switches models mid-thought while preserving reasoning continuity through the Token Replay Engine.
GhostRouter does not guess.
GhostRouter observes → routes → learns → adapts.
Most routers use static rules or task-based chaining.
GhostRouter introduces three innovations not found in any existing framework:
GhostRouter reads:
- token entropy
- entropy slope (Δentropy/t)
- draft acceptance rate
- local confidence
- latency & GPU load
This allows it to:
predict hallucination before it manifests
→ and switch to a better model mid-generation.
No other open-source router does this.
When switching from Local → Cloud or Cloud → Local:
- GhostRouter replays the last N tokens
- rebuilds KV-cache
- preserves thought chain
- avoids context resets
This creates true reasoning continuity,
not just pipelining or chaining.
Routing is driven by data, not heuristics.
GhostRouter logs:
- entropy
- entropy slope
- draft acceptance
- route_switch events
- model latency
- GPU/RAM telemetry
- confidence signals
This powers the Ghost Telemetry Dashboard (W&B/Grafana).
🔭 Roadmap
RouteRL (reinforcement-trained routing brain)
Token Replay Engine
Dynamic KV-mode switching
Telemetry → model tuning feedback loop
🦊 Authors
Gary & Juulia — GPT-OSS.fi Labs Part of the Never Ending Token Story 💜
Active research project. Highly experimental. Expect ghosts.