Wayfinder Router v0.1.6 - Prove it: an honest, offline routing benchmark
v0.1.6 is a big one — it caps four releases of work since v0.1.2 and lands the piece the project most needed: proof.
Wayfinder claims to route with no model call, deterministically, offline. This release backs that claim with a reproducible benchmark - one that's deliberately honest about where structural routing loses. Along the way it also makes the chat-UI path actually work
(streaming), makes routing visible, and lets clients discover the routing modes.
What's new
An honest, offline benchmark (the headline).
A new benchmarks/ harness (make benchmark) scores routers against per-model correctness labels — no model is called, nothing hits the network, and it reproduces byte-for-byte. Metrics are the ones the field already uses (RouteLLM's PGR / call-fraction, RouterArena's cost / latency), not a flattering invention, and it reports the full cost-quality curve with a cost-aware knee.
The point is credibility, so the harness is built to not flatter us:
- it ships honest baselines — always-local, always-cloud, a stable-random, a tuned length-threshold, and an oracle upper bound;
- the shipped dataset includes Wayfinder's failure mode — short-but-hard prompts ("prove √2 is irrational") that score structurally invisible;
- on that illustrative set, the length baseline slightly beats Wayfinder (PGR 0.67 vs 0.60). We published that on purpose.
- routers that need a model call to decide (RouteLLM, NotDiamond, …) get a pluggable adapter and a comparison citing their own published numbers with provenance - never presented as ours.
make benchmark # reproduce benchmarks/results.md, offline, no keysThe README gains a "How it compares" section with the precise positioning: the only offline, zero-model-call, calibrate-on-your-data, self-hosted structural router - and a link to the benchmark.
The chat path is real now: streaming + async (v0.1.4).
A request with stream: true is relayed back as Server-Sent-Events, so chat clients (LibreChat, Open WebUI, …) render tokens progressively. The gateway forwards asynchronously (httpx.AsyncClient), so concurrent requests no longer block one another. Upstream timeouts and connection failures now return an OpenAI-shaped wayfinder_router_upstream_error (a 502, or a terminal SSE event) instead of a bare 500.
Production hardening (v0.1.4).
A configurable upstream timeout (WAYFINDER_ROUTER_TIMEOUT / serve --timeout); an optional WAYFINDER_ROUTER_FEEDBACK_TOKEN that gates the /v1/feedback write behind a bearer token; an x-wayfinder-router-request-id on every response with routing decisions and config-reload failures logged; and GET /healthz reporting degraded + missing_keys. New serve --dry-run returns the routing decision with no backends configured - try the router in 30 seconds.
Routing made visible (v0.1.5).
A read-only dashboard answers "is it actually routing, and where?" without digging through headers: GET /router serves a tiny self-contained page (no CDN, no build) and GET /router/recent is the JSON behind it - recent decisions, a per-model count, and scores at a glance, metadata only, never prompt text. For clients that hide headers, opt-in X-Wayfinder-Debug: true surfaces the
decision in the response body; the default response stays byte-clean.
Model discovery + no-fork chat UIs (v0.1.3).
GET /v1/models advertises the selectable routing options — auto, prefer-local / prefer-hosted, and each configured endpoint - so any OpenAI-compatible client auto-populates its model dropdown as a routing-mode picker, no hand-written list. New examples/ recipes put a chat UI in front with no fork: a LibreChat custom-endpoint config + a Compose sidecar override, and
Open WebUI notes. The high-end directive is now prefer-hosted (with prefer-cloud kept as a silent back-compat alias).
The boundary still holds
None of this touches the deterministic core. The score is still computed offline with no model call, no key read, and no network; only the optional gateway/UI layers reach the network or keys, and only from the environment.
Install / upgrade
pip install -U "wayfinder-router[gateway]"Fully backward-compatible with v0.1.2: existing clients and wayfinder-router.toml
configs are unaffected, and prefer-cloud keeps working.
Looking ahead → v0.2.0
This release finishes hardening the router so it's safe and pleasant to put behind any chat client. Next is wayfinder-chat — the turnkey path: a LibreChat fork with the gateway inside, where the per-request override and the visibility surface shipped here become a built-in per-conversation routing-mode picker and threshold slider. wayfinder-router stays the lean, deterministic, bring-your-own-client router; wayfinder-chat will be the one-thing-to-install experience for everyone else.
Full changelog: https://github.com/itsthelore/wayfinder-router/blob/main/CHANGELOG.md