Cost-quality routing for LLM APIs with reproducible Pareto frontiers per task class.
routerlab is an open-source library + CLI that routes each LLM task to the cheapest model that meets a quality threshold. Cost is grounded in real token economics (via tokenometer) and quality is predicted before the call, not measured after. Per-task Pareto frontiers are published openly so anyone can pick a model rationally.
Where existing routers tend to hand-wave cost or hide their methodology, routerlab is cost-first, reproducible, and open end-to-end.
Early / pre-release. Engine, eval harness, and per-task frontiers are under active development. Expect breaking changes until v0.1.0.
bun add @routerlab/core @routerlab/cli# Route a single prompt at a quality bar of 0.85 for QA tasks:
route --task=qa --quality-bar=0.85 --input=prompt.txtProgrammatic:
import { route } from "@routerlab/core";
const decision = await route({ task: "qa", qualityBar: 0.85, prompt });
// => { model, expectedCost, expectedQuality, fallback }- Production routing examples show support chatbot, extraction, summarization, and code-review routing patterns.
- OpenAI-compatible gateway design shows how to put routerlab in front of existing SDK clients without changing application prompts.
- The CLI package docs in
packages/clicoverroute route,route frontier,route models, and eval commands.
bun install
bun run eval:all # regenerates eval/results/frontier.json + plotsCached judge outputs and provider responses keep this affordable (default judge is the cheapest competent model in the candidate pool).
- Anthropic: Opus 4.7, Sonnet 4.6, Haiku 4.5.
- Free-tier: Groq (Llama 3.3 70B, Llama 3.1 8B, Mixtral 8x7B), Together, HuggingFace Inference, OpenRouter.
@misc{routerlab-2026,
author = {Faraazuddin Mohammed},
title = {{routerlab}: Practical Cost-Quality Routing for LLM APIs},
year = {2026},
howpublished = {\url{https://github.com/faraa2m/routerlab}}
}