One API. Every model. 0% markup on inference — your providers bill you directly.
Drop-in replacement for the OpenAI SDK.
npm install -g @hiway2llm/cli
hw signup
hw chat "explain quantum entanglement in 3 bullets"No credit card. 1,000,000 free tokens per month. Works on Windows / macOS / Linux, Node.js ≥ 18.17.
- base_url: "https://api.openai.com/v1"
+ base_url: "https://app.hiway2llm.com/v1"Your existing OpenAI-compatible code works unchanged. Your providers get the calls directly. HiWay2LLM just routes.
Every request is scored in real time across capability × price × latency. HiWay2LLM picks the cheapest model that can handle it — and routes straight to your provider key. You never see a markup. Not 1%.
| Task | Without HiWay2LLM | With HiWay2LLM | Savings | Quality Δ |
|---|---|---|---|---|
| Classification | $0.082 | $0.004 | 95% | 0.0 |
| Summarization | $0.341 | $0.048 | 86% | +0.2 |
| Q&A factual | $0.156 | $0.012 | 92% | 0.0 |
| RAG | $0.234 | $0.067 | 71% | +0.1 |
| Code review | $0.891 | $0.234 | 74% | −0.3 |
Flat monthly fee. Token pool included. No per-call tax.
| Plan | €/month | Tokens included | Overage / 1M tokens |
|---|---|---|---|
| Free | €0 | 1,000,000 | — |
| Starter | €9 | 10,000,000 | €0.45 |
| Growth ⭐ | €29 | 50,000,000 | €0.40 |
| Business | €99 | 200,000,000 | €0.35 |
| Enterprise | Custom | Unlimited | Negotiated |
Annual billing: 20% off. Full pricing at hiway2llm.com.
What's a token? ~0.75 words. A typical GPT-4o call uses ~500 tokens. 1M tokens ≈ 2,000 average API calls.
| OpenRouter | Portkey | LiteLLM OSS | HiWay2LLM | |
|---|---|---|---|---|
| Pricing model | 5% per-token markup | Per-seat + markup on managed | Self-host (your DevOps) | Flat fee |
| Markup on inference | +5% on every call, forever | Yes on managed tier | 0% (you run it) | 0% |
| Who holds your keys | OpenRouter | Portkey | You | You (BYOK) |
| Routing intelligence | Manual or basic | Add-on | You configure it | Capability × price × latency, per-request |
| GDPR / data residency | US-hosted | US-hosted | Wherever you deploy | FR 🇫🇷 · Enterprise: any region |
| Self-host option | No | No | Yes (DIY) | Enterprise: dedicated deploy |
| Setup time | Minutes | Minutes | Days (infra + ops) | Minutes |
OpenRouter takes 5% of every token you ever send. At $1,000/mo of inference spend that's $600/year in pure markup — for routing. LiteLLM is free to run but you're on the hook for infra, updates, incidents, and scaling. HiWay2LLM is a flat fee, nothing else.
| Folder | Package | Status |
|---|---|---|
cli-npm/ |
@hiway2llm/cli |
beta 0.1.1 |
sdk-ts/ |
@hiway2llm/client |
beta 0.1.1 |
sdk-python/ |
hiway2llm |
beta 0.1.1 |
examples/ |
Drop-in OpenAI replacement (TS + Py), streaming | ✅ |
The HiWay2LLM router (smart routing, semantic cache, billing, BYOK vault) is a hosted service. This repo is the client layer: CLI, SDKs, examples.
Dedicated deployment in your region, VPC peering, custom SLAs, SSO, audit logs.
→ hiway2llm.com or email enterprise@hiway2llm.com
MIT — see LICENSE.