Skip to content

Lling0000/proofroute

Repository files navigation

ProofRoute

English | 简体中文

Proof License Dependencies Proxy

ProofRoute is a zero-dependency, CLI-first OpenAI-compatible LLM router and proxy that routes coding-agent prompts to the cheapest fast-enough local or cloud model, then prints prompt-free terminal proof for speed, savings, privacy, and model choice.

It is built for developers who do not want to pause a Vibe Coding session to compare model menus. ProofRoute inspects the request locally, predicts intent, estimates context and output pressure, checks cost and latency tradeoffs, and forwards the call to the best configured local or cloud model. The result is infrastructure that stays invisible while work is happening, then becomes loud only when it can prove speed, savings, model choice, and privacy in a frame worth sharing.

ProofRoute terminal proof

The first command is deliberately this early because the project earns attention only when it proves itself before setup work begins. Run the demo from a fresh clone to see the local routing proof, local/cloud route split, estimated savings, p95 Controller latency, speed lift, intent accuracy, and route mix without a network call, API key, hosted dashboard, npm publish state, or provider configuration. The cloud rows in that receipt are selected targets inside the zero-network decision, not live provider calls.

node ./bin/proofroute.js demo

The second checkpoint proves the proxy path instead of only the offline router. connect prints the drop-in OpenAI-compatible base URL and smoke --proxy sends a local fake-provider request through the same forwarding layer that coding agents use in a real session.

node ./bin/proofroute.js connect --port 8787
node ./bin/proofroute.js smoke --proxy

The architecture follows Agent-View-Controller because routing has to be fast, explainable, and easy to harden. The Controller owns local intent recognition, token estimation, context fit checks, cost modeling, latency modeling, and numerically stable Softmax ranking. The Agent owns asynchronous provider execution, OpenAI-compatible forwarding, Ollama adaptation, fallback execution, proxy lifecycle, and benchmark orchestration. The View owns terminal-native proof cards, route traces, Markdown receipts, connection cards, model maps, privacy reports, launch checks, and classifier receipts.

The default experience is intentionally zero configuration. A developer can clone the repository, run the demo, and see a local proof card with provider calls fixed at zero before any API key, provider setup, or hosted dashboard exists. When the proxy is started, existing SDKs, editor extensions, CLIs, and coding agents can point their OpenAI base URL at ProofRoute and continue using familiar endpoints while the router quietly swaps in the model that fits the task.

Quick Start

The fresh-clone path is intentionally plain because ProofRoute should earn trust before asking for installation ceremony. Until public npm visibility and anonymous GitHub visibility are proven by node ./bin/proofroute.js publish --check-public --check-actions, the honest copy-paste path is direct Node execution from this repository, while the package-facing memory hook remains proofroute demo after the checkout is linked locally or the npm package is publicly available. The second proof path is the single-prompt Markdown trace receipt that shows why the selected model beat the runner-up without printing the original prompt.

git clone https://github.com/Lling0000/proofroute.git
cd proofroute
node ./bin/proofroute.js demo
node ./bin/proofroute.js route --trace --markdown --prompt "Refactor this webhook, explain the bug, and write a regression test."

Proof Packs

The strongest maintainer first run is the core release proof pack. It needs no API key, calls no external provider, and makes no CUDA, TensorRT, or multi-GPU claim. It proves the repository face, zero-network router value, proxy smoke path, proxy matrix path, prompt-free privacy boundary, SVG receipt, git provenance, and launch copy that any contributor can reproduce on an ordinary laptop.

node ./bin/proofroute.js release --core --out proofroute-release-pack

When a maintainer wants stronger ordinary-laptop evidence, npm run classifier:artifact:evidence and npm run classifier:artifact:verify generate and verify classifier-linear-evidence.json, then the local artifact proof can be folded into the same no-hardware-claim archive. The release pack verifies classifier evidence before it copies the raw evidence file, so passing evidence can be archived with its verifier output while failed or prompt-tainted evidence is recorded only as a verifier failure and blocker summary. That path proves a hashed local classifier artifact and benchmark gate while still saying plainly that no CUDA, TensorRT, or multi-GPU hardware claim is being made.

node ./bin/proofroute.js release --core --require-artifact-evidence --artifact-evidence classifier-linear-evidence.json --max-evidence-age-hours 24 --out proofroute-release-pack

Any public hardware acceleration claim has to pass stricter evidence. node ./bin/proofroute.js doctor --strict-hardware checks that the classifier is a warmed HTTP sidecar with at least two devices, at least two lanes, and nvidia-smi sourced profiles. node ./bin/proofroute.js release --preflight --check-public --require-evidence --evidence classifier-evidence.json --max-evidence-age-hours 24 explains strict release blockers, including public face and clean git provenance blockers, without writing a proof pack or running proxy smoke and proxy matrix checks unless --smoke is explicitly added.

Daily Workflow

ProofRoute is shaped around terminal proof instead of a hosted dashboard. node ./bin/proofroute.js share or npm run share turns the zero-network proof into a launch screenshot. node ./bin/proofroute.js share --markdown or npm run share:markdown creates copy that can live in README sections, PR comments, launch posts, and discussions. node ./bin/proofroute.js share --svg --out docs/proofroute-terminal.svg or npm run share:svg refreshes the repository hero asset from the same prompt-free proof loop.

Once the proxy is running, node ./bin/proofroute.js stats --since 1h, node ./bin/proofroute.js stats --watch --since 1h, node ./bin/proofroute.js privacy --file .proofroute/events.jsonl, node ./bin/proofroute.js share --ledger --since 1h --markdown, node ./bin/proofroute.js prove --ledger --since 1h --min-requests 20 --max-p95-ms 5 --max-router-overhead-pct 1 --max-classifier-circuit-open 0, and node ./bin/proofroute.js tune --since 1h --export tuned-router.json turn a real coding session into a private proof loop. The ledger stores routing evidence rather than prompt or completion text, so teams can review savings, speed, policy, and privacy without exporting proprietary work.

If a local ledger is malformed or contains unsafe fields, node ./bin/proofroute.js repair --file .proofroute/events.jsonl --out .proofroute/events.repaired.jsonl writes a new prompt-free artifact instead of mutating the original ledger. Repair drops malformed records and records that cannot be rebuilt into trusted route evidence, refuses in-place overwrite, projects usable records onto the allowed routing-evidence schema, and uses conservative value checks so a sensitive string hidden inside an allowed field does not become a shareable proof value. Pass the repaired file with --file to privacy, ledger-backed prove, share, and tune so a bad local JSONL record does not kill the proof loop while the data loss stays visible.

OpenAI Proxy

The day-one integration flow is compact. node ./bin/proofroute.js connect --port 8787 prints the OpenAI-compatible base URL and verification commands, while eval "$(node ./bin/proofroute.js connect --shell sh)" injects client variables into the current tool session. The placeholder OPENAI_API_KEY=proofroute-local is only for SDKs and CLIs that insist on a client key while talking to the local ProofRoute proxy; it is not an upstream OpenAI provider key. When the first executable model already lives behind LM Studio, vLLM, or another local OpenAI-compatible gateway, node ./bin/proofroute.js connect --local-openai http://127.0.0.1:1234/v1 --local-openai-model local-qwen-coder emits the client-side OpenAI variables and the proxy-side PROOFROUTE_LOCAL_OPENAI_* exports so the first routed request does not depend on router.json. Use the model id that your local gateway actually exposes; local-qwen-coder is the checked-in example name, not a required model.

node ./bin/proofroute.js connect --port 8787
eval "$(node ./bin/proofroute.js connect --shell sh)"

The transparent proxy serves /v1/chat/completions, /v1/completions, /v1/responses, /v1/models, and /ready as an OpenAI-compatible Base URL proxy rather than a network CONNECT proxy. For a local-only OpenAI-compatible server, start with examples/local-openai-router.json or the PROOFROUTE_LOCAL_OPENAI_* environment path so no upstream cloud key is needed. For the mixed catalog in examples/router.json, ProofRoute expects local Ollama at 127.0.0.1:11434, and cloud OpenAI routes need a real upstream OPENAI_API_KEY rather than the client placeholder printed by connect. The proxy exposes browser-readable proof headers for final model, requested model, model swap state, routing policy, intent, runner-up model, context window, context use, estimated route cost, baseline cost, savings percentage, speedup, estimated latency, baseline latency, decision latency, router overhead, cache state, classifier backend, classifier circuit state, fallback, actual token count, and actual savings whenever the upstream returns usage metadata.

node ./bin/proofroute.js proxy --port 8787 --config examples/local-openai-router.json

Routing Proof

The fastest way to understand one prompt is route. It prints detected intent, resolved policy, selected model, a decision receipt against the runner-up, candidate tradeoff bars for probability, cost fit, speed fit, and context use, estimated latency, estimated cost, and savings against the most expensive viable baseline. Trace mode keeps that receipt first, then adds the weighted quality, context, local, requested-model, cost, and latency contributions, plus the models filtered out by context, executability, budget, or latency ceilings. When the same command uses --markdown, it emits a prompt-free single-prompt receipt that can be pasted into a pull request, launch thread, or debugging note without carrying the original prompt text.

node ./bin/proofroute.js route --trace --markdown --prompt "Refactor this webhook, explain the bug, and write a regression test."

Broader local evidence comes from bench, calibrate, plan, and fanout. The benchmark reports p50 and p95 Controller latency plus money saved and speed gained. Calibration reports labeled intent accuracy, savings, average speedup, intent mix, and model mix. Planning turns one complex request into multi-agent role routing before provider calls happen. Fanout batches executable role decisions in one Controller pass and dispatches them in parallel through configured providers.

Command Surface

The everyday command surface is intentionally headless. demo creates the zero-network first impression. share turns that proof into screenshots, Markdown, SVG, or ledger-backed receipts. prove converts latency, overhead, savings, speed, request volume, accuracy, and classifier-circuit thresholds into a failing gate. connect prints drop-in proxy setup. models renders executable local and cloud coverage. smoke verifies runtime or transparent proxy compatibility with a fake provider, while smoke --proxy --matrix proves through one local proxy origin that intent, context, and cost constraints can route to different models without leaking prompt text into headers or the ledger. stats and privacy turn local telemetry into prompt-free evidence. repair writes a separate prompt-free repaired ledger when malformed records block proof, sharing, or tuning, or when forbidden fields make the privacy audit fail before a ledger can be shared. doctor checks runtime, providers, telemetry, classifier, sidecar, and routing policy. tune recommends a reviewed policy patch from observed traffic. publish checks the clean source tree, package, npm, public GitHub, public npm, authenticated GitHub, and current-HEAD Actions surface before release copy tells people to install anything.

The detailed operator story lives in the executable commands instead of a static dashboard. Run node ./bin/proofroute.js --help for the current CLI map, node ./bin/proofroute.js profile for the repository face, node ./bin/proofroute.js launch for a prompt-free launch readiness card, and node ./bin/proofroute.js publish --check-public --check-actions for the public release gate. The publish gate emits machine-readable blockers, nextActions, npm.evidence, and summary.localEvidence so npm CLI version, npm pack dry-run coverage, npm auth failures, GitHub account visibility restrictions, anonymous public 404s, and disabled Actions runs become a release checklist instead of a vague red terminal. Adding --probe-actions-dispatch to publish deliberately attempts a workflow dispatch and records account-level Actions blockers when GitHub refuses the run. Adding --support-note renders the same evidence as plain, redacted text for external support or release logs, while --support-pack proofroute-publish-support-pack writes a small redacted attachment folder with status.json, the full JSON report, support note, next-action prose, and redaction policy without changing the failing gate. The --json flag keeps machine-readable output as the highest priority.

Configuration And Privacy

Configuration stays plain JSON because routing policy should be inspectable and copyable. Providers define execution endpoints, credentials, and protocol shape. Models define context window, price, median latency, throughput, locality, provider, endpoint, and per-intent quality. The default catalog is useful for demonstration rather than universal truth, and production teams should replace prices and latencies with their own observed telemetry.

node ./bin/proofroute.js init > router.json

For local OpenAI-compatible gateways, node ./bin/proofroute.js init --preset local-openai > router.json emits the same executable shape as examples/local-openai-router.json, including requiresApiKey: false, local policy bias, and a command-backed classifier example. The faster environment-only path is PROOFROUTE_LOCAL_OPENAI_BASE_URL, with optional model, context, latency, and throughput variables that let a private LM Studio, vLLM, or model server become a routable zero-price candidate without a JSON file.

The proxy writes .proofroute/events.jsonl as a privacy-preserving routing ledger by default. It records timestamp, requested model, final model, model swap state, provider, locality, resolved policy, intent, classifier backend, confidence, token estimates, context window, context use, runner-up model, baseline model, candidate count, rejected count, rejected reasons, cost estimates, baseline cost, savings, speedup, baseline latency, decision latency, router overhead, end-to-end latency, streaming mode, and status code. When a non-streaming upstream returns usage metadata, it can also record actual input tokens, output tokens, total tokens, routed cost, baseline cost, and actual savings, but it does not store prompt or completion text.

Repair is deliberately more conservative than privacy audit. The audit fails when forbidden keys or malformed JSONL are present, while repair creates a new JSONL file by keeping only trusted top-level route evidence, validating model and policy-like values against the current catalog and known routing vocabulary, dropping records without enough route evidence, and recording only counts plus line numbers for discarded data. This makes the repaired artifact useful for local proof recovery without pretending it is a lossless copy of the original telemetry.

Classifier Acceleration

The built-in classifier is compact, deterministic, and zero dependency. It uses weighted lexical features, prompt shape signals, token pressure, and stable probability mass to choose among code, reasoning, writing, extraction, long-context, and chat intents. The interface is intentionally shaped so ONNX, TensorRT, vLLM, private embedding code, or another warm local classifier can replace scoring internals without changing the proxy, evidence, or terminal views.

ProofRoute classifier accelerator proof

Persistent local accelerators attach through PROOFROUTE_CLASSIFIER_URL or classifier.url. The npm package exposes two public binaries: proofroute is the main router, proxy, proof, release, and publish CLI, while proofroute-classifier is the reference classifier sidecar for warm local intent classification. The bundled sidecar exposes GET /health, GET /ready, GET /metrics, POST /warmup, POST /classify, and POST /classify/batch, reports backend, scheduler, warmup, batch, lane, device profile, request, error, uptime, inflight, and decision-time metadata, and can use device profiles from environment fields or a bounded nvidia-smi startup probe. Public CUDA, TensorRT, or multi-GPU copy must come from npm run classifier:hardware:evidence and npm run classifier:hardware:verify, while ordinary laptops can use the artifact-only verifier without making a hardware claim.

PROOFROUTE_CLASSIFIER_URL=http://127.0.0.1:8788/classify node ./bin/proofroute.js route --prompt "Refactor this function and add a regression test."

The example accelerator path is practical rather than decorative. examples/accelerator-worker.js defines the hot NDJSON worker protocol, examples/linear-accelerator-module.js proves a local model artifact with stable Softmax, examples/onnx-accelerator-module.js shows the optional ONNX Runtime shape, and examples/tensorrt-accelerator-module.js defines the TensorRT engine contract for machines that build their own serialized engine. proofroute learn, examples/train-linear-intent-model.js, and examples/export-linear-intent-onnx.js close the loop from labeled samples to reviewed classifier artifacts.

Development And Release

Tests focus on the pieces where correctness matters most: numerical stability, intent recognition, route economics, proxy compatibility, privacy boundaries, public repository metadata, publish readiness, release proof packs, and classifier evidence. Run the suite with the built-in Node test runner.

node --test

Repository copy is generated and checked by node ./bin/proofroute.js profile or npm run profile, while npm run release:npm:dry-run previews the package tarball without publishing. Release status is deliberately honest: direct checkout commands are the source of truth until node ./bin/proofroute.js profile --check-public proves the public face and node ./bin/proofroute.js publish --check-public --check-actions proves the final install-copy gate. Release readiness is checked by node ./bin/proofroute.js launch or npm run launch, while node ./bin/proofroute.js launch --check-public adds no-credential GitHub and npm visibility. Maintainers can run node ./bin/proofroute.js publish --local-only or npm run publish:local to prove local source, package surface, npm CLI, npm pack dry-run, and npm publish dry-run without touching npm auth, GitHub, anonymous public visibility, or Actions. That local-only pass is useful release evidence, but it is not the final public install-copy gate. The final publish-facing gate remains node ./bin/proofroute.js publish --check-public --check-actions or npm run publish:preflight, which checks package metadata, a resolvable git HEAD, a clean source tree, npm CLI availability, npm pack dry-run contents, npm publish dry-run stability, npm registry identity, authenticated GitHub visibility, anonymous GitHub visibility, public npm visibility, and recent GitHub Actions evidence for the current local HEAD before public install copy is trusted. The same report now separates local source and package evidence from operator or platform blockers, so a clean checkout, clean npm CLI, clean tarball surface, and clean publish dry-run remain visible even when npm login, public npm visibility, GitHub account visibility, or account-level Actions access still need outside action. When an account-level platform blocker needs a human explanation, npm run publish:support-note produces a copyable redacted note and npm run publish:support-pack writes a redacted attachment directory with a compact status.json, without bypassing any blocker or turning an external account restriction into a code success.

The shareable core archive is node ./bin/proofroute.js release --core --out proofroute-release-pack, npm run release:core, or the intentionally unqualified npm run release:pack. It writes repository metadata, launch readiness JSON, git provenance, paragraph-only release prose, prompt-free launch copy, copied SVG proof assets, and an explicit no-accelerator-claim classifier state into one folder for release notes, pull requests, and launch posts. The authenticated GitHub drift-checking maintainer archive is npm run release:pack:github, so the ordinary-laptop first pack stays separate from the remote repository gate. When classifier evidence is supplied, the pack consumes verifier output first and copies the original evidence only after verification passes, which keeps failed evidence useful for debugging without preserving a raw submitted bundle inside a public archive. Strict hardware release paths require fresh classifier evidence with the hardware probe gate before accelerator claims can pass.

ProofRoute is meant to be open source infrastructure with a shareable heartbeat. The strongest demo is not a slide or a landing page; it is a terminal capture where a real prompt is routed in milliseconds, a local model wins when it should, a premium model wins when the task deserves it, and exact savings appear in the same frame as the engineering decision.

License

MIT. See LICENSE.

Releases

No releases published

Packages

 
 
 

Contributors