Releases: zoefix/neko-route
Releases · zoefix/neko-route
Release list
0.2.2
Highlights
Model sharing — securely share your local models with friends over a free tunnel. Works with any OpenAI Responses API client (Codex, opencode, …), not only Neko Route. Each token scopes its allowed models with a spend quota and concurrency / RPM limits; keys (sk-…) and downstream model IDs are customizable. Image generation is supported and metered.
Fixes
- Streaming requests are now billed, and OpenAI cached input tokens are no longer double-counted.
- Shared requests route to the requested model regardless of Codex default / fallback / auxiliary / memory settings.
- Standard clients sending
max_output_tokensno longer fail against official accounts. - Request logs separate shared from local traffic and can be filtered by token.
Neko Route 0.2.1
Highlights
- Direct Provider mode — pass Codex requests straight through to an upstream OpenAI provider while still recording logs, tokens, and cost.
- Auxiliary & memory models — route Codex's internal auxiliary and memory-writing agents to dedicated 1M-context models to prevent context-length failures.
- Health page — a dedicated view of each enabled model's recent request health, drawn from the full request log.
- Dashboard — multi-model token & cost trend chart with per-model cost.
Changes
- Reasoning levels aligned to Codex's four-tier catalog (low / medium / high / xhigh); Anthropic requests map to Claude's full range automatically.
- Request logs now label request type (main / auxiliary / memory) on both OpenAI and Anthropic routes.
Fixes
- Memory and auxiliary agents now route to the configured model instead of the fallback provider.
- Codex image editing on official OpenAI accounts.
v0.2.0
Improvements
- Logs: 2xx requests always show a green status now — including client-disconnected streams that actually completed, where the stream column shows final latency instead of "disconnected".
- Logs: rebuilt the table layout with adaptive column widths and no horizontal scrolling.
- Dashboard: replaced the "Active models" card with "Total cost" (estimated spend summed across providers).
Fixes
- Cost is now estimated at query time, so requests logged with an empty cost (e.g. under a random model ID) show pricing on refresh, matched on the request's real model name.
- Expanded the pricing table: added gpt-4o / gpt-4o-mini / gpt-4-turbo / o1 / o3 / o4-mini and Claude 3.x models.
v0.1.9
Features
- Image generation — OpenAI Images protocol (
/v1/images/generations,/v1/images/edits) plus a/neko-imageCodex skill for context-aware create/edit. - Image generation test in the model tester — verify whether a model can produce images.
Improvements
- Model IDs are now auto-generated and hidden; set a display name and upstream model instead.
- No models are preset on first run. Stale presets on signed-out built-in clients are cleaned up automatically, and sidebar/dashboard counts reflect only usable providers.
- Logs: thumbnail preview for image requests, display names instead of internal IDs, image quality in the reasoning column, and tighter column widths.
- Image models are hidden from the Codex model list (image generation runs through
/neko-image).
Fixes
- Image edits now inject the upstream model into the multipart request (fixes 400 "model name cannot be empty").
- Windows: prompt to start Codex manually when no process is detected, instead of failing to restart.
v0.1.8
Fixed
- Context usage reported to Codex now reflects the real pre-edit request size, fixing long agent sessions that under-counted context occupancy and never compacted. Billing and quota accounting still use the post-edit usage.
- Dropped the local context-window pre-compression pass; its byte-based token estimate overshot the real size and could evict cached prefixes, breaking Anthropic prompt caching.
Added
- Per-request estimated cost column in the request log, priced by the upstream model (Claude and OpenAI catalogs).
v0.1.7
Fixed
- Official Anthropic (Claude) requests no longer end after the thinking phase with an empty reply; the output-token budget now has a floor and reasoning effort follows the Codex request tier.
- Chat Completions streaming now reports token usage on strict OpenAI-compatible providers.
Changed
- Context management now relies solely on Anthropic's official Context Editing and Compaction; the in-house compressor and its user-facing settings were removed, and tool-result governance is fixed internally.
- Images are now sent inline as base64 instead of through the Files API, fixing uploads on official OAuth channels.
Added
- Homebrew Cask installation and release automation for macOS.
Neko Route v0.1.6
What's Changed
- Fixed model switching from DeepSeek to GPT by removing Neko Route local reasoning markers before OpenAI Responses verification.
- Preserved DeepSeek reasoning continuity for same-provider conversations.
Neko Route v0.1.5
What's Changed
- Fixed Claude Desktop Official detection when credentials are stored in the encrypted Desktop token cache.
- Removed the duplicate top-bar server status badge; the sidebar status remains the single service indicator.
Neko Route v0.1.4
What's Changed
- Added per-provider HTTP proxy settings for built-in official clients, user-added official accounts, and custom API providers. Proxy passwords are stored in the local secret store instead of JSON config.
- Moved provider credential viewing, copying, and editing into the provider edit dialog. The provider list no longer shows the key-management button.
- Added automatic Codex slot mapping for third-party API and LAN modes so Codex sees compatible gpt-* model IDs while Neko Route keeps routing to the real model.
- Improved Anthropic Messages mirroring, context-pressure handling, and stream completion reporting to avoid silent stops and repeated context-full retries.
- Improved OpenAI Chat Completions conversion, including tool-call pairing, response_format downgrades, streaming output conversion, and usage handling.
- Fixed manual and automatic Codex config application so both paths share the same slot mapping and local route checks.
- Updated provider UI polish: hidden API keys with inline reveal/copy controls, HTTP proxy controls at the bottom of the edit dialog, and a compact green proxy-route indicator next to provider names.
- Changed the top-right Restart Codex action to always ask for confirmation.
- Fixed Windows packaging script cleanup so stale NSIS installers are not copied into release output.
Neko Route v0.1.3
[0.1.3] - 2026-06-21
Added
- Added LAN sharing mode with remote model discovery and Codex configuration support.
- Added Claude context-pressure tracking, context bridge diagnostics, and archived tool-result recall.
- Added request-log stream state tracking for converted Anthropic and Chat Completions streams.
- Added richer Codex catalog metadata and unified auto-compact limits for all model protocols.
Changed
- Mirrored Claude Desktop / Claude Code Anthropic Messages requests more closely, including separate messages and count-tokens profiles.
- Improved Anthropic Messages conversion for system placement, prompt cache positioning, thinking restoration, and large tool-result compression.
- Improved OpenAI Chat Completions bridging from Responses input, including tool-call pairing, multimodal content, reasoning, response formats, and stream conversion.
- Downgraded unsupported Chat Completions
json_schemaresponse formats tojson_objectfor non-allowlisted providers. - Raised generated model auto-compact limits to 90% for every protocol.
- Refined provider, model, log, Codex setup, and LAN sharing UI surfaces.
Fixed
- Fixed Claude context-full requests repeatedly failing once before succeeding after compression.
- Fixed Anthropic mid-conversation
systemmessages so they are only placed where Claude accepts them. - Fixed converted Anthropic streams incorrectly reporting success after upstream interruption or missing
message_stop. - Fixed Anthropic
max_tokensstop reasons so they surface as incomplete Responses results. - Fixed Chat Completions model tests that could miss returned content or reasoning-only output.
- Fixed provider compatibility issues caused by sending
json_schemato upstreams that only supportjson_object.