[3.8.37] — 2026-06-26
✨ New Features
-
feat(providers): add DGrid AI gateway provider — OpenAI-compatible gateway at
api.dgrid.ai/v1(aliasdgrid, API-key auth, passthrough models). Free router tier (10 RPM / 100 RPD); a $5 lifetime top-up raises limits to 20 RPM / 1,000 RPD. (#4931 — thanks @dgridOP) -
feat(providers): add Pioneer AI (Fastino Labs) provider — OpenAI-compatible chat completions at
api.pioneer.ai/v1. Registered with aliaspn,X-API-Keyauth, and a catalog of 10 open-tier serverless models (Qwen3, Llama 3.1/3.2, Gemma 3, SmolLM3). Free $75 credits, no credit card required. Gated enterprise models (Claude/GPT/Gemini) require prior fine-tuning on the Pioneer platform and are intentionally excluded from the catalog. (#4909 — thanks @HikiNarou) -
feat(providers): add xAI Grok inbound translators and a thinking patcher — Grok requests are now translated on the inbound path and reasoning is normalized so Grok modes behave consistently across clients. (#4910 — thanks @mugnimaestra)
-
feat(oauth): Codex bulk-import endpoint —
POST /api/oauth/codex/importaccepts multiple Codex OAuth credentials in one call for fast multi-account onboarding. (#4914 — thanks @beaaan) -
feat(embeddings): add a
dimensionsoverride field to embedding combos so an embedding combo can pin the output vector size per target. (#4913 — thanks @wenzetan) -
feat(sse): auto-promote successful combo model — a new opt-in
comboAutoPromoteEnabledsetting reorders a combo's persisted model list so that, when a combo model responds successfully, it is moved to position #1 for future requests. (#4852 — thanks @arssnndr) -
feat(sse): add toggleable tool-source diagnostics — an opt-in switch surfaces where each tool definition originated when debugging tool-routing issues. (#4856 — thanks @DuyPrX)
-
feat(headroom): proxy lifecycle management + dashboard UI — start/stop/monitor a Headroom compression proxy from the dashboard, with Docker sidecar support. (#4649 — thanks @diegosouzapw / @carmelogunsroses)
-
feat(sse):
x-omniroute-strip-reasoningrequest header to dropreasoning_contentfrom upstream responses (opt-in, preserving reasoning-aware clients). (#4678 — thanks @anuragg-saxenaa / @diegosouzapw) -
feat(cli): multi-model support for the Factory Droid CLI integration. (#4682 — thanks @anuragg-saxenaa / @diegosouzapw)
-
feat(sse): parse Gemini CLI 429
retryDelayfrom the structuredRetryInfopayload so cooldowns honor the upstream-provided backoff. (#4738 — thanks @NoxzRCW) -
feat(sse): add GPT-4 and GPT-4o mini to the GitHub Copilot provider catalog. (#4798, #4797 — thanks @decolua)
-
feat(api): add the
MiniMax-M3pricing row (canonical + lowercase alias) so the new MiniMax default model gets accurate per-request cost accounting instead of falling back to a zero/default rate. (#4814 — thanks @octo-patch)
🔧 Bug Fixes
-
fix(sse): dense, deterministic
response.outputordering inresponse.completed— items are now sorted by their actualoutput_index(via a recorded-as-emitted accumulator + stable sort) instead of being rebuilt from unordered state dicts;normalizeOutputIndexreplaces fragileparseIntcalls for robust index coercion; superseded tool calls (replaced at the same index mid-stream) are excluded from the final output array. (#4906 — thanks @Marco9113) -
fix(sse): normalize Codex custom/freeform tools (
apply_patch,type:"custom"with noparameters) to a{ input: string }function schema instead of an empty schema — the empty schema made models invokeapply_patchwith{}, breaking the Codex runtime which expects{ input: string }. Also mapscustom_tool_call/custom_tool_call_outputinput items and streamsapply_patchtool calls viacustom_tool_call_input.delta/.doneevents. (#4862 — thanks @nstung463) -
fix(sse): preserve the
requiredarray when translating Draft 2020-12 antigravity tool schemas (e.g. from OpenCode), stripping unsupported JSON Schema meta keywords while keeping mandatory arguments required so the model no longer calls tools without them. (#4843 — thanks @anuragg-saxenaa) -
fix(sse): Kiro tool-schema sanitizer — strip unsupported JSON-Schema keywords (
anyOf/$ref/if-then, etc.) and hash-truncate tool names >64 chars before dispatch, mapping the streamed tool-call name back for the client, so Kiro no longer rejects tool calls with400 "Improperly formed request". (#4847 — thanks @smarthomeblack) -
fix(sse): make the
anthropic-versiondefault-guard case-insensitive foranthropic-compatible-*providers, so a caller/operator-suppliedAnthropic-Version(any casing) is no longer clobbered by a second lowercaseanthropic-version: 2023-06-01header. (#4823 — thanks @zakirkun) -
fix(db): validate HuggingFace API tokens via the
whoami-v2endpoint as a pure auth probe so fine-grained Inference-Provider tokens (valid even when model/task endpoints reject them) are no longer falsely marked invalid; only 401/403 means an invalid key, other non-OK statuses surface as transient upstream errors. (#4819 — thanks @Delcado19) -
fix(sse): reject the Anthropic-only
[1m]context-1m suffix inbuildKiroPayloadbefore it reaches AWS Bedrock — Kiro is Bedrock-backed and cannot honor the beta, so a forwardedkr/*[1m]model id was malformed upstream; callers now get a clear error pointing them at a direct-Anthropic provider for 1M-context routing. (#4816 — thanks @Delcado19) -
fix(dashboard): align the Engine Combos editor engines with the API schema — the named-combos pipeline dropdown offered four engines (
headroom,session-dedup,ccr,llmlingua) thatPUT /api/context/combos/[id]rejects, so selecting one made the save return 400 while the UI swallowed the error. The dropdown is now sourced from a single canonical engine map shared withstackedPipelineStepSchema(parity guarded by a unit test), and the editor surfaces save errors plus empty-name/empty-pipeline validation instead of failing quietly. (#5062 — closes #4955) -
fix(sse): surface malformed HTTP-200 upstream responses instead of treating them as success, so combo fallback can trigger. (#4942 — thanks @haipham22)
-
fix(antigravity): retry transient upstream failures rather than failing the request outright. (#4941 — thanks @Jordannst)
-
fix(sse): exclude WS-bridge controller-closed errors from the provider circuit breaker so a client disconnect no longer trips the whole provider. (#4870 — closes #4602, thanks @huohua-dev)
-
fix(sse): resolve custom combos by id and case-insensitive name. (#4869 — closes #4446, thanks @herjarsa)
-
fix(sse): forward AI SDK image parts in the Responses translator. (#4859 — thanks @mugnimaestra)
-
fix(sse): emit valid concatenable Kiro
tool_calls.argumentsdeltas. (#4855 — thanks @wahyuzero) -
fix(sse): strip
temperaturefor Claude models with extended thinking enabled (the upstream rejects it). (#4853 — thanks @noestelar) -
fix(sse): unwrap the Qoder HTTP-200 SSE error envelope so combo fallback can trigger. (#4850 — thanks @vianlearns)
-
fix(sse): strip reasoning blobs from agentic context to prevent O(n²) token growth across multi-turn agent loops. (#4849 — thanks @GodrezJr2)
-
fix(sse): close the reasoning block before message content in the Responses stream so clients render reasoning and answer in the right order. (#4848 — thanks @kwanLeeFrmVi)
-
fix(config): sync the full SiliconFlow model list into the registry. (#4844 — thanks @letanphuc)
-
fix(sse): strip Composer
<|final|>sentinel markers that leaked after Composer reasoning. (#4842 — thanks @noestelar) -
fix(build): trace-include
sql.js'ssql-wasm.wasmin the standalone bundle so SQLite-WASM works in the packaged build. (#4839 — thanks @Delcado19) -
fix(cli): persist lazily-installed native runtime deps (
better-sqlite3,systray2) to the shared runtimepackage.jsonwith--save-exactinstead of--no-save, so installing one no longer prunes the other as "extraneous" — fixing a "No SQLite driver available" failure after a--trayinstall. (#4841 — thanks @omartuhintvs) -
fix(sse): resolve bare model names to a connection's
defaultModelbefore upstream calls. (#4825 — thanks @anuragg-saxenaa) -
fix(api): surface a Docker-localhost hint on provider-node validation connection errors. (#4822 — thanks @anuragg-saxenaa)
-
fix(sse): strip Gemini built-in tools when
functionDeclarationsare present in the Antigravity envelope (the two are mutually exclusive upstream). (#4821 — thanks @Vanszs) -
fix(sse): strip
X-Stainless-*headers and normalize the SDKUser-Agentfor OpenAI-compatible endpoints. (#4820 — thanks @anuragg-saxenaa) -
fix(oauth): allow a per-connection refresh lead-time override via
providerSpecificData.refreshLeadMs. (#4818 — thanks @anuragg-saxenaa) -
fix(dashboard): resolve passthrough model aliases by
providerIdinModelSelectModal. (#4815 — thanks @anuragg-saxenaa) -
fix(sse): strip
enumDescriptionsfrom Antigravity tool schemas. (#4813, #4740 — thanks @anuragg-saxenaa) -
fix(dashboard): keep the desktop sidebar visible via an explicit CSS class. (#4812 — thanks @Delcado19)
-
fix(sse): filter nameless hosted tools when converting Responses API to Chat format. (#4789 — upstream, thanks Владимир Акимов)
-
fix(sse): stream-writer mock
abort()now returns a Promise (test-stability fix). (#4788 — thanks @decolua) -
fix(sse): use the WorkOS auth-token shape for Cline. (#4787 — thanks @apeltekci)
-
fix(api): fall back to the existing access token for any OAuth provider when a refresh fails. (#4786 — thanks @decolua)
-
fix(sse): read Antigravity usage from the
response.usageMetadataenvelope. (#4785 — thanks @decolua) -
fix(oauth): verify Cursor installation on Linux before auto-import. (#4770 — upstream, thanks Ibrahim Ryan)
-
fix(cli): fall back to the default data dir when
DATA_DIRis not writable. (#4767 — upstream, thanks Thiên Toán) -
fix(sse):
json_schemafallback for OpenAI-compatible providers that don't support structured outputs. (#4766 — thanks @mustafabozkaya) -
fix(cli): verify launchd registration and skip self-SIGTERM in macOS autostart. (#4765 — thanks @ntdung6868)
-
fix(sse): finalize the
tool_callsfinish_reasonon early stream end in the OpenAI Responses translator. (#4764 — thanks @decolua) -
fix(sse): gate Kiro image attachments behind a Claude-capability check. (#4763 — thanks @decolua)
-
fix(sse): track Ollama streaming usage from raw NDJSON chunks. (#4754 — thanks @fresent)
-
fix(sse): include low-level cause details in
formatProviderError. (#4741 — thanks @decolua) -
fix(executors):
anthropic-compatible-*gateways now get aBearertoken alongsidex-api-key. (#4729 — thanks @hodtien) -
fix(translator): strip the
x-anthropic-billing-headerin the claude-to-openai path. (#4728 — thanks @weimaozhen) -
fix(translator): preserve
reasoning_effortfor non-Copilot Responses clients. (#4688 — thanks @ryanngit / @diegosouzapw) -
fix(codex): treat an OAuth 401 as an unrecoverable refresh failure (stop retrying a dead token). (#4686 — thanks @sacwooky / @diegosouzapw)
-
fix(translator): coerce tool descriptions to strings in OpenAI normalization. (#4675 — thanks @East-rayyy / @diegosouzapw)
-
fix(dashboard): stop double-masking an already-masked API key in the list view (E2E 3/9 regression). (#4671 — thanks @diegosouzapw)
-
fix(combo): flatten Anthropic tool messages + tool history to prevent an upstream 503. (#4648 — thanks @warelik / @diegosouzapw)
-
fix(providers): require a Default Model in the compatible-provider API-key setup flow. (#4641 — thanks @arden1601)
🔒 Security
-
fix(auth): only trust forwarding headers (
X-Forwarded-For/X-Real-IP) from loopback TCP peers, so a non-loopback client can't spoof its origin to bypass local-only route guards. (#4689 — thanks @Jordannst / @diegosouzapw) -
fix(sse): redact the API key from the AUTH debug log in the chat handler. (#4858 — thanks @sacwooky)
-
fix(oauth): classify
/api/oauth/cursor/auto-importas a local-only route in the route guard, so the loopback-enforced process-spawning endpoint can't be reached through a tunneled/leaked JWT (Hard Rule #17). (#5070 — thanks @diegosouzapw)
📝 Maintenance
-
chore(ci): harden the release flow — decouple the Quality Ratchet from coverage-shard flakes (
if: !cancelled()+--allow-missing), add fast-path drift gates (check:complexity,check:cognitive-complexity,check:pack-policy,check:build-scope), and raise the default build heap to 8 GB. (#5054 — thanks @diegosouzapw) -
docs(routing): sync the combo strategy docs for Fusion (17 strategies). (#5067 — thanks @diegosouzapw)
-
test(sse): golden-lock the
provider.tstranslate-path across all providers. (#4734 — thanks @diegosouzapw / @decolua) -
docs(env): document
HEADROOM_URLin.env.example+ENVIRONMENT.md. (thanks @diegosouzapw) -
chore(quality): rebaseline the file-size ratchet across the rc17 PR-batch levas (leva2/leva3/leva4) to absorb cycle drift. (thanks @diegosouzapw)
What's Changed
- Release v3.8.37 by @diegosouzapw in #5053
Full Changelog: v3.8.36...v3.8.37