v0.51.539 — Release SX (gateway session continuity + local slash-model routing)
·
83 commits
to master
since this release
Release v0.51.539 — Release SX (gateway session continuity + local slash-model routing)
Batch of two independent, gated breaking-flow fixes. Cherry-picked onto current master with original author attribution preserved.
Fixed
- The gateway Runs API no longer spawns a fresh session for every message (#4535). The
/v1/runsrequest body omittedsession_id, so each turn created a newrun_<uuid>session instead of reusing the stable WebUI session — flooding the sidebar with duplicateApi_Serversessions and breaking conversation continuity. The body now carriessession_id. Thanks @rodboev (#4548). - HuggingFace-style slash model IDs now route to the configured local provider instead of the default backend. A selected local model such as
unsloth/gemma-4-12b-it-GGUF:UD-Q4_K_XLwas being sent to the default provider (e.g.openai-codex) because the slash in the ID suppressed provider qualification. When the selected provider is explicitly configured inconfig.yaml(llama.cpp, Ollama, LM Studio, vLLM, or any OpenAI-compatible endpoint), the model ID is now qualified as@provider:model; OpenRouter and removed-provider slow paths are preserved. Thanks @MinhoJJang (#4547).
Gate
- Full pytest suite: 9736 passed, 0 failed
- Codex regression gate: SAFE TO SHIP
- Opus advisor: SAFE (verified None-safety, ordering, mutual independence, no OpenRouter/custom-path/gateway-consumer regression)
Closes #4535.