v0.15.3 — failover trusts live quota; /model dashboard; root agent docker CLI (#2265, #2263, #2262, #2264)
Failover: live quota is authoritative over stale exhaustion marks (#2265)
The 2026-06-10 fleet failover outage was a stale-data-over-live-truth
inversion. The auth-broker judged an account eligible-or-not purely by
the persisted exhausted_until mark, never consulting the live quota
probe already cached beside it. One misfired +7d weekly mark on the
healthy primary account stranded the whole fleet — every failover
returned no-eligible-target — and it survived auth use/refresh and
a broker recreate because nothing cleared a mark on a healthy probe. A
separate path routed a consumer onto an account whose mark had expired
(looked eligible) while its live 5h was 100% walled.
Fix (most-recent-signal-wins): a new pure account-eligibility module
makes a fresh (≤24h) quota snapshot that is newer than the mark
authoritative — walled→blocked, healthy→not — and only falls back to the
mark when there is no usable live data. A clearly-healthy probe now
self-heals (clears) a stale mark off disk, so a misfire can't outlast one
probe cycle. Marks carry marked_at for recency comparison. Wired into
the broker's three decision points (serving, failover, and the
list-state exhausted field, which also stops the all-exhausted alert
and cron preflight from false-alarming on a stale-mark-but-healthy
account). 289 broker + 67 server tests, both incident shapes reproduced
as integration tests.
/model dashboard + root agent docker CLI (#2263, #2262, #2264)
/modelis now a live picker-driven menu with a quota brief (#2263),
plus a UAT DM scenario covering show / switch / bad-name (#2262).- A root debugging agent (
SWITCHROOM_AGENT_ROOT) auto-provisions a
version-pinned staticdockerclient into$HOME/.local/binon first
boot — idempotent and non-fatal, sodocker ps/logs/exec/inspectwork
out of the box (#2264). Seedocs/root-agent.md.