feat(dash): slot indicator dot states + warming pulse#314
Merged
Conversation
726c67e to
e9ae99e
Compare
Surface slot lifecycle on the dashboard via colour-coded status dots — the user's ask was "green for recently live, yellow for idle, red for errors, grey for offline, plus a distinct warming/loading indicator". Backend: `Slot.as_dict()` gains `last_used_at` (epoch seconds), sourced from `SlotManager._last_used` (already bumped by `serving()` enter/exit on every dispatched request). Process-local — on hal0-api restart the field reads null, which the dashboard renders as "stale" (yellow). No persistence, no new endpoint, no schema changes. Frontend: new `slotIndicator(slot)` helper in `ui/src/dash/slots.jsx` is the single source of truth for the dot mapping; `RECENTLY_LIVE_MS = 60 * 60 * 1000` is the one constant gating "recent" vs "stale". CSS adds four new dot classes (`recent` / `stale` / `warming` / `offline`) reusing the existing palette vars (`--ok` / `--warn` / `--fg-4`) and the existing `pulse` keyframe for warming. Tooltip on each dot surfaces "Loaded, last used 12 min ago" / "Warming up Qwen3.5-0.8B…" / "Error: <message>" via the `title` attribute. State → dot.cls mapping: ready + last_used_at within 1h → recent (green) ready + >1h ago / null → stale (yellow) idle → stale (yellow) warming / starting / pulling / unloading → warming (amber, pulses) serving → serving (cyan, pulses; pre-existing) error → error (red) offline / unknown → offline (grey) Tests: extends `tests/slots/test_manager.py` to assert `status().last_used_at` round-trips through `as_dict()`. Adds 10 Playwright cases in `slot-indicator.spec.ts` pinning the mapping table (state, label, tooltip, 1h boundary). Adds an opt-in `slot-indicator-live-screenshot.spec.ts` (gated by `HAL0_LIVE_LXC=1`) for visual capture against a live hal0 LXC. Live-verified on hal0 LXC (10.0.1.142): all five ready slots with null `last_used_at` rendered yellow stale dots; `POST /api/slots/ primary/restart` cascaded primary through `starting` (amber warming pulse) while embed/rerank/stt/tts simultaneously turned grey (`.dot.offline`). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
e9ae99e to
0cb88ac
Compare
thinmintdev
added a commit
that referenced
this pull request
May 28, 2026
…rough + gut installer auth section (#390) - docs/operate/lemonade.md (new, .md canonical): operator reference for the v0.2 Lemonade runtime — what it is, where state lives, the /v1/* proxy + dispatcher fallthrough (PRs #248/#277), slot ↔ Lemonade model mapping (PRs #281/#282), max_loaded_models = 8 LRU cap (PR #283), per-type LRU eviction per ADR-0008 (supersedes nuclear-evict ADR-0007), OFFLINE-on-eviction (PR #276), and the three known v0.3 caveats (Vulkan KV gauge missing, whisper RUNPATH workaround, GPU cleanup unload hang). - docs/dashboard/v3.md (new, .md canonical, new docs/dashboard/ dir): page-by-page tour of the v3 React dashboard shipped in v0.3.0-alpha.1 (PR #235). Covers the shell + Mock-badge convention, /dashboard (system overview after #356), /chat (real surface per #309/#314/#315/#351), /slots (sidebar mirror per #357 + #344 UX sweep), /models (#313/#319/#353), /mcp (#304/#300), /agents (Peers per #299), /memory (graph #297, throughput #308), Settings (no Auth tab post-ADR-0012), and the footer journal (Epic #322 — PRs #321/#328/#329/#330/#332). Mock-fallback issues linked via the dashboard-v3 label, not enumerated. - installer/README.md: gut ~95 lines of stale auth prose (Caddy, Bearer-token mint/use/revoke, first-run OTP claim wizard, HAL0_AUTH_ENABLED/HAL0_AUTH_DISABLED, password recovery, basic_auth upgrade path, the TLS recipe). Replace with one paragraph pointing at docs/operate/auth.mdx for the reverse-proxy recipe and docs/agents/identity.md for the X-hal0-Agent identity model. Auth was removed in v0.3.0-alpha.1 per ADR-0012; the README hadn't caught up. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Investigation summary
/api/slotson hal0 LXC exposesstate+metadata.updated_at(last state-change wall clock) but nolast_used_at(last-serve wall clock).SlotManageralready maintains_last_used: dict[str, float]in memory, bumped on every dispatched request via theserving()async context manager (manager.py:1421/1440/1443) — wired throughDispatcher.forward → _forward_with_servingfor every slot-routed call. So the "recently live within 1h" signal already exists server-side; it just isn't surfaced. Minimum diff: thread_last_used.get(slot_name)onto theSlotsnapshot returned fromSlotManager.status(), surface it onas_dict(), and consume it in the frontend. No persistence — onhal0-apirestart the field resets to null, which the dashboard treats as "stale" (yellow). That matches operator intuition: we don't actually know if the slot was hit during downtime.State → dot.cls mapping
readyrecent--ok)readystale--warn)idlestale--warn)warming/starting/pulling/unloadingwarming--warn)servingserving--accent)errorerror--err)offline/ unknownoffline--fg-4)The 1h threshold lives as a single named const (
RECENTLY_LIVE_MS = 60 * 60 * 1000) inslots.jsx. The mapping itself is the single functionslotIndicator(slot), exported onwindowfor unit testing.Backend changes
src/hal0/slots/manager.py:Slot.__init__+Slot.as_dict()gainlast_used_at: float | None.SlotManager.status()populates it fromself._last_used.get(slot_name).That's it. No new endpoint, no schema migration, no new bump-site (the existing
serving()context already covers every dispatched request)._last_usedis process-local;hal0-apirestart resets to null and the UI degrades to yellow, which is honest.Known limitation observed live (NOT a regression — this PR's scope is the indicator, not the bump path): on a hal0 install with no upstreams.toml configured, chat completions fall through to
lemonade_proxy._proxy(api/routes/v1.py:241), which bypassesSlotManager.serving()and therefore doesn't bumplast_used_at. On a normal install withprimaryregistered as a slot upstream, the dispatcher's_forward_with_servingpath bumps as expected. A follow-up could add the bump to the proxy fallback; flagged in the PR conversation for triage rather than expanded here.Frontend changes
ui/src/dash/slots.jsx: newslotIndicator(slot, now?)helper returning{cls, label, tooltip}. NewIndicatorDotcomponent.RECENTLY_LIVE_MSconst. Replaces the two<span className={"dot " + state}/>sites inSlotCard+SlotListRow. All three exposed onwindowfor tests.ui/src/dashboard.css: adds.dot.recent,.dot.stale,.dot.warming,.dot.offline. Reuses existing palette +pulsekeyframe (now0.4 ↔ 1.0opacity as the brief suggested, was0.35).ui/src/api/hooks/useSlots.ts:Slotinterface gainslast_used_at?: number | nullso TS keeps its grip.chrome.jsx(PR #306's territory) is not touched — its dots are lemond-status / coresident / follow-tail indicators, not per-slot state dots.Tests
tests/slots/test_manager.py: newtest_status_surfaces_last_used_atassertsSlot.last_used_atround-trips throughstatus()+as_dict(), both before and after abump_last_used()call. All 117 existing slot tests still pass.ui/tests/e2e/specs/slot-indicator.spec.ts: 10 new Playwright cases pinning every cell of the mapping table — including the≤1h is still recent/1h + 1s is staleboundary cases and thenull last_used_at"no requests since hal0-api started" tooltip.ui/tests/e2e/specs/slot-indicator-live-screenshot.spec.ts: opt-in capture against the LXC, gated byHAL0_LIVE_LXC=1(skipped in CI, useful for follow-up visual reviews).Full local suite: 444 pytest tests pass, 44 Playwright tests pass (9 environmentally skipped).
Live verification on hal0 LXC (10.0.1.142)
After clean rebuild (
cd /opt/hal0/ui && rm -rf dist node_modules/.vite && npm run build && systemctl restart hal0-api):/api/slotsexposeslast_used_atcurl /api/slotsreturns"last_used_at": nullfor all 5 slots after restartlast_used_atrender yellow (stale)--warndotlast_used_at→ primary turns green (recent)lemonade_proxy(no upstreams registered) and bypassesserving(). Documented as known limitation; not introduced by this PR.last_used_at > 1hago) → yellowPOST /api/slots/primary/restart→ screenshot captures primary'sstartingchip with amber dot, plus 4 other slots simultaneously inofflinegrey during the lemond cascadeerror → error (red), tooltip surfaces metadata.messagecovers it.dot.offlinegrey during scenario 5's restart cascadeCommands used:
Files touched
Owned-by-sibling-PR files NOT touched:
chrome.jsx(#306),dashboard.jsx(#308/#309),chat.jsx(#309),main.tsx(#309),idle.py(#307),models.jsx/settings.jsx,dispatcher/router.py.Out of scope
last_used_atfrom the lemonade proxy fallback path (separate PR — see the limitation note above)🤖 Generated with Claude Code