feat(dash): real chat surface against primary slot#309
Merged
Conversation
This was referenced May 25, 2026
9cfe618 to
2cab099
Compare
Replace the prototype's scripted demo bubbles + no-op send button with a live round-trip against /v1/chat/completions. The chat surface now talks to the persona slot's actual model via the hal0-api Lemonade proxy and streams the response token-by-token. Why: the prior `ChatActive` rendered ~250 lines of hardcoded JSX (fake "refactor my code" exchange, fake tool blocks, fake image attachment) and `onSend` only cleared the draft. Closes the chat half of hal0_dashboard_v2_rework_in_flight (#200). What: * New `ui/src/api/hooks/useChatCompletions.ts` — `streamChatCompletion` parses SSE frames (`data: {...}\n\n` + `[DONE]`) into running `content` / `reasoning_content` buffers. Non-streaming `chatCompletion` falls back to the same envelope. * New `ui/src/dash/chat.jsx` — extracted Composer, ChatActive, ChatEmpty, PersonaPicker from dashboard.jsx and rewrote ChatActive around a real `useChat` hook. The hook owns message state, single- flight send, and translates SSE deltas into incremental updates on the in-flight assistant bubble. Errors render as a red row instead of being swallowed. Empty drafts no-op (button disabled + Enter guarded). Enter sends; Shift+Enter inserts newline. * `dashboard.jsx` shrinks by ~300 lines — Composer/ChatActive/ ChatEmpty/PersonaPicker moved out; the file now owns only the snapshot strip, memory map, throughput / health cards, and the DashboardView shell. * `main.tsx` adds the chat.jsx side-effect import so its `Object.assign(window, …)` runs before `<App />` mounts. Extracting to a separate file isolates this surface from the parallel `fix/dashboard-memmap-throughput-live` agent's edits, which touch MemoryMap/ThroughputCard in the same file (feedback_multi_agent_one_file). Deferred (locked to brief): * Streaming reasoning split (think/answer surface — separate scaffold). * Tool calls / function-calling visualisation. * Multimodal attachments. * Persistence across page reload. * Model picker in the composer. Verification: * `npm run typecheck` clean. * `npm run build` clean (~803ms, 114 modules). * Existing dashboard-v3 Playwright spec passes — composer still mounts inside dash-main. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2cab099 to
740f2aa
Compare
thinmintdev
added a commit
that referenced
this pull request
May 28, 2026
…rough + gut installer auth section (#390) - docs/operate/lemonade.md (new, .md canonical): operator reference for the v0.2 Lemonade runtime — what it is, where state lives, the /v1/* proxy + dispatcher fallthrough (PRs #248/#277), slot ↔ Lemonade model mapping (PRs #281/#282), max_loaded_models = 8 LRU cap (PR #283), per-type LRU eviction per ADR-0008 (supersedes nuclear-evict ADR-0007), OFFLINE-on-eviction (PR #276), and the three known v0.3 caveats (Vulkan KV gauge missing, whisper RUNPATH workaround, GPU cleanup unload hang). - docs/dashboard/v3.md (new, .md canonical, new docs/dashboard/ dir): page-by-page tour of the v3 React dashboard shipped in v0.3.0-alpha.1 (PR #235). Covers the shell + Mock-badge convention, /dashboard (system overview after #356), /chat (real surface per #309/#314/#315/#351), /slots (sidebar mirror per #357 + #344 UX sweep), /models (#313/#319/#353), /mcp (#304/#300), /agents (Peers per #299), /memory (graph #297, throughput #308), Settings (no Auth tab post-ADR-0012), and the footer journal (Epic #322 — PRs #321/#328/#329/#330/#332). Mock-fallback issues linked via the dashboard-v3 label, not enumerated. - installer/README.md: gut ~95 lines of stale auth prose (Caddy, Bearer-token mint/use/revoke, first-run OTP claim wizard, HAL0_AUTH_ENABLED/HAL0_AUTH_DISABLED, password recovery, basic_auth upgrade path, the TLS recipe). Replace with one paragraph pointing at docs/operate/auth.mdx for the reverse-proxy recipe and docs/agents/identity.md for the X-hal0-Agent identity model. Auth was removed in v0.3.0-alpha.1 per ADR-0012; the README hadn't caught up. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What was dummy / what was wired (
/diagnosesummary)The v3 dashboard's main chat surface (
/, root route) showed a fully scripted demo conversation:ChatActiveinui/src/dash/dashboard.jsxhardcoded ~250 lines of JSX — a fake "refactor my slot manager" exchange, a fakeread_filetool block, a fakegenerate_imagetool block, and a fake image attachment row.onSendwas() => setDraft(\"\")— typing and clicking send just cleared the input. Nothing was POSTed.composerState(idle / sending / streaming / swap / offline / no-tools) was real and useful, but only ever driven by the Tweaks selector — never by an actual request.What was already wired and reused as-is:
useSlots()(5s poll) gives the persona's livemodel_id.POST /v1/chat/completionsis mounted insrc/hal0/api/routes/v1.py(curated surface, not the proxy catch-all) and accepts standard OpenAI{model, messages, stream}..composer-bannerstate machine +.msg/.bubbleCSS classes indashboard.css.The fix
ui/src/api/hooks/useChatCompletions.ts—streamChatCompletionandchatCompletion. Streaming usesfetch+ a manualReadableStreamreader (notEventSource— POST not supported) to parsedata: {…}\\n\\nframes and the terminaldata: [DONE]. Handles Qwen3.5'sreasoning_contentsplit: surfaces bothcontentandreasoning_contentbuffers viaonDelta, withcontentpreferred for the final answer (falls back toreasoning_contentfor very short prompts where the model never emits real content).ui/src/dash/chat.jsx— extractedComposer,ChatActive,ChatEmpty,PersonaPickerfromdashboard.jsx. RewroteChatActivearound a realuseChathook that owns message state, single-flight send, and anAbortControllerso the streaming Stop button actually works. Errors render as a red.msgrow instead of being swallowed. Enter sends; Shift+Enter inserts newline; empty drafts disable the send button. The persona slot'smodel_id || modelis what we send as the OpenAImodelfield.dashboard.jsxshrinks by ~300 lines — now owns onlySnapshotStrip/MemoryMap/HealthCard/ThroughputCard/DashboardView.main.tsxadds the./dash/chat.jsxside-effect import afterdashboard.jsxso the window-globals are installed before<App />mounts.Why a new file rather than editing in place
Sibling PRs
fix/sidebar-live-countsandfix/dashboard-memmap-throughput-liveare also touchingdashboard.jsx. The throughput agent's work-in-progress on the LXC was already addinguseHardware()toMemoryMap(verified by stashing + diffing). Extracting the chat surface to its own file eliminates the per-hunk-git-adddance that the memory notefeedback_multi_agent_one_filewarns about.dashboard.jsxhere only loses the chat-surface code; the sibling's MemoryMap / ThroughputCard hunks land cleanly on top with no overlap.Endpoints called
/v1/chat/completionsAccept: text/event-stream,stream: true. Body:{model: <persona.model_id>, messages: [...], stream: true}No backend changes.
Verification commands actually run
On hal0-dev (this worktree):
On hal0 LXC (
/opt/hal0), checked out this branch,rm -rf node_modules/.vite dist && npm run build,systemctl restart hal0-api. Then:The real assistant bubble captured by the live spec:
i.e. the prompt "Reply with exactly: hello world" round-tripped through the dashboard →
/v1/chat/completions→ Lemonade → Qwen3.5-0.8B-GGUF and the model replied with exactlyhello worldas instructed. The throwaway_chat-live-verify.spec.tswas removed before commit — it depends on the LXC being reachable with a live primary slot and isn't gated for CI.After the LXC verification, the LXC working tree was restored to its pre-test state (sibling-throughput WIP on
dashboard.jsx, sibling-lemonade WIP onidle.py) and rebuilt.Deferred (out of scope per brief)
PersonaPicker).Co-Authored-By: Claude Opus 4.7 (1M context) noreply@anthropic.com