Skip to content

feat(dash): real chat surface against primary slot#309

Merged
thinmintdev merged 1 commit into
mainfrom
fix/chat-surface-functional
May 25, 2026
Merged

feat(dash): real chat surface against primary slot#309
thinmintdev merged 1 commit into
mainfrom
fix/chat-surface-functional

Conversation

@thinmintdev
Copy link
Copy Markdown
Contributor

What was dummy / what was wired (/diagnose summary)

The v3 dashboard's main chat surface (/, root route) showed a fully scripted demo conversation:

  • ChatActive in ui/src/dash/dashboard.jsx hardcoded ~250 lines of JSX — a fake "refactor my slot manager" exchange, a fake read_file tool block, a fake generate_image tool block, and a fake image attachment row.
  • The composer's onSend was () => setDraft(\"\") — typing and clicking send just cleared the input. Nothing was POSTed.
  • The Tweaks-panel composerState (idle / sending / streaming / swap / offline / no-tools) was real and useful, but only ever driven by the Tweaks selector — never by an actual request.

What was already wired and reused as-is:

  • useSlots() (5s poll) gives the persona's live model_id.
  • POST /v1/chat/completions is mounted in src/hal0/api/routes/v1.py (curated surface, not the proxy catch-all) and accepts standard OpenAI {model, messages, stream}.
  • The .composer-banner state machine + .msg / .bubble CSS classes in dashboard.css.

The fix

  • New ui/src/api/hooks/useChatCompletions.tsstreamChatCompletion and chatCompletion. Streaming uses fetch + a manual ReadableStream reader (not EventSource — POST not supported) to parse data: {…}\\n\\n frames and the terminal data: [DONE]. Handles Qwen3.5's reasoning_content split: surfaces both content and reasoning_content buffers via onDelta, with content preferred for the final answer (falls back to reasoning_content for very short prompts where the model never emits real content).
  • New ui/src/dash/chat.jsx — extracted Composer, ChatActive, ChatEmpty, PersonaPicker from dashboard.jsx. Rewrote ChatActive around a real useChat hook that owns message state, single-flight send, and an AbortController so the streaming Stop button actually works. Errors render as a red .msg row instead of being swallowed. Enter sends; Shift+Enter inserts newline; empty drafts disable the send button. The persona slot's model_id || model is what we send as the OpenAI model field.
  • dashboard.jsx shrinks by ~300 lines — now owns only SnapshotStrip / MemoryMap / HealthCard / ThroughputCard / DashboardView.
  • main.tsx adds the ./dash/chat.jsx side-effect import after dashboard.jsx so the window-globals are installed before <App /> mounts.

Why a new file rather than editing in place

Sibling PRs fix/sidebar-live-counts and fix/dashboard-memmap-throughput-live are also touching dashboard.jsx. The throughput agent's work-in-progress on the LXC was already adding useHardware() to MemoryMap (verified by stashing + diffing). Extracting the chat surface to its own file eliminates the per-hunk-git-add dance that the memory note feedback_multi_agent_one_file warns about. dashboard.jsx here only loses the chat-surface code; the sibling's MemoryMap / ThroughputCard hunks land cleanly on top with no overlap.

Endpoints called

When Method Path Notes
Stream send POST /v1/chat/completions Accept: text/event-stream, stream: true. Body: {model: <persona.model_id>, messages: [...], stream: true}

No backend changes.

Verification commands actually run

On hal0-dev (this worktree):

cd /tmp/hal0-chat-live/ui
npm run typecheck       # clean
rm -rf node_modules/.vite dist && npm run build    # 114 modules, 803ms
npx playwright test dashboard-v3 --reporter=line   # 2/2 pass (existing smoke)

On hal0 LXC (/opt/hal0), checked out this branch, rm -rf node_modules/.vite dist && npm run build, systemctl restart hal0-api. Then:

# Pre-load primary so the first chat doesn't wait for a cold load
ssh hal0 'curl -sX POST http://127.0.0.1:13305/v1/load \\
  -H content-type:application/json \\
  -d {\"model_name\":\"Qwen3.5-0.8B-GGUF\"}'
# → {\"status\":\"success\",\"model_name\":\"Qwen3.5-0.8B-GGUF\",...}

# Live Playwright spec: opens http://10.0.1.142:8080/, types
# \"Reply with exactly: hello world\", clicks send, asserts on a real
# assistant bubble + no dummy strings + send-disabled-when-empty.
HAL0_E2E_LIVE=1 HAL0_E2E_BASE_URL=http://10.0.1.142:8080 \\
  npx playwright test _chat-live-verify --reporter=line
# → 3 passed (24.2s)

The real assistant bubble captured by the live spec:

=== Final assistant bubble ===
\"hello world\"
=== /Final ===

i.e. the prompt "Reply with exactly: hello world" round-tripped through the dashboard → /v1/chat/completions → Lemonade → Qwen3.5-0.8B-GGUF and the model replied with exactly hello world as instructed. The throwaway _chat-live-verify.spec.ts was removed before commit — it depends on the LXC being reachable with a live primary slot and isn't gated for CI.

After the LXC verification, the LXC working tree was restored to its pre-test state (sibling-throughput WIP on dashboard.jsx, sibling-lemonade WIP on idle.py) and rebuilt.

Deferred (out of scope per brief)

  • Streaming reasoning split (think/answer surface — there's a separate scaffold).
  • Tool-calling / function-calling visualisation (the prototype's toolblock UI is gone for now).
  • Multimodal attachments / voice input (the icons stay as "coming soon" toasts).
  • Persistence across page reload (no localStorage; conversation lives only in the page session).
  • Model picker in the composer (persona pick still routes via the existing PersonaPicker).
  • Backend changes — the proxy works, no new routes needed.

Co-Authored-By: Claude Opus 4.7 (1M context) noreply@anthropic.com

Replace the prototype's scripted demo bubbles + no-op send button with a
live round-trip against /v1/chat/completions. The chat surface now talks
to the persona slot's actual model via the hal0-api Lemonade proxy and
streams the response token-by-token.

Why: the prior `ChatActive` rendered ~250 lines of hardcoded JSX (fake
"refactor my code" exchange, fake tool blocks, fake image attachment)
and `onSend` only cleared the draft. Closes the chat half of
hal0_dashboard_v2_rework_in_flight (#200).

What:

* New `ui/src/api/hooks/useChatCompletions.ts` — `streamChatCompletion`
  parses SSE frames (`data: {...}\n\n` + `[DONE]`) into running
  `content` / `reasoning_content` buffers. Non-streaming
  `chatCompletion` falls back to the same envelope.

* New `ui/src/dash/chat.jsx` — extracted Composer, ChatActive,
  ChatEmpty, PersonaPicker from dashboard.jsx and rewrote ChatActive
  around a real `useChat` hook. The hook owns message state, single-
  flight send, and translates SSE deltas into incremental updates on
  the in-flight assistant bubble. Errors render as a red row instead
  of being swallowed. Empty drafts no-op (button disabled + Enter
  guarded). Enter sends; Shift+Enter inserts newline.

* `dashboard.jsx` shrinks by ~300 lines — Composer/ChatActive/
  ChatEmpty/PersonaPicker moved out; the file now owns only the
  snapshot strip, memory map, throughput / health cards, and the
  DashboardView shell.

* `main.tsx` adds the chat.jsx side-effect import so its
  `Object.assign(window, …)` runs before `<App />` mounts.

Extracting to a separate file isolates this surface from the parallel
`fix/dashboard-memmap-throughput-live` agent's edits, which touch
MemoryMap/ThroughputCard in the same file
(feedback_multi_agent_one_file).

Deferred (locked to brief):

* Streaming reasoning split (think/answer surface — separate scaffold).
* Tool calls / function-calling visualisation.
* Multimodal attachments.
* Persistence across page reload.
* Model picker in the composer.

Verification:

* `npm run typecheck` clean.
* `npm run build` clean (~803ms, 114 modules).
* Existing dashboard-v3 Playwright spec passes — composer still mounts
  inside dash-main.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@thinmintdev thinmintdev force-pushed the fix/chat-surface-functional branch from 2cab099 to 740f2aa Compare May 25, 2026 18:21
@thinmintdev thinmintdev merged commit 625c938 into main May 25, 2026
@thinmintdev thinmintdev deleted the fix/chat-surface-functional branch May 25, 2026 18:21
thinmintdev added a commit that referenced this pull request May 28, 2026
…rough + gut installer auth section (#390)

- docs/operate/lemonade.md (new, .md canonical): operator reference for
  the v0.2 Lemonade runtime — what it is, where state lives, the /v1/*
  proxy + dispatcher fallthrough (PRs #248/#277), slot ↔ Lemonade
  model mapping (PRs #281/#282), max_loaded_models = 8 LRU cap (PR
  #283), per-type LRU eviction per ADR-0008 (supersedes nuclear-evict
  ADR-0007), OFFLINE-on-eviction (PR #276), and the three known v0.3
  caveats (Vulkan KV gauge missing, whisper RUNPATH workaround, GPU
  cleanup unload hang).

- docs/dashboard/v3.md (new, .md canonical, new docs/dashboard/ dir):
  page-by-page tour of the v3 React dashboard shipped in
  v0.3.0-alpha.1 (PR #235). Covers the shell + Mock-badge convention,
  /dashboard (system overview after #356), /chat (real surface per
  #309/#314/#315/#351), /slots (sidebar mirror per #357 + #344 UX
  sweep), /models (#313/#319/#353), /mcp (#304/#300), /agents (Peers
  per #299), /memory (graph #297, throughput #308), Settings (no Auth
  tab post-ADR-0012), and the footer journal (Epic #322 — PRs
  #321/#328/#329/#330/#332). Mock-fallback issues linked via the
  dashboard-v3 label, not enumerated.

- installer/README.md: gut ~95 lines of stale auth prose (Caddy,
  Bearer-token mint/use/revoke, first-run OTP claim wizard,
  HAL0_AUTH_ENABLED/HAL0_AUTH_DISABLED, password recovery, basic_auth
  upgrade path, the TLS recipe). Replace with one paragraph pointing
  at docs/operate/auth.mdx for the reverse-proxy recipe and
  docs/agents/identity.md for the X-hal0-Agent identity model. Auth
  was removed in v0.3.0-alpha.1 per ADR-0012; the README hadn't
  caught up.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant