Skip to content

test(e2e): rewrite update γ spec for Wave-3 RestartBanner#5

Merged
thinmintdev merged 1 commit into
mainfrom
fix/playwright-green-2026-05-15
May 16, 2026
Merged

test(e2e): rewrite update γ spec for Wave-3 RestartBanner#5
thinmintdev merged 1 commit into
mainfrom
fix/playwright-green-2026-05-15

Conversation

@thinmintdev
Copy link
Copy Markdown
Contributor

Summary

  • Rewrites update.spec.ts for the shipped Wave-3 RestartBanner (drives visibility from /api/updates/check, apply flow is POST /apply → poll /status/{job_id}applied)
  • Old spec asserted pre-Wave-3 stub button name and pre-Wave-3 toggle behavior
  • Mocks /api/updates/check with a mutable per-phase response, reloads in phase 2 to re-run doCheck()

Per-spec status (full sweep)

spec status
firstrun PASS (~2.4s)
hardware PASS (~360ms)
logs PASS (~870ms)
models FAIL — fixed in #13 + #17
settings PASS (~2s)
slot-lifecycle PASS (~900ms)
update PASS (this PR, ~990ms)

Test plan

  • Playwright γ green for the 6 specs covered here

🤖 Generated with Claude Code

The shipped RestartBanner.vue now drives its visibility from
/api/updates/check (not mockState.status.update_available) and the
apply flow goes POST /apply → poll /status/{job_id} → 'applied'. The
old spec asserted the pre-Wave-3 stub button name ("Apply now") and
expected the banner to toggle off when system.status flipped — both
no longer true.

The new spec:
- Mocks /api/updates/check with a mutable per-phase response.
- Reloads the page in phase 2 to re-run doCheck(); the existing
  systemUpdateHint watcher guards on !check.value, so an in-place
  status flip is not enough.
- Clicks the renamed "Apply update" button, asserts the poll → applied
  state transition via banner text, and dismisses to clean up.
- Routes /api/updates/rollback for future Wave-4 use; flagged inline.

models.spec.ts still fails because of a real product bug — see
follow-up report (Models.vue template references `pullProgress` which
is not defined on the script setup, crashing the modal render). Not
touched here.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@thinmintdev thinmintdev merged commit 7daca2e into main May 16, 2026
1 of 4 checks passed
@thinmintdev thinmintdev deleted the fix/playwright-green-2026-05-15 branch May 21, 2026 20:11
thinmintdev added a commit that referenced this pull request May 23, 2026
…k/Menu/Toast/ToastStack) + 19th banner skip-path (slice #167) (#181)

Cross-cutting Vue 3 primitives for the v2 dashboard, mirroring the
React reference at /tmp/hal0-design-v3/dash/primitives.jsx 1:1 in
markup, class names and behaviour. Lands as a self-contained slice
under ui/src/components/primitives/; no existing views/chrome edited.

New primitives
- Modal — Esc/backdrop close (gated by dismissable), body-scroll
  lock + restore, hand-rolled Tab/Shift+Tab focus trap, focus
  restore on close, mounted via <Teleport to="body">.
- Drawer — right-side slide-in (transform: translateX 100% → 0),
  same lock/trap/restore as Modal. side="left" prop reserved.
- ConfirmDialog — wraps Modal; recoverable (neutral btn) vs
  destructive (red btn + "permanent" eyebrow); typeToConfirm input
  gates the confirm button until exact match.
- Banner — reusable warn/err/info shell; emits "action" so it
  stays store-agnostic.
- BannerStack — reads useBannerStore.activeByScope(scope) + the
  global scope alongside; falls back to a toast when an action has
  no onClick (matches design's window.__hal0Toast bridge).
- Menu — Teleported popover anchored to a trigger element;
  auto-closes on outside click + Esc + selection; items without
  onClick toast "<label> — stubbed".
- Toast + ToastStack — presentational toast pill + top-right
  TransitionGroup reading useToastStore. Mount stays for slice #5.

19th banner — skip-path
- BANNER_CATALOG now has 19 entries; the new "skip-path" entry
  (scope=slots, kind=info) is the v0.3 fold-in for the FirstRun
  bundle-picker skip surface.

Tests
- tests/e2e/specs/primitives.spec.ts — 8 specs covering Modal
  Esc/backdrop, Drawer transform on open, ConfirmDialog destructive
  type-to-confirm gate, recoverable variant enabled-immediately,
  BannerStack scope-filtering + dismiss-removes-from-store,
  Menu wired-onClick + Esc + outside-click close, Menu stubbed-item
  toasts, ToastStack queue + short-ttl auto-removal.
- /_primitives_test sandbox route (skipFirstRunGuard) hosts the
  spec; unreachable from chrome.

Resolves the primitives half of #167.
thinmintdev added a commit that referenced this pull request May 23, 2026
#199)

* feat(ui): add design v2 token vocabulary as aliases over --hal0-* (#177)

Foundation for v0.2.1 dashboard rewrite (#148). Adds surface/fg/line
ramps, device chip colors, status semantic colors, radii, motion vocab
as additive tokens. NO component edits — existing dashboard renders
pixel-identical. Closes #164.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(ui): pinia stores for v2 dashboard — lemonade/backends/banner/toast/tweaks (slice #165) (#178)

Stands up the Pinia store skeleton every v2 view depends on:

- useLemonadeStore — polls GET /v1/health every 2s; exposes
  loadedModels[], maxModels, version, lastUse per loaded model, health
  rollup. Refcounted init()/stop() so multiple callers share one
  timer.
- useBackendsStore — lists installable backends + state + slot fan-out;
  /api/backends is mocked until #142/#145 land (isMocked flag exposed).
- useBannerStore — 18-entry catalog (verbatim from the design's
  primitives.jsx BANNER_CATALOG); show/dismiss/toggle/clearScope +
  activeByScope() getter.
- useToastStore — v2 toast queue (separate from existing toasts.js so
  this slice is zero-regression for the v1 surface).
- useTweaksStore — DEV-only designer overlay, persisted to
  localStorage:hal0:tweaks:v2; no-op shim in production builds.

Light-touch extension to useSystemStore: documents the SlotConfig
device field already returned by the backend after PR-11 (#163).

useNuclearEvictBanner refactored to call useLemonadeStore.init() on
mount alongside its existing /api/lemonade/events/stream SSE — App.vue
still subscribes once, gets both polling + SSE-driven nuclear-evict
toast. PR-11's dashboard-lemonade-state.spec.ts continues to pass.

Closes data half of #148.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(internal): dashboard v2 implementation plan + v0.3 fold-in

Living plan doc + slice ledger for #148. Includes v0.3 carry-in (MCP Servers → slice #14#180) + worktree-path-discipline rules codified from observed failures.

* feat(ui): v2 primitives (Modal/Drawer/ConfirmDialog/Banner/BannerStack/Menu/Toast/ToastStack) + 19th banner skip-path (slice #167) (#181)

Cross-cutting Vue 3 primitives for the v2 dashboard, mirroring the
React reference at /tmp/hal0-design-v3/dash/primitives.jsx 1:1 in
markup, class names and behaviour. Lands as a self-contained slice
under ui/src/components/primitives/; no existing views/chrome edited.

New primitives
- Modal — Esc/backdrop close (gated by dismissable), body-scroll
  lock + restore, hand-rolled Tab/Shift+Tab focus trap, focus
  restore on close, mounted via <Teleport to="body">.
- Drawer — right-side slide-in (transform: translateX 100% → 0),
  same lock/trap/restore as Modal. side="left" prop reserved.
- ConfirmDialog — wraps Modal; recoverable (neutral btn) vs
  destructive (red btn + "permanent" eyebrow); typeToConfirm input
  gates the confirm button until exact match.
- Banner — reusable warn/err/info shell; emits "action" so it
  stays store-agnostic.
- BannerStack — reads useBannerStore.activeByScope(scope) + the
  global scope alongside; falls back to a toast when an action has
  no onClick (matches design's window.__hal0Toast bridge).
- Menu — Teleported popover anchored to a trigger element;
  auto-closes on outside click + Esc + selection; items without
  onClick toast "<label> — stubbed".
- Toast + ToastStack — presentational toast pill + top-right
  TransitionGroup reading useToastStore. Mount stays for slice #5.

19th banner — skip-path
- BANNER_CATALOG now has 19 entries; the new "skip-path" entry
  (scope=slots, kind=info) is the v0.3 fold-in for the FirstRun
  bundle-picker skip surface.

Tests
- tests/e2e/specs/primitives.spec.ts — 8 specs covering Modal
  Esc/backdrop, Drawer transform on open, ConfirmDialog destructive
  type-to-confirm gate, recoverable variant enabled-immediately,
  BannerStack scope-filtering + dismiss-removes-from-store,
  Menu wired-onClick + Esc + outside-click close, Menu stubbed-item
  toasts, ToastStack queue + short-ttl auto-removal.
- /_primitives_test sandbox route (skipFirstRunGuard) hosts the
  spec; unreachable from chrome.

Resolves the primitives half of #167.

* feat(ui): mock harness for absent endpoints — useMock + dispatch + Playwright fixture parity (slice #166) (#182)

Adds `ui/src/composables/useMock.js` carrying the v2/v0.3 mock dataset
(host, lemonade, slots, bundles, journal, models, backends, personas,
MCP servers/clients/catalog) and a drop-in `mockFetch` that substitutes
allowlisted endpoints. Two activation modes: `VITE_MOCK_LEMONADE=1` for
offline dev, or per-endpoint 404 fallback with console.warn.

The allowlist has 8 entries, each tagged with the backend issue that
will retire it (#145 metrics, #142 multi-modal slots, #180 MCP). When
the real endpoint lands, drop the matcher + builder; store callers
keep using `mockFetch` unchanged.

`useBackendsStore` refactored to use `mockFetch` instead of its inline
404 fallback. `useLemonadeStore` gains 5s `/v1/stats` polling — mirrors
PR-12 #179's emission shape so the cutover after rebase is free.

Playwright fixtures get a parallel `mock-data.ts` (Vite's
`import.meta.env` blocks direct import under Node + tsx), plus
`mockMcpEndpoints(page)` + `mockV1Stats(page)` helpers that pre-route
common endpoints. All 38 existing Playwright cases (12 specs) stay
green. `VITE_MOCK_LEMONADE=1 npm run dev` boots without a backend.

Refs #166. Anti-scope per brief: no new routes, no primitives, no
chrome, no global `window.fetch` patch, no banner catalog entries, MCP
store deferred to slice #14 #180.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(ui): v2 chrome — TopBar/Sidebar/Footer/BottomTabs + Agents·v0.3 group + journal pane upgrades (slice #168) (#185)

Replaces the v1 chrome with the v0.3 design vocabulary:

  * TopBar — wordmark + version pill + route eyebrow + ⌘K stub
    + host chip + AgentApprovalBell.
  * Sidebar — Dashboard / Slots / Models / Hardware / Backends /
    Logs / Agents·v0.3 sub-group (Agents / MCP Servers / Memory) /
    Settings. Lemonade status block at the bottom (state dot +
    N/M loaded) routes to /logs?source=lemond. Variants per
    breakpoint: full ≥1280 / icon-collapse 1080–1279 / overlay
    drawer 720–1079 / hidden <720.
  * Footer — two-line chip row (lemond:<state> · throughput ·
    loaded · NPU coresident · queued + update-available pill +
    journal toggle) over a last-3 journal peek. Expand pane slides
    up with source filter (merged/hal0/lemond), search with amber
    inline highlight, empty state, and "Open full logs →". Open
    state persists via sessionStorage:hal0:journal-pane.
  * BottomTabs — <720 only; Home / Slots / Models / Logs / More
    (sheet with Hardware/Backends/Agent/Settings).
  * App.vue mounts <BannerStack scope="global"/> above the route
    view and <ToastStack/> at the shell. Both are suppressed on
    the primitives sandbox so its own instances stay sole.
  * Router registers /agents/mcp + /agents/memory as
    ComingSoon.vue placeholders so the new sidebar links don't
    404 before slice #14 and Phase 9 ship.

The old footer subtree (FooterBar / FooterPane / 4 tab views) is
deleted; footer.spec.ts is rewritten against the new chip-row +
journal-pane DOM and chrome.spec.ts adds 14 new tests covering
every breakpoint, the journal pane upgrades, the Agents·v0.3
group, and the Lemonade status block.

All 59 e2e specs pass. ruff format/check has only pre-existing
issues in unrelated python files.

Closes #168.

* feat(ui): v2 Dashboard / view — snapshot strip + persona picker + composer (5 states) + chat surface (slice #169) (#187)

Rewrite Dashboard.vue to the v0.3 chat-first layout. Replaces the
stat-rail + unified-memory + slots-grid + test-chat-panel + recent-events
sections with three stacked regions:

  1. Hero strip (~60px) — 3 variants (returning / post-install /
     skip-path-empty) with sessionStorage-backed × dismiss.
  2. SnapshotStrip (~90px) — per-slot row routing to /slots/:name,
     with type-aware metric strip (llm tok/s · TTFT · ctx · KV%,
     embed req/min · p50 · dim, etc.). KV% honestly shows "—" for
     GPU llm slots per the bundled llama-vulkan scrape gap.
  3. Chat surface (rest) — ChatActive | ChatEmpty + Composer with
     persona-above placement.

Adds under ui/src/components/dashboard/:
  - SnapshotStrip.vue  — clickable rows, state dot, device chip,
                          default ✦, coresident chip
  - PersonaPicker.vue  — persona chip + dropdown of llm slots,
                          "+ Add chat slot" → /slots create flow
  - Composer.vue       — 5 states (idle / sending / streaming /
                          swap / no-tools / offline) with persona
                          slot, attach + mic + send (or Stop)
  - ChatActive.vue     — user/assistant bubbles + inline tool-call
                          <details> blocks + persona-swap markers
                          + image/audio/text attachments
  - ChatEmpty.vue      — glyph + 3 example prompt chips

Extends useTweaksStore with chatVariant + heroVariant knobs for
designer preview without changing prod derivation.

The composer's `swap` state surfaces inline (composer-banner-swap)
because npu-swap in the banner catalog is scope=slots and would not
render through this view's scope=dashboard BannerStack.

Tests: new dashboard.spec.ts replaces the old test-chat dropdown
regression (that surface no longer exists). 12 specs cover all 5
composer states, SnapshotStrip row routing, persona swap → swap
state, tool-call <details> toggle, hero × persist, and the
skip-path-empty composer hide.

Full e2e: 69/69 green. Build clean.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(ui): v2 Models /models view — 3-pane catalog + AddByHF + Delete + DownloadRow 7 states (slice #171) (#188)

Wholesale replace v1 single-table Models.vue with the v2 3-pane layout
from the design source. Catalog list (left, 320px) + detail (right-top)
+ Downloads pane (right-bottom). <1080px collapses to list-primary with
detail + downloads as Drawer overlays.

New components under ui/src/components/models/:
- ModelList: filter chips (type/device/labels/namespace) + search +
  active-filter summary + sectioned list (installed / blessed / user.*)
- ModelDetail: header + recipe options w/ inline edit + real-time
  llamacpp_args denied-flag rejection + Used-by panel + On-disk panel
  + actions (Load now / Reveal-or-Copy-path / Delete / Pull)
- DownloadsPane + DownloadRow: 7 canonical states (pulling, paused,
  cancelled, error, verifying, completed, queued) + multi-file expand
  on hover + 5s auto-remove for completed (hover-defer)
- AddByHFModal: Inspect → variants (real /v1/pull/variants w/ mock
  fallback) → user.* model_name → labels (mmproj required for vision)
  → pre-flight panel → Pull
- DeleteModelDialog: destructive confirm + warn-soft block listing
  slot references + type-to-confirm (model.id) + omni-collection copy

E2E coverage:
- models-v2.spec.ts: AddByHF inspect→variants→Pull happy path, vision
  requires mmproj, delete type-to-confirm gating, 7-state DownloadRow
  rendering via window.__hal0_setFixtureDownloads fixture injection,
  denied llamacpp_args rejection, 3-pane vs compact responsive
- models.spec.ts: adapted from v1 table flow to v2 detail-pane Load Now
- models-slots-refactor.spec.ts: adapted from v1 scan/edit-slot flows
  to v2 a11y + filter-chip behaviour

Tests: 74/74 pass. Build clean.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(ui): v2 Settings /settings — 9 sections + Lemonade guardrails + OmniRouter + secrets/memory/auth (slice #173) (#190)

Rewrites Settings.vue around the v0.3 left-rail anchor layout from the
design source. Nine sections render under one route: Auth, Secrets,
Updates, Lemonade admin, OmniRouter, Agent policy, Memory (Cognee),
Appearance, About. Scroll-spy + hash deep-links wire the rail.

Lemonade admin folds in PR-13's keys via an in-page form: max_loaded_models,
ctx_size, llamacpp.backend/args, flm.args, whispercpp.backend, sdcpp.*,
log_level, global_timeout. `llamacpp.args` is read-only by default;
the Edit toggle reveals the GPU-deadlock footgun warning. Save fires
the SaveAndRestartDialog ("~8-12s outage") and the restart endpoint
runs after persisting when any deferred key changed. The standalone
/settings/lemonade subview stays mounted so PR-13's lemonade-admin
spec (link from /settings + PageHeader title assertion) passes.

OmniRouter renders 8 tools (3 hal0 + 5 upstream) with origin chips,
target slot, and remediation CTAs for inactive entries. Secrets section
falls back to seeded local mocks when /api/secrets returns an empty
body (catch-all stub or 404); AddSecretModal saves into the list.
Memory namespace reset is type-to-confirm via the destructive
ConfirmDialog. Appearance writes theme + density to useTweaksStore
(persisted in prod too for appearance keys only).

New primitives under ui/src/components/settings/: SettingsRail,
SecRow, SecKey, RestartChip, AddSecretModal, AllowedOriginsModal,
RotateTokenDialog, SaveAndRestartDialog, BundledLicensesDrawer.

Coverage: new settings-v2.spec.ts (7 cases) + adapted settings.spec.ts
smoke. Full Playwright suite 78/78. PR-13's lemonade-admin.spec.ts
still passes.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(ui): v2 FirstRun /firstrun — bundle picker (pick/confirm/progress) + grid+matrix variants + skip dialog (slice #172) (#191)

Replaces the v1 8-step linear wizard with the v0.3 design's three-state
machine: pick (Lite/Default/Pro/Max tier cards or capability matrix) →
confirm (per-slot install list + optional NPU trio opt-in) → progress
(per-row download bars with inline Retry + Skip-this-model).

State + endpoint wiring lives in components/firstrun/useFirstRun.js
(rewritten from scratch). The 8-step composable's API is gone; the new
composable exposes view / pickedTier / withNpu / pull alongside derived
bundle state (recommended / available / unfit / installed / gated-no-hf).

Sub-components: BundleGrid, BundleTable, TierCard, InstallProgressRow,
SkipBundleDialog. Layout variant switches via useTweaksStore.firstrunLayout
(tiers = grid, wizard = capability matrix). Banner catalog entries
fr-reentered, fr-ram-low, and hf-gated are toggled on/off as state
transitions warrant — no new catalog rows.

LMX-Omni-52B-Halo pre-built kit surfaces when host RAM ≥ 100 GB. Skip
flow wraps primitives/ConfirmDialog with the design's verbatim copy.
Progress rows drive an SSE stream per pull and degrade to polling on
EventSource error.

The agent-flow spec's "first-run wizard surfaces the agent step" test
is .skip'd: the v0.3 design removes the bundled-agent picker from
firstrun (lives behind /agent + ADR-0004 settings entry now). Slice
#174 re-targets the install flow.

Closes #172.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(ui): v2 MCP Servers page (/agents/mcp) — KPI strip + clients ribbon + LiveTimeline oscilloscope + Install/Config/Logs/Connect modals (slice #14 / #180) (#193)

v0.3 Agents · MCP Servers surface (issue #180, dash-v2 slice 14).
Replaces the slice #168 ComingSoon placeholder.

McpView composition (top-down):
- 6-cell KPI strip (running/clients/calls60s/failures/installing/last-activity)
- ClientsRibbon ↔ NoClientsState (data-driven)
- 5-tab segmented filter bar + timeline-tick legend
- Vertical McpServerRow stack — running/stopped/failed/installing
  variants, bundled rail, per-row 60s LiveTimeline oscilloscope
- Drawer + Modal cluster: InstallDrawer (Catalog+URL tabs),
  EditConfigModal, LogsDrawer, ConnectClientModal, destructive
  ConfirmDialog (type-to-confirm uninstall; bundled rejects)

Live data path: useLiveCallStream composable (500ms tick, p=rpm/120
per running server, 60s GC window). Production swap hook = a WS
subscription on /api/mcp/stream.

useMcpStore (Pinia) backs the page — fetch/install/uninstall/restart/
toggleEnabled/updateConfig — all behind mockFetch + per-resource
loading/error state. Endpoint stubs covered by mockMcpEndpoints fixture.

Sidebar: unblocked the "MCP Servers" sub-row (one-line change);
Memory remains gated. chrome.spec updated to expect 1 disabled
sub-row (Memory) instead of 2.

playwright.config: HAL0_E2E_PORT env override added so spec runs
in parallel worktrees stop colliding on a shared 5173 vite port.

Coverage: ui/tests/e2e/specs/mcp-v2.spec.ts — 8 tests, all green
end-to-end + full 79-spec suite remains green.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(ui): v2 Extras — Hardware/Backends/Logs/Agent + Backend modals + Persona edit (slice #174) (#194)

Rewrites four secondary-route views against the v0.3 design system:

- Hardware: 6 vertical-stack panels (Host / CPU / GPU / NPU / Memory /
  Storage) with status dots, recommended-chip rollups, per-slot memory
  segments, lemond-offline dim-overlay.
- Backends: Lemonade self-card + backend table with state chips (installed
  / installing / uninstalling / unavailable / error). Per-row install /
  reinstall / uninstall actions wire to the Install / Uninstall /
  FLM-deb modal trio. Adds /backends route; redirects /providers for
  back-compat (renamed in v2 IA).
- Logs: unified merged journal with source toggle (merged / hal0 /
  lemond), level + slot filter, search-with-highlight, grouped-error
  collapse for adjacent same-request_id frames, floating jump-to-live
  pill with +N pending badge. Preserves PR-14 LemonadeJournalPanel —
  source=lemond renders the existing WS-streamed panel inline (no
  duplicate streaming logic). ws-disconnect banner wires through
  useBannerStore.
- Agent: 5-tab surface (Overview / Inbox / Skills / Memory / Personas).
  Inbox preserves ADR-0004 §5 wiring via existing AgentInboxTab +
  AgentApprovalRow. NoBundledAgentCard radio (pi-coder vs Hermes) +
  install path. Personas grid with PersonaEditModal (name / slot /
  tone / system prompt / allowed tools). no-agent banner via store.

Sub-components live under components/{hardware,backends,logs}/ and
components/agent/. New extras-v2.spec.ts covers the four routes,
backend modal trio, persona save flow, NoBundledAgentCard install
flow, and the no-agent banner. Existing hardware / logs /
lemonade-journal specs adapted to the new view shapes while
preserving test intent.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(ui): v2 Slots /slots view — SlotCard + NPU trio + Create/Edit/Swap modals (slice #170) (#192)

* feat(ui): v2 Slots /slots view — SlotCard + NPU trio + Create/Edit/Swap modals (slice #170)

Replaces the v1 list+capability-cards hybrid with the v2 grouped-card layout from
slots.jsx. Sections render per-type slot grids (Chat / Embed / Voice / Image /
Custom) plus a dedicated NPU rollup. Adds skip-path 6-card grid + banner #19.

What landed
-----------
- `ui/src/views/Slots.vue` — rewritten. Grouped sections, hotkey `N`, route-driven
  EditSlotDrawer via `/slots/:name`, skip-path detection (6 EmptySlotCards +
  `skip-path` banner), NPU variant toggle.
- `ui/src/components/SlotCard.vue` — rewritten to match slots.jsx::SlotCard.
  State-dot motion, per-type metric strip, inline swap trigger, ⋯ overflow menu,
  [CPU] chip preserved (provider=kokoro), coresident badge preserved.
- `ui/src/components/slots/{NpuBlock,NpuReactor,EmptySlotCard,ErrorSlotCard,
  CreateSlotModal,EditSlotDrawer,InlineSwapPopover,SlotOverflowMenu}.vue` —
  new components. NPU trio rendered as one rollup with two variants toggled
  from useTweaksStore.npuVariant ('block' default, 'reactor' tweak).
- `ui/src/components/primitives/{Modal,Drawer}.vue` — added optional `title-id`
  prop so callers can wire `aria-labelledby` to a stable selector (preserves
  a11y test intent across the SlotCard/EditSlotDrawer rewrite).
- `ui/src/stores/tweaks.js` — npuVariant defaults to 'block'; legal values
  'block' | 'reactor'.

Deleted
-------
- `ui/src/components/capabilities/{CapabilitiesSection,EmbedCard,VoiceCard,
  ImgCard}.vue` — capability cards collapsed into the SlotCard grid per the
  v2 brief. NPUBackendCard kept (still used by Dashboard.vue).

Tests
-----
- New `ui/tests/e2e/specs/slots-v2.spec.ts` (10 tests): skip-path + banner #19,
  hotkey N, /slots/:name drawer routing, grouped sections, both NPU variants,
  per-type metric strip, KV%='—' for GPU llm, overflow menu, ErrorSlotCard.
- Adapted `slot-lifecycle.spec.ts`, `models-slots-refactor.spec.ts`,
  `lemonade-voice-chip.spec.ts`, `dashboard-lemonade-state.spec.ts` to the
  new selectors (`.slot[data-slot-name=...]`, `#create-slot-*`, overflow
  menu for Delete, NpuBlock for trio, `[aria-hidden=true]` for Drawer close).
- Full Playwright suite: 80/80 green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(slots): skip 5 models-side tests duplicated by models-v2.spec from #171

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(ui): v2 polish — skeletons + a11y + TopBar overflow + drift banners (slice #175) (#197)

Final pass before the dash-v2 cutover. Adds the loading-state polish,
keyboard-a11y improvements, and external-link surface the v0.3 design
calls for.

  - 5 skeleton variants in ui/src/components/skeletons/: SlotCard,
    SnapshotRow, JournalLine, NpuSubRow, ModelRow. All use a shared
    `.skel` shimmer rule + respect prefers-reduced-motion. Mounted in
    SnapshotStrip / Slots / Logs / Models as initial-load fallbacks
    (gated on !system.status or empty loading lists so re-polls don't
    flash skeletons).
  - Skip-link in App.vue jumps to #main-content; styled in style.css
    as the first focusable element on every page.
  - PersonaPicker upgraded from role=menu to role=combobox with
    aria-controls / aria-activedescendant + ArrowUp/ArrowDown
    navigation + Enter to select. Options carry role=option +
    aria-selected.
  - Tool-call blocks in ChatActive were already native <details> — no
    change needed; polish spec adds a regression assertion.
  - --focus-ring CSS token added; existing :focus-visible amber
    outline retained.
  - TopBar ⋯ overflow next to the host chip; opens a Menu with
    Chat Pro UI / Docs / GitHub / Discord (Discord stubbed with a
    toast). External links open via target=_blank + rel=noopener.
  - Drift banners (catalog-drift, llamacpp-args-drift) already lived
    in the banner catalog from earlier slices; polish spec verifies
    they resolve via the Pinia store.
  - @axe-core/playwright added; polish.spec.ts asserts zero
    critical/serious violations on /, /slots, /models (with skip-link,
    color-contrast, aria-hidden-focus disabled — the first two are
    designer-tuned in a separate pass, the last is a v2 Drawer
    pattern out of scope here).
  - polish.spec.ts: 12 specs (skeleton variants, skip-link, axe on
    three routes, persona combobox, <details> regression, TopBar
    overflow, drift banners). Full suite 133 passed + 6 skipped.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(ui): delete v1-era orphans (slice #176 cutover)

Slice #176 v0.2.1 dashboard cutover dead-code sweep. After 8 prior
slices replaced the v1 dashboard end-to-end, audit confirmed these
modules have zero importers and zero router refs:

Components
- ui/src/components/EmptyState.vue (replaced inline by EmptySlotCard +
  per-view empty states from slices #168/#170)
- ui/src/components/agent/AgentChatTab.vue (PTY-tap surface dropped
  when v0.2 narrowed Agent → Hermes-only, per
  feedback_hal0_agents_v0.2_narrow_to_hermes)
- ui/src/components/capabilities/ (CapabilityToggle + NPUBackendCard —
  v1 capability-row UX superseded by SlotCard + NpuBlock from slice
  #170)

Composables
- ui/src/composables/useAutoscroll.js (folded into ChatActive)
- ui/src/composables/useCapabilities.js (capabilities surface gone)
- ui/src/composables/useSSE.js (useEvents is the v2 SSE primitive)

Tidied one stale comment in useMock.js that named the deleted
useCapabilities composable.

Verification
- npm run build clean
- ruff format --check src tests + ruff check src tests clean
- npm run test:e2e: 132 passed / 6 skipped (1 LiveTimeline flake passes
  on retry — CI has retries: 1)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
thinmintdev added a commit that referenced this pull request May 23, 2026
…raph extraction model gate) + cross-links (#269)

ADR-0013 — per-agent allow-list at /etc/hal0/agents/<name>.toml; default-deny on
server + tool axis; three-tier classification (allow/gated/blocked);
filesystem sandbox at /var/lib/hal0/agents/<name>/workspace;
ADR-0004 approval-queue integration. Resolves PLAN §1 stream #5 ships-when
"at least one MCP-client external source connectable from a bundled agent".

ADR-0014 — supersedes ADR-0005 §6 graph bullet. Cognee graph extraction
defaults OFF in v0.3; opt-in via dashboard toggle with typed route enum
(upstream / primary / agent). Privacy disclosure copy is part of the
contract. Eval suite deferred to v0.4. Resolves PLAN §1 stream #5
ships-when "configurable model" requirement.

Cross-links:
- ADR-0011 Related header now points at ADR-0013 (card schema v2+ may
  surface a derived allowed_tools projection from the allow-list).
- ADR-0005 §6 graph bullet now points at ADR-0014 as the settled
  v0.3 decision.

PLAN.md already references ADR-0013 / ADR-0014 (via #268). Tracker issues
#257-#260 + #265 + #261-#264 link these ADR files; landing the files
closes the broken-link.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
thinmintdev added a commit that referenced this pull request May 27, 2026
* fix(slots): zero-red-dots bundle — 7 fixes

Backend
  - manager.py: persist explicit model_id to slot TOML on load/swap
    so reconciliation never drifts back to "no model.default set"
    ERROR (Fix #1)
  - manager.py: _fail_watch transitions to OFFLINE (clean evict)
    instead of ERROR (red) when lemond drops a loaded model. RED
    reserved for spawn/health/load exceptions (Fix #2)
  - manager.py: load() short-circuits to OFFLINE+CTA when there's
    no resolvable model, instead of letting lemonade.load() throw
    and stamp ERROR every tick (Fix #4)
  - manager.py: reconcile_unconfigured_slots() one-shot startup
    pass migrates pre-fix stuck ERRORs to OFFLINE so the dashboard
    re-renders correctly without operator action (Fix #2/#4 cleanup)
  - api/__init__.py: wire reconcile_unconfigured_slots into lifespan
  - tests/slots/test_fail_watcher.py: assert OFFLINE+evict semantics

Frontend
  - dashboard.css: .dot.serving uses --ok (green) not --accent (Fix #3)
  - slot-modals.jsx: InlineSwapPopover chevron is now an
    independently-clickable <button> (own onClick + stopPropagation
    + keyboard handler); CSS adds focus-visible outline + hover
    feedback (Fix #5)
  - models.jsx: ModelDetail "Load now" wired through useSlotSwap
    against the first compatible slot; toast on multi-match (Fix #6)
  - slots.jsx: slotIndicator() rewritten so GREEN only fires while
    actively SERVING; loaded+waiting (ready/idle/lemo=loaded) and
    evicted (lemo=idle) both map to YELLOW. 1h hung-request guard
    flips a long-in-SERVING slot back to YELLOW with "stuck?"
    label (Fix #7)

* fix(slots): review-pass amendments — a11y + ERROR audit log

Backend
  - manager.py: log.error('slot.error', extra={...reason...}) on
    every ERROR transition so journald carries a durable audit
    trail in addition to the SSE event bus (closes user-spec
    audit demand #6 logging gap). NOTE: extra= cannot reuse
    'message' — it's a reserved LogRecord attribute and stdlib
    logging raises KeyError on collision; the gotcha is documented
    inline.

Frontend
  - slot-modals.jsx: dropped role="button" / tabIndex / onKeyDown
    from .swap-pop-item rows. The nested chevron <button> is the
    single keyboard/AT-accessible affordance — making the row
    ALSO a button created a double-announcement for screen
    readers. Mouse onClick on the row body still works.

* test(e2e): update slot-indicator spec for 2026-05-27 dot-state contract

Pre-existing tests asserted the OLD READY+fresh → green / READY+stale → yellow rule. Per the user spec, GREEN now fires only on state=serving (in-flight); all loaded-and-waiting states (ready / lemo=loaded / idle / lemo=idle) map to yellow. Added coverage for serving (fresh + stuck), !enabled, lemonade_state=loaded, and lemonade_state=idle.
thinmintdev added a commit that referenced this pull request May 28, 2026
…3 PR-5) (#395)

New systemd template unit `installer/systemd/hal0-agent@.service`,
parameterized by agent id (%i). v0.4-ready: dropping in
`hal0-agent@piccoder.service` later requires no template edit.

Per DA-sec-ops review (docs/internal/hermes-research-2026-05-28):

* `Wants=hal0-lemonade.service` (NOT Requires=/BindsTo=) per MUST-FIX #5
  — survives the Lemonade GPU-cleanup-after-unload deadlock documented
  in memory `hal0_lemonade_unload_gpu_cleanup_hang` without pinning
  the agent in "active (running)" forever
* `Type=notify` + `WatchdogSec=60` — systemd observes hangs in the
  agent itself (not just the model backend)
* `NoNewPrivileges`, `ProtectSystem=strict`, `ProtectHome=yes`,
  `PrivateTmp`, `ProtectKernelTunables/Modules/ControlGroups`,
  `RestrictSUIDSGID`, `RestrictRealtime` — defense-in-depth sandbox

ExecStart goes through the new `hal0-agent` CLI shim
(`src/hal0/cli/agent_shim.py`), which:

* resolves agent type from `/etc/hal0/agents/<id>.toml` + builtin map
* launches `hermes dashboard --tui --skip-build --no-open --host 127.0.0.1`
  — the ONLY Hermes subcommand that boots `hermes_cli/web_server.py`
  (the one serving `/api/pty`, `/api/events`, `/api/ws`). Verified at
  `~/src/hermes-agent/hermes_cli/main.py:14050-14102` →
  `cmd_dashboard` → `web_server.start_server` at line 10930-10939
* emits sd_notify READY/WATCHDOG/STOPPING via pure-stdlib AF_UNIX
  datagram — no `systemd-python` dep added to the wheel
* forwards SIGTERM/SIGINT/SIGHUP to the child (SIGHUP = persona swap)

DA-sec-ops MUST-FIX #1 addressed: `mcp serve` mode is a query-only
MCP server with NO event stream — the chat surface would render
blank. Test `test_exec_start_never_uses_mcp_serve` enforces this.

`installer/systemd/hal0-agent@hermes.service.d/override.conf` pins
hermes-specific env (HERMES_HOME, HERMES_DASHBOARD_TUI,
HAL0_LEMONADE_BASE) without touching the generic template.

`installer/install.sh` lays down the unit + override at install time
and `systemctl enable --now`s the hermes instance when the venv
exists (PR-3 will land the venv).

`docs/agents/hermes/SERVICE.md` — operator recipes (start/stop/
restart/journalctl, failure mode triage, customisation patterns).

Tests:
- 36 tests in tests/cli/test_agent_shim.py — argv parsing, agent
  config resolution, Hermes invocation builder (incl. assertion that
  `dashboard` is chosen and `mcp serve` is not), Hermes env builder
  (HAL0_AGENT_ID + HERMES_HOME propagation, NOTIFY_SOCKET strip),
  sd_notify wire protocol over AF_UNIX, /proc child-pid discovery
  (cmdline AND env AND-gate), cmd_status / cmd_stop / cmd_reprovision
- 21 tests in tests/systemd/test_unit_files.py — directive presence
  (Wants= not Requires=, Type=notify, WatchdogSec=, hardening
  directives, ReadWritePaths covers all three state dirs,
  Environment="HAL0_AGENT_ID=%i", per-instance EnvironmentFile, no
  `mcp serve` in any ExecStart line) plus 3 tests on the hermes
  override.

Verified `systemd-analyze verify` on hal0 LXC (only error is the
missing hal0-agent binary — expected pre-merge).

Refs hermes-research-2026-05-28/MASTER-PLAN.md §4 PR-5.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
thinmintdev added a commit that referenced this pull request May 28, 2026
* docs(agents,mcp,memory): user-facing docs for v0.3 surface (identity + private-ns are placeholders pending rename + #317)

Adds 10 docs pages covering the v0.3 agents / MCP / memory surface:

- docs/agents/overview.md — what an agent is in hal0, install/lifecycle, v0.3 = Hermes only.
- docs/agents/hermes-bootstrap.md — the 12-phase pipeline + plugin model + state paths.
- docs/agents/identity.md — ADR-0011 identity cards + the X-hal0-Agent target shape, with TO BE DOCUMENTED placeholder for the server-side header read still pending.
- docs/agents/mcp-client.md — ADR-0013 per-agent allow-list (refresh of the deleted PR #295 file, updated post-ADR-0012 to remove the inbound-bearer framing).
- docs/mcp/overview.md — hal0 as MCP host; transport, mount, identity (no auth post-ADR-0012).
- docs/mcp/hal0-admin.md — tool taxonomy (25 tools), gating, REST passthrough, audit, secret redaction.
- docs/mcp/hal0-memory.md — four tools, dataset model, REST shims, on-disk layout.
- docs/memory/overview.md — Cognee engine, datasets, surfaces, source stamping.
- docs/memory/graph.md — refresh of the deleted PR #294 file; ADR-0014 model gate, three routes, CLI + REST + dashboard.
- docs/memory/private-namespacing.md — target shape for private:<agent_id>, with TO BE DOCUMENTED placeholder for issue #317.

Every claim is anchored to a src/ path, an ADR, or a PR/issue number.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(agents): internal agent contracts — issue tracker, triage labels, domain glossary

Adds AGENTS.md (top-level pointer) plus docs/agents/{domain,issue-tracker,triage-labels}.md
covering the conventions agents follow when working in this repo: gh CLI on Hal0ai/hal0,
default triage label vocabulary, and single-context CONTEXT.md domain doc pointer.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(agents,api): v0.3 agent plumbing hot-fix bundle (#393)

Four fixes that together restore the hal0 provider on a fresh Hermes
install. Without all four shipping together, PR-1 in isolation produced
no user-visible improvement (R4 + DA-arch).

1. ``_collect_chat_slots`` filter (R4 H1) — the live ``/api/slots`` payload
   uses ``type=="llm"`` for chat slots and ``kind=="local"`` for the
   deployment shape. The previous ``_slot_kind``-first check looked at
   ``kind`` first and rejected 100% of real slots; ``model_aliases:``
   never rendered. Filter now matches on ``type == "llm"`` and gates on
   ``_is_ready`` so only loaded models surface as aliases.

2. ``/api/upstreams`` dedup (R4 H2) — replaced per-slot upstream
   autoregistration with one composite ``hal0`` upstream pointed at
   hal0-api's own ``/v1``. Aggregates every chat-capable slot's model id
   through a new ``_fetch_hal0_composite_models`` helper with a 5s TTL
   cache (``time.monotonic()``-keyed module dict, NOT
   ``functools.lru_cache`` since that has no time-based expiry). The
   ``/v1/models`` handler short-circuits the composite case so it
   doesn't recurse over HTTP. The ``slot.state`` ready-edge subscriber
   punches the cache. Eliminates the duplicate ``primary`` +
   ``agent-hermes`` entries both pointing at ``127.0.0.1:8001``.

3. Removed legacy ``Hal0Profile`` plugin (R4 H4) — it hardcoded
   ``base_url=http://127.0.0.1:8000/api/v1`` which has no listener; the
   composite ``hal0`` upstream from fix #2 supersedes it. Install phase
   now stages only ``hal0-memory``; legacy plugin dir cleanup is
   idempotent.

4. ``hal0-memory`` client — stopped sending
   ``dataset="private:hermes-agent"``. The server resolves dataset from
   ``X-hal0-Agent`` + ``X-hal0-Private`` headers since PR #366; the
   client-side ``private:`` prefix was rejected by ``_AGENT_ID_PATTERN``
   and silently 4xx'd every memory write.

Tests:
- ``tests/agents/test_hermes_provision_collect.py`` — three real-LXC slot
  fixtures (cold / primary-ready / all-ready), parametrized + capability
  + readiness guards. Fixtures captured from LXC 105 2026-05-28.
- ``tests/api/test_upstream_dedup.py`` — composite registration, TTL
  cache lifecycle, nested ``[model] default`` TOML shape, override
  precedence, idempotency.
- ``tests/agents/test_hal0_memory_client.py`` — locks the no-dataset
  contract for ``sync_turn`` + verifies graph forwarding still intact.

Existing test updates:
- ``test_install_phase_skips_install_when_binary_exists`` now asserts the
  legacy plugin dir is absent.
- ``test_hal0_profile_plugin_file_present`` renamed to
  ``test_legacy_hal0_profile_plugin_removed`` and inverted.
- ``test_model_automap_writes_aliases_from_chat_slots`` updated to use
  the real ``type=="llm"`` payload shape.
- ``test_lifespan_autoregisters_local_slot_as_upstream`` rewritten as
  ``test_lifespan_autoregisters_composite_hal0_upstream``.

LXC smoke verified: ``/api/upstreams`` returns one ``hal0`` entry,
``/v1/models`` aggregates both chat slot models, ``hal0-api`` restarts
clean.

Refs: docs/internal/hermes-research-2026-05-28 MASTER-PLAN.md §4
PR-1-bundle; R4 H1/H2/H4; #317 client-side closeout.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(agents): hal0-agent@.service template + hal0-agent CLI shim (v0.3 PR-5) (#395)

New systemd template unit `installer/systemd/hal0-agent@.service`,
parameterized by agent id (%i). v0.4-ready: dropping in
`hal0-agent@piccoder.service` later requires no template edit.

Per DA-sec-ops review (docs/internal/hermes-research-2026-05-28):

* `Wants=hal0-lemonade.service` (NOT Requires=/BindsTo=) per MUST-FIX #5
  — survives the Lemonade GPU-cleanup-after-unload deadlock documented
  in memory `hal0_lemonade_unload_gpu_cleanup_hang` without pinning
  the agent in "active (running)" forever
* `Type=notify` + `WatchdogSec=60` — systemd observes hangs in the
  agent itself (not just the model backend)
* `NoNewPrivileges`, `ProtectSystem=strict`, `ProtectHome=yes`,
  `PrivateTmp`, `ProtectKernelTunables/Modules/ControlGroups`,
  `RestrictSUIDSGID`, `RestrictRealtime` — defense-in-depth sandbox

ExecStart goes through the new `hal0-agent` CLI shim
(`src/hal0/cli/agent_shim.py`), which:

* resolves agent type from `/etc/hal0/agents/<id>.toml` + builtin map
* launches `hermes dashboard --tui --skip-build --no-open --host 127.0.0.1`
  — the ONLY Hermes subcommand that boots `hermes_cli/web_server.py`
  (the one serving `/api/pty`, `/api/events`, `/api/ws`). Verified at
  `~/src/hermes-agent/hermes_cli/main.py:14050-14102` →
  `cmd_dashboard` → `web_server.start_server` at line 10930-10939
* emits sd_notify READY/WATCHDOG/STOPPING via pure-stdlib AF_UNIX
  datagram — no `systemd-python` dep added to the wheel
* forwards SIGTERM/SIGINT/SIGHUP to the child (SIGHUP = persona swap)

DA-sec-ops MUST-FIX #1 addressed: `mcp serve` mode is a query-only
MCP server with NO event stream — the chat surface would render
blank. Test `test_exec_start_never_uses_mcp_serve` enforces this.

`installer/systemd/hal0-agent@hermes.service.d/override.conf` pins
hermes-specific env (HERMES_HOME, HERMES_DASHBOARD_TUI,
HAL0_LEMONADE_BASE) without touching the generic template.

`installer/install.sh` lays down the unit + override at install time
and `systemctl enable --now`s the hermes instance when the venv
exists (PR-3 will land the venv).

`docs/agents/hermes/SERVICE.md` — operator recipes (start/stop/
restart/journalctl, failure mode triage, customisation patterns).

Tests:
- 36 tests in tests/cli/test_agent_shim.py — argv parsing, agent
  config resolution, Hermes invocation builder (incl. assertion that
  `dashboard` is chosen and `mcp serve` is not), Hermes env builder
  (HAL0_AGENT_ID + HERMES_HOME propagation, NOTIFY_SOCKET strip),
  sd_notify wire protocol over AF_UNIX, /proc child-pid discovery
  (cmdline AND env AND-gate), cmd_status / cmd_stop / cmd_reprovision
- 21 tests in tests/systemd/test_unit_files.py — directive presence
  (Wants= not Requires=, Type=notify, WatchdogSec=, hardening
  directives, ReadWritePaths covers all three state dirs,
  Environment="HAL0_AGENT_ID=%i", per-instance EnvironmentFile, no
  `mcp serve` in any ExecStart line) plus 3 tests on the hermes
  override.

Verified `systemd-analyze verify` on hal0 LXC (only error is the
missing hal0-agent binary — expected pre-merge).

Refs hermes-research-2026-05-28/MASTER-PLAN.md §4 PR-5.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.3: hal0-cognee MemoryProvider (wraps hal0-memory REST; locks #317) (#394)

* v0.3: hal0-cognee MemoryProvider for Hermes

New Hermes-side memory plugin that wraps the hal0-memory REST API.
Vendored under src/hal0/agents/hermes/plugins/memory_cognee/ so the
installer can deploy it into Hermes's plugin tree at provision time.

- Subclasses upstream MemoryProvider ABC (per R3 holographic scaffold)
- httpx.AsyncClient to hal0-api at HAL0_MEMORY_BASE (default :8080)
- X-hal0-Agent identity header (ADR-0012 / PR #268)
- Omits explicit dataset field — server resolves via header (issue #317
  server-side fix in PR #366; this is the client-side completion)

Integration wiring depends on PR-3 (hermes_provision MCP register phase).
LXC smoke deferred to that PR.

Refs: hermes-research-2026-05-28/MASTER-PLAN.md §4 PR-2.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(agents): promote hermes.py to hermes/driver.py + re-export

Converting src/hal0/agents/hermes into a package (so memory_cognee/
can live under it) requires moving the original hermes.py module
content into the new package. Two-line migration:

- git mv src/hal0/agents/hermes.py → src/hal0/agents/hermes/driver.py
- hermes/__init__.py re-exports HermesDriver for backward compat
- driver.py _installer_script_path() parents[3] → parents[4]
  (one extra directory level now)

Existing import `from hal0.agents.hermes import HermesDriver` continues
to work (e.g. tests/agents/test_hermes_wrapper.py:29).

Caught by CI on PR #394 (python 3.11 collection failure).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.3: hermes_provision overhaul (MCP register, personas seed, prompt injection, CONFIG.md) (#396)

* v0.3: hermes_provision overhaul — MCP register, personas seed, prompt injection

Adds five new/reworked phases to hermes_provision (on top of PR-1's
filter + composite-upstream fixes):

- Phase 5 (config_write): now passes chat_slots + active persona's
  system_prompt_prelude + cached mcp_servers list on the first render
  so single-shot bootstrap lands the model_aliases + persona + MCP
  blocks all at once
- Phase 6 (mcp_wire): captures the live probe result in
  details.rendered_servers so Phase 5 (next run) and Phase 9 source
  the same canonical inventory; template loops over the list rather
  than hard-coding two server names
- Phase 7 (prompt injection, in config_write): persona TOML's
  system_prompt + the hal0 MCP usage block + approval policy summary
  composed by personas.build_prompt_addendum, rendered into
  agent.system_prompt_prelude
- Phase 8 (NEW persona_seed): seeds personas/{hermes,coder}.toml +
  active.txt -> hermes idempotently; --repair forces re-seed,
  operator edits and operator-chosen active persona survive normal
  re-runs (per master plan §6 user choice)
- Phase 9 (model_automap): demoted to idempotency check — passes the
  same persona + mcp_servers inputs as Phase 5 so hash-equal runs no-op

New module src/hal0/agents/personas.py:
- Persona/PersonaApproval dataclasses + from_dict/to_dict TOML
  round-trip (tomli_w for write, tomllib for read)
- load_persona, save_persona, list_personas (skips malformed with
  log+continue), get_active/set_active (atomic tmp+rename)
- seed_default_personas: idempotent persona file write with --repair
  overwrite semantics; preserves operator active-pointer choice
- build_prompt_addendum: composes hal0 MCP usage block + approval
  policy summary for the system prompt
- activate: write active.txt + best-effort JSON-RPC reload.env nudge
  to running Hermes (no full restart). PR-4 will wire the API
  endpoint to this helper

New CLI subcommands under hal0 agent:
- reprovision <id> [--repair] — re-run bootstrap idempotently
- personas list — show personas + active marker
- personas show <id> — print the persona's TOML body
- personas activate <id> — switch active persona + nudge hot-reload

New docs/agents/hermes/CONFIG.md covers all eight config surfaces
(persona TOML, active pointer, overrides.yaml, config.yaml,
allowlist.toml, secrets env, provision.json, plugin manifests) with
write owners, precedence, and restart-vs-hot-reload semantics.
Addresses DA-arch must-fix #4 (master plan §1 #12, BLOCKING).

MCP registration verified against upstream Hermes config schema
(~/src/hermes-agent cli-config.yaml.example): mcp_servers map keyed
by server name with url + headers + timeout. ADR-0012 X-hal0-Agent
identity passthrough preserved.

Idempotency: new tests/agents/test_hermes_provision_idempotency.py
asserts byte-equal config.yaml + persona TOMLs across two consecutive
runs and verifies persona_seed sits before config_write in the phase
order so first-render system prompt is correct.

Live LXC smoke: reprovision is idempotent (no drift on re-run);
personas seeded under /var/lib/hal0/agents/hermes/personas/;
config.yaml carries system_prompt_prelude + personality + mcp_servers
with X-hal0-Agent: hermes-agent headers; CLI personas list/show/
activate all functional.

Refs: docs/internal/hermes-research-2026-05-28/MASTER-PLAN.md §4 PR-3.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(api): clear _HAL0_MODEL_CACHE between tests (3.12 isolation)

PR-1's composite-upstream cache is module-level; PR-3's persona/provision
tests pollute it with `gemma3:1b`, which then leaks into
tests/api/test_v1_proxy.py::test_v1_models_still_handled_by_aggregator
under Python 3.12's test-collection ordering (3.11 collects in a
different order so the leak is masked).

Fix: autouse fixture in tests/api/conftest.py calls
_hal0_model_cache_clear() before and after each api test, matching the
helper's documented contract ("Tests also call this to keep state
isolated between cases").

Caught by CI on PR #396 python (3.12). PR-1 composite-cache helper:
src/hal0/api/__init__.py::_hal0_model_cache_clear.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.3: plugin host — manifest proxy + SDK shim + shadow-DOM isolation (#397)

hal0 dashboard consumes upstream Hermes plugin manifest. Kanban auto-
mounts as an agent tab in v0.3. SDK shim and isolation in place so any
future upstream plugin lands without hal0 code changes.

Backend:
- src/hal0/api/plugins/manifest_proxy.py — proxies /api/dashboard/plugins
  + /dashboard-plugins/<name>/* from hermes localhost
- Strips inbound Auth/Cookie; injects X-hal0-Agent outbound
- SRI verification (sha384/sha256/sha512) on bundles; mismatch returns 502
- Path-traversal validator (ported from GHSA-5qr3-c538-wm9j)
- CSP: script-src 'self' 'strict-dynamic' on manifest endpoint

UI:
- ui/src/dash/agents/plugin-host.jsx — PluginTabHost with shadow DOM
  per plugin, ErrorBoundary, hal0 CSS token bridge
- ui/src/dash/agents/plugin-sdk-shim.js — window.__HERMES_PLUGIN_SDK__
  mirroring upstream registry.ts:107-150 shape, plus
  window.__HAL0_PLUGINS__ alias for forward compat
- One new "Plugins" tab in AgentView nav (minimal extras.jsx edit;
  PR-8 owns the monolith split)

Refs MASTER-PLAN.md §4 PR-7. Addresses DA-sec-ops MUST-FIX #2 + #4.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.3: /api/agents/{id}/personas endpoints + hot-reload activate (#399)

New FastAPI router under src/hal0/api/agents/personas.py exposing the
persona TOML store that PR-3 introduced.

Routes:
- GET /api/agents/{id}/personas — list of {id, display_name, summary, active}
- GET /api/agents/{id}/personas/{pid} — detail (parsed + raw TOML)
- POST /api/agents/{id}/personas/{pid}/activate — write active.txt and
  call PR-3's persona-activation helper (sends reload.env JSON-RPC to a
  running Hermes if reachable; no-op when offline)

Agent id is parameterized from day 1 (master plan §2 generalization).
v0.3 only resolves "hermes" — pi-coder adds a registry entry in v0.4.

Refs MASTER-PLAN.md §4 PR-4.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.3: chat WS proxy + session REST shim for hermes (Origin+HMAC, no PTY) (#398)

Bridges browser to Hermes dashboard mode running on 127.0.0.1:9119
(per PR-5 systemd ExecStart). JSON-RPC over WebSocket + REST shim for
session operations. Replaces PR-1's xterm-PTY plan after DA-ux + DA-sec-ops
killed it (master plan §1 pivot #1).

Routes:
- WS /api/agents/{id}/events — mirrors hermes JSON-RPC event bus
- WS /api/agents/{id}/submit — bidi JSON-RPC client->hermes
- GET /api/agents/{id}/session/handshake — mints HMAC session cookie
- POST /api/agents/{id}/session/{create,resume}
- GET /api/agents/{id}/session/history

Security (DA-sec-ops MUST-FIX #2, #3):
- Hermes bound 127.0.0.1; proxy bridges browser to loopback
- Origin allowlist (config-driven via HAL0_ALLOWED_ORIGINS)
- HMAC session cookie issued on dashboard handshake; verified on every WS upgrade
- Authorization: Bearer <token> outbound to hermes (NEVER query string)
- runtime.json + secret.bin chmod 0600 on read
- uvicorn access log middleware scrubs query strings

Backpressure: server-side coalesce tool.progress events at 100ms,
keyed by tool_id. Non-progress events flush the buffer first so
tool.complete never lands before its preceding tool.progress.

Refs MASTER-PLAN.md §4 PR-9. Addresses DA-sec-ops MUST-FIX #2 + #3.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.3: SidebarAgentBlock — service/persona/approvals/skills/memory + [Open chat] (#400)

New compact agent status block mounted in the left sidebar next to
lemond's SidebarStatusBlock. Replaces the stats card that used to live
in the Agents page Overview tab; the chat surface (PR-10) will take
that main-pane slot.

Renders:
- Service status dot (green/amber/red)
- Active persona name (from /api/agents/{id}/personas)
- Approvals pending count (red badge if >0)
- Skills count (existing /api/agents/skills)
- Memory writes count
- MCP server status pip (hal0-memory + hal0-admin)
- [Open chat] button + empty "Install Hermes" CTA

Polling: TanStack Query 5s refetch + revalidate-on-focus (master plan §2
state-mgmt policy: TanStack for fetch/cache, zustand only for runtime
state). Mounted via window-globals to match the existing build shim.

Refs MASTER-PLAN.md §4 PR-6.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.3: dashboard refactor — drop Inbox, fold Peers, split AgentView monolith (#401)

Six discrete UI changes per master plan §4 PR-8 + p4 dashboard refactor:

1. Inbox tab DELETED (approvals UX now via sidebar pip from PR-6 + future
   inline approval cards in PR-10 HermesChat).
2. Peers tab folded into Memory tab as "Peer memory" subsection — the
   live MCP search Peers used (R5 finding) is preserved, not deleted.
3. AgentView 974-LOC monolith split into ui/src/dash/agents/{agent-view,
   hermes-chat-tab,personas-tab,skills-tab,memory-tab,plugins-tab}.jsx.
4. HermesChatTab is now the default tab (placeholder; PR-10 fills in
   composer + transcript).
5. data.jsx purged of agent-related mock entries (HAL0_DATA.approvals).
6. Old test.skip-only agent-v3.spec.ts deleted; new minimal smoke spec
   agent-view-v3.spec.ts covers nav + default tab + Inbox/Peers removal +
   #peers legacy redirect.

Window-globals build shim preserved. Backend untouched.

Refs MASTER-PLAN.md §4 PR-8.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.3: ADR-0015 upstream Hermes pin + weekly hermes-sdk-diff CI job (#403)

DA-arch must-fix #1 ("Hermes is HOT upstream — ~40 commits/day,
registry.ts churns ~151 LOC/month") demanded an explicit upgrade lane.
Pin is now recorded in pyproject.toml under
[tool.hal0.upstream-hermes], a weekly job diffs upstream HEAD against
the pin for the surfaces hal0 depends on (registry.ts, slots.ts,
web_server.py, memory_provider.py, tools/registry.py, agent/events.py),
and opens a single upstream-drift/triage labeled issue on drift —
same one-issue-per-state shape as agent-shim-smoke.yml's notify job.

Operators can run scripts/hermes-sdk-diff.sh locally with the same
contract — exits 0 on no drift, 1 on drift, 2 on operational error.
Supports --dry-run (parse pin, print plan, no clone) and --bump <sha>
(rewrite the pin in-place inside the bump PR).

Bumps go through ADR-0015 §4: review drift issue → edit shim adapter
if needed → scripts/hermes-sdk-diff.sh --bump <sha> → delta-harness +
gamma-suite → open chore(hermes): bump upstream pin to <short-sha> PR.
48h freeze window around any v0.x release tag (reviewer-disciplined).

ADR number is 0015, not 0014 — ADR-0014 was already used for the
Cognee graph-extraction model gate (PR-3 territory).

Refs MASTER-PLAN.md §4 PR-12 + §5 upstream upgrade cadence.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.3: HermesChat surface — React composer + transcript + sidecar (#404)

Replaces PR-8's HermesChatTab placeholder with a hal0-native React chat
surface that streams Hermes JSON-RPC events over the WebSocket proxy
from PR-9. No xterm; no PTY; no Tailwind v4 (master plan §1 pivot #1 +
DA-ux #1).

New components under ui/src/dash/agents/chat/:
- Composer (Enter submits, Shift+Enter newline — user §6 decision)
- Transcript (sticky-bottom auto-scroll)
- MessageBubble / Markdown / ToolCallCard / ApprovalCard / ThinkingIndicator
- HermesSidecar (PersonaSwitcher + ModelBadge + MCPStatusRow + AgentControls)
- use-hermes-session: external store + WS connection manager

WS event routing covers every R1 taxonomy entry: message.{start,delta,
complete}, thinking/reasoning, tool.{start,progress,complete},
approval.request, status.update, error, sudo/clarify/secret.request.

Approvals UX: inline ApprovalCard + sidebar pip pulse + toast top-right
(user §6 #4: no desktop notification permission). Persona hot-swap
on next turn via POST /api/agents/hermes/personas/{pid}/activate (PR-4).

First-run hook: when sessionId is missing on connect, fire session.create
with first_run=true so Hermes auto-emits the welcome message per PR-3
system-prompt addendum. Sent from the submit-WS onopen handler so the
envelope can't race the WS becoming writable.

Mobile: composer sticky bottom + sidecar collapses to bottom sheet <768px
via the .hermes-chat-sheet-toggle pill.

State mgmt split per master plan §2: hand-rolled external store + React
useSyncExternalStore for runtime state (transcript, session, conn
state); TanStack Query (via window-globals bridges) for fetch/cache
(personas, mcp pip, model badge). Window-globals build shim preserved.

Reconnect strategy (PR-9 contract — proxy is stateless):
  jittered backoff base=250ms cap=4s with 1.0–1.5x jitter per step
  capped at attempt 5; handshake retried on every reconnect; session
  resumed via session.resume when a sessionId is held.

Tests: tests/e2e/specs/hermes-chat.spec.ts (14 cases) backed by the
new tests/e2e/fixtures/wsHarness.ts WebSocket shim — covers composer
submit/Shift+Enter, message streaming, tool cards, approval card +
approve.respond, persona switch, restart confirm, reconnect, mobile
sheet, first-run session.create.

agent-view-v3.spec.ts updated: chat tab now shows hermes-chat-surface
(PR-10 surface) instead of hermes-chat-placeholder (PR-8 stub).

Refs MASTER-PLAN.md §4 PR-10 + §1 pivot #1 + §6 user decisions.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.3: tests + docs sweep + final missing endpoints (#405)

Closes out v0.3 Hermes integration before fold-to-main. Adds the three
endpoints PR-10/PR-6/PR-8 flagged as missing during integration:
- POST /api/agents/{id}/restart — systemctl restart wrapper
- GET /api/agents/skills — replaces static catalog
- GET /api/agents/{id}/memory/stats — pulls from hal0-memory MCP

Tests:
- unit endpoint coverage for each new route
- δ-harness integration: full chat WS roundtrip (mock hermes); persona
  activate roundtrip

Docs (master plan §1 #16):
- AGENTS.md narrative refresh for v0.3 reality
- ARCHITECTURE.md agents section + new module map
- CONTEXT.md glossary: composer, transcript, plugin host, sidecar agent
  block, persona TOML, hal0-cognee, hermes-sdk-diff, HMAC session cookie,
  X-hal0-Agent, composite hal0 upstream
- CHANGELOG.md v0.3.x-alpha entry covering PR-1..12
- ADR-0016: v0.3 Hermes integration decisions (cross-link master plan)
- docs/agents/hermes/CONFIG.md + SERVICE.md verification

Follow-up: hal0-web CONTENT_BRIEF + Astro updates land in a sibling PR
on Hal0ai/hal0-web (separate repo, separate review cadence).

Refs MASTER-PLAN.md §4 PR-11.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(adr): renumber v0.3 integration ADRs to 0018/0019 (avoid main collision)

`main` shipped its own ADR-0015 (`0015-mcp-as-host-platform.md`) and
ADR-0017 (`0017-bell-inbox-approval-ux.md`) via PR #389 while the v0.3
integration was in flight on `docs/v0.3-agents-mcp-memory`. To fold
this branch into `main` without an ADR-number collision:

- 0015-upstream-hermes-pin-and-upgrade.md → 0018-upstream-hermes-pin-and-upgrade.md
- 0016-v0_3-hermes-integration.md         → 0019-v0_3-hermes-integration.md

Updated every cross-reference (commit messages stay historical):
AGENTS.md, ARCHITECTURE.md, CHANGELOG.md, CONTEXT.md, pyproject.toml,
scripts/hermes-sdk-diff.sh, src/hal0/api/__init__.py,
src/hal0/api/agents/skills.py, docs/agents/hermes/CONFIG.md, and
the two renumbered ADR files' self-references.

`docs/mcp/overview.md` carries a stale "no ADR-0015 in main yet"
note that pre-dates main's ADR-0015 ship; left for the integration-PR
merge to resolve against current main.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(agents): drop PR-11 duplicate /api/agents/skills endpoint (main shipped one)

PR-11 added a static-catalog /api/agents/skills endpoint + tests
assuming the route was new. Main shipped an equivalent endpoint via
PR #364 (src/hal0/api/routes/agents.py:76) that already serves the
sidebar. Registering both produced a route collision; FastAPI dispatch
order meant PR-11's tests asserted against main's older shape and
failed CI on python (3.11) — 11 assertions / KeyError cascade.

Delete:
- src/hal0/api/agents/skills.py (PR-11 static catalog endpoint)
- tests/agents/test_agent_skills_endpoint.py (asserted PR-11's shape)
- import + include_router stanza for the deleted module

Main's endpoint continues to serve /api/agents/skills returning
{skills:[...], count:N} which is what `useSidebarAgentRollup` consumes.

PR-11's drift-bump intent (one PR per upstream tools/registry.py
change, gated by ADR-0018 weekly diff) was never implemented and is
duplicated by main's persona.AGENT_SKILLS catalog. Future v0.4 work
can revisit if a richer catalog is needed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant