test(openwebui): add CI smoke test proving prewire end-to-end#4
Merged
Conversation
The bundled OpenWebUI install relies on env_writer.py producing /etc/hal0/openwebui.env so the container talks to hal0's /v1 API on boot. We had no test proving the round trip actually works — an override-key typo or a missing `--add-host` flag would only surface on a user's first launch. Add tests/openwebui/test_prewire_smoke.py: a real `docker run` of ghcr.io/open-webui/open-webui:main against a uvicorn-hosted hal0-api fed by a Python stub upstream returning OpenAI-shaped /v1/models. After signing up the bootstrap admin (OpenWebUI requires a Bearer token for /api/models even with WEBUI_AUTH=False), the test polls the merged /api/models view until the stub-served model id appears — proving env file → container env → upstream proxy → response is intact. Wire a parallel `openwebui-prewire-smoke` job into the Integration (β) workflow alongside slot-integration. Pre-pulls the 2 GB image outside the pytest timeout, captures container logs on failure. Tests gated behind @pytest.mark.integration + `docker info` preflight so `make test` on the dev VM still skips cleanly. Co-Authored-By: Claude <noreply@anthropic.com>
4 tasks
thinmintdev
added a commit
that referenced
this pull request
May 27, 2026
* fix(slots): zero-red-dots bundle — 7 fixes
Backend
- manager.py: persist explicit model_id to slot TOML on load/swap
so reconciliation never drifts back to "no model.default set"
ERROR (Fix #1)
- manager.py: _fail_watch transitions to OFFLINE (clean evict)
instead of ERROR (red) when lemond drops a loaded model. RED
reserved for spawn/health/load exceptions (Fix #2)
- manager.py: load() short-circuits to OFFLINE+CTA when there's
no resolvable model, instead of letting lemonade.load() throw
and stamp ERROR every tick (Fix #4)
- manager.py: reconcile_unconfigured_slots() one-shot startup
pass migrates pre-fix stuck ERRORs to OFFLINE so the dashboard
re-renders correctly without operator action (Fix #2/#4 cleanup)
- api/__init__.py: wire reconcile_unconfigured_slots into lifespan
- tests/slots/test_fail_watcher.py: assert OFFLINE+evict semantics
Frontend
- dashboard.css: .dot.serving uses --ok (green) not --accent (Fix #3)
- slot-modals.jsx: InlineSwapPopover chevron is now an
independently-clickable <button> (own onClick + stopPropagation
+ keyboard handler); CSS adds focus-visible outline + hover
feedback (Fix #5)
- models.jsx: ModelDetail "Load now" wired through useSlotSwap
against the first compatible slot; toast on multi-match (Fix #6)
- slots.jsx: slotIndicator() rewritten so GREEN only fires while
actively SERVING; loaded+waiting (ready/idle/lemo=loaded) and
evicted (lemo=idle) both map to YELLOW. 1h hung-request guard
flips a long-in-SERVING slot back to YELLOW with "stuck?"
label (Fix #7)
* fix(slots): review-pass amendments — a11y + ERROR audit log
Backend
- manager.py: log.error('slot.error', extra={...reason...}) on
every ERROR transition so journald carries a durable audit
trail in addition to the SSE event bus (closes user-spec
audit demand #6 logging gap). NOTE: extra= cannot reuse
'message' — it's a reserved LogRecord attribute and stdlib
logging raises KeyError on collision; the gotcha is documented
inline.
Frontend
- slot-modals.jsx: dropped role="button" / tabIndex / onKeyDown
from .swap-pop-item rows. The nested chevron <button> is the
single keyboard/AT-accessible affordance — making the row
ALSO a button created a double-announcement for screen
readers. Mouse onClick on the row body still works.
* test(e2e): update slot-indicator spec for 2026-05-27 dot-state contract
Pre-existing tests asserted the OLD READY+fresh → green / READY+stale → yellow rule. Per the user spec, GREEN now fires only on state=serving (in-flight); all loaded-and-waiting states (ready / lemo=loaded / idle / lemo=idle) map to yellow. Added coverage for serving (fresh + stuck), !enabled, lemonade_state=loaded, and lemonade_state=idle.
Merged
9 tasks
thinmintdev
added a commit
that referenced
this pull request
May 28, 2026
…injection, CONFIG.md) (#396) * v0.3: hermes_provision overhaul — MCP register, personas seed, prompt injection Adds five new/reworked phases to hermes_provision (on top of PR-1's filter + composite-upstream fixes): - Phase 5 (config_write): now passes chat_slots + active persona's system_prompt_prelude + cached mcp_servers list on the first render so single-shot bootstrap lands the model_aliases + persona + MCP blocks all at once - Phase 6 (mcp_wire): captures the live probe result in details.rendered_servers so Phase 5 (next run) and Phase 9 source the same canonical inventory; template loops over the list rather than hard-coding two server names - Phase 7 (prompt injection, in config_write): persona TOML's system_prompt + the hal0 MCP usage block + approval policy summary composed by personas.build_prompt_addendum, rendered into agent.system_prompt_prelude - Phase 8 (NEW persona_seed): seeds personas/{hermes,coder}.toml + active.txt -> hermes idempotently; --repair forces re-seed, operator edits and operator-chosen active persona survive normal re-runs (per master plan §6 user choice) - Phase 9 (model_automap): demoted to idempotency check — passes the same persona + mcp_servers inputs as Phase 5 so hash-equal runs no-op New module src/hal0/agents/personas.py: - Persona/PersonaApproval dataclasses + from_dict/to_dict TOML round-trip (tomli_w for write, tomllib for read) - load_persona, save_persona, list_personas (skips malformed with log+continue), get_active/set_active (atomic tmp+rename) - seed_default_personas: idempotent persona file write with --repair overwrite semantics; preserves operator active-pointer choice - build_prompt_addendum: composes hal0 MCP usage block + approval policy summary for the system prompt - activate: write active.txt + best-effort JSON-RPC reload.env nudge to running Hermes (no full restart). PR-4 will wire the API endpoint to this helper New CLI subcommands under hal0 agent: - reprovision <id> [--repair] — re-run bootstrap idempotently - personas list — show personas + active marker - personas show <id> — print the persona's TOML body - personas activate <id> — switch active persona + nudge hot-reload New docs/agents/hermes/CONFIG.md covers all eight config surfaces (persona TOML, active pointer, overrides.yaml, config.yaml, allowlist.toml, secrets env, provision.json, plugin manifests) with write owners, precedence, and restart-vs-hot-reload semantics. Addresses DA-arch must-fix #4 (master plan §1 #12, BLOCKING). MCP registration verified against upstream Hermes config schema (~/src/hermes-agent cli-config.yaml.example): mcp_servers map keyed by server name with url + headers + timeout. ADR-0012 X-hal0-Agent identity passthrough preserved. Idempotency: new tests/agents/test_hermes_provision_idempotency.py asserts byte-equal config.yaml + persona TOMLs across two consecutive runs and verifies persona_seed sits before config_write in the phase order so first-render system prompt is correct. Live LXC smoke: reprovision is idempotent (no drift on re-run); personas seeded under /var/lib/hal0/agents/hermes/personas/; config.yaml carries system_prompt_prelude + personality + mcp_servers with X-hal0-Agent: hermes-agent headers; CLI personas list/show/ activate all functional. Refs: docs/internal/hermes-research-2026-05-28/MASTER-PLAN.md §4 PR-3. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(api): clear _HAL0_MODEL_CACHE between tests (3.12 isolation) PR-1's composite-upstream cache is module-level; PR-3's persona/provision tests pollute it with `gemma3:1b`, which then leaks into tests/api/test_v1_proxy.py::test_v1_models_still_handled_by_aggregator under Python 3.12's test-collection ordering (3.11 collects in a different order so the leak is masked). Fix: autouse fixture in tests/api/conftest.py calls _hal0_model_cache_clear() before and after each api test, matching the helper's documented contract ("Tests also call this to keep state isolated between cases"). Caught by CI on PR #396 python (3.12). PR-1 composite-cache helper: src/hal0/api/__init__.py::_hal0_model_cache_clear. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
10 tasks
thinmintdev
added a commit
that referenced
this pull request
May 28, 2026
…397) hal0 dashboard consumes upstream Hermes plugin manifest. Kanban auto- mounts as an agent tab in v0.3. SDK shim and isolation in place so any future upstream plugin lands without hal0 code changes. Backend: - src/hal0/api/plugins/manifest_proxy.py — proxies /api/dashboard/plugins + /dashboard-plugins/<name>/* from hermes localhost - Strips inbound Auth/Cookie; injects X-hal0-Agent outbound - SRI verification (sha384/sha256/sha512) on bundles; mismatch returns 502 - Path-traversal validator (ported from GHSA-5qr3-c538-wm9j) - CSP: script-src 'self' 'strict-dynamic' on manifest endpoint UI: - ui/src/dash/agents/plugin-host.jsx — PluginTabHost with shadow DOM per plugin, ErrorBoundary, hal0 CSS token bridge - ui/src/dash/agents/plugin-sdk-shim.js — window.__HERMES_PLUGIN_SDK__ mirroring upstream registry.ts:107-150 shape, plus window.__HAL0_PLUGINS__ alias for forward compat - One new "Plugins" tab in AgentView nav (minimal extras.jsx edit; PR-8 owns the monolith split) Refs MASTER-PLAN.md §4 PR-7. Addresses DA-sec-ops MUST-FIX #2 + #4. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
thinmintdev
added a commit
that referenced
this pull request
May 28, 2026
Replaces PR-8's HermesChatTab placeholder with a hal0-native React chat surface that streams Hermes JSON-RPC events over the WebSocket proxy from PR-9. No xterm; no PTY; no Tailwind v4 (master plan §1 pivot #1 + DA-ux #1). New components under ui/src/dash/agents/chat/: - Composer (Enter submits, Shift+Enter newline — user §6 decision) - Transcript (sticky-bottom auto-scroll) - MessageBubble / Markdown / ToolCallCard / ApprovalCard / ThinkingIndicator - HermesSidecar (PersonaSwitcher + ModelBadge + MCPStatusRow + AgentControls) - use-hermes-session: external store + WS connection manager WS event routing covers every R1 taxonomy entry: message.{start,delta, complete}, thinking/reasoning, tool.{start,progress,complete}, approval.request, status.update, error, sudo/clarify/secret.request. Approvals UX: inline ApprovalCard + sidebar pip pulse + toast top-right (user §6 #4: no desktop notification permission). Persona hot-swap on next turn via POST /api/agents/hermes/personas/{pid}/activate (PR-4). First-run hook: when sessionId is missing on connect, fire session.create with first_run=true so Hermes auto-emits the welcome message per PR-3 system-prompt addendum. Sent from the submit-WS onopen handler so the envelope can't race the WS becoming writable. Mobile: composer sticky bottom + sidecar collapses to bottom sheet <768px via the .hermes-chat-sheet-toggle pill. State mgmt split per master plan §2: hand-rolled external store + React useSyncExternalStore for runtime state (transcript, session, conn state); TanStack Query (via window-globals bridges) for fetch/cache (personas, mcp pip, model badge). Window-globals build shim preserved. Reconnect strategy (PR-9 contract — proxy is stateless): jittered backoff base=250ms cap=4s with 1.0–1.5x jitter per step capped at attempt 5; handshake retried on every reconnect; session resumed via session.resume when a sessionId is held. Tests: tests/e2e/specs/hermes-chat.spec.ts (14 cases) backed by the new tests/e2e/fixtures/wsHarness.ts WebSocket shim — covers composer submit/Shift+Enter, message streaming, tool cards, approval card + approve.respond, persona switch, restart confirm, reconnect, mobile sheet, first-run session.create. agent-view-v3.spec.ts updated: chat tab now shows hermes-chat-surface (PR-10 surface) instead of hermes-chat-placeholder (PR-8 stub). Refs MASTER-PLAN.md §4 PR-10 + §1 pivot #1 + §6 user decisions. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
thinmintdev
added a commit
that referenced
this pull request
May 28, 2026
* docs(agents,mcp,memory): user-facing docs for v0.3 surface (identity + private-ns are placeholders pending rename + #317) Adds 10 docs pages covering the v0.3 agents / MCP / memory surface: - docs/agents/overview.md — what an agent is in hal0, install/lifecycle, v0.3 = Hermes only. - docs/agents/hermes-bootstrap.md — the 12-phase pipeline + plugin model + state paths. - docs/agents/identity.md — ADR-0011 identity cards + the X-hal0-Agent target shape, with TO BE DOCUMENTED placeholder for the server-side header read still pending. - docs/agents/mcp-client.md — ADR-0013 per-agent allow-list (refresh of the deleted PR #295 file, updated post-ADR-0012 to remove the inbound-bearer framing). - docs/mcp/overview.md — hal0 as MCP host; transport, mount, identity (no auth post-ADR-0012). - docs/mcp/hal0-admin.md — tool taxonomy (25 tools), gating, REST passthrough, audit, secret redaction. - docs/mcp/hal0-memory.md — four tools, dataset model, REST shims, on-disk layout. - docs/memory/overview.md — Cognee engine, datasets, surfaces, source stamping. - docs/memory/graph.md — refresh of the deleted PR #294 file; ADR-0014 model gate, three routes, CLI + REST + dashboard. - docs/memory/private-namespacing.md — target shape for private:<agent_id>, with TO BE DOCUMENTED placeholder for issue #317. Every claim is anchored to a src/ path, an ADR, or a PR/issue number. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(agents): internal agent contracts — issue tracker, triage labels, domain glossary Adds AGENTS.md (top-level pointer) plus docs/agents/{domain,issue-tracker,triage-labels}.md covering the conventions agents follow when working in this repo: gh CLI on Hal0ai/hal0, default triage label vocabulary, and single-context CONTEXT.md domain doc pointer. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(agents,api): v0.3 agent plumbing hot-fix bundle (#393) Four fixes that together restore the hal0 provider on a fresh Hermes install. Without all four shipping together, PR-1 in isolation produced no user-visible improvement (R4 + DA-arch). 1. ``_collect_chat_slots`` filter (R4 H1) — the live ``/api/slots`` payload uses ``type=="llm"`` for chat slots and ``kind=="local"`` for the deployment shape. The previous ``_slot_kind``-first check looked at ``kind`` first and rejected 100% of real slots; ``model_aliases:`` never rendered. Filter now matches on ``type == "llm"`` and gates on ``_is_ready`` so only loaded models surface as aliases. 2. ``/api/upstreams`` dedup (R4 H2) — replaced per-slot upstream autoregistration with one composite ``hal0`` upstream pointed at hal0-api's own ``/v1``. Aggregates every chat-capable slot's model id through a new ``_fetch_hal0_composite_models`` helper with a 5s TTL cache (``time.monotonic()``-keyed module dict, NOT ``functools.lru_cache`` since that has no time-based expiry). The ``/v1/models`` handler short-circuits the composite case so it doesn't recurse over HTTP. The ``slot.state`` ready-edge subscriber punches the cache. Eliminates the duplicate ``primary`` + ``agent-hermes`` entries both pointing at ``127.0.0.1:8001``. 3. Removed legacy ``Hal0Profile`` plugin (R4 H4) — it hardcoded ``base_url=http://127.0.0.1:8000/api/v1`` which has no listener; the composite ``hal0`` upstream from fix #2 supersedes it. Install phase now stages only ``hal0-memory``; legacy plugin dir cleanup is idempotent. 4. ``hal0-memory`` client — stopped sending ``dataset="private:hermes-agent"``. The server resolves dataset from ``X-hal0-Agent`` + ``X-hal0-Private`` headers since PR #366; the client-side ``private:`` prefix was rejected by ``_AGENT_ID_PATTERN`` and silently 4xx'd every memory write. Tests: - ``tests/agents/test_hermes_provision_collect.py`` — three real-LXC slot fixtures (cold / primary-ready / all-ready), parametrized + capability + readiness guards. Fixtures captured from LXC 105 2026-05-28. - ``tests/api/test_upstream_dedup.py`` — composite registration, TTL cache lifecycle, nested ``[model] default`` TOML shape, override precedence, idempotency. - ``tests/agents/test_hal0_memory_client.py`` — locks the no-dataset contract for ``sync_turn`` + verifies graph forwarding still intact. Existing test updates: - ``test_install_phase_skips_install_when_binary_exists`` now asserts the legacy plugin dir is absent. - ``test_hal0_profile_plugin_file_present`` renamed to ``test_legacy_hal0_profile_plugin_removed`` and inverted. - ``test_model_automap_writes_aliases_from_chat_slots`` updated to use the real ``type=="llm"`` payload shape. - ``test_lifespan_autoregisters_local_slot_as_upstream`` rewritten as ``test_lifespan_autoregisters_composite_hal0_upstream``. LXC smoke verified: ``/api/upstreams`` returns one ``hal0`` entry, ``/v1/models`` aggregates both chat slot models, ``hal0-api`` restarts clean. Refs: docs/internal/hermes-research-2026-05-28 MASTER-PLAN.md §4 PR-1-bundle; R4 H1/H2/H4; #317 client-side closeout. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(agents): hal0-agent@.service template + hal0-agent CLI shim (v0.3 PR-5) (#395) New systemd template unit `installer/systemd/hal0-agent@.service`, parameterized by agent id (%i). v0.4-ready: dropping in `hal0-agent@piccoder.service` later requires no template edit. Per DA-sec-ops review (docs/internal/hermes-research-2026-05-28): * `Wants=hal0-lemonade.service` (NOT Requires=/BindsTo=) per MUST-FIX #5 — survives the Lemonade GPU-cleanup-after-unload deadlock documented in memory `hal0_lemonade_unload_gpu_cleanup_hang` without pinning the agent in "active (running)" forever * `Type=notify` + `WatchdogSec=60` — systemd observes hangs in the agent itself (not just the model backend) * `NoNewPrivileges`, `ProtectSystem=strict`, `ProtectHome=yes`, `PrivateTmp`, `ProtectKernelTunables/Modules/ControlGroups`, `RestrictSUIDSGID`, `RestrictRealtime` — defense-in-depth sandbox ExecStart goes through the new `hal0-agent` CLI shim (`src/hal0/cli/agent_shim.py`), which: * resolves agent type from `/etc/hal0/agents/<id>.toml` + builtin map * launches `hermes dashboard --tui --skip-build --no-open --host 127.0.0.1` — the ONLY Hermes subcommand that boots `hermes_cli/web_server.py` (the one serving `/api/pty`, `/api/events`, `/api/ws`). Verified at `~/src/hermes-agent/hermes_cli/main.py:14050-14102` → `cmd_dashboard` → `web_server.start_server` at line 10930-10939 * emits sd_notify READY/WATCHDOG/STOPPING via pure-stdlib AF_UNIX datagram — no `systemd-python` dep added to the wheel * forwards SIGTERM/SIGINT/SIGHUP to the child (SIGHUP = persona swap) DA-sec-ops MUST-FIX #1 addressed: `mcp serve` mode is a query-only MCP server with NO event stream — the chat surface would render blank. Test `test_exec_start_never_uses_mcp_serve` enforces this. `installer/systemd/hal0-agent@hermes.service.d/override.conf` pins hermes-specific env (HERMES_HOME, HERMES_DASHBOARD_TUI, HAL0_LEMONADE_BASE) without touching the generic template. `installer/install.sh` lays down the unit + override at install time and `systemctl enable --now`s the hermes instance when the venv exists (PR-3 will land the venv). `docs/agents/hermes/SERVICE.md` — operator recipes (start/stop/ restart/journalctl, failure mode triage, customisation patterns). Tests: - 36 tests in tests/cli/test_agent_shim.py — argv parsing, agent config resolution, Hermes invocation builder (incl. assertion that `dashboard` is chosen and `mcp serve` is not), Hermes env builder (HAL0_AGENT_ID + HERMES_HOME propagation, NOTIFY_SOCKET strip), sd_notify wire protocol over AF_UNIX, /proc child-pid discovery (cmdline AND env AND-gate), cmd_status / cmd_stop / cmd_reprovision - 21 tests in tests/systemd/test_unit_files.py — directive presence (Wants= not Requires=, Type=notify, WatchdogSec=, hardening directives, ReadWritePaths covers all three state dirs, Environment="HAL0_AGENT_ID=%i", per-instance EnvironmentFile, no `mcp serve` in any ExecStart line) plus 3 tests on the hermes override. Verified `systemd-analyze verify` on hal0 LXC (only error is the missing hal0-agent binary — expected pre-merge). Refs hermes-research-2026-05-28/MASTER-PLAN.md §4 PR-5. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * v0.3: hal0-cognee MemoryProvider (wraps hal0-memory REST; locks #317) (#394) * v0.3: hal0-cognee MemoryProvider for Hermes New Hermes-side memory plugin that wraps the hal0-memory REST API. Vendored under src/hal0/agents/hermes/plugins/memory_cognee/ so the installer can deploy it into Hermes's plugin tree at provision time. - Subclasses upstream MemoryProvider ABC (per R3 holographic scaffold) - httpx.AsyncClient to hal0-api at HAL0_MEMORY_BASE (default :8080) - X-hal0-Agent identity header (ADR-0012 / PR #268) - Omits explicit dataset field — server resolves via header (issue #317 server-side fix in PR #366; this is the client-side completion) Integration wiring depends on PR-3 (hermes_provision MCP register phase). LXC smoke deferred to that PR. Refs: hermes-research-2026-05-28/MASTER-PLAN.md §4 PR-2. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(agents): promote hermes.py to hermes/driver.py + re-export Converting src/hal0/agents/hermes into a package (so memory_cognee/ can live under it) requires moving the original hermes.py module content into the new package. Two-line migration: - git mv src/hal0/agents/hermes.py → src/hal0/agents/hermes/driver.py - hermes/__init__.py re-exports HermesDriver for backward compat - driver.py _installer_script_path() parents[3] → parents[4] (one extra directory level now) Existing import `from hal0.agents.hermes import HermesDriver` continues to work (e.g. tests/agents/test_hermes_wrapper.py:29). Caught by CI on PR #394 (python 3.11 collection failure). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * v0.3: hermes_provision overhaul (MCP register, personas seed, prompt injection, CONFIG.md) (#396) * v0.3: hermes_provision overhaul — MCP register, personas seed, prompt injection Adds five new/reworked phases to hermes_provision (on top of PR-1's filter + composite-upstream fixes): - Phase 5 (config_write): now passes chat_slots + active persona's system_prompt_prelude + cached mcp_servers list on the first render so single-shot bootstrap lands the model_aliases + persona + MCP blocks all at once - Phase 6 (mcp_wire): captures the live probe result in details.rendered_servers so Phase 5 (next run) and Phase 9 source the same canonical inventory; template loops over the list rather than hard-coding two server names - Phase 7 (prompt injection, in config_write): persona TOML's system_prompt + the hal0 MCP usage block + approval policy summary composed by personas.build_prompt_addendum, rendered into agent.system_prompt_prelude - Phase 8 (NEW persona_seed): seeds personas/{hermes,coder}.toml + active.txt -> hermes idempotently; --repair forces re-seed, operator edits and operator-chosen active persona survive normal re-runs (per master plan §6 user choice) - Phase 9 (model_automap): demoted to idempotency check — passes the same persona + mcp_servers inputs as Phase 5 so hash-equal runs no-op New module src/hal0/agents/personas.py: - Persona/PersonaApproval dataclasses + from_dict/to_dict TOML round-trip (tomli_w for write, tomllib for read) - load_persona, save_persona, list_personas (skips malformed with log+continue), get_active/set_active (atomic tmp+rename) - seed_default_personas: idempotent persona file write with --repair overwrite semantics; preserves operator active-pointer choice - build_prompt_addendum: composes hal0 MCP usage block + approval policy summary for the system prompt - activate: write active.txt + best-effort JSON-RPC reload.env nudge to running Hermes (no full restart). PR-4 will wire the API endpoint to this helper New CLI subcommands under hal0 agent: - reprovision <id> [--repair] — re-run bootstrap idempotently - personas list — show personas + active marker - personas show <id> — print the persona's TOML body - personas activate <id> — switch active persona + nudge hot-reload New docs/agents/hermes/CONFIG.md covers all eight config surfaces (persona TOML, active pointer, overrides.yaml, config.yaml, allowlist.toml, secrets env, provision.json, plugin manifests) with write owners, precedence, and restart-vs-hot-reload semantics. Addresses DA-arch must-fix #4 (master plan §1 #12, BLOCKING). MCP registration verified against upstream Hermes config schema (~/src/hermes-agent cli-config.yaml.example): mcp_servers map keyed by server name with url + headers + timeout. ADR-0012 X-hal0-Agent identity passthrough preserved. Idempotency: new tests/agents/test_hermes_provision_idempotency.py asserts byte-equal config.yaml + persona TOMLs across two consecutive runs and verifies persona_seed sits before config_write in the phase order so first-render system prompt is correct. Live LXC smoke: reprovision is idempotent (no drift on re-run); personas seeded under /var/lib/hal0/agents/hermes/personas/; config.yaml carries system_prompt_prelude + personality + mcp_servers with X-hal0-Agent: hermes-agent headers; CLI personas list/show/ activate all functional. Refs: docs/internal/hermes-research-2026-05-28/MASTER-PLAN.md §4 PR-3. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(api): clear _HAL0_MODEL_CACHE between tests (3.12 isolation) PR-1's composite-upstream cache is module-level; PR-3's persona/provision tests pollute it with `gemma3:1b`, which then leaks into tests/api/test_v1_proxy.py::test_v1_models_still_handled_by_aggregator under Python 3.12's test-collection ordering (3.11 collects in a different order so the leak is masked). Fix: autouse fixture in tests/api/conftest.py calls _hal0_model_cache_clear() before and after each api test, matching the helper's documented contract ("Tests also call this to keep state isolated between cases"). Caught by CI on PR #396 python (3.12). PR-1 composite-cache helper: src/hal0/api/__init__.py::_hal0_model_cache_clear. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * v0.3: plugin host — manifest proxy + SDK shim + shadow-DOM isolation (#397) hal0 dashboard consumes upstream Hermes plugin manifest. Kanban auto- mounts as an agent tab in v0.3. SDK shim and isolation in place so any future upstream plugin lands without hal0 code changes. Backend: - src/hal0/api/plugins/manifest_proxy.py — proxies /api/dashboard/plugins + /dashboard-plugins/<name>/* from hermes localhost - Strips inbound Auth/Cookie; injects X-hal0-Agent outbound - SRI verification (sha384/sha256/sha512) on bundles; mismatch returns 502 - Path-traversal validator (ported from GHSA-5qr3-c538-wm9j) - CSP: script-src 'self' 'strict-dynamic' on manifest endpoint UI: - ui/src/dash/agents/plugin-host.jsx — PluginTabHost with shadow DOM per plugin, ErrorBoundary, hal0 CSS token bridge - ui/src/dash/agents/plugin-sdk-shim.js — window.__HERMES_PLUGIN_SDK__ mirroring upstream registry.ts:107-150 shape, plus window.__HAL0_PLUGINS__ alias for forward compat - One new "Plugins" tab in AgentView nav (minimal extras.jsx edit; PR-8 owns the monolith split) Refs MASTER-PLAN.md §4 PR-7. Addresses DA-sec-ops MUST-FIX #2 + #4. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * v0.3: /api/agents/{id}/personas endpoints + hot-reload activate (#399) New FastAPI router under src/hal0/api/agents/personas.py exposing the persona TOML store that PR-3 introduced. Routes: - GET /api/agents/{id}/personas — list of {id, display_name, summary, active} - GET /api/agents/{id}/personas/{pid} — detail (parsed + raw TOML) - POST /api/agents/{id}/personas/{pid}/activate — write active.txt and call PR-3's persona-activation helper (sends reload.env JSON-RPC to a running Hermes if reachable; no-op when offline) Agent id is parameterized from day 1 (master plan §2 generalization). v0.3 only resolves "hermes" — pi-coder adds a registry entry in v0.4. Refs MASTER-PLAN.md §4 PR-4. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * v0.3: chat WS proxy + session REST shim for hermes (Origin+HMAC, no PTY) (#398) Bridges browser to Hermes dashboard mode running on 127.0.0.1:9119 (per PR-5 systemd ExecStart). JSON-RPC over WebSocket + REST shim for session operations. Replaces PR-1's xterm-PTY plan after DA-ux + DA-sec-ops killed it (master plan §1 pivot #1). Routes: - WS /api/agents/{id}/events — mirrors hermes JSON-RPC event bus - WS /api/agents/{id}/submit — bidi JSON-RPC client->hermes - GET /api/agents/{id}/session/handshake — mints HMAC session cookie - POST /api/agents/{id}/session/{create,resume} - GET /api/agents/{id}/session/history Security (DA-sec-ops MUST-FIX #2, #3): - Hermes bound 127.0.0.1; proxy bridges browser to loopback - Origin allowlist (config-driven via HAL0_ALLOWED_ORIGINS) - HMAC session cookie issued on dashboard handshake; verified on every WS upgrade - Authorization: Bearer <token> outbound to hermes (NEVER query string) - runtime.json + secret.bin chmod 0600 on read - uvicorn access log middleware scrubs query strings Backpressure: server-side coalesce tool.progress events at 100ms, keyed by tool_id. Non-progress events flush the buffer first so tool.complete never lands before its preceding tool.progress. Refs MASTER-PLAN.md §4 PR-9. Addresses DA-sec-ops MUST-FIX #2 + #3. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * v0.3: SidebarAgentBlock — service/persona/approvals/skills/memory + [Open chat] (#400) New compact agent status block mounted in the left sidebar next to lemond's SidebarStatusBlock. Replaces the stats card that used to live in the Agents page Overview tab; the chat surface (PR-10) will take that main-pane slot. Renders: - Service status dot (green/amber/red) - Active persona name (from /api/agents/{id}/personas) - Approvals pending count (red badge if >0) - Skills count (existing /api/agents/skills) - Memory writes count - MCP server status pip (hal0-memory + hal0-admin) - [Open chat] button + empty "Install Hermes" CTA Polling: TanStack Query 5s refetch + revalidate-on-focus (master plan §2 state-mgmt policy: TanStack for fetch/cache, zustand only for runtime state). Mounted via window-globals to match the existing build shim. Refs MASTER-PLAN.md §4 PR-6. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * v0.3: dashboard refactor — drop Inbox, fold Peers, split AgentView monolith (#401) Six discrete UI changes per master plan §4 PR-8 + p4 dashboard refactor: 1. Inbox tab DELETED (approvals UX now via sidebar pip from PR-6 + future inline approval cards in PR-10 HermesChat). 2. Peers tab folded into Memory tab as "Peer memory" subsection — the live MCP search Peers used (R5 finding) is preserved, not deleted. 3. AgentView 974-LOC monolith split into ui/src/dash/agents/{agent-view, hermes-chat-tab,personas-tab,skills-tab,memory-tab,plugins-tab}.jsx. 4. HermesChatTab is now the default tab (placeholder; PR-10 fills in composer + transcript). 5. data.jsx purged of agent-related mock entries (HAL0_DATA.approvals). 6. Old test.skip-only agent-v3.spec.ts deleted; new minimal smoke spec agent-view-v3.spec.ts covers nav + default tab + Inbox/Peers removal + #peers legacy redirect. Window-globals build shim preserved. Backend untouched. Refs MASTER-PLAN.md §4 PR-8. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * v0.3: ADR-0015 upstream Hermes pin + weekly hermes-sdk-diff CI job (#403) DA-arch must-fix #1 ("Hermes is HOT upstream — ~40 commits/day, registry.ts churns ~151 LOC/month") demanded an explicit upgrade lane. Pin is now recorded in pyproject.toml under [tool.hal0.upstream-hermes], a weekly job diffs upstream HEAD against the pin for the surfaces hal0 depends on (registry.ts, slots.ts, web_server.py, memory_provider.py, tools/registry.py, agent/events.py), and opens a single upstream-drift/triage labeled issue on drift — same one-issue-per-state shape as agent-shim-smoke.yml's notify job. Operators can run scripts/hermes-sdk-diff.sh locally with the same contract — exits 0 on no drift, 1 on drift, 2 on operational error. Supports --dry-run (parse pin, print plan, no clone) and --bump <sha> (rewrite the pin in-place inside the bump PR). Bumps go through ADR-0015 §4: review drift issue → edit shim adapter if needed → scripts/hermes-sdk-diff.sh --bump <sha> → delta-harness + gamma-suite → open chore(hermes): bump upstream pin to <short-sha> PR. 48h freeze window around any v0.x release tag (reviewer-disciplined). ADR number is 0015, not 0014 — ADR-0014 was already used for the Cognee graph-extraction model gate (PR-3 territory). Refs MASTER-PLAN.md §4 PR-12 + §5 upstream upgrade cadence. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * v0.3: HermesChat surface — React composer + transcript + sidecar (#404) Replaces PR-8's HermesChatTab placeholder with a hal0-native React chat surface that streams Hermes JSON-RPC events over the WebSocket proxy from PR-9. No xterm; no PTY; no Tailwind v4 (master plan §1 pivot #1 + DA-ux #1). New components under ui/src/dash/agents/chat/: - Composer (Enter submits, Shift+Enter newline — user §6 decision) - Transcript (sticky-bottom auto-scroll) - MessageBubble / Markdown / ToolCallCard / ApprovalCard / ThinkingIndicator - HermesSidecar (PersonaSwitcher + ModelBadge + MCPStatusRow + AgentControls) - use-hermes-session: external store + WS connection manager WS event routing covers every R1 taxonomy entry: message.{start,delta, complete}, thinking/reasoning, tool.{start,progress,complete}, approval.request, status.update, error, sudo/clarify/secret.request. Approvals UX: inline ApprovalCard + sidebar pip pulse + toast top-right (user §6 #4: no desktop notification permission). Persona hot-swap on next turn via POST /api/agents/hermes/personas/{pid}/activate (PR-4). First-run hook: when sessionId is missing on connect, fire session.create with first_run=true so Hermes auto-emits the welcome message per PR-3 system-prompt addendum. Sent from the submit-WS onopen handler so the envelope can't race the WS becoming writable. Mobile: composer sticky bottom + sidecar collapses to bottom sheet <768px via the .hermes-chat-sheet-toggle pill. State mgmt split per master plan §2: hand-rolled external store + React useSyncExternalStore for runtime state (transcript, session, conn state); TanStack Query (via window-globals bridges) for fetch/cache (personas, mcp pip, model badge). Window-globals build shim preserved. Reconnect strategy (PR-9 contract — proxy is stateless): jittered backoff base=250ms cap=4s with 1.0–1.5x jitter per step capped at attempt 5; handshake retried on every reconnect; session resumed via session.resume when a sessionId is held. Tests: tests/e2e/specs/hermes-chat.spec.ts (14 cases) backed by the new tests/e2e/fixtures/wsHarness.ts WebSocket shim — covers composer submit/Shift+Enter, message streaming, tool cards, approval card + approve.respond, persona switch, restart confirm, reconnect, mobile sheet, first-run session.create. agent-view-v3.spec.ts updated: chat tab now shows hermes-chat-surface (PR-10 surface) instead of hermes-chat-placeholder (PR-8 stub). Refs MASTER-PLAN.md §4 PR-10 + §1 pivot #1 + §6 user decisions. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * v0.3: tests + docs sweep + final missing endpoints (#405) Closes out v0.3 Hermes integration before fold-to-main. Adds the three endpoints PR-10/PR-6/PR-8 flagged as missing during integration: - POST /api/agents/{id}/restart — systemctl restart wrapper - GET /api/agents/skills — replaces static catalog - GET /api/agents/{id}/memory/stats — pulls from hal0-memory MCP Tests: - unit endpoint coverage for each new route - δ-harness integration: full chat WS roundtrip (mock hermes); persona activate roundtrip Docs (master plan §1 #16): - AGENTS.md narrative refresh for v0.3 reality - ARCHITECTURE.md agents section + new module map - CONTEXT.md glossary: composer, transcript, plugin host, sidecar agent block, persona TOML, hal0-cognee, hermes-sdk-diff, HMAC session cookie, X-hal0-Agent, composite hal0 upstream - CHANGELOG.md v0.3.x-alpha entry covering PR-1..12 - ADR-0016: v0.3 Hermes integration decisions (cross-link master plan) - docs/agents/hermes/CONFIG.md + SERVICE.md verification Follow-up: hal0-web CONTENT_BRIEF + Astro updates land in a sibling PR on Hal0ai/hal0-web (separate repo, separate review cadence). Refs MASTER-PLAN.md §4 PR-11. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore(adr): renumber v0.3 integration ADRs to 0018/0019 (avoid main collision) `main` shipped its own ADR-0015 (`0015-mcp-as-host-platform.md`) and ADR-0017 (`0017-bell-inbox-approval-ux.md`) via PR #389 while the v0.3 integration was in flight on `docs/v0.3-agents-mcp-memory`. To fold this branch into `main` without an ADR-number collision: - 0015-upstream-hermes-pin-and-upgrade.md → 0018-upstream-hermes-pin-and-upgrade.md - 0016-v0_3-hermes-integration.md → 0019-v0_3-hermes-integration.md Updated every cross-reference (commit messages stay historical): AGENTS.md, ARCHITECTURE.md, CHANGELOG.md, CONTEXT.md, pyproject.toml, scripts/hermes-sdk-diff.sh, src/hal0/api/__init__.py, src/hal0/api/agents/skills.py, docs/agents/hermes/CONFIG.md, and the two renumbered ADR files' self-references. `docs/mcp/overview.md` carries a stale "no ADR-0015 in main yet" note that pre-dates main's ADR-0015 ship; left for the integration-PR merge to resolve against current main. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(agents): drop PR-11 duplicate /api/agents/skills endpoint (main shipped one) PR-11 added a static-catalog /api/agents/skills endpoint + tests assuming the route was new. Main shipped an equivalent endpoint via PR #364 (src/hal0/api/routes/agents.py:76) that already serves the sidebar. Registering both produced a route collision; FastAPI dispatch order meant PR-11's tests asserted against main's older shape and failed CI on python (3.11) — 11 assertions / KeyError cascade. Delete: - src/hal0/api/agents/skills.py (PR-11 static catalog endpoint) - tests/agents/test_agent_skills_endpoint.py (asserted PR-11's shape) - import + include_router stanza for the deleted module Main's endpoint continues to serve /api/agents/skills returning {skills:[...], count:N} which is what `useSidebarAgentRollup` consumes. PR-11's drift-bump intent (one PR per upstream tools/registry.py change, gated by ADR-0018 weekly diff) was never implemented and is duplicated by main's persona.AGENT_SKILLS catalog. Future v0.4 work can revisit if a richer catalog is needed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
12 tasks
thinmintdev
added a commit
that referenced
this pull request
May 29, 2026
…Phase 0 OpenRouter prereq) (#409) DA must-fix #4 from the OpenRouter integration analysis: OAuth PKCE callback re-introduces an auth surface ADR-0012 stripped. ADR-0020 constrains the callback to 127.0.0.1 only — ADR-0012's LAN-trust posture holds; users SSH-tunnel :8080 to complete the OR handshake. Scaffolds: - src/hal0/api/openrouter/auth.py — callback route (501; V1 fills in) - src/hal0/api/openrouter/_loopback.py — is_loopback_host helper - /api/openrouter/auth/callback enforces loopback guard from day 1 V1 (OpenRouter-as-Hermes-upstream) lands the actual code-exchange flow on top of this scaffold. Refs openrouter-research-2026-05-28/PLANNING.md §3 Phase 0 + §5 Q1. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
tests/openwebui/test_prewire_smoke.py— realdocker runof open-webui against uvicorn-hosted hal0-api with stub upstream/api/models(after bootstrap-admin signup) until stub model appears — proves env file → container env → upstream proxy → response is intactopenwebui-prewire-smokejob into.github/workflows/integration.ymlalongside slot-integration@pytest.mark.integration+docker infopreflightVerified locally
tests/openwebui/suite: 15/15 in 26.99sTest plan
🤖 Generated with Claude Code