fix(ci): clear ruff lint + format errors blocking CI#2
Merged
Conversation
Auto-fixed 25 (import sorting, unused imports, asyncio.TimeoutError → TimeoutError, RUF022 __all__ sort) via ruff check --fix; manually addressed the remaining 7: - src/hal0/api/routes/slots.py: replace ambiguous U+2212 with hyphen in the docstring (RUF002) - src/hal0/auth/tokens.py: try/except/pass → contextlib.suppress on the metadata-write best-effort path (SIM105) - src/hal0/hardware/probe.py: drop redundant int() around round() (RUF046) - src/hal0/providers/llama_server.py: hoist _gpu import to the top block, restoring E402 compliance - tests/api/test_slots_routes.py: try/except/pass → contextlib.suppress on cancelled-task cleanup (SIM105) - tests/auth/test_tokens.py: rename unused `tok` to `_tok` (RUF059) - tests/registry/test_curated.py: tighten blind pytest.raises(Exception) to pydantic.ValidationError (B017) No logic changes — `ruff check src tests` is clean. Pre-existing unrelated test failure in tests/providers/test_comfyui.py (image_ref returns a sha256 digest but the test expects ":v1") is left untouched; tracked separately.
CI's "Format check" step (`ruff format --check src tests`) was masked by the failing Lint step. With Lint green, format drift across 31 files would block CI next. Pure `ruff format` pass — no logic changes.
This was referenced May 16, 2026
thinmintdev
added a commit
that referenced
this pull request
May 16, 2026
chore(ci): reformat 3 files added by post-#2 PRs
This was referenced May 16, 2026
thinmintdev
added a commit
that referenced
this pull request
May 21, 2026
Close finding #2 from tests/harness/FINDINGS.md. The slot-create CLI flag --backend was always really the provider; the actual hardware backend was hardcoded to vulkan, blocking ROCm + CPU slot creation from the command line. Most of the split (new --provider / --hardware flags, hidden --backend alias, _detect_default_hardware probe) had already landed; this commit finishes the brief: - Deprecation warning now goes to stderr via typer.echo(..., err=True) with the exact phrasing the brief calls out, so stdout stays parseable when scripts pipe the success line elsewhere. - Same stderr-routing applied to slot edit's --backend alias for consistency. - New test: bare 'hal0 slot create primary' on a Strix Halo fixture (AMD iGPU, vulkan_capable=True, compute_capable=False) auto-resolves hardware=vulkan — the platform hal0 v1 most cares about. - New test: --hardware foo is rejected at the Typer/Click parse layer before the command body runs (no API call made). - Existing legacy-backend test now asserts the deprecation lands on stderr, not stdout, so re-introducing the old console.print path fails loudly.
3 tasks
thinmintdev
added a commit
that referenced
this pull request
May 23, 2026
… outputs (#154) Post-grill source of truth for v0.2 Lemonade migration. Supersedes ADR-0006 + ADR-0007; locks the 22-PR implementation sequence. - ADR-0008: Lemonade adoption as unified inference runtime (Path 4). Rescinds ADR-0007's preload validation (per-type LRU + nuclear-evict exemption list make it unnecessary). Locks --threads N mandatory. - ADR-0009: FLM trio NPU packing (chat + asr + embed in one AMDXDNA HW context via --asr 1 --embed 1). - ADR-0010: bundle picker first-run UX (no default stack). - ADR-0006/0007: Status -> Superseded by ADR-0008. - CONTEXT.md: glossary additions from grill (slot type, group, FLM trio, bundle tiers, model namespace, fresh install, v0.1.x -> v0.2 upgrade). - lemonade-adoption-plan-2026-05-22.md: 13 sections, 22-PR roadmap, service topology, slot model, NPU+FLM trio, model layout, OmniRouter spec, bundle picker, v0.1.x->v0.2 clean break, slot architecture migration, implementation sequence, operational caveats. - lemonade-spike-2-findings + runbook: empirical Phase A/B/C results, /diagnose chain that uncovered --threads deadlock, FLM trio verification. - lemonade-research-2026-05-22/{researcher,architect,api,ui}.md: 4-agent design pass with deep code references. Implementation contract: docs/internal/lemonade-adoption-plan-2026-05-22.md. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
thinmintdev
added a commit
that referenced
this pull request
May 23, 2026
…ndpoints (PR-3) PR-3 of the v0.2 Lemonade migration sequence (plan §11). Brings the LemonadeClient skeleton shipped in #137 in line with the locked adoption plan (docs/internal/lemonade-adoption-plan-2026-05-22.md) and ADR-0008 before any later PR depends on it. Changes: - Fix the llamacpp_args serialization bug. Wire format is a single space-separated string; Lemonade's nlohmann::json parser raises "type must be string, but is array" on a list (spike #2 findings + lemonade-research-2026-05-22/api.md §1.3). The client now accepts str | list[str] | None: None omits the key (never send JSON null, per the v1_load_schema memory + nlohmann unconditional accessor), str passes through verbatim, list joins on single spaces, [] becomes the empty-string sentinel ("use default" via is_empty_option). - DEFAULT_BASE_URL: 127.0.0.1:9100 → 127.0.0.1:13305 (ADR-0008 §1 + plan §3 + §12.2 lock the port). - Add the four loopback-only /internal/* endpoints from plan §2.2: shutdown() (POST /internal/shutdown — systemd ExecStop), internal_config() (GET /internal/config — admin panel source of truth), internal_set(values) (POST /internal/set — atomic config setter for both immediate-effect and deferred-until-next-load keys), and internal_cleanup_cache() (POST /internal/cleanup-cache — weekly HF cache hygiene cron). All four route through _raise_for_status, so non-2xx surfaces as LemonadeHTTPError, not LemonadeLoadError (reserved for /v1/load's evict-all blast radius). - Update stale ADR references in docstrings/comments. ADR-0006 → ADR-0008 (parent decision). ADR-0007's no-retry-on-/v1/load logic is still valid but the ADR ref is stale; rephrased to cite ADR-0008 §3's nuclear-evict + not-found exemption. Drop "preload validation" and "preload validator" — preload was removed from main in #155. - Stats docstring now points at plan §12.1's KV%-missing caveat so the metrics-shim author doesn't re-discover it later. Tests: 13 new cases extending tests/lemonade/test_client.py. - llamacpp_args matrix: None omits key; "--threads 8" forwards verbatim; ["--parallel", "1", "--threads", "8"] joins to "--parallel 1 --threads 8"; [] → "". - Each /internal/* endpoint verified for HTTP method, path, Bearer auth header, and request body shape (or absence for the no-body endpoints). One shared parametrised test confirms all four raise LemonadeHTTPError on 403 (not LemonadeLoadError). Out of scope (per the PR-3 brief): preload module (removed in #155 already), forward-plane endpoints, hal0-api wiring of the idle driver, and any Pydantic/TypedDict response layer (plain dicts are fine for v0.2). Refs: docs/internal/lemonade-adoption-plan-2026-05-22.md §2.2 + §3 + §11; docs/internal/adr/0008-lemonade-adoption.md §1, §3, §4, §7; docs/internal/lemonade-spike-2-findings-2026-05-22.md; docs/internal/lemonade-research-2026-05-22/api.md §1. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
5 tasks
thinmintdev
added a commit
that referenced
this pull request
May 23, 2026
…ndpoints (PR-3) (#156) * feat(lemonade): extend client — fix llamacpp_args + add /internal/* endpoints (PR-3) PR-3 of the v0.2 Lemonade migration sequence (plan §11). Brings the LemonadeClient skeleton shipped in #137 in line with the locked adoption plan (docs/internal/lemonade-adoption-plan-2026-05-22.md) and ADR-0008 before any later PR depends on it. Changes: - Fix the llamacpp_args serialization bug. Wire format is a single space-separated string; Lemonade's nlohmann::json parser raises "type must be string, but is array" on a list (spike #2 findings + lemonade-research-2026-05-22/api.md §1.3). The client now accepts str | list[str] | None: None omits the key (never send JSON null, per the v1_load_schema memory + nlohmann unconditional accessor), str passes through verbatim, list joins on single spaces, [] becomes the empty-string sentinel ("use default" via is_empty_option). - DEFAULT_BASE_URL: 127.0.0.1:9100 → 127.0.0.1:13305 (ADR-0008 §1 + plan §3 + §12.2 lock the port). - Add the four loopback-only /internal/* endpoints from plan §2.2: shutdown() (POST /internal/shutdown — systemd ExecStop), internal_config() (GET /internal/config — admin panel source of truth), internal_set(values) (POST /internal/set — atomic config setter for both immediate-effect and deferred-until-next-load keys), and internal_cleanup_cache() (POST /internal/cleanup-cache — weekly HF cache hygiene cron). All four route through _raise_for_status, so non-2xx surfaces as LemonadeHTTPError, not LemonadeLoadError (reserved for /v1/load's evict-all blast radius). - Update stale ADR references in docstrings/comments. ADR-0006 → ADR-0008 (parent decision). ADR-0007's no-retry-on-/v1/load logic is still valid but the ADR ref is stale; rephrased to cite ADR-0008 §3's nuclear-evict + not-found exemption. Drop "preload validation" and "preload validator" — preload was removed from main in #155. - Stats docstring now points at plan §12.1's KV%-missing caveat so the metrics-shim author doesn't re-discover it later. Tests: 13 new cases extending tests/lemonade/test_client.py. - llamacpp_args matrix: None omits key; "--threads 8" forwards verbatim; ["--parallel", "1", "--threads", "8"] joins to "--parallel 1 --threads 8"; [] → "". - Each /internal/* endpoint verified for HTTP method, path, Bearer auth header, and request body shape (or absence for the no-body endpoints). One shared parametrised test confirms all four raise LemonadeHTTPError on 403 (not LemonadeLoadError). Out of scope (per the PR-3 brief): preload module (removed in #155 already), forward-plane endpoints, hal0-api wiring of the idle driver, and any Pydantic/TypedDict response layer (plain dicts are fine for v0.2). Refs: docs/internal/lemonade-adoption-plan-2026-05-22.md §2.2 + §3 + §11; docs/internal/adr/0008-lemonade-adoption.md §1, §3, §4, §7; docs/internal/lemonade-spike-2-findings-2026-05-22.md; docs/internal/lemonade-research-2026-05-22/api.md §1. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * style: apply ruff format to client.py + test_client.py CI's `ruff format --check` step caught two formatting deltas the author agent missed (it ran `ruff check` but not `ruff format`). Pure formatter output; no logic changes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
7 tasks
thinmintdev
added a commit
that referenced
this pull request
May 23, 2026
…R-5) (#159) Adds the `Lemonade daemon` install step after PR-4's system prerequisites and before PR-6's server_models.json block: * Pinned embeddable tarball (v10.6.0) — sha256 verified with `HAL0_SKIP_LEMONADE_SHA=1` escape hatch, extracted into /opt/lemonade/ with a `.installed-version` marker so re-runs skip the multi-hundred-MB download when the binary is already current. * Dedicated `hal0` system user/group (idempotent) — owns /opt/lemonade + /var/lib/hal0/lemonade so lemond runs unprivileged per ADR-0008 §1 (internal loopback-only runtime). * /var/lib/hal0/lemonade/config.json written atomically every run with the locked baseline from lemonade-adoption-plan §3 (config_version=1, port 13305 loopback, max_loaded_models=4, rocm_channel=stable, flm.args="--asr 1 --embed 1", kokoro cpu_bin=builtin, ...). * MANDATORY `llamacpp.args = "--parallel 1 --threads N"` per ADR-0008 §4 + memory `hal0_lemonade_threads_deadlock`. Formula: N = max(2, (nproc - 2) / 4) — splits across the four-process capability rollup (primary + embed + rerank + voice) and avoids the spike #2 Vulkan-dispatch deadlock. Defaults to 2 + warn when nproc is unavailable; the flag is never omitted. * /etc/systemd/system/hal0-lemonade.service — Type=simple, User=hal0, LimitMEMLOCK=infinity, CPUQuota=80%, ExecStop curl /internal/shutdown for clean drain. Verbatim from plan §3. * Service start block extended to `systemctl enable --now hal0-lemonade` before hal0-api, with a 30 s wait_active. * Full DEV_MODE skip with what-would-happen logging. * ERR-trap recovery hint for the new step. * UI_STEP_TOTAL bumped 9 → 10. Refs: lemonade-adoption-plan-2026-05-22 §3 / §11 PR-5 / §12.2-12.3, ADR-0008 §1, §3, §4, memory `hal0_lemonade_threads_deadlock`. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
4 tasks
thinmintdev
added a commit
that referenced
this pull request
May 23, 2026
Surfaces the kokoro:cpu constraint locked in plan §1 #2 + ADR-0008 §2 as a small "[CPU]" chip with hover/focus tooltip on the voice slot card's TTS sub-section. Hard-coded to provider === 'kokoro' per plan; no device-detection logic. GPU-accelerated TTS lands in v0.3. Chip also added to SlotCard.vue header for any custom kokoro slot operators add outside the capability surface — same disclosure, same neutral slate palette (info, not warning). Both chips share verbatim copy: "Kokoro TTS runs on CPU in v0.2. GPU-accelerated TTS is planned for v0.3." A11y: chip is focusable with `tabindex="0"` so keyboard users see the native title= tooltip; aria-label carries the full disclosure for screen-reader users. Tests: ui/tests/e2e/specs/lemonade-voice-chip.spec.ts — kokoro present → chip visible + correct aria-label; non-kokoro provider → chip absent; tooltip text matches the brief verbatim. Mocks /api/capabilities to mount VoiceCard's TTS sub-section in isolation. Plan §11 PR-15. ADR-0008 §2. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This was referenced May 23, 2026
thinmintdev
added a commit
that referenced
this pull request
May 23, 2026
…gation 1) (#283) The 5-capability-slot loadout (primary + embed + rerank + tts + stt) exceeds the global budget of 4. Lemonade evicts to make room; if the incoming load fails (e.g., whisper backend missing), it nuclear-evicts EVERYTHING — leaving /v1/health.loaded[] empty and all hal0 slots in the post-#276 OFFLINE drift state (was ERROR pre-#276). Bump to 8 matches the slot-port ceiling (8081-8099 minus reserved) so the canonical full loadout (5 capability slots + 3 NPU trio + 1 image) fits without forcing eviction churn. Doesn't fix the nuclear-evict behavior itself (out of our control without a lemond upstream PR) but reduces how often we hit it. Tracked in #275 bug 7 deep-dive comment. Mitigation #2 (collateral- eviction recovery in SlotManager) + #3 (meaningful "install backend X" error message) are separate follow-ups. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
thinmintdev
added a commit
that referenced
this pull request
May 27, 2026
* fix(slots): zero-red-dots bundle — 7 fixes
Backend
- manager.py: persist explicit model_id to slot TOML on load/swap
so reconciliation never drifts back to "no model.default set"
ERROR (Fix #1)
- manager.py: _fail_watch transitions to OFFLINE (clean evict)
instead of ERROR (red) when lemond drops a loaded model. RED
reserved for spawn/health/load exceptions (Fix #2)
- manager.py: load() short-circuits to OFFLINE+CTA when there's
no resolvable model, instead of letting lemonade.load() throw
and stamp ERROR every tick (Fix #4)
- manager.py: reconcile_unconfigured_slots() one-shot startup
pass migrates pre-fix stuck ERRORs to OFFLINE so the dashboard
re-renders correctly without operator action (Fix #2/#4 cleanup)
- api/__init__.py: wire reconcile_unconfigured_slots into lifespan
- tests/slots/test_fail_watcher.py: assert OFFLINE+evict semantics
Frontend
- dashboard.css: .dot.serving uses --ok (green) not --accent (Fix #3)
- slot-modals.jsx: InlineSwapPopover chevron is now an
independently-clickable <button> (own onClick + stopPropagation
+ keyboard handler); CSS adds focus-visible outline + hover
feedback (Fix #5)
- models.jsx: ModelDetail "Load now" wired through useSlotSwap
against the first compatible slot; toast on multi-match (Fix #6)
- slots.jsx: slotIndicator() rewritten so GREEN only fires while
actively SERVING; loaded+waiting (ready/idle/lemo=loaded) and
evicted (lemo=idle) both map to YELLOW. 1h hung-request guard
flips a long-in-SERVING slot back to YELLOW with "stuck?"
label (Fix #7)
* fix(slots): review-pass amendments — a11y + ERROR audit log
Backend
- manager.py: log.error('slot.error', extra={...reason...}) on
every ERROR transition so journald carries a durable audit
trail in addition to the SSE event bus (closes user-spec
audit demand #6 logging gap). NOTE: extra= cannot reuse
'message' — it's a reserved LogRecord attribute and stdlib
logging raises KeyError on collision; the gotcha is documented
inline.
Frontend
- slot-modals.jsx: dropped role="button" / tabIndex / onKeyDown
from .swap-pop-item rows. The nested chevron <button> is the
single keyboard/AT-accessible affordance — making the row
ALSO a button created a double-announcement for screen
readers. Mouse onClick on the row body still works.
* test(e2e): update slot-indicator spec for 2026-05-27 dot-state contract
Pre-existing tests asserted the OLD READY+fresh → green / READY+stale → yellow rule. Per the user spec, GREEN now fires only on state=serving (in-flight); all loaded-and-waiting states (ready / lemo=loaded / idle / lemo=idle) map to yellow. Added coverage for serving (fresh + stuck), !enabled, lemonade_state=loaded, and lemonade_state=idle.
6 tasks
thinmintdev
added a commit
that referenced
this pull request
May 28, 2026
Four fixes that together restore the hal0 provider on a fresh Hermes install. Without all four shipping together, PR-1 in isolation produced no user-visible improvement (R4 + DA-arch). 1. ``_collect_chat_slots`` filter (R4 H1) — the live ``/api/slots`` payload uses ``type=="llm"`` for chat slots and ``kind=="local"`` for the deployment shape. The previous ``_slot_kind``-first check looked at ``kind`` first and rejected 100% of real slots; ``model_aliases:`` never rendered. Filter now matches on ``type == "llm"`` and gates on ``_is_ready`` so only loaded models surface as aliases. 2. ``/api/upstreams`` dedup (R4 H2) — replaced per-slot upstream autoregistration with one composite ``hal0`` upstream pointed at hal0-api's own ``/v1``. Aggregates every chat-capable slot's model id through a new ``_fetch_hal0_composite_models`` helper with a 5s TTL cache (``time.monotonic()``-keyed module dict, NOT ``functools.lru_cache`` since that has no time-based expiry). The ``/v1/models`` handler short-circuits the composite case so it doesn't recurse over HTTP. The ``slot.state`` ready-edge subscriber punches the cache. Eliminates the duplicate ``primary`` + ``agent-hermes`` entries both pointing at ``127.0.0.1:8001``. 3. Removed legacy ``Hal0Profile`` plugin (R4 H4) — it hardcoded ``base_url=http://127.0.0.1:8000/api/v1`` which has no listener; the composite ``hal0`` upstream from fix #2 supersedes it. Install phase now stages only ``hal0-memory``; legacy plugin dir cleanup is idempotent. 4. ``hal0-memory`` client — stopped sending ``dataset="private:hermes-agent"``. The server resolves dataset from ``X-hal0-Agent`` + ``X-hal0-Private`` headers since PR #366; the client-side ``private:`` prefix was rejected by ``_AGENT_ID_PATTERN`` and silently 4xx'd every memory write. Tests: - ``tests/agents/test_hermes_provision_collect.py`` — three real-LXC slot fixtures (cold / primary-ready / all-ready), parametrized + capability + readiness guards. Fixtures captured from LXC 105 2026-05-28. - ``tests/api/test_upstream_dedup.py`` — composite registration, TTL cache lifecycle, nested ``[model] default`` TOML shape, override precedence, idempotency. - ``tests/agents/test_hal0_memory_client.py`` — locks the no-dataset contract for ``sync_turn`` + verifies graph forwarding still intact. Existing test updates: - ``test_install_phase_skips_install_when_binary_exists`` now asserts the legacy plugin dir is absent. - ``test_hal0_profile_plugin_file_present`` renamed to ``test_legacy_hal0_profile_plugin_removed`` and inverted. - ``test_model_automap_writes_aliases_from_chat_slots`` updated to use the real ``type=="llm"`` payload shape. - ``test_lifespan_autoregisters_local_slot_as_upstream`` rewritten as ``test_lifespan_autoregisters_composite_hal0_upstream``. LXC smoke verified: ``/api/upstreams`` returns one ``hal0`` entry, ``/v1/models`` aggregates both chat slot models, ``hal0-api`` restarts clean. Refs: docs/internal/hermes-research-2026-05-28 MASTER-PLAN.md §4 PR-1-bundle; R4 H1/H2/H4; #317 client-side closeout. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This was referenced May 28, 2026
thinmintdev
added a commit
that referenced
this pull request
May 28, 2026
…397) hal0 dashboard consumes upstream Hermes plugin manifest. Kanban auto- mounts as an agent tab in v0.3. SDK shim and isolation in place so any future upstream plugin lands without hal0 code changes. Backend: - src/hal0/api/plugins/manifest_proxy.py — proxies /api/dashboard/plugins + /dashboard-plugins/<name>/* from hermes localhost - Strips inbound Auth/Cookie; injects X-hal0-Agent outbound - SRI verification (sha384/sha256/sha512) on bundles; mismatch returns 502 - Path-traversal validator (ported from GHSA-5qr3-c538-wm9j) - CSP: script-src 'self' 'strict-dynamic' on manifest endpoint UI: - ui/src/dash/agents/plugin-host.jsx — PluginTabHost with shadow DOM per plugin, ErrorBoundary, hal0 CSS token bridge - ui/src/dash/agents/plugin-sdk-shim.js — window.__HERMES_PLUGIN_SDK__ mirroring upstream registry.ts:107-150 shape, plus window.__HAL0_PLUGINS__ alias for forward compat - One new "Plugins" tab in AgentView nav (minimal extras.jsx edit; PR-8 owns the monolith split) Refs MASTER-PLAN.md §4 PR-7. Addresses DA-sec-ops MUST-FIX #2 + #4. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
thinmintdev
added a commit
that referenced
this pull request
May 28, 2026
Bridges browser to Hermes dashboard mode running on 127.0.0.1:9119 (per PR-5 systemd ExecStart). JSON-RPC over WebSocket + REST shim for session operations. Replaces PR-1's xterm-PTY plan after DA-ux + DA-sec-ops killed it (master plan §1 pivot #1). Routes: - WS /api/agents/{id}/events — mirrors hermes JSON-RPC event bus - WS /api/agents/{id}/submit — bidi JSON-RPC client->hermes - GET /api/agents/{id}/session/handshake — mints HMAC session cookie - POST /api/agents/{id}/session/{create,resume} - GET /api/agents/{id}/session/history Security (DA-sec-ops MUST-FIX #2, #3): - Hermes bound 127.0.0.1; proxy bridges browser to loopback - Origin allowlist (config-driven via HAL0_ALLOWED_ORIGINS) - HMAC session cookie issued on dashboard handshake; verified on every WS upgrade - Authorization: Bearer <token> outbound to hermes (NEVER query string) - runtime.json + secret.bin chmod 0600 on read - uvicorn access log middleware scrubs query strings Backpressure: server-side coalesce tool.progress events at 100ms, keyed by tool_id. Non-progress events flush the buffer first so tool.complete never lands before its preceding tool.progress. Refs MASTER-PLAN.md §4 PR-9. Addresses DA-sec-ops MUST-FIX #2 + #3. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
thinmintdev
added a commit
that referenced
this pull request
May 28, 2026
Bridges browser to Hermes dashboard mode running on 127.0.0.1:9119 (per PR-5 systemd ExecStart). JSON-RPC over WebSocket + REST shim for session operations. Replaces PR-1's xterm-PTY plan after DA-ux + DA-sec-ops killed it (master plan §1 pivot #1). Routes: - WS /api/agents/{id}/events — mirrors hermes JSON-RPC event bus - WS /api/agents/{id}/submit — bidi JSON-RPC client->hermes - GET /api/agents/{id}/session/handshake — mints HMAC session cookie - POST /api/agents/{id}/session/{create,resume} - GET /api/agents/{id}/session/history Security (DA-sec-ops MUST-FIX #2, #3): - Hermes bound 127.0.0.1; proxy bridges browser to loopback - Origin allowlist (config-driven via HAL0_ALLOWED_ORIGINS) - HMAC session cookie issued on dashboard handshake; verified on every WS upgrade - Authorization: Bearer <token> outbound to hermes (NEVER query string) - runtime.json + secret.bin chmod 0600 on read - uvicorn access log middleware scrubs query strings Backpressure: server-side coalesce tool.progress events at 100ms, keyed by tool_id. Non-progress events flush the buffer first so tool.complete never lands before its preceding tool.progress. Refs MASTER-PLAN.md §4 PR-9. Addresses DA-sec-ops MUST-FIX #2 + #3. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
thinmintdev
added a commit
that referenced
this pull request
May 28, 2026
Bridges browser to Hermes dashboard mode running on 127.0.0.1:9119 (per PR-5 systemd ExecStart). JSON-RPC over WebSocket + REST shim for session operations. Replaces PR-1's xterm-PTY plan after DA-ux + DA-sec-ops killed it (master plan §1 pivot #1). Routes: - WS /api/agents/{id}/events — mirrors hermes JSON-RPC event bus - WS /api/agents/{id}/submit — bidi JSON-RPC client->hermes - GET /api/agents/{id}/session/handshake — mints HMAC session cookie - POST /api/agents/{id}/session/{create,resume} - GET /api/agents/{id}/session/history Security (DA-sec-ops MUST-FIX #2, #3): - Hermes bound 127.0.0.1; proxy bridges browser to loopback - Origin allowlist (config-driven via HAL0_ALLOWED_ORIGINS) - HMAC session cookie issued on dashboard handshake; verified on every WS upgrade - Authorization: Bearer <token> outbound to hermes (NEVER query string) - runtime.json + secret.bin chmod 0600 on read - uvicorn access log middleware scrubs query strings Backpressure: server-side coalesce tool.progress events at 100ms, keyed by tool_id. Non-progress events flush the buffer first so tool.complete never lands before its preceding tool.progress. Refs MASTER-PLAN.md §4 PR-9. Addresses DA-sec-ops MUST-FIX #2 + #3. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
thinmintdev
added a commit
that referenced
this pull request
May 28, 2026
…TY) (#398) Bridges browser to Hermes dashboard mode running on 127.0.0.1:9119 (per PR-5 systemd ExecStart). JSON-RPC over WebSocket + REST shim for session operations. Replaces PR-1's xterm-PTY plan after DA-ux + DA-sec-ops killed it (master plan §1 pivot #1). Routes: - WS /api/agents/{id}/events — mirrors hermes JSON-RPC event bus - WS /api/agents/{id}/submit — bidi JSON-RPC client->hermes - GET /api/agents/{id}/session/handshake — mints HMAC session cookie - POST /api/agents/{id}/session/{create,resume} - GET /api/agents/{id}/session/history Security (DA-sec-ops MUST-FIX #2, #3): - Hermes bound 127.0.0.1; proxy bridges browser to loopback - Origin allowlist (config-driven via HAL0_ALLOWED_ORIGINS) - HMAC session cookie issued on dashboard handshake; verified on every WS upgrade - Authorization: Bearer <token> outbound to hermes (NEVER query string) - runtime.json + secret.bin chmod 0600 on read - uvicorn access log middleware scrubs query strings Backpressure: server-side coalesce tool.progress events at 100ms, keyed by tool_id. Non-progress events flush the buffer first so tool.complete never lands before its preceding tool.progress. Refs MASTER-PLAN.md §4 PR-9. Addresses DA-sec-ops MUST-FIX #2 + #3. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
thinmintdev
added a commit
that referenced
this pull request
May 28, 2026
* docs(agents,mcp,memory): user-facing docs for v0.3 surface (identity + private-ns are placeholders pending rename + #317) Adds 10 docs pages covering the v0.3 agents / MCP / memory surface: - docs/agents/overview.md — what an agent is in hal0, install/lifecycle, v0.3 = Hermes only. - docs/agents/hermes-bootstrap.md — the 12-phase pipeline + plugin model + state paths. - docs/agents/identity.md — ADR-0011 identity cards + the X-hal0-Agent target shape, with TO BE DOCUMENTED placeholder for the server-side header read still pending. - docs/agents/mcp-client.md — ADR-0013 per-agent allow-list (refresh of the deleted PR #295 file, updated post-ADR-0012 to remove the inbound-bearer framing). - docs/mcp/overview.md — hal0 as MCP host; transport, mount, identity (no auth post-ADR-0012). - docs/mcp/hal0-admin.md — tool taxonomy (25 tools), gating, REST passthrough, audit, secret redaction. - docs/mcp/hal0-memory.md — four tools, dataset model, REST shims, on-disk layout. - docs/memory/overview.md — Cognee engine, datasets, surfaces, source stamping. - docs/memory/graph.md — refresh of the deleted PR #294 file; ADR-0014 model gate, three routes, CLI + REST + dashboard. - docs/memory/private-namespacing.md — target shape for private:<agent_id>, with TO BE DOCUMENTED placeholder for issue #317. Every claim is anchored to a src/ path, an ADR, or a PR/issue number. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(agents): internal agent contracts — issue tracker, triage labels, domain glossary Adds AGENTS.md (top-level pointer) plus docs/agents/{domain,issue-tracker,triage-labels}.md covering the conventions agents follow when working in this repo: gh CLI on Hal0ai/hal0, default triage label vocabulary, and single-context CONTEXT.md domain doc pointer. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(agents,api): v0.3 agent plumbing hot-fix bundle (#393) Four fixes that together restore the hal0 provider on a fresh Hermes install. Without all four shipping together, PR-1 in isolation produced no user-visible improvement (R4 + DA-arch). 1. ``_collect_chat_slots`` filter (R4 H1) — the live ``/api/slots`` payload uses ``type=="llm"`` for chat slots and ``kind=="local"`` for the deployment shape. The previous ``_slot_kind``-first check looked at ``kind`` first and rejected 100% of real slots; ``model_aliases:`` never rendered. Filter now matches on ``type == "llm"`` and gates on ``_is_ready`` so only loaded models surface as aliases. 2. ``/api/upstreams`` dedup (R4 H2) — replaced per-slot upstream autoregistration with one composite ``hal0`` upstream pointed at hal0-api's own ``/v1``. Aggregates every chat-capable slot's model id through a new ``_fetch_hal0_composite_models`` helper with a 5s TTL cache (``time.monotonic()``-keyed module dict, NOT ``functools.lru_cache`` since that has no time-based expiry). The ``/v1/models`` handler short-circuits the composite case so it doesn't recurse over HTTP. The ``slot.state`` ready-edge subscriber punches the cache. Eliminates the duplicate ``primary`` + ``agent-hermes`` entries both pointing at ``127.0.0.1:8001``. 3. Removed legacy ``Hal0Profile`` plugin (R4 H4) — it hardcoded ``base_url=http://127.0.0.1:8000/api/v1`` which has no listener; the composite ``hal0`` upstream from fix #2 supersedes it. Install phase now stages only ``hal0-memory``; legacy plugin dir cleanup is idempotent. 4. ``hal0-memory`` client — stopped sending ``dataset="private:hermes-agent"``. The server resolves dataset from ``X-hal0-Agent`` + ``X-hal0-Private`` headers since PR #366; the client-side ``private:`` prefix was rejected by ``_AGENT_ID_PATTERN`` and silently 4xx'd every memory write. Tests: - ``tests/agents/test_hermes_provision_collect.py`` — three real-LXC slot fixtures (cold / primary-ready / all-ready), parametrized + capability + readiness guards. Fixtures captured from LXC 105 2026-05-28. - ``tests/api/test_upstream_dedup.py`` — composite registration, TTL cache lifecycle, nested ``[model] default`` TOML shape, override precedence, idempotency. - ``tests/agents/test_hal0_memory_client.py`` — locks the no-dataset contract for ``sync_turn`` + verifies graph forwarding still intact. Existing test updates: - ``test_install_phase_skips_install_when_binary_exists`` now asserts the legacy plugin dir is absent. - ``test_hal0_profile_plugin_file_present`` renamed to ``test_legacy_hal0_profile_plugin_removed`` and inverted. - ``test_model_automap_writes_aliases_from_chat_slots`` updated to use the real ``type=="llm"`` payload shape. - ``test_lifespan_autoregisters_local_slot_as_upstream`` rewritten as ``test_lifespan_autoregisters_composite_hal0_upstream``. LXC smoke verified: ``/api/upstreams`` returns one ``hal0`` entry, ``/v1/models`` aggregates both chat slot models, ``hal0-api`` restarts clean. Refs: docs/internal/hermes-research-2026-05-28 MASTER-PLAN.md §4 PR-1-bundle; R4 H1/H2/H4; #317 client-side closeout. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(agents): hal0-agent@.service template + hal0-agent CLI shim (v0.3 PR-5) (#395) New systemd template unit `installer/systemd/hal0-agent@.service`, parameterized by agent id (%i). v0.4-ready: dropping in `hal0-agent@piccoder.service` later requires no template edit. Per DA-sec-ops review (docs/internal/hermes-research-2026-05-28): * `Wants=hal0-lemonade.service` (NOT Requires=/BindsTo=) per MUST-FIX #5 — survives the Lemonade GPU-cleanup-after-unload deadlock documented in memory `hal0_lemonade_unload_gpu_cleanup_hang` without pinning the agent in "active (running)" forever * `Type=notify` + `WatchdogSec=60` — systemd observes hangs in the agent itself (not just the model backend) * `NoNewPrivileges`, `ProtectSystem=strict`, `ProtectHome=yes`, `PrivateTmp`, `ProtectKernelTunables/Modules/ControlGroups`, `RestrictSUIDSGID`, `RestrictRealtime` — defense-in-depth sandbox ExecStart goes through the new `hal0-agent` CLI shim (`src/hal0/cli/agent_shim.py`), which: * resolves agent type from `/etc/hal0/agents/<id>.toml` + builtin map * launches `hermes dashboard --tui --skip-build --no-open --host 127.0.0.1` — the ONLY Hermes subcommand that boots `hermes_cli/web_server.py` (the one serving `/api/pty`, `/api/events`, `/api/ws`). Verified at `~/src/hermes-agent/hermes_cli/main.py:14050-14102` → `cmd_dashboard` → `web_server.start_server` at line 10930-10939 * emits sd_notify READY/WATCHDOG/STOPPING via pure-stdlib AF_UNIX datagram — no `systemd-python` dep added to the wheel * forwards SIGTERM/SIGINT/SIGHUP to the child (SIGHUP = persona swap) DA-sec-ops MUST-FIX #1 addressed: `mcp serve` mode is a query-only MCP server with NO event stream — the chat surface would render blank. Test `test_exec_start_never_uses_mcp_serve` enforces this. `installer/systemd/hal0-agent@hermes.service.d/override.conf` pins hermes-specific env (HERMES_HOME, HERMES_DASHBOARD_TUI, HAL0_LEMONADE_BASE) without touching the generic template. `installer/install.sh` lays down the unit + override at install time and `systemctl enable --now`s the hermes instance when the venv exists (PR-3 will land the venv). `docs/agents/hermes/SERVICE.md` — operator recipes (start/stop/ restart/journalctl, failure mode triage, customisation patterns). Tests: - 36 tests in tests/cli/test_agent_shim.py — argv parsing, agent config resolution, Hermes invocation builder (incl. assertion that `dashboard` is chosen and `mcp serve` is not), Hermes env builder (HAL0_AGENT_ID + HERMES_HOME propagation, NOTIFY_SOCKET strip), sd_notify wire protocol over AF_UNIX, /proc child-pid discovery (cmdline AND env AND-gate), cmd_status / cmd_stop / cmd_reprovision - 21 tests in tests/systemd/test_unit_files.py — directive presence (Wants= not Requires=, Type=notify, WatchdogSec=, hardening directives, ReadWritePaths covers all three state dirs, Environment="HAL0_AGENT_ID=%i", per-instance EnvironmentFile, no `mcp serve` in any ExecStart line) plus 3 tests on the hermes override. Verified `systemd-analyze verify` on hal0 LXC (only error is the missing hal0-agent binary — expected pre-merge). Refs hermes-research-2026-05-28/MASTER-PLAN.md §4 PR-5. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * v0.3: hal0-cognee MemoryProvider (wraps hal0-memory REST; locks #317) (#394) * v0.3: hal0-cognee MemoryProvider for Hermes New Hermes-side memory plugin that wraps the hal0-memory REST API. Vendored under src/hal0/agents/hermes/plugins/memory_cognee/ so the installer can deploy it into Hermes's plugin tree at provision time. - Subclasses upstream MemoryProvider ABC (per R3 holographic scaffold) - httpx.AsyncClient to hal0-api at HAL0_MEMORY_BASE (default :8080) - X-hal0-Agent identity header (ADR-0012 / PR #268) - Omits explicit dataset field — server resolves via header (issue #317 server-side fix in PR #366; this is the client-side completion) Integration wiring depends on PR-3 (hermes_provision MCP register phase). LXC smoke deferred to that PR. Refs: hermes-research-2026-05-28/MASTER-PLAN.md §4 PR-2. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(agents): promote hermes.py to hermes/driver.py + re-export Converting src/hal0/agents/hermes into a package (so memory_cognee/ can live under it) requires moving the original hermes.py module content into the new package. Two-line migration: - git mv src/hal0/agents/hermes.py → src/hal0/agents/hermes/driver.py - hermes/__init__.py re-exports HermesDriver for backward compat - driver.py _installer_script_path() parents[3] → parents[4] (one extra directory level now) Existing import `from hal0.agents.hermes import HermesDriver` continues to work (e.g. tests/agents/test_hermes_wrapper.py:29). Caught by CI on PR #394 (python 3.11 collection failure). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * v0.3: hermes_provision overhaul (MCP register, personas seed, prompt injection, CONFIG.md) (#396) * v0.3: hermes_provision overhaul — MCP register, personas seed, prompt injection Adds five new/reworked phases to hermes_provision (on top of PR-1's filter + composite-upstream fixes): - Phase 5 (config_write): now passes chat_slots + active persona's system_prompt_prelude + cached mcp_servers list on the first render so single-shot bootstrap lands the model_aliases + persona + MCP blocks all at once - Phase 6 (mcp_wire): captures the live probe result in details.rendered_servers so Phase 5 (next run) and Phase 9 source the same canonical inventory; template loops over the list rather than hard-coding two server names - Phase 7 (prompt injection, in config_write): persona TOML's system_prompt + the hal0 MCP usage block + approval policy summary composed by personas.build_prompt_addendum, rendered into agent.system_prompt_prelude - Phase 8 (NEW persona_seed): seeds personas/{hermes,coder}.toml + active.txt -> hermes idempotently; --repair forces re-seed, operator edits and operator-chosen active persona survive normal re-runs (per master plan §6 user choice) - Phase 9 (model_automap): demoted to idempotency check — passes the same persona + mcp_servers inputs as Phase 5 so hash-equal runs no-op New module src/hal0/agents/personas.py: - Persona/PersonaApproval dataclasses + from_dict/to_dict TOML round-trip (tomli_w for write, tomllib for read) - load_persona, save_persona, list_personas (skips malformed with log+continue), get_active/set_active (atomic tmp+rename) - seed_default_personas: idempotent persona file write with --repair overwrite semantics; preserves operator active-pointer choice - build_prompt_addendum: composes hal0 MCP usage block + approval policy summary for the system prompt - activate: write active.txt + best-effort JSON-RPC reload.env nudge to running Hermes (no full restart). PR-4 will wire the API endpoint to this helper New CLI subcommands under hal0 agent: - reprovision <id> [--repair] — re-run bootstrap idempotently - personas list — show personas + active marker - personas show <id> — print the persona's TOML body - personas activate <id> — switch active persona + nudge hot-reload New docs/agents/hermes/CONFIG.md covers all eight config surfaces (persona TOML, active pointer, overrides.yaml, config.yaml, allowlist.toml, secrets env, provision.json, plugin manifests) with write owners, precedence, and restart-vs-hot-reload semantics. Addresses DA-arch must-fix #4 (master plan §1 #12, BLOCKING). MCP registration verified against upstream Hermes config schema (~/src/hermes-agent cli-config.yaml.example): mcp_servers map keyed by server name with url + headers + timeout. ADR-0012 X-hal0-Agent identity passthrough preserved. Idempotency: new tests/agents/test_hermes_provision_idempotency.py asserts byte-equal config.yaml + persona TOMLs across two consecutive runs and verifies persona_seed sits before config_write in the phase order so first-render system prompt is correct. Live LXC smoke: reprovision is idempotent (no drift on re-run); personas seeded under /var/lib/hal0/agents/hermes/personas/; config.yaml carries system_prompt_prelude + personality + mcp_servers with X-hal0-Agent: hermes-agent headers; CLI personas list/show/ activate all functional. Refs: docs/internal/hermes-research-2026-05-28/MASTER-PLAN.md §4 PR-3. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(api): clear _HAL0_MODEL_CACHE between tests (3.12 isolation) PR-1's composite-upstream cache is module-level; PR-3's persona/provision tests pollute it with `gemma3:1b`, which then leaks into tests/api/test_v1_proxy.py::test_v1_models_still_handled_by_aggregator under Python 3.12's test-collection ordering (3.11 collects in a different order so the leak is masked). Fix: autouse fixture in tests/api/conftest.py calls _hal0_model_cache_clear() before and after each api test, matching the helper's documented contract ("Tests also call this to keep state isolated between cases"). Caught by CI on PR #396 python (3.12). PR-1 composite-cache helper: src/hal0/api/__init__.py::_hal0_model_cache_clear. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * v0.3: plugin host — manifest proxy + SDK shim + shadow-DOM isolation (#397) hal0 dashboard consumes upstream Hermes plugin manifest. Kanban auto- mounts as an agent tab in v0.3. SDK shim and isolation in place so any future upstream plugin lands without hal0 code changes. Backend: - src/hal0/api/plugins/manifest_proxy.py — proxies /api/dashboard/plugins + /dashboard-plugins/<name>/* from hermes localhost - Strips inbound Auth/Cookie; injects X-hal0-Agent outbound - SRI verification (sha384/sha256/sha512) on bundles; mismatch returns 502 - Path-traversal validator (ported from GHSA-5qr3-c538-wm9j) - CSP: script-src 'self' 'strict-dynamic' on manifest endpoint UI: - ui/src/dash/agents/plugin-host.jsx — PluginTabHost with shadow DOM per plugin, ErrorBoundary, hal0 CSS token bridge - ui/src/dash/agents/plugin-sdk-shim.js — window.__HERMES_PLUGIN_SDK__ mirroring upstream registry.ts:107-150 shape, plus window.__HAL0_PLUGINS__ alias for forward compat - One new "Plugins" tab in AgentView nav (minimal extras.jsx edit; PR-8 owns the monolith split) Refs MASTER-PLAN.md §4 PR-7. Addresses DA-sec-ops MUST-FIX #2 + #4. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * v0.3: /api/agents/{id}/personas endpoints + hot-reload activate (#399) New FastAPI router under src/hal0/api/agents/personas.py exposing the persona TOML store that PR-3 introduced. Routes: - GET /api/agents/{id}/personas — list of {id, display_name, summary, active} - GET /api/agents/{id}/personas/{pid} — detail (parsed + raw TOML) - POST /api/agents/{id}/personas/{pid}/activate — write active.txt and call PR-3's persona-activation helper (sends reload.env JSON-RPC to a running Hermes if reachable; no-op when offline) Agent id is parameterized from day 1 (master plan §2 generalization). v0.3 only resolves "hermes" — pi-coder adds a registry entry in v0.4. Refs MASTER-PLAN.md §4 PR-4. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * v0.3: chat WS proxy + session REST shim for hermes (Origin+HMAC, no PTY) (#398) Bridges browser to Hermes dashboard mode running on 127.0.0.1:9119 (per PR-5 systemd ExecStart). JSON-RPC over WebSocket + REST shim for session operations. Replaces PR-1's xterm-PTY plan after DA-ux + DA-sec-ops killed it (master plan §1 pivot #1). Routes: - WS /api/agents/{id}/events — mirrors hermes JSON-RPC event bus - WS /api/agents/{id}/submit — bidi JSON-RPC client->hermes - GET /api/agents/{id}/session/handshake — mints HMAC session cookie - POST /api/agents/{id}/session/{create,resume} - GET /api/agents/{id}/session/history Security (DA-sec-ops MUST-FIX #2, #3): - Hermes bound 127.0.0.1; proxy bridges browser to loopback - Origin allowlist (config-driven via HAL0_ALLOWED_ORIGINS) - HMAC session cookie issued on dashboard handshake; verified on every WS upgrade - Authorization: Bearer <token> outbound to hermes (NEVER query string) - runtime.json + secret.bin chmod 0600 on read - uvicorn access log middleware scrubs query strings Backpressure: server-side coalesce tool.progress events at 100ms, keyed by tool_id. Non-progress events flush the buffer first so tool.complete never lands before its preceding tool.progress. Refs MASTER-PLAN.md §4 PR-9. Addresses DA-sec-ops MUST-FIX #2 + #3. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * v0.3: SidebarAgentBlock — service/persona/approvals/skills/memory + [Open chat] (#400) New compact agent status block mounted in the left sidebar next to lemond's SidebarStatusBlock. Replaces the stats card that used to live in the Agents page Overview tab; the chat surface (PR-10) will take that main-pane slot. Renders: - Service status dot (green/amber/red) - Active persona name (from /api/agents/{id}/personas) - Approvals pending count (red badge if >0) - Skills count (existing /api/agents/skills) - Memory writes count - MCP server status pip (hal0-memory + hal0-admin) - [Open chat] button + empty "Install Hermes" CTA Polling: TanStack Query 5s refetch + revalidate-on-focus (master plan §2 state-mgmt policy: TanStack for fetch/cache, zustand only for runtime state). Mounted via window-globals to match the existing build shim. Refs MASTER-PLAN.md §4 PR-6. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * v0.3: dashboard refactor — drop Inbox, fold Peers, split AgentView monolith (#401) Six discrete UI changes per master plan §4 PR-8 + p4 dashboard refactor: 1. Inbox tab DELETED (approvals UX now via sidebar pip from PR-6 + future inline approval cards in PR-10 HermesChat). 2. Peers tab folded into Memory tab as "Peer memory" subsection — the live MCP search Peers used (R5 finding) is preserved, not deleted. 3. AgentView 974-LOC monolith split into ui/src/dash/agents/{agent-view, hermes-chat-tab,personas-tab,skills-tab,memory-tab,plugins-tab}.jsx. 4. HermesChatTab is now the default tab (placeholder; PR-10 fills in composer + transcript). 5. data.jsx purged of agent-related mock entries (HAL0_DATA.approvals). 6. Old test.skip-only agent-v3.spec.ts deleted; new minimal smoke spec agent-view-v3.spec.ts covers nav + default tab + Inbox/Peers removal + #peers legacy redirect. Window-globals build shim preserved. Backend untouched. Refs MASTER-PLAN.md §4 PR-8. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * v0.3: ADR-0015 upstream Hermes pin + weekly hermes-sdk-diff CI job (#403) DA-arch must-fix #1 ("Hermes is HOT upstream — ~40 commits/day, registry.ts churns ~151 LOC/month") demanded an explicit upgrade lane. Pin is now recorded in pyproject.toml under [tool.hal0.upstream-hermes], a weekly job diffs upstream HEAD against the pin for the surfaces hal0 depends on (registry.ts, slots.ts, web_server.py, memory_provider.py, tools/registry.py, agent/events.py), and opens a single upstream-drift/triage labeled issue on drift — same one-issue-per-state shape as agent-shim-smoke.yml's notify job. Operators can run scripts/hermes-sdk-diff.sh locally with the same contract — exits 0 on no drift, 1 on drift, 2 on operational error. Supports --dry-run (parse pin, print plan, no clone) and --bump <sha> (rewrite the pin in-place inside the bump PR). Bumps go through ADR-0015 §4: review drift issue → edit shim adapter if needed → scripts/hermes-sdk-diff.sh --bump <sha> → delta-harness + gamma-suite → open chore(hermes): bump upstream pin to <short-sha> PR. 48h freeze window around any v0.x release tag (reviewer-disciplined). ADR number is 0015, not 0014 — ADR-0014 was already used for the Cognee graph-extraction model gate (PR-3 territory). Refs MASTER-PLAN.md §4 PR-12 + §5 upstream upgrade cadence. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * v0.3: HermesChat surface — React composer + transcript + sidecar (#404) Replaces PR-8's HermesChatTab placeholder with a hal0-native React chat surface that streams Hermes JSON-RPC events over the WebSocket proxy from PR-9. No xterm; no PTY; no Tailwind v4 (master plan §1 pivot #1 + DA-ux #1). New components under ui/src/dash/agents/chat/: - Composer (Enter submits, Shift+Enter newline — user §6 decision) - Transcript (sticky-bottom auto-scroll) - MessageBubble / Markdown / ToolCallCard / ApprovalCard / ThinkingIndicator - HermesSidecar (PersonaSwitcher + ModelBadge + MCPStatusRow + AgentControls) - use-hermes-session: external store + WS connection manager WS event routing covers every R1 taxonomy entry: message.{start,delta, complete}, thinking/reasoning, tool.{start,progress,complete}, approval.request, status.update, error, sudo/clarify/secret.request. Approvals UX: inline ApprovalCard + sidebar pip pulse + toast top-right (user §6 #4: no desktop notification permission). Persona hot-swap on next turn via POST /api/agents/hermes/personas/{pid}/activate (PR-4). First-run hook: when sessionId is missing on connect, fire session.create with first_run=true so Hermes auto-emits the welcome message per PR-3 system-prompt addendum. Sent from the submit-WS onopen handler so the envelope can't race the WS becoming writable. Mobile: composer sticky bottom + sidecar collapses to bottom sheet <768px via the .hermes-chat-sheet-toggle pill. State mgmt split per master plan §2: hand-rolled external store + React useSyncExternalStore for runtime state (transcript, session, conn state); TanStack Query (via window-globals bridges) for fetch/cache (personas, mcp pip, model badge). Window-globals build shim preserved. Reconnect strategy (PR-9 contract — proxy is stateless): jittered backoff base=250ms cap=4s with 1.0–1.5x jitter per step capped at attempt 5; handshake retried on every reconnect; session resumed via session.resume when a sessionId is held. Tests: tests/e2e/specs/hermes-chat.spec.ts (14 cases) backed by the new tests/e2e/fixtures/wsHarness.ts WebSocket shim — covers composer submit/Shift+Enter, message streaming, tool cards, approval card + approve.respond, persona switch, restart confirm, reconnect, mobile sheet, first-run session.create. agent-view-v3.spec.ts updated: chat tab now shows hermes-chat-surface (PR-10 surface) instead of hermes-chat-placeholder (PR-8 stub). Refs MASTER-PLAN.md §4 PR-10 + §1 pivot #1 + §6 user decisions. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * v0.3: tests + docs sweep + final missing endpoints (#405) Closes out v0.3 Hermes integration before fold-to-main. Adds the three endpoints PR-10/PR-6/PR-8 flagged as missing during integration: - POST /api/agents/{id}/restart — systemctl restart wrapper - GET /api/agents/skills — replaces static catalog - GET /api/agents/{id}/memory/stats — pulls from hal0-memory MCP Tests: - unit endpoint coverage for each new route - δ-harness integration: full chat WS roundtrip (mock hermes); persona activate roundtrip Docs (master plan §1 #16): - AGENTS.md narrative refresh for v0.3 reality - ARCHITECTURE.md agents section + new module map - CONTEXT.md glossary: composer, transcript, plugin host, sidecar agent block, persona TOML, hal0-cognee, hermes-sdk-diff, HMAC session cookie, X-hal0-Agent, composite hal0 upstream - CHANGELOG.md v0.3.x-alpha entry covering PR-1..12 - ADR-0016: v0.3 Hermes integration decisions (cross-link master plan) - docs/agents/hermes/CONFIG.md + SERVICE.md verification Follow-up: hal0-web CONTENT_BRIEF + Astro updates land in a sibling PR on Hal0ai/hal0-web (separate repo, separate review cadence). Refs MASTER-PLAN.md §4 PR-11. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore(adr): renumber v0.3 integration ADRs to 0018/0019 (avoid main collision) `main` shipped its own ADR-0015 (`0015-mcp-as-host-platform.md`) and ADR-0017 (`0017-bell-inbox-approval-ux.md`) via PR #389 while the v0.3 integration was in flight on `docs/v0.3-agents-mcp-memory`. To fold this branch into `main` without an ADR-number collision: - 0015-upstream-hermes-pin-and-upgrade.md → 0018-upstream-hermes-pin-and-upgrade.md - 0016-v0_3-hermes-integration.md → 0019-v0_3-hermes-integration.md Updated every cross-reference (commit messages stay historical): AGENTS.md, ARCHITECTURE.md, CHANGELOG.md, CONTEXT.md, pyproject.toml, scripts/hermes-sdk-diff.sh, src/hal0/api/__init__.py, src/hal0/api/agents/skills.py, docs/agents/hermes/CONFIG.md, and the two renumbered ADR files' self-references. `docs/mcp/overview.md` carries a stale "no ADR-0015 in main yet" note that pre-dates main's ADR-0015 ship; left for the integration-PR merge to resolve against current main. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(agents): drop PR-11 duplicate /api/agents/skills endpoint (main shipped one) PR-11 added a static-catalog /api/agents/skills endpoint + tests assuming the route was new. Main shipped an equivalent endpoint via PR #364 (src/hal0/api/routes/agents.py:76) that already serves the sidebar. Registering both produced a route collision; FastAPI dispatch order meant PR-11's tests asserted against main's older shape and failed CI on python (3.11) — 11 assertions / KeyError cascade. Delete: - src/hal0/api/agents/skills.py (PR-11 static catalog endpoint) - tests/agents/test_agent_skills_endpoint.py (asserted PR-11's shape) - import + include_router stanza for the deleted module Main's endpoint continues to serve /api/agents/skills returning {skills:[...], count:N} which is what `useSidebarAgentRollup` consumes. PR-11's drift-bump intent (one PR per upstream tools/registry.py change, gated by ADR-0018 weekly diff) was never implemented and is duplicated by main's persona.AGENT_SKILLS catalog. Future v0.4 work can revisit if a richer catalog is needed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
7 tasks
thinmintdev
added a commit
that referenced
this pull request
May 29, 2026
…ackends) (#410) DA must-fix #2 from the OpenRouter integration analysis: R7 claimed upstream Hermes ships 7 spawn backends, but tests/agents/ had zero delegate_task coverage. Verifies orchestration end-to-end for local + docker + modal with mocked backend handlers (no Modal credits or docker pulls in CI). Gates V3a observability panel scope. Adds: - tests/harness/integration/_delegate_fakes.py — FakeLocalBackend, FakeDockerBackend, FakeModalBackend implementing the upstream BaseEnvironment ABC, capturing invocations for assertions - _delegate_runner.py — in-process orchestration harness wiring the fakes into a simulated delegate_task dispatch loop - test_delegate_task_{local,docker,modal}.py — happy path + error path + invocation payload shape per backend - test_delegate_task_dispatch_matrix.py — parametrised fan-out across the 3 backends asserting orchestration works uniformly, plus an upstream-contract drift gate that runs against tools.environments.base.BaseEnvironment when ~/src/hermes-agent is on PYTHONPATH (skips cleanly on CI) Upstream audit at pin 0554ef1a corrected R7's "7 backends" marketing: upstream actually ships 6 (local/docker/singularity/modal/daytona/ssh). Vercel Sandbox is NOT a BaseEnvironment subclass upstream. The gap is documented in FINDINGS.md §46 so V3a's UI design can target the real backend list. The three covered backends round-trip cleanly, so V3a observability survives intact; singularity / daytona / ssh can be added incrementally per README §14. Refs openrouter-research-2026-05-28/PLANNING.md §3 Phase 0 + §4 #2. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
checkerrors (auto-fixed 25, manually fixed 7: U+2212→hyphen, try/except/pass→contextlib.suppress, etc.)ruff format src testsTest plan
ruff check src testscleanruff format --check src testscleanpytestgreen (locally 592 pass / 6 skip / 1 pre-existing comfyui fail)🤖 Generated with Claude Code