You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Follow-up to #168. Surfaced during review by @kaghni who flagged that pi and openclaw had no/minimal changes in the embed-daemon fix PR. Investigation found two related gaps worth fixing together.
Problem 1 — pi: documented-but-missing spawn-on-miss
If the socket isn't there yet, we spawn the canonical daemon at `~/.hivemind/embed-deps/embed-daemon.js` (deposited by `hivemind embeddings install`) and wait for it to listen, mirroring the auto-spawn-on-miss logic in `src/embeddings/client.ts`. Subsequent agents (codex, CC, cursor, hermes, ...) connect to the SAME daemon — pi pays the cold-start cost only when it's the first user on the box.
But the implementation at lines 183-210 (`tryEmbedOverSocket`) only calls `connect()` and settles `null` on any error. No spawn ever happens. Result: if pi is the first agent to talk to the daemon on a given box (e.g. after reboot), it silently writes NULL into `message_embedding` until some other agent spawns the daemon. Same regression vector this PR #168 was opened to close, on a different code path.
Problem 2 — openclaw doesn't produce embeddings
OpenClaw is MCP-mode (`hivemind_*` tool contracts) and currently writes NULL on every vector column:
`upsertRowSql` (memory): `summary_embedding = NULL` on UPDATE, `NULL` literal on INSERT
session capture (`dist/index.js:1884`): `message_embedding` omitted from the column list → defaults NULL
It already imports from `../../src/`, already uses `createRequire(import.meta.url)` to bypass the esbuild stub on `node:child_process` (lines 79-80), and already spawns long-lived workers (`spawnOpenclawSkillifyWorker` at line 406). So the capability to spawn the daemon and embed is already there — we just don't.
OpenClaw must remain a consumer-only if the user hasn't run `hivemind embeddings install` (respect the opt-in invariant set in `src/user-config.ts` from #168 — no 600MB transformers download as a side effect). It should become a producer only when `~/.hivemind/embed-deps/embed-daemon.js` is already present.
Plan
Extract a shared spawn-on-miss state machine in `src/embeddings/standalone-embed-client.ts` (~80 LOC). Used by openclaw (which imports from `src/`). Pi can't import from `src/` (raw .ts ship constraint) — gets a parallel inline implementation, but the test suite covers both via the shared helper.
Fix pi: replace the broken `tryEmbedOverSocket` with the spawn-on-miss version. Bug fix.
Wire openclaw as producer: `tryEmbedOverSocket` + spawn-on-miss in `upsertRowSql` and the sessions INSERT.
Three focused commits, matching the repo's "never >3 src files per commit across different layers" rule.
`src/embeddings/standalone-embed-client.ts` exists with unit tests covering all 11 edge cases above (mock spawn + real socket against a stub daemon)
`pi/extension-source/hivemind.ts` actually spawns the daemon when the socket is absent (matches its own documentation)
`openclaw/src/index.ts`: `upsertRowSql` writes the document embedding (not NULL) when the daemon is available; session capture INSERT includes `message_embedding` in column list
E2E verification on `test_plugin/default/sessions_test` (never prod) showing openclaw writes produce non-NULL embeddings with semantic recall working
Coverage targets met (per-file ≥90% on all 3 new/modified files)
Recycle-after-hello-mismatch logic in pi or openclaw. The shared canonical daemon binary is updated when the user reruns `embeddings install`; pi/openclaw just consume whatever's there.
Follow-up to #168. Surfaced during review by @kaghni who flagged that pi and openclaw had no/minimal changes in the embed-daemon fix PR. Investigation found two related gaps worth fixing together.
Problem 1 — pi: documented-but-missing spawn-on-miss
`pi/extension-source/hivemind.ts:166-175` documents auto-spawn-on-miss:
But the implementation at lines 183-210 (`tryEmbedOverSocket`) only calls `connect()` and settles `null` on any error. No spawn ever happens. Result: if pi is the first agent to talk to the daemon on a given box (e.g. after reboot), it silently writes NULL into `message_embedding` until some other agent spawns the daemon. Same regression vector this PR #168 was opened to close, on a different code path.
Problem 2 — openclaw doesn't produce embeddings
OpenClaw is MCP-mode (`hivemind_*` tool contracts) and currently writes NULL on every vector column:
It already imports from `../../src/`, already uses `createRequire(import.meta.url)` to bypass the esbuild stub on `node:child_process` (lines 79-80), and already spawns long-lived workers (`spawnOpenclawSkillifyWorker` at line 406). So the capability to spawn the daemon and embed is already there — we just don't.
OpenClaw must remain a consumer-only if the user hasn't run `hivemind embeddings install` (respect the opt-in invariant set in `src/user-config.ts` from #168 — no 600MB transformers download as a side effect). It should become a producer only when `~/.hivemind/embed-deps/embed-daemon.js` is already present.
Plan
Three focused commits, matching the repo's "never >3 src files per commit across different layers" rule.
Edge case matrix (test coverage required)
Acceptance criteria
Out of scope