Skip to content

openclaw: name message_embedding in auto-capture INSERT to avoid deeplake vector::at#180

Open
kaghni wants to merge 1 commit into
mainfrom
fix/openclaw-insert-embedding-column
Open

openclaw: name message_embedding in auto-capture INSERT to avoid deeplake vector::at#180
kaghni wants to merge 1 commit into
mainfrom
fix/openclaw-insert-embedding-column

Conversation

@kaghni
Copy link
Copy Markdown
Collaborator

@kaghni kaghni commented May 18, 2026

Summary

Auto-capture has been failing in production with Auto-capture failed: Query failed: 500: {"error":"Database error: Failed to insert tuple: vector::at out of range"} on every openclaw turn. Diagnosed during the live E2E test of #170/#171/#172/#124 against @deeplake/hivemind@0.7.32.

Root cause

The openclaw auto-capture INSERT omits the message_embedding column entirely. The bundle stubs node:child_process so it can't spawn the embed daemon; the prior assumption was that an unmentioned FLOAT4[] column would default to NULL. It doesn't — deeplake-api's tuple-builder hits an internal C++ bounds check on the missing column.

Side-by-side:

-- openclaw (current, fails with vector::at):
INSERT INTO "sessions" (id, path, filename, message,                     author, ...)
                                                    ^ column OMITTED
VALUES                  (uuid, ..., jsonb,                              author, ...)

-- claude-code (works, even with NULL embedding):
INSERT INTO "sessions" (id, path, filename, message, message_embedding, author, ...)
                                                     ^^^^^^^^^^^^^^^^^
VALUES                  (uuid, ..., jsonb,           NULL,              author, ...)

Decisive evidence this is fixable on our side

PR #168's body explicitly states pre-fix claude-code wrote rows with message_embedding = NULL successfully ("writes through as NULL forever after"). Those rows weren't rejected — they just had NULL embeddings. So deeplake-api does accept a NULL FLOAT4[] when the column is named in the INSERT. The literal difference between "writes succeed with NULL" and "vector::at out of range" is whether the column appears in the INSERT column list.

Fix

Name message_embedding in the openclaw INSERT and pass NULL in the values tuple — mirroring the working claude-code-with-NULL-embedding shape. One semantic delta at the call site (openclaw/src/index.ts).

Plus a regex window bump in tests/claude-code/skillify-session-start-injection.test.ts (the inline comment documenting why we need the explicit NULL pushed past the previous 3500-char window between agent_end and Auto-captured).

Test plan

  • npm run typecheck clean
  • npm run build produces the openclaw bundle with the new INSERT
  • npm test — 2708/2709 pass; the 1 failure is deeplake-fs.test.ts > batches and flushes on BATCH_SIZE writes which also fails on origin/main (pre-existing flake, unrelated)
  • npm run audit:openclaw -- --criticals-only — 0 critical
  • Live verification once merged: install 0.7.33 (or whatever the bump-bot produces), restart openclaw gateway, send a Telegram message, confirm Auto-captured N messages log line appears instead of vector::at

Out of scope (separate)

  • The Query timeout after 10000ms failures — once vector::at stops, we'll know whether the timeouts also stop (most are likely the slow-failure path of the same bug) or are a real deeplake-side write-latency issue that warrants a separate report.
  • The pre-existing structural bug where the skillify spawn lives inside the auto-capture try/catch (so when capture fails, mining also gets skipped) — separate small PR.
  • A nice-to-have feedback for the deeplake team: vector::at out of range is a C++ assertion leak that's hard to debug from the plugin side; a clearer "INSERT missing required column for column-store table" message would have saved hours.

Summary by CodeRabbit

  • Refactor
    • Enhanced session data capture to explicitly include all required fields with consistent handling
  • Tests
    • Updated test assertions to validate improved session capture behavior

Review Change Stack

Symptom: production gateway emits `Auto-capture failed: Query failed:
500: {"error":"Database error: Failed to insert tuple: vector::at out
of range"}` from deeplake-api on every openclaw turn (request IDs in
the gateway journal, multiple workspaces affected).

Root cause: the openclaw auto-capture INSERT omitted the
`message_embedding` column entirely:

  INSERT INTO "sessions" (id, path, filename, message, author, ...)
                                                       ^ skipped
  VALUES                  (uuid, ..., jsonb,           userName, ...)

The bundle stubs `node:child_process` so it can't spawn the embed
daemon, and the prior assumption was that an unmentioned `FLOAT4[]`
column would default to NULL. It doesn't — deeplake-api's storage
layer hits an internal C++ bounds check (`vector::at out of range`)
when building the tuple for a row that has no value for the embedding
column.

Decisive negative evidence from #168's PR body: pre-#168 claude-code
sessions wrote successfully with NULL embeddings, because its INSERT
NAMED the column with `embeddingSqlLiteral(null)` → `"NULL"`. The
literal difference between "writes succeed with NULL" and
"vector::at" is whether the column appears in the INSERT.

Fix: list `message_embedding` in the column list and pass `NULL` in
the values tuple — mirroring the working claude-code shape when its
embed daemon returns null.

No schema migration. No deeplake-api change. One-line semantic delta
at the call site, plus a regex window bump in the bundle-scan test
(the comment + slightly longer INSERT statement pushed past 3500
chars between `agent_end` and `Auto-captured`).

Confidence: high — column-count math lines up with the C++
assertion, and #168's working-with-NULL path is direct counter-
evidence for the "deeplake rejects NULL FLOAT4[]" hypothesis.
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 18, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: b2f2e49b-0f74-409f-b0cb-d70fa040ba1f

📥 Commits

Reviewing files that changed from the base of the PR and between 66ad723 and 9a56ff1.

📒 Files selected for processing (2)
  • openclaw/src/index.ts
  • tests/claude-code/skillify-session-start-injection.test.ts

📝 Walkthrough

Walkthrough

This PR adds an explicit message_embedding column to the session auto-capture SQL INSERT statement in openclaw/src/index.ts during agent_end handling, passing NULL for that column. The corresponding test assertion regex bounds are expanded to accommodate the revised code structure.

Changes

Session Embedding Capture

Layer / File(s) Summary
Session embedding column auto-capture and test assertion
openclaw/src/index.ts, tests/claude-code/skillify-session-start-injection.test.ts
The agent_end auto-capture SQL INSERT now includes an explicit message_embedding column with NULL value. The test regex assertion is updated with larger segment windows to validate the expanded code structure.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

Possibly related issues

  • activeloopai/hivemind#80: Addresses the same message_embedding column addition to session capture, resolving a reported parity gap with claude-code/codex.
  • activeloopai/hivemind#178: Documents the session capture omission of message_embedding that this PR remedies by adding the column to the INSERT.

Possibly related PRs

  • activeloopai/hivemind#172: Modifies adjacent agent_end session capture and spawn/dedupe logic in the same code path; the test structure expectations align with this PR's assertion updates.

Suggested reviewers

  • efenocchi

Poem

🐰 A column was lost in the session's embrace,
message_embedding left without place,
Now NULL takes its stand in the INSERT's flow,
And tests hop along where the structure will go! 🎉

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately reflects the main change: adding message_embedding to the openclaw auto-capture INSERT to fix the deeplake vector::at error.
Description check ✅ Passed The description is comprehensive and covers the summary, root cause diagnosis, fix explanation, and test plan. The Version Bump section checkbox is not explicitly checked, which is a gap in following the template.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/openclaw-insert-embedding-column

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 ESLint

If the error stems from missing dependencies, add them to the package.json file. For unrecoverable errors (e.g., due to private dependencies), disable the tool in the CodeRabbit configuration.

ESLint skipped: no ESLint configuration detected in root package.json. To enable, add eslint to devDependencies.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot requested a review from efenocchi May 18, 2026 22:23
@github-actions
Copy link
Copy Markdown
Contributor

Coverage Report

No src/*.ts files changed in this PR.

Generated for commit ec522d4.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant