fix: cascade-clean orphaned vec_episodes rows on import_from_dict force-overwrite by kohai-ut · Pull Request #87 · AxDSan/mnemosyne

kohai-ut · 2026-05-12T02:27:10Z

TL;DR

vec_episodes is a sqlite-vec virtual table keyed by episodic_memory.rowid (AUTOINCREMENT). When BeamMemory.import_from_dict(force=True) DELETEs an existing episodic_memory row to replace it, the new row gets a fresh rowid via cursor.lastrowid — leaving the old row's vector embedding stranded in vec_episodes pointing at a rowid that AUTOINCREMENT will never re-issue. Operators with high import churn accumulate dead vec_episodes entries indefinitely.

This PR adds a cascade DELETE FROM vec_episodes WHERE rowid = existing["rowid"] BEFORE the episodic_memory DELETE, plus broadens the cleanup exception catch to sqlite3.Error so a cleanup failure doesn't abort the import mid-loop.

5 regression tests, all passing. Originally flagged by Codex adversarial on PR #84 as a deferred follow-up.

Why this matters

Storage hygiene: long-running deployments that do regular import/export cycles (backups, multi-source sync) accumulate orphaned vec_episodes rows that never get cleaned. Each entry is ~768 bytes (float32) or ~48 bytes (binary quantized) — small per row but unbounded growth at scale.

This is the only DELETE FROM episodic_memory site in production code (verified via grep). Mnemosyne.forget() doesn't currently cascade to episodic_memory at all — that's the separate C17 concern in the ledger.

What this PR does

Single commit:

import_from_dict cascade-cleans vec_episodes before the episodic_memory DELETE on the force-overwrite path
Guarded by _vec_available(self.conn) — sqlite-vec is optional
Hoisted vec_ok = _vec_available(self.conn) from before the embeddings-import loop to before the episodic_memory loop — both code paths now share one check
Broad except sqlite3.Error (not narrow OperationalError) catches all sqlite3 exception subclasses with WARNING log — working_memory was already committed at line 3978, so propagating a cleanup error would abort the import mid-loop with partial state

/review army findings (all addressed)

Finding	Sources	Fix
Initial "best-effort failure" test dropped `vec_episodes` before calling import, causing `vec_ok=False` short-circuit instead of exercising try/except	2-source (Codex structured P2 GATE FAIL + Claude C1 CRITICAL)	Test now monkey-patches `_vec_available` to return True after dropping the table, so the cascade runs and the DELETE fails, exercising the try/except
Missing end-to-end test with non-empty `episodic_embeddings` payload — every other test used `[]`, leaving the `old_to_new_rowid` mapping path uncovered for the force-overwrite case	Claude H1	Added `test_import_from_dict_force_with_new_embeddings_no_orphan`
`except sqlite3.OperationalError` too narrow — other sqlite3 exception subclasses propagate and abort import mid-loop	Claude H2	Broadened to `except sqlite3.Error as cleanup_exc` with WARNING log

What is NOT in this PR (intentional, documented)

Mnemosyne.forget() doesn't cascade to episodic_memory at all (C17 in ledger) — pre-existing different concern
memory_embeddings fallback table gets stale content on force-overwrite — different orphan shape, separate concern
binary_vector column NULL after force-INSERT — pre-existing, not affected by this PR
Concurrent import_from_dict race — import_from_dict is not advertised as concurrent-safe; out of scope

Test plan

All 5 tests in tests/test_orphan_vec_episodes_cleanup.py pass
All 20 tests in tests/test_integration.py pass
Targeted test_beam.py -k "import or export" run: green
Codex structured /review: GATE PASS after commit 1's fix
Claude adversarial /review: CRITICAL + 2 HIGH (all addressed)
CI: full suite green on Python 3.9 / 3.10 / 3.11 / 3.12

🤖 Generated with Claude Code

…orce-overwrite `vec_episodes` is a sqlite-vec virtual table keyed by `episodic_memory.rowid` (AUTOINCREMENT INTEGER PK). When `BeamMemory.import_from_dict(force=True)` DELETEs an existing episodic_memory row to replace it, the new row gets a fresh rowid via `cursor.lastrowid` — leaving the old row's vector embedding stranded in vec_episodes pointing at a rowid that AUTOINCREMENT will never re-issue. Operators with high import churn (backup/ restore cycles, multi-source imports, periodic sync) accumulate dead vec_episodes entries indefinitely. Bug originally surfaced by Codex adversarial /review on PR AxDSan#84 as a deferred follow-up — not E2.a.5's bug, but a sibling storage-hygiene concern in the same code area. Fix: - In `import_from_dict`'s episodic_memory loop, when an existing row is found and `force=True`, DELETE FROM vec_episodes WHERE rowid = existing["rowid"] BEFORE deleting the episodic_memory row. - Guarded by `_vec_available(self.conn)` — sqlite-vec is optional. - Hoisted `vec_ok = _vec_available(self.conn)` from before the embeddings-import loop to before the episodic_memory loop so both code paths share one check. - Broad `except sqlite3.Error` (not just OperationalError) catches all sqlite3 exception subclasses with logging — `working_memory` was already committed, so propagating any cleanup error would abort the import mid-loop leaving partial state. Best-effort cleanup: log + continue. Data integrity > orphan cleanup. /review army (Codex structured GATE FAIL + Claude adversarial CRITICAL/HIGH) caught two issues in commit 1's initial test: CRITICAL (2-source: Codex P2 + Claude C1) — the "best-effort failure" test dropped vec_episodes before calling import, but the production `vec_ok = _vec_available()` check then returned False and the cascade was SKIPPED entirely. The try/except path was never exercised — the test would have passed even with the try/except removed. Fixed by monkey-patching `_vec_available` to return True even after the table is dropped, so the cascade runs, the DELETE fails, and the try/except is actually exercised. HIGH (Claude H1) — every other test used `"episodic_embeddings": []`. The real-world payload carries embeddings, and the interesting case is: (a) cascade cleans old vec entry, (b) INSERT assigns new rowid, (c) embeddings section maps old-payload- rowid → new-rowid → reinserts. Added `test_import_from_dict_force_with_new_embeddings_no_orphan` to verify the full round-trip produces exactly 1 episodic_memory row + 1 vec_episodes row with vec_episodes.rowid == episodic_memory.rowid. HIGH (Claude H2) — `except sqlite3.OperationalError` was too narrow. Other sqlite3.Error subclasses (DatabaseError, NotSupportedError, IntegrityError from a corrupted vec0 shadow, etc.) would propagate and abort the import mid-loop with working_memory already committed — partial state. Broadened to `except sqlite3.Error` with a WARNING log so operators can see when the cleanup failed. Out of scope (documented, separate tickets): - `Mnemosyne.forget()` doesn't cascade to episodic_memory at all (C17 in ledger) — pre-existing different concern - `memory_embeddings` fallback table gets stale content on force- overwrite — different orphan shape, separate concern - `binary_vector` column NULL after force-INSERT — pre-existing - Concurrent `import_from_dict` race — `import_from_dict` is not advertised as concurrent-safe; out of scope 5 new tests in `tests/test_orphan_vec_episodes_cleanup.py`: - test_import_from_dict_force_cleans_vec_episodes_orphan - test_import_from_dict_no_force_does_not_touch_vec_episodes - test_import_from_dict_force_idempotent_no_orphan_accumulation - test_import_from_dict_cleanup_failure_is_best_effort - test_import_from_dict_force_with_new_embeddings_no_orphan 29 tests pass (5 new + 24 existing import/integration/E4). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

AxDSan merged commit 373e056 into AxDSan:main May 12, 2026
5 checks passed

This was referenced May 12, 2026

fix: restore PR #82/#83/#91 work stripped by the May 12 merge resolution #93

Closed

docs+harness: benchmark infrastructure + merge-regression recovery #92

Merged

kohai-ut deleted the fix/orphan-vec-episodes-cleanup branch May 12, 2026 18:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: cascade-clean orphaned vec_episodes rows on import_from_dict force-overwrite#87

fix: cascade-clean orphaned vec_episodes rows on import_from_dict force-overwrite#87
AxDSan merged 1 commit into
AxDSan:mainfrom
kohai-ut:fix/orphan-vec-episodes-cleanup

kohai-ut commented May 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

kohai-ut commented May 12, 2026

TL;DR

Why this matters

What this PR does

/review army findings (all addressed)

What is NOT in this PR (intentional, documented)

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants