Skip to content

fix(store): make LanceStore::update atomic via merge_insert#20

Merged
yhyyz merged 1 commit into
ourmem:mainfrom
doctatortot:fix/atomic-memory-update
May 21, 2026
Merged

fix(store): make LanceStore::update atomic via merge_insert#20
yhyyz merged 1 commit into
ourmem:mainfrom
doctatortot:fix/atomic-memory-update

Conversation

@doctatortot
Copy link
Copy Markdown
Contributor

Summary

LanceStore::update() can silently lose a memory if the operation is interrupted partway through. It replaces a row with two separate, independently-committed operations — a delete followed by an add — and if execution stops between them, the row is deleted but never re-inserted.

Root cause

// store/lancedb.rs — update()
table.delete(&format!("id = '{}'", escape_sql(&mem.id))).await?;   // commit #1
// ... build batch ...
table.add(reader).execute().await?;                                // commit #2

These are two distinct transactions. If the future is dropped at the second .await — which happens whenever the caller's request is cancelled (e.g. an HTTP client disconnects/aborts and the axum handler future is dropped) — the delete has already committed but the add never runs. The row is gone with no replacement.

soft_delete() goes through the same update() path, so DELETE /v1/memories/{id} and batch_soft_delete are exposed too: a cancelled delete can drop the row entirely instead of tombstoning it (state = 'deleted').

When it bites

It's easy to miss because the happy path works and create() (a single add) is already atomic. It shows up under interruption, and is most likely when an update is slow enough that a client times out and aborts mid-flight — e.g. a content-changing update that re-embeds a large document on a busy/slow host. The result is a memory that "vanished" with no error on the server side.

Fix

Replace the delete-then-add with a single atomic merge_insert upsert keyed on id:

let mut merge = table.merge_insert(&["id"]);
merge.when_matched_update_all(None);
merge.when_not_matched_insert_all();
merge.execute(reader).await?;

This commits as one transaction, so an interrupted update either fully applies or not at all — it can never leave the row deleted-without-replacement. Behavior is otherwise preserved: the version bump + updated_at are still computed before the write, and updating an id that doesn't exist still inserts it (the old delete-no-op + add did the same; when_not_matched_insert_all keeps that). Rows are keyed by unique ids, so the single-match update path applies.

Tests

  • test_update_replaces_row_atomically — update replaces a row in place: exactly one copy remains (no duplicate, no loss), content + version updated.
  • test_update_upserts_when_row_absent — update inserts when the row is absent.
  • Full suite green (cargo test, 381 passed). cargo fmt --check and clippy clean for the change.

Notes

merge_insert is available in the pinned lancedb (0.27). No schema or API changes; purely an internal durability fix in update().

`update()` replaced a memory by deleting the row and then adding the new
version as two separate operations. That sequence is not atomic: if the
process is interrupted between the delete and the re-insert — e.g. the HTTP
request is cancelled, so the axum handler future is dropped at an `.await` —
the row ends up deleted but never restored, silently losing the memory.
`soft_delete()` (and therefore `delete_memory` / `batch_soft_delete`) goes
through the same path, so a cancelled delete could likewise drop the row
instead of marking it deleted.

Replace the delete-then-add with a single `merge_insert` upsert keyed on
`id` (`when_matched_update_all` + `when_not_matched_insert_all`), which
commits as one transaction. An interrupted update now either fully applies
or not at all — it can never leave the row deleted-without-replacement.

Adds two regression tests: update replaces a row in place (exactly one
copy, content + version updated) and upserts when the row is absent.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@yhyyz yhyyz merged commit 100ff5b into ourmem:main May 21, 2026
1 check passed
yhyyz added a commit that referenced this pull request May 21, 2026
PR #14 added include_superseded param to list(), PR #20's tests
used the old 2-arg signature. Updated to list(100, 0, false).
@yhyyz
Copy link
Copy Markdown
Contributor

yhyyz commented May 21, 2026

Merged and deployed — great catch @doctatortot! The delete-then-add race condition was a real data-loss bug waiting to happen. merge_insert is the correct fix. 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants