fix(store): make LanceStore::update atomic via merge_insert#20
Merged
Conversation
`update()` replaced a memory by deleting the row and then adding the new version as two separate operations. That sequence is not atomic: if the process is interrupted between the delete and the re-insert — e.g. the HTTP request is cancelled, so the axum handler future is dropped at an `.await` — the row ends up deleted but never restored, silently losing the memory. `soft_delete()` (and therefore `delete_memory` / `batch_soft_delete`) goes through the same path, so a cancelled delete could likewise drop the row instead of marking it deleted. Replace the delete-then-add with a single `merge_insert` upsert keyed on `id` (`when_matched_update_all` + `when_not_matched_insert_all`), which commits as one transaction. An interrupted update now either fully applies or not at all — it can never leave the row deleted-without-replacement. Adds two regression tests: update replaces a row in place (exactly one copy, content + version updated) and upserts when the row is absent. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
yhyyz
added a commit
that referenced
this pull request
May 21, 2026
Contributor
|
Merged and deployed — great catch @doctatortot! The delete-then-add race condition was a real data-loss bug waiting to happen. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
LanceStore::update()can silently lose a memory if the operation is interrupted partway through. It replaces a row with two separate, independently-committed operations — adeletefollowed by anadd— and if execution stops between them, the row is deleted but never re-inserted.Root cause
These are two distinct transactions. If the future is dropped at the second
.await— which happens whenever the caller's request is cancelled (e.g. an HTTP client disconnects/aborts and the axum handler future is dropped) — the delete has already committed but the add never runs. The row is gone with no replacement.soft_delete()goes through the sameupdate()path, soDELETE /v1/memories/{id}andbatch_soft_deleteare exposed too: a cancelled delete can drop the row entirely instead of tombstoning it (state = 'deleted').When it bites
It's easy to miss because the happy path works and
create()(a singleadd) is already atomic. It shows up under interruption, and is most likely when an update is slow enough that a client times out and aborts mid-flight — e.g. a content-changing update that re-embeds a large document on a busy/slow host. The result is a memory that "vanished" with no error on the server side.Fix
Replace the delete-then-add with a single atomic
merge_insertupsert keyed onid:This commits as one transaction, so an interrupted update either fully applies or not at all — it can never leave the row deleted-without-replacement. Behavior is otherwise preserved: the version bump +
updated_atare still computed before the write, and updating an id that doesn't exist still inserts it (the old delete-no-op + add did the same;when_not_matched_insert_allkeeps that). Rows are keyed by unique ids, so the single-match update path applies.Tests
test_update_replaces_row_atomically— update replaces a row in place: exactly one copy remains (no duplicate, no loss), content + version updated.test_update_upserts_when_row_absent— update inserts when the row is absent.cargo test, 381 passed).cargo fmt --checkandclippyclean for the change.Notes
merge_insertis available in the pinnedlancedb(0.27). No schema or API changes; purely an internal durability fix inupdate().