fix(retain): defer memory_links → memory_units FKs to break cascade deadlock#1398
Merged
nicoloboschi merged 2 commits intomainfrom May 4, 2026
Merged
fix(retain): defer memory_links → memory_units FKs to break cascade deadlock#1398nicoloboschi merged 2 commits intomainfrom
nicoloboschi merged 2 commits intomainfrom
Conversation
…eadlock
Concurrent INSERT into memory_links (from retain link generation —
temporal, semantic, entity, causal — via _bulk_insert_links) and any
DELETE that cascades through memory_units → memory_links (e.g.
delta-retain superseding chunks: chunks → memory_units → memory_links)
can deadlock under sustained single-tenant write load.
The cycle:
Tx A: DELETE FROM chunks WHERE chunk_id = ANY(...)
→ CASCADE acquires row locks on memory_units, then on
memory_links rows where to_unit_id matches the deleted units.
Tx B: INSERT INTO memory_links (...) referencing one of the same
memory_units rows.
→ The immediate FK check takes FOR KEY SHARE on those
memory_units rows.
The two transactions take row locks on the same memory_units rows in
opposite orders depending on which side started first. PostgreSQL
detects the cycle and aborts one of them; the loser is killed mid-batch
and the worker has to retry. Under sustained write load the pattern
repeats.
The _bulk_insert_links sort by (from_unit_id, to_unit_id) prevents
INSERT-vs-INSERT contention but doesn't help INSERT-vs-cascading-DELETE.
Fix: make both memory_links → memory_units FKs DEFERRABLE INITIALLY
DEFERRED. INSERT no longer takes FOR KEY SHARE on the FK target row at
INSERT time — checked at COMMIT instead. Concurrent DELETE cascades
freely; if it has removed the target row by COMMIT, the INSERT
transaction fails with a clean FK violation (sqlstate 23503) instead of
both transactions getting tangled in a deadlock (sqlstate 40P01). The
WHERE EXISTS filter in _bulk_insert_links continues to handle the
typical "stale unit_id" case at INSERT time; the deferred FK is just
the backstop for the narrow race window between EXISTS and COMMIT.
ON DELETE CASCADE semantics are preserved — only the *timing* of the
constraint check moves. The entity_id FK is left immediate (entities
aren't part of the observed deadlock cycle).
PG-only: Oracle's deferrable-FK semantics differ and the deadlock cycle
was only observed on PostgreSQL.
Tests:
* test_memory_links_deferred_fk verifies both FKs end up
condeferrable=true, condeferred=true, confdeltype='c' (CASCADE)
after the migration runs. Schema-shape invariant — locks in the fix
so a future migration can't regress it accidentally.
* test_migration_shape passes — the new migration uses the
run_for_dialect dispatcher correctly.
A behaviour test (concurrent INSERT + cascading DELETE no longer
deadlocks) is hard to write deterministically because PG's deadlock
detector is racy; the schema-shape test is the durable guard.
Address review feedback on the deferred-FK migration: * tests/test_memory_links_deferred_fk.py: replace stale migration ID references (a2v3w4x5y6z7) with the actual ID (9f8e7d6c5b4a) in the module docstring and assertion failure message. * 9f8e7d6c5b4a_memory_links_deferrable_fk.py: replace _FK_NAMES tuple + substring-based column derivation with an explicit _FK_COLUMNS dict. Drop the misleading DO $$ ... EXCEPTION WHEN duplicate_object blocks; DROP CONSTRAINT IF EXISTS already provides idempotence and the EXCEPTION clause was unreachable after a successful drop.
liling
pushed a commit
to liling/hindsight
that referenced
this pull request
May 5, 2026
…eadlock (vectorize-io#1398) * fix(retain): defer memory_links → memory_units FKs to break cascade deadlock Concurrent INSERT into memory_links (from retain link generation — temporal, semantic, entity, causal — via _bulk_insert_links) and any DELETE that cascades through memory_units → memory_links (e.g. delta-retain superseding chunks: chunks → memory_units → memory_links) can deadlock under sustained single-tenant write load. The cycle: Tx A: DELETE FROM chunks WHERE chunk_id = ANY(...) → CASCADE acquires row locks on memory_units, then on memory_links rows where to_unit_id matches the deleted units. Tx B: INSERT INTO memory_links (...) referencing one of the same memory_units rows. → The immediate FK check takes FOR KEY SHARE on those memory_units rows. The two transactions take row locks on the same memory_units rows in opposite orders depending on which side started first. PostgreSQL detects the cycle and aborts one of them; the loser is killed mid-batch and the worker has to retry. Under sustained write load the pattern repeats. The _bulk_insert_links sort by (from_unit_id, to_unit_id) prevents INSERT-vs-INSERT contention but doesn't help INSERT-vs-cascading-DELETE. Fix: make both memory_links → memory_units FKs DEFERRABLE INITIALLY DEFERRED. INSERT no longer takes FOR KEY SHARE on the FK target row at INSERT time — checked at COMMIT instead. Concurrent DELETE cascades freely; if it has removed the target row by COMMIT, the INSERT transaction fails with a clean FK violation (sqlstate 23503) instead of both transactions getting tangled in a deadlock (sqlstate 40P01). The WHERE EXISTS filter in _bulk_insert_links continues to handle the typical "stale unit_id" case at INSERT time; the deferred FK is just the backstop for the narrow race window between EXISTS and COMMIT. ON DELETE CASCADE semantics are preserved — only the *timing* of the constraint check moves. The entity_id FK is left immediate (entities aren't part of the observed deadlock cycle). PG-only: Oracle's deferrable-FK semantics differ and the deadlock cycle was only observed on PostgreSQL. Tests: * test_memory_links_deferred_fk verifies both FKs end up condeferrable=true, condeferred=true, confdeltype='c' (CASCADE) after the migration runs. Schema-shape invariant — locks in the fix so a future migration can't regress it accidentally. * test_migration_shape passes — the new migration uses the run_for_dialect dispatcher correctly. A behaviour test (concurrent INSERT + cascading DELETE no longer deadlocks) is hard to write deterministically because PG's deadlock detector is racy; the schema-shape test is the durable guard. * review: fix stale migration ID + simplify FK recreation Address review feedback on the deferred-FK migration: * tests/test_memory_links_deferred_fk.py: replace stale migration ID references (a2v3w4x5y6z7) with the actual ID (9f8e7d6c5b4a) in the module docstring and assertion failure message. * 9f8e7d6c5b4a_memory_links_deferrable_fk.py: replace _FK_NAMES tuple + substring-based column derivation with an explicit _FK_COLUMNS dict. Drop the misleading DO $$ ... EXCEPTION WHEN duplicate_object blocks; DROP CONSTRAINT IF EXISTS already provides idempotence and the EXCEPTION clause was unreachable after a successful drop. --------- Co-authored-by: Nicolò Boschi <boschi1997@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
Concurrent retain link generation (
_bulk_insert_linkswriting temporal / semantic / entity / causal links) and any DELETE that cascades throughmemory_units → memory_links(most prominently delta-retain superseding chunks:chunks → memory_units → memory_links) can deadlock under sustained single-tenant write load.The cycle:
DELETE FROM chunks WHERE chunk_id = ANY(...)→ CASCADE acquires row locks onmemory_units, then onmemory_linksrows whereto_unit_idmatches the deleted units.INSERT INTO memory_links (...)referencing one of the samememory_unitsrows → the immediate FK check takesFOR KEY SHAREon those rows.The two transactions take row locks on the same
memory_unitsrows in opposite orders. PostgreSQL detects the cycle and aborts one of them; the loser is killed mid-batch and has to retry. Under sustained write load the pattern repeats.The existing
_bulk_insert_linkssort by(from_unit_id, to_unit_id)prevents INSERT-vs-INSERT contention but doesn't help INSERT-vs-cascading-DELETE.Fix
Make both
memory_links → memory_unitsFKsDEFERRABLE INITIALLY DEFERRED. INSERT no longer takesFOR KEY SHAREon the FK target row at INSERT time — checked at COMMIT instead. Concurrent DELETE cascades freely; if it has removed the target row by COMMIT, the INSERT transaction fails with a clean FK violation (sqlstate 23503) instead of both transactions getting tangled in a deadlock (sqlstate 40P01).The
WHERE EXISTSfilter in_bulk_insert_linkscontinues to handle the typical "stale unit_id" case at INSERT time; the deferred FK is just the backstop for the narrow race window between EXISTS and COMMIT.ON DELETE CASCADEsemantics are preserved — only the timing of the constraint check moves. Theentity_idFK onmemory_linksis left immediate (entities aren't part of the observed deadlock cycle).Why PG-only
The migration intentionally omits the Oracle slot. Oracle's deferrable-FK semantics differ from PostgreSQL's and the deadlock cycle was only observed on PG. Following the established pattern documented in
CLAUDE.mdfor dialect-only migrations.Test plan
test_memory_links_deferred_fk(new) — verifies both FKs end upcondeferrable=true,condeferred=true, and stillconfdeltype='c'(CASCADE) after the migration runs. Schema-shape invariant — locks in the fix so a future migration can't regress it accidentally.test_migration_shape— confirms the new migration uses therun_for_dialectdispatcher correctly. All 62 migrations pass.ruff check+ruff formatclean.A behaviour test (concurrent INSERT + cascading DELETE no longer deadlocks) is hard to write deterministically because PG's deadlock detector is racy. The schema-shape test is the durable guard; once the constraints are deferred, the deadlock cycle is structurally impossible.
Migration safety
DO $$ ... EXCEPTION WHEN duplicate_object THEN NULLblock, so the migration is safe to re-run on schemas where the constraint was already recreated.ALTER TABLE ... ADD CONSTRAINTvalidates existing rows but doesn't touch them.