Skip to content

fix(retain): defer memory_links → memory_units FKs to break cascade deadlock#1398

Merged
nicoloboschi merged 2 commits intomainfrom
fix/memory-links-deferred-fk
May 4, 2026
Merged

fix(retain): defer memory_links → memory_units FKs to break cascade deadlock#1398
nicoloboschi merged 2 commits intomainfrom
fix/memory-links-deferred-fk

Conversation

@cdbartholomew
Copy link
Copy Markdown
Contributor

Problem

Concurrent retain link generation (_bulk_insert_links writing temporal / semantic / entity / causal links) and any DELETE that cascades through memory_units → memory_links (most prominently delta-retain superseding chunks: chunks → memory_units → memory_links) can deadlock under sustained single-tenant write load.

The cycle:

  • Tx A: DELETE FROM chunks WHERE chunk_id = ANY(...) → CASCADE acquires row locks on memory_units, then on memory_links rows where to_unit_id matches the deleted units.
  • Tx B: INSERT INTO memory_links (...) referencing one of the same memory_units rows → the immediate FK check takes FOR KEY SHARE on those rows.

The two transactions take row locks on the same memory_units rows in opposite orders. PostgreSQL detects the cycle and aborts one of them; the loser is killed mid-batch and has to retry. Under sustained write load the pattern repeats.

The existing _bulk_insert_links sort by (from_unit_id, to_unit_id) prevents INSERT-vs-INSERT contention but doesn't help INSERT-vs-cascading-DELETE.

Fix

Make both memory_links → memory_units FKs DEFERRABLE INITIALLY DEFERRED. INSERT no longer takes FOR KEY SHARE on the FK target row at INSERT time — checked at COMMIT instead. Concurrent DELETE cascades freely; if it has removed the target row by COMMIT, the INSERT transaction fails with a clean FK violation (sqlstate 23503) instead of both transactions getting tangled in a deadlock (sqlstate 40P01).

The WHERE EXISTS filter in _bulk_insert_links continues to handle the typical "stale unit_id" case at INSERT time; the deferred FK is just the backstop for the narrow race window between EXISTS and COMMIT.

ON DELETE CASCADE semantics are preserved — only the timing of the constraint check moves. The entity_id FK on memory_links is left immediate (entities aren't part of the observed deadlock cycle).

Why PG-only

The migration intentionally omits the Oracle slot. Oracle's deferrable-FK semantics differ from PostgreSQL's and the deadlock cycle was only observed on PG. Following the established pattern documented in CLAUDE.md for dialect-only migrations.

Test plan

  • test_memory_links_deferred_fk (new) — verifies both FKs end up condeferrable=true, condeferred=true, and still confdeltype='c' (CASCADE) after the migration runs. Schema-shape invariant — locks in the fix so a future migration can't regress it accidentally.
  • test_migration_shape — confirms the new migration uses the run_for_dialect dispatcher correctly. All 62 migrations pass.
  • ruff check + ruff format clean.

A behaviour test (concurrent INSERT + cascading DELETE no longer deadlocks) is hard to write deterministically because PG's deadlock detector is racy. The schema-shape test is the durable guard; once the constraints are deferred, the deadlock cycle is structurally impossible.

Migration safety

  • Idempotent: each FK is dropped + recreated inside a DO $$ ... EXCEPTION WHEN duplicate_object THEN NULL block, so the migration is safe to re-run on schemas where the constraint was already recreated.
  • No data movement, no rewrite — ALTER TABLE ... ADD CONSTRAINT validates existing rows but doesn't touch them.
  • Downgrade reverts to the default (non-deferrable) form, restoring prior behaviour exactly.

…eadlock

Concurrent INSERT into memory_links (from retain link generation —
temporal, semantic, entity, causal — via _bulk_insert_links) and any
DELETE that cascades through memory_units → memory_links (e.g.
delta-retain superseding chunks: chunks → memory_units → memory_links)
can deadlock under sustained single-tenant write load.

The cycle:

  Tx A: DELETE FROM chunks WHERE chunk_id = ANY(...)
        → CASCADE acquires row locks on memory_units, then on
          memory_links rows where to_unit_id matches the deleted units.

  Tx B: INSERT INTO memory_links (...) referencing one of the same
        memory_units rows.
        → The immediate FK check takes FOR KEY SHARE on those
          memory_units rows.

The two transactions take row locks on the same memory_units rows in
opposite orders depending on which side started first. PostgreSQL
detects the cycle and aborts one of them; the loser is killed mid-batch
and the worker has to retry. Under sustained write load the pattern
repeats.

The _bulk_insert_links sort by (from_unit_id, to_unit_id) prevents
INSERT-vs-INSERT contention but doesn't help INSERT-vs-cascading-DELETE.

Fix: make both memory_links → memory_units FKs DEFERRABLE INITIALLY
DEFERRED. INSERT no longer takes FOR KEY SHARE on the FK target row at
INSERT time — checked at COMMIT instead. Concurrent DELETE cascades
freely; if it has removed the target row by COMMIT, the INSERT
transaction fails with a clean FK violation (sqlstate 23503) instead of
both transactions getting tangled in a deadlock (sqlstate 40P01). The
WHERE EXISTS filter in _bulk_insert_links continues to handle the
typical "stale unit_id" case at INSERT time; the deferred FK is just
the backstop for the narrow race window between EXISTS and COMMIT.

ON DELETE CASCADE semantics are preserved — only the *timing* of the
constraint check moves. The entity_id FK is left immediate (entities
aren't part of the observed deadlock cycle).

PG-only: Oracle's deferrable-FK semantics differ and the deadlock cycle
was only observed on PostgreSQL.

Tests:
  * test_memory_links_deferred_fk verifies both FKs end up
    condeferrable=true, condeferred=true, confdeltype='c' (CASCADE)
    after the migration runs. Schema-shape invariant — locks in the fix
    so a future migration can't regress it accidentally.
  * test_migration_shape passes — the new migration uses the
    run_for_dialect dispatcher correctly.

A behaviour test (concurrent INSERT + cascading DELETE no longer
deadlocks) is hard to write deterministically because PG's deadlock
detector is racy; the schema-shape test is the durable guard.
@cdbartholomew cdbartholomew requested a review from nicoloboschi May 3, 2026 15:11
Address review feedback on the deferred-FK migration:

* tests/test_memory_links_deferred_fk.py: replace stale migration ID
  references (a2v3w4x5y6z7) with the actual ID (9f8e7d6c5b4a) in the
  module docstring and assertion failure message.
* 9f8e7d6c5b4a_memory_links_deferrable_fk.py: replace _FK_NAMES tuple +
  substring-based column derivation with an explicit _FK_COLUMNS dict.
  Drop the misleading DO $$ ... EXCEPTION WHEN duplicate_object blocks;
  DROP CONSTRAINT IF EXISTS already provides idempotence and the
  EXCEPTION clause was unreachable after a successful drop.
@nicoloboschi nicoloboschi merged commit 06e45ab into main May 4, 2026
60 of 63 checks passed
liling pushed a commit to liling/hindsight that referenced this pull request May 5, 2026
…eadlock (vectorize-io#1398)

* fix(retain): defer memory_links → memory_units FKs to break cascade deadlock

Concurrent INSERT into memory_links (from retain link generation —
temporal, semantic, entity, causal — via _bulk_insert_links) and any
DELETE that cascades through memory_units → memory_links (e.g.
delta-retain superseding chunks: chunks → memory_units → memory_links)
can deadlock under sustained single-tenant write load.

The cycle:

  Tx A: DELETE FROM chunks WHERE chunk_id = ANY(...)
        → CASCADE acquires row locks on memory_units, then on
          memory_links rows where to_unit_id matches the deleted units.

  Tx B: INSERT INTO memory_links (...) referencing one of the same
        memory_units rows.
        → The immediate FK check takes FOR KEY SHARE on those
          memory_units rows.

The two transactions take row locks on the same memory_units rows in
opposite orders depending on which side started first. PostgreSQL
detects the cycle and aborts one of them; the loser is killed mid-batch
and the worker has to retry. Under sustained write load the pattern
repeats.

The _bulk_insert_links sort by (from_unit_id, to_unit_id) prevents
INSERT-vs-INSERT contention but doesn't help INSERT-vs-cascading-DELETE.

Fix: make both memory_links → memory_units FKs DEFERRABLE INITIALLY
DEFERRED. INSERT no longer takes FOR KEY SHARE on the FK target row at
INSERT time — checked at COMMIT instead. Concurrent DELETE cascades
freely; if it has removed the target row by COMMIT, the INSERT
transaction fails with a clean FK violation (sqlstate 23503) instead of
both transactions getting tangled in a deadlock (sqlstate 40P01). The
WHERE EXISTS filter in _bulk_insert_links continues to handle the
typical "stale unit_id" case at INSERT time; the deferred FK is just
the backstop for the narrow race window between EXISTS and COMMIT.

ON DELETE CASCADE semantics are preserved — only the *timing* of the
constraint check moves. The entity_id FK is left immediate (entities
aren't part of the observed deadlock cycle).

PG-only: Oracle's deferrable-FK semantics differ and the deadlock cycle
was only observed on PostgreSQL.

Tests:
  * test_memory_links_deferred_fk verifies both FKs end up
    condeferrable=true, condeferred=true, confdeltype='c' (CASCADE)
    after the migration runs. Schema-shape invariant — locks in the fix
    so a future migration can't regress it accidentally.
  * test_migration_shape passes — the new migration uses the
    run_for_dialect dispatcher correctly.

A behaviour test (concurrent INSERT + cascading DELETE no longer
deadlocks) is hard to write deterministically because PG's deadlock
detector is racy; the schema-shape test is the durable guard.

* review: fix stale migration ID + simplify FK recreation

Address review feedback on the deferred-FK migration:

* tests/test_memory_links_deferred_fk.py: replace stale migration ID
  references (a2v3w4x5y6z7) with the actual ID (9f8e7d6c5b4a) in the
  module docstring and assertion failure message.
* 9f8e7d6c5b4a_memory_links_deferrable_fk.py: replace _FK_NAMES tuple +
  substring-based column derivation with an explicit _FK_COLUMNS dict.
  Drop the misleading DO $$ ... EXCEPTION WHEN duplicate_object blocks;
  DROP CONSTRAINT IF EXISTS already provides idempotence and the
  EXCEPTION clause was unreachable after a successful drop.

---------

Co-authored-by: Nicolò Boschi <boschi1997@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants