Skip to content

staging-v24.1.29: release-24.1: optbuilder: take FOR SHARE locking during RC FK child checks#169973

Merged
rail merged 3 commits intocockroachdb:staging-v24.1.29from
rail:backportstaging-v24.1.29-169926
May 8, 2026
Merged

staging-v24.1.29: release-24.1: optbuilder: take FOR SHARE locking during RC FK child checks#169973
rail merged 3 commits intocockroachdb:staging-v24.1.29from
rail:backportstaging-v24.1.29-169926

Conversation

@rail
Copy link
Copy Markdown
Member

@rail rail commented May 8, 2026

Backport 3/3 commits from #169926.

/cc @cockroachdb/release


Backport 1/1 commits from #150291 and 2/2 commits from #152245.

/cc @cockroachdb/release


opt: take exclusive locks for foreign key cascades under read-committed

Previously foreign key cascades would never take locks on the rows they modified. Under serializable isolation, this presents no difficulty as locks are purely advisory at that isolation level. However, under repeatable read and read committed isolations, locks are necessary to avoid incorrect enforcement.

For example, consider this timing:

t1: R-C statement that deletes from parent starts.
t2: transaction that writes to child commits. This row references the
parent that the previous statement is deleting, but still sees it
because that statement hasn't done anything yet.
t3: R-C statement deletes child keys, but has a timestamp of t1, so
doesn't see the row inserted by t2.

After the R-C transaction commits, we're left with a child table that has a row it should not. A similar problem exists for update cascades.

Fortunately, the fix is relatively simple. By taking an exclusive lock on the child rows before we delete or update them, we force the KV to check write intents. KV will see the write at t2 and either bump t1's timestamp or force it to restart.

Fixes: #150282

Release note (bug fix): A but that would allow a race condition in foreign key cascades under read committed and repeatable read isolations has been fixed.


logictest: fix fk_cascade_race_150282

In #150291 we added subtest fk_cascade_race_150282 which exercises FK
checks performed by concurrent Serializable and Read Committed
transactions. Unfortunately thanks to this test we've discovered that
the locking added in #150291 is insufficient for preventing all FK
violations. We need to turn on the disabled parent-FK-check locking for
the Serializable transaction.

This commmit fixes subtest fk_cascade_race_150282 by turning on these
settings for the Serializable transaction:

  • enable_implicit_fk_locking_for_serializable
  • enable_shared_locking_for_serializable
  • enable_durable_locking_for_serializable

Informs: #151663

Release note: None


optbuilder: take FOR SHARE locking during RC FK child checks

Under Read Committed isolation, we need to take FOR SHARE locking during
FK child checks to ensure that the rows read by the FK check have not
already been written to at a later timestamp. (This is the same reason
that we needed to add FOR UPDATE locking to RC FK cascades in #150291.)

Even with this locking, we still need Serializable transactions to have
the FK parent locking enabled, if both SSI and RC transactions are going
to concurrently modify rows involved in FK relationships. This is
because FOR SHARE (and FOR UPDATE) locking does not prevent phantoms.

Informs: #151663

Release note (bug fix): A bug that would allow FK violations as a result
of some combinations of concurrent Read Committed and Serializable
transactions has been fixed.

Note that if both Serializable and weaker-isolation transactions will
concurrently modify rows involved in FK relationships, the Serializable
transactions must have the following session variables set to prevent
any possible FK violations:

  • SET enable_implicit_fk_locking_for_serializable = on;
  • SET enable_shared_locking_for_serializable = on;
  • SET enable_durable_locking_for_serializable = on;

Release justification: fix for serious FK violation issue.

Release justification:

@rail rail requested a review from a team as a code owner May 8, 2026 15:29
@rail rail requested review from DrewKimball and removed request for a team May 8, 2026 15:29
@blathers-crl
Copy link
Copy Markdown

blathers-crl Bot commented May 8, 2026

Thanks for opening a backport.

Before merging, please confirm that it falls into one of the following categories (select one):

  • Non-production code changes OR fixes for serious issues. Non-production includes test-only changes, build system changes, etc. Serious issues are defined in the policy as correctness, stability, or security issues, data corruption/loss, significant performance regressions, breaking working and widely used functionality, or an inability to detect and debug production issues.
  • Other approved changes. These changes must be gated behind a disabled-by-default feature flag unless there is a strong justification not to. Reference the approved ENGREQ ticket in the PR body (e.g., "Fixes ENGREQ-123").

Add a brief release justification to the PR description explaining your selection.

Also, confirm that the change does not break backward compatibility and complies with all aspects of the backport policy.

All backports must be reviewed by the TL and EM for the owning area.

@blathers-crl blathers-crl Bot added backport Label PR's that are backports to older release branches T-sql-queries SQL Queries Team labels May 8, 2026
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

mw5h and others added 3 commits May 8, 2026 13:05
Previously foreign key cascades would never take locks on the rows they
modified. Under serializable isolation, this presents no difficulty as
locks are purely advisory at that isolation level. However, under
repeatable read and read committed isolations, locks are necessary to
avoid incorrect enforcement.

For example, consider this timing:

t1: R-C statement that deletes from parent starts.
t2: transaction that writes to child commits. This row references the
    parent that the previous statement is deleting, but still sees it
    because that statement hasn't done anything yet.
t3: R-C statement deletes child keys, but has a timestamp of t1, so
    doesn't see the row inserted by t2.

After the R-C transaction commits, we're left with a child table that
has a row it should not. A similar problem exists for update cascades.

Fortunately, the fix is relatively simple. By taking an exclusive lock
on the child rows before we delete or update them, we force the KV
to check write intents. KV will see the write at t2 and either bump
t1's timestamp or force it to restart.

Fixes: cockroachdb#150282
Release note (bug fix): A bug that would allow a race condition in
foreign key cascades under read committed and repeatable read isolations
has been fixed.
In cockroachdb#150291 we added subtest `fk_cascade_race_150282` which exercises FK
checks performed by concurrent Serializable and Read Committed
transactions. Unfortunately thanks to this test we've discovered that
the locking added in cockroachdb#150291 is insufficient for preventing all FK
violations. We need to turn on the disabled parent-FK-check locking for
the Serializable transaction.

This commmit fixes subtest `fk_cascade_race_150282` by turning on these
settings for the Serializable transaction:
- `enable_implicit_fk_locking_for_serializable`
- `enable_shared_locking_for_serializable`
- `enable_durable_locking_for_serializable`

Informs: cockroachdb#151663

Release note: None
Under Read Committed isolation, we need to take FOR SHARE locking during
FK child checks to ensure that the rows read by the FK check have not
already been written to at a later timestamp. (This is the same reason
that we needed to add FOR UPDATE locking to RC FK cascades in cockroachdb#150291.)

Even with this locking, we still need Serializable transactions to have
the FK parent locking enabled, if both SSI and RC transactions are going
to concurrently modify rows involved in FK relationships. This is
because FOR SHARE (and FOR UPDATE) locking does not prevent phantoms.

Informs: cockroachdb#151663

Release note (bug fix): A bug that would allow FK violations as a result
of some combinations of concurrent Read Committed and Serializable
transactions has been fixed.

Note that if both Serializable and weaker-isolation transactions will
concurrently modify rows involved in FK relationships, the Serializable
transactions must have the following session variables set to prevent
any possible FK violations:

- `SET enable_implicit_fk_locking_for_serializable = on;`
- `SET enable_shared_locking_for_serializable = on;`
- `SET enable_durable_locking_for_serializable = on;`
@rail rail force-pushed the backportstaging-v24.1.29-169926 branch from 905e352 to f0296aa Compare May 8, 2026 17:05
@blathers-crl
Copy link
Copy Markdown

blathers-crl Bot commented May 8, 2026

Detected infrastructure failure (matched: self-hosted runner lost communication with the server). Automatically rerunning failed jobs. (run link)

@rail rail merged commit de8d079 into cockroachdb:staging-v24.1.29 May 8, 2026
23 of 24 checks passed
@rail rail deleted the backportstaging-v24.1.29-169926 branch May 8, 2026 17:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport Label PR's that are backports to older release branches T-sql-queries SQL Queries Team target-release-24.1.29

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants