Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

release-21.2: sql: disallow multiple modification subqueries of same table #71595

Merged
merged 1 commit into from
Nov 23, 2021

Conversation

blathers-crl[bot]
Copy link

@blathers-crl blathers-crl bot commented Oct 15, 2021

Backport 1/1 commits from #70976 on behalf of @michae2.

/cc @cockroachdb/release

Release justification: fix for high-severity bug in existing functionality.


Addresses: #70731

Disallow multiple INSERT ON CONFLICT, UPSERT, UPDATE, or DELETE
subqueries on the same table until we are better able to detect multiple
modifications of the same row by a single statement. (Note that multiple
INSERTs without ON CONFLICT do not suffer from this issue and are always
allowed.)

Release note (sql change): Statements containing multiple INSERT ON
CONFLICT, UPSERT, UPDATE, or DELETE subqueries can cause data corruption
if they modify the same row multiple times. For example, the following
SELECT 1 statement will cause corruption of table t:

CREATE TABLE t (i INT, j INT, PRIMARY KEY (i), INDEX (j));
INSERT INTO t VALUES (0, 0);
WITH
  cte1 AS (UPDATE t SET j = 1 WHERE i = 0 RETURNING *),
  cte2 AS (UPDATE t SET j = 2 WHERE i = 0 RETURNING *)
SELECT 1;

Until this is fixed, this change disallows statements with multiple
subqueries which modify the same table. Applications can work around
this by rewriting problematic statements. For example, the query above
can be rewritten as an explicit multi-statement transaction:

BEGIN;
UPDATE t SET j = 1 WHERE i = 0;
UPDATE t SET j = 2 WHERE i = 0;
SELECT 1;
COMMIT;

or, if it doesn't matter which update "wins", as multiple non-mutating
CTEs on an UPDATE statement:

WITH
  cte1 AS (SELECT 1),
  cte2 AS (SELECT 2)
UPDATE t SET j = x.j
FROM (SELECT * FROM cte1 UNION ALL SELECT * FROM cte2) AS x (j)
WHERE i = 0
RETURNING 1;

which in this case could be written more simply as:

UPDATE t SET j = x.j FROM (VALUES (1), (2)) AS x (j) WHERE i = 0 RETURNING 1;

(Note that in these last two rewrites the first update will win, rather
than the last.) None of these rewrites suffer from the corruption problem.

To override this change and allow these statements in spite of the risk of
corruption, applications can:

SET CLUSTER SETTING sql.multiple_modifications_of_table.enabled = true

But be warned that with this enabled there is nothing to prevent this
type of corruption from occuring if the same row is modified multiple
times by a single statment.

To check for corruption, use the EXPERIMENTAL SCRUB command:

EXPERIMENTAL SCRUB TABLE t WITH OPTIONS INDEX ALL;

@blathers-crl blathers-crl bot requested a review from a team as a code owner October 15, 2021 00:50
@blathers-crl blathers-crl bot force-pushed the blathers/backport-release-21.2-70976 branch from d6e535d to 14bcb65 Compare October 15, 2021 00:50
@blathers-crl
Copy link
Author

blathers-crl bot commented Oct 15, 2021

Thanks for opening a backport.

Please check the backport criteria before merging:

  • Patches should only be created for serious issues.
  • Patches should not break backwards-compatibility.
  • Patches should change as little code as possible.
  • Patches should not change on-disk formats or node communication protocols.
  • Patches should not add new functionality.
  • Patches must not add, edit, or otherwise modify cluster versions; or add version gates.
If some of the basic criteria cannot be satisfied, ensure that the exceptional criteria are satisfied within.
  • There is a high priority need for the functionality that cannot wait until the next release and is difficult to address in another way.
  • The new functionality is additive-only and only runs for clusters which have specifically “opted in” to it (e.g. by a cluster setting).
  • New code is protected by a conditional check that is trivial to verify and ensures that it only runs for opt-in clusters.
  • The PM and TL on the team that owns the changed code have signed off that the change obeys the above rules.

Add a brief release justification to the body of your PR to justify this backport.

Some other things to consider:

  • What did we do to ensure that a user that doesn’t know & care about this backport, has no idea that it happened?
  • Will this work in a cluster of mixed patch versions? Did we test that?
  • If a user upgrades a patch version, uses this feature, and then downgrades, what happens?

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@michae2 michae2 force-pushed the blathers/backport-release-21.2-70976 branch from 14bcb65 to a5b5eee Compare October 15, 2021 05:03
Copy link
Collaborator

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: (but wait for 21.2.1)

Reviewed 10 of 10 files at r1, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @jordanlewis and @michae2)

Addresses: #70731

Disallow multiple INSERT ON CONFLICT, UPSERT, UPDATE, or DELETE
subqueries on the same table until we are better able to detect multiple
modifications of the same row by a single statement. (Note that multiple
INSERTs without ON CONFLICT do not suffer from this issue and are always
allowed.)

Release note (sql change): Statements containing multiple INSERT ON
CONFLICT, UPSERT, UPDATE, or DELETE subqueries can cause data corruption
if they modify the same row multiple times. For example, the following
SELECT 1 statement will cause corruption of table t:

```sql
CREATE TABLE t (i INT, j INT, PRIMARY KEY (i), INDEX (j));
INSERT INTO t VALUES (0, 0);
WITH
  cte1 AS (UPDATE t SET j = 1 WHERE i = 0 RETURNING *),
  cte2 AS (UPDATE t SET j = 2 WHERE i = 0 RETURNING *)
SELECT 1;
```

Until this is fixed, this change disallows statements with multiple
subqueries which modify the same table. Applications can work around
this by rewriting problematic statements. For example, the query above
can be rewritten as an explicit multi-statement transaction:

```sql
BEGIN;
UPDATE t SET j = 1 WHERE i = 0;
UPDATE t SET j = 2 WHERE i = 0;
SELECT 1;
COMMIT;
```

or, if it doesn't matter which update "wins", as multiple non-mutating
CTEs on an UPDATE statement:

```sql
WITH
  cte1 AS (SELECT 1),
  cte2 AS (SELECT 2)
UPDATE t SET j = x.j
FROM (SELECT * FROM cte1 UNION ALL SELECT * FROM cte2) AS x (j)
WHERE i = 0
RETURNING 1;
```

which in this case could be written more simply as:

```sql
UPDATE t SET j = x.j FROM (VALUES (1), (2)) AS x (j) WHERE i = 0 RETURNING 1;
```

(Note that in these last two rewrites the first update will win, rather
than the last.) None of these rewrites suffer from the corruption problem.

To override this change and allow these statements in spite of the risk of
corruption, applications can:

```sql
SET CLUSTER SETTING sql.multiple_modifications_of_table.enabled = true
```

But be warned that with this enabled there is nothing to prevent this
type of corruption from occuring if the same row is modified multiple
times by a single statment.

To check for corruption, use the EXPERIMENTAL SCRUB command:

```sql
EXPERIMENTAL SCRUB TABLE t WITH OPTIONS INDEX ALL;
```
@rytaft rytaft force-pushed the blathers/backport-release-21.2-70976 branch from a5b5eee to bbc99af Compare November 23, 2021 22:08
@rytaft rytaft merged commit afa06ad into release-21.2 Nov 23, 2021
@rytaft rytaft deleted the blathers/backport-release-21.2-70976 branch November 23, 2021 23:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants