Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: explicit SELECT FOR UPDATE should be applied after predicate #75457

Closed
nvanbenschoten opened this issue Jan 24, 2022 · 9 comments
Closed
Assignees
Labels
A-read-committed Related to the introduction of Read Committed A-sql-execution Relating to SQL execution. A-sql-optimizer SQL logical planning and optimizations. C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) docs-done docs-known-limitation T-sql-queries SQL Queries Team
Projects

Comments

@nvanbenschoten
Copy link
Member

nvanbenschoten commented Jan 24, 2022

Describe the problem

This is similar to #57031, but it's not the same. Currently, SELECT FOR UPDATE places locks on each key scanned by the base index scan. This means that even if some of those keys are later filtered out by a predicate which could not be pushed into the scan, they will still be locked.

This is not the correct behavior (according to PG), but was initially implemented for ease of implementation and to avoid a two-phase locking process with multiple RPCs.

To Reproduce

Setup

create table t(a int primary key, b bool);
insert into t values (1, false), (2, true);

Example

--tx1                                          --tx2
begin transaction;
                                               begin transaction;
select * from t  
where b=true for update;
                                               select * from 
                                               t where b=false for update; -- BLOCKS!

Expected behavior

txn2's SELECT query should not block, because txn1 should have only acquire a lock on the row that it returned ((2, true)).

Jira issue: CRDB-12682

Epic CRDB-25322

@nvanbenschoten nvanbenschoten added C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) A-sql-execution Relating to SQL execution. T-sql-queries SQL Queries Team labels Jan 24, 2022
@nvanbenschoten nvanbenschoten added this to Triage in SQL Queries via automation Jan 24, 2022
@yuzefovich
Copy link
Member

@nvanbenschoten what's the urgency on this?

@rytaft
Copy link
Collaborator

rytaft commented Feb 15, 2022

Moving this to the backlog for now, @nvanbenschoten please let us know if there is urgency on this. Thanks!

@nvanbenschoten
Copy link
Member Author

This is also the kind of issue that won't be urgent until it suddenly is, so it's tough to say. The effect of this is that an application or tool built for PG that uses SELECT FOR UPDATE could see unexpected contention/transaction deadlocks when run on CockroachDB, because CRDB grabs too many row-level locks in some cases due to this issue. The only mitigations right now would be to remove the FOR UPDATE clause or to add secondary indexes to ensure that all predicates are pushed into scans, but this second options runs into issues with #57031.

@rytaft
Copy link
Collaborator

rytaft commented Feb 15, 2022

Ok makes sense -- moving this to a higher priority bucket -- we'll try to get to this during stability if not before

@rytaft rytaft moved this from Backlog to 22.1 Medium Likelihood (60%) in SQL Queries Feb 15, 2022
@nvanbenschoten
Copy link
Member Author

I suspect that this is too large for stability as it's not trivial, so we may just want to consider it for the v22.2 release cycle.

@rytaft
Copy link
Collaborator

rytaft commented Feb 15, 2022

sounds good -- cc @vy-ton for awareness

@michae2
Copy link
Collaborator

michae2 commented Jan 19, 2023

One of the things we'll have to address when fixing this: https://github.com/cockroachdb/cockroach/blob/master/pkg/sql/opt/exec/execbuilder/mutation.go#L1002-L1008

@michae2 michae2 moved this from Backlog (DO NOT ADD NEW ISSUES) to 23.2 Release in SQL Queries Mar 27, 2023
@michae2 michae2 added the A-sql-optimizer SQL logical planning and optimizations. label Mar 27, 2023
@nvanbenschoten nvanbenschoten added the A-read-committed Related to the introduction of Read Committed label Mar 30, 2023
michae2 added a commit to michae2/cockroach that referenced this issue May 22, 2023
Temporarily disallow `SELECT FOR UPDATE` statements under all isolation
levels that are not `SERIALIZABLE` (i.e. `SNAPSHOT` and `READ
COMMITTED`). We will allow them again when the following issues are
fixed:

- cockroachdb#57031
- cockroachdb#75457
- cockroachdb#100193
- cockroachdb#100194

Fixes: cockroachdb#100144

Release note: None
michae2 added a commit to michae2/cockroach that referenced this issue May 23, 2023
Temporarily disallow `SELECT FOR UPDATE` statements under all isolation
levels that are not `SERIALIZABLE` (i.e. `SNAPSHOT` and `READ
COMMITTED`). We will allow them again when the following issues are
fixed:

- cockroachdb#57031
- cockroachdb#75457
- cockroachdb#100193
- cockroachdb#100194

Fixes: cockroachdb#100144

Release note: None
michae2 added a commit to michae2/cockroach that referenced this issue May 25, 2023
Temporarily disallow `SELECT FOR UPDATE` statements under all isolation
levels that are not `SERIALIZABLE` (i.e. `SNAPSHOT` and `READ
COMMITTED`). We will allow them again when the following issues are
fixed:

- cockroachdb#57031
- cockroachdb#75457
- cockroachdb#100193
- cockroachdb#100194

Fixes: cockroachdb#100144

Release note: None
craig bot pushed a commit that referenced this issue May 25, 2023
103734: opt: disallow SELECT FOR UPDATE under weak isolation levels r=nvanbenschoten,mgartner a=michae2

**querycache: remove unused field from CachedData**

Remove the `IsCorrelated` flag from `querycache.CachedData`, which is no
longer used.

Release note: None

---

**sql/opt: add locking durability**

In addition to strength and wait policy, we now add a third property to
locks: durability. Locks with `LockDurabilityGuaranteed` are guaranteed
to be held until commit (if the transaction commits). Durable locks must
be used when correctness depends on locking. This is never the case
under our `SERIALIZABLE` isolation, but under `SNAPSHOT` and `READ
COMMITTED` isolation it will be the case for `SELECT FOR UPDATE`
statements, which will be the first users of durable locks.

This commit adds locking durability to the optimizer and `EXPLAIN`
output, but does not plumb it into KV yet. It will be used in the next
commit to temporarily disallow `SELECT FOR UPDATE` statements under
`SNAPSHOT` and `READ COMMITTED` isolation.

Release note: None

---

**opt: disallow SELECT FOR UPDATE under weak isolation levels**

Temporarily disallow `SELECT FOR UPDATE` statements under all isolation
levels that are not `SERIALIZABLE` (i.e. `SNAPSHOT` and `READ
COMMITTED`). We will allow them again when the following issues are
fixed:

- #57031
- #75457
- #100193
- #100194

Fixes: #100144

Release note: None

Co-authored-by: Michael Erickson <michae2@cockroachlabs.com>
michae2 added a commit to michae2/cockroach that referenced this issue Sep 29, 2023
Locking clauses such as FOR UPDATE and FOR SHARE apply to some or all of
the data sources in a query's FROM list, depending on whether they have
targets (FOR UPDATE OF x). Without targets, they always apply within
subqueries in the FROM list. With targets, they apply within subqueries
if the subquery alias matches the target.

Because of this scope-like nature of FOR UPDATE and FOR SHARE, we
implement semantic analysis using a stack of locking items that grow as
we build inner subqueries deeper in the recursive optbuilder calls.

Prior to this change, we only used the stack of locking items during
buildScan, at the very bottom of the recursion. Because of this, calls
to `lockingSpec.filter` could afford to compress the stack into a single
locking item on our way deeper in the recursion.

As part of the upcoming fix for cockroachdb#75457, however, we will need to build a
new Lock operator when popping off locking items after returning from
the recursion. That Lock operator will need some information gathered
from buildScan at the bottom of the recursion.

To support this, we refactor the stack of locking items to be two
stacks: one that tracks all locking items in scope, and a second that
tracks which locking items currently apply. This will allow buildScan to
associate table information with the correct locking item(s), which can
then be used to build Lock operators when popping the locking items.

As a bonus, by using only the applied locking item stack in
`validateLockingInFrom` we can make validation of SELECT FOR UPDATE
queries a little more precise, which allows some queries we were
incorrectly disallowing.

Informs: cockroachdb#57031, cockroachdb#75457

Epic: CRDB-25322

Release note (sql change): Allow FOR UPDATE on some queries that were
previously disallowed. Queries that use the following operations are now
allowed to have FOR UPDATE OF as long as the prohibited operation is in
a subquery not locked by the FOR UPDATE OF:

- UNION
- INTERSECT
- EXCEPT
- DISTINCT
- GROUP BY
- HAVING
- aggregations
- window functions

For example, the following query is now allowed because the subquery
using the prohibited operations is not affected by the FOR UPDATE OF:

```
SELECT * FROM t,
  (SELECT DISTINCT 0, 0 UNION SELECT a, count(*) FROM t GROUP BY a HAVING a > 0) AS u
  FOR UPDATE OF t;
```

This matches PostgreSQL.
michae2 added a commit to michae2/cockroach that referenced this issue Oct 3, 2023
Locking clauses such as FOR UPDATE and FOR SHARE apply to some or all of
the data sources in a query's FROM list, depending on whether they have
targets (FOR UPDATE OF x). Without targets, they always apply within
subqueries in the FROM list. With targets, they apply within subqueries
if the subquery alias matches the target.

Because of this scope-like nature of FOR UPDATE and FOR SHARE, we
implement semantic analysis using a stack of locking items that grow as
we build inner subqueries deeper in the recursive optbuilder calls.

Prior to this change, we only used the stack of locking items during
buildScan, at the very bottom of the recursion. Because of this, calls
to `lockingSpec.filter` could afford to compress the stack into a single
locking item on our way deeper in the recursion.

As part of the upcoming fix for cockroachdb#75457, however, we will need to build a
new Lock operator when popping off locking items after returning from
the recursion. That Lock operator will need some information gathered
from buildScan at the bottom of the recursion.

To support this, we refactor the stack of locking items to be two
stacks: one that tracks all locking items in scope, and a second that
tracks which locking items currently apply. This will allow buildScan to
associate table information with the correct locking item(s), which can
then be used to build Lock operators when popping the locking items.

As a bonus, by using only the applied locking item stack in
`validateLockingInFrom` we can make validation of SELECT FOR UPDATE
queries a little more precise, which allows some queries we were
incorrectly disallowing.

Informs: cockroachdb#57031, cockroachdb#75457

Epic: CRDB-25322

Release note (sql change): Allow FOR UPDATE on some queries that were
previously disallowed. Queries that use the following operations are now
allowed to have FOR UPDATE OF as long as the prohibited operation is in
a subquery not locked by the FOR UPDATE OF:

- UNION
- INTERSECT
- EXCEPT
- DISTINCT
- GROUP BY
- HAVING
- aggregations
- window functions

For example, the following query is now allowed because the subquery
using the prohibited operations is not affected by the FOR UPDATE OF:

```
SELECT * FROM t,
  (SELECT DISTINCT 0, 0 UNION SELECT a, count(*) FROM t GROUP BY a HAVING a > 0) AS u
  FOR UPDATE OF t;
```

This matches PostgreSQL.
michae2 added a commit to michae2/cockroach that referenced this issue Oct 6, 2023
Locking clauses such as FOR UPDATE and FOR SHARE apply to some or all of
the data sources in a query's FROM list, depending on whether they have
targets (FOR UPDATE OF x). Without targets, they always apply within
subqueries in the FROM list. With targets, they apply within subqueries
if the subquery alias matches the target.

Because of this scope-like nature of FOR UPDATE and FOR SHARE, we
implement semantic analysis using a stack of locking items that grow as
we build inner subqueries deeper in the recursive optbuilder calls.

Prior to this change, we only used the stack of locking items during
buildScan, at the very bottom of the recursion. Because of this, calls
to `lockingSpec.filter` could afford to compress the stack into a single
locking item on our way deeper in the recursion.

As part of the upcoming fix for cockroachdb#75457, however, we will need to build a
new Lock operator when popping off locking items after returning from
the recursion. That Lock operator will need some information gathered
from buildScan at the bottom of the recursion.

To support this, we refactor the stack of locking items to be two
stacks: one that tracks all locking items in scope, and a second that
tracks which locking items currently apply. This will allow buildScan to
associate table information with the correct locking item(s), which can
then be used to build Lock operators when popping the locking items.

As a bonus, by using only the applied locking item stack in
`validateLockingInFrom` we can make validation of SELECT FOR UPDATE
queries a little more precise, which allows some queries we were
incorrectly disallowing.

Informs: cockroachdb#57031, cockroachdb#75457

Epic: CRDB-25322

Release note (sql change): Allow FOR UPDATE on some queries that were
previously disallowed. Queries that use the following operations are now
allowed to have FOR UPDATE OF as long as the prohibited operation is in
a subquery not locked by the FOR UPDATE OF:

- UNION
- INTERSECT
- EXCEPT
- DISTINCT
- GROUP BY
- HAVING
- aggregations
- window functions

For example, the following query is now allowed because the subquery
using the prohibited operations is not affected by the FOR UPDATE OF:

```
SELECT * FROM t,
  (SELECT DISTINCT 0, 0 UNION SELECT a, count(*) FROM t GROUP BY a HAVING a > 0) AS u
  FOR UPDATE OF t;
```

This matches PostgreSQL.
craig bot pushed a commit that referenced this issue Oct 6, 2023
111258: optbuilder: refactor semantic analysis of FOR UPDATE r=DrewKimball,msirek a=michae2

**optbuilder: a few minor tweaks to building of FOR UPDATE**

Make a few minor tweaks to semantic analysis of FOR UPDATE locking
clauses.

1. Disallow multiple FOR UPDATE clauses on parenthesized queries. We do
   not currently handle scopes of parenthesized queries correctly.
   Because of this, we disallow multiple ORDER BY and LIMIT clauses on
   parenthesized queries. The previous implementation of FOR UPDATE was
   simple enough that we could work around this, but the upcoming
   changes make it impossible to support.

2. Allow FOR UPDATE on statements with VALUES in the FROM list (but
   continue to disallow FOR UPDATE on VALUES directly). This matches
   Postgres.

Release note (sql change): Two minor changes to FOR UPDATE clauses:

1. Multiple FOR UPDATE clauses on fully-parenthesized queries are now
disallowed. For example, the following statements are now disallowed:

```
(SELECT 1 FOR UPDATE) FOR UPDATE;
SELECT * FROM ((SELECT 1 FOR UPDATE) FOR UPDATE) AS x;
```

whereas statements like the following are still allowed:

```
SELECT * FROM (SELECT 1 FOR UPDATE) AS x FOR UPDATE;
SELECT (SELECT 1 FOR UPDATE) FOR UPDATE;
```

This does not match PostgreSQL, which allows all of these, but does
match our behavior for ORDER BY and LIMIT.

2. FOR UPDATE is now allowed on statements with VALUES in the FROM list
or as a subquery. For example, the following statements are now allowed:

```
SELECT (VALUES (1)) FOR UPDATE;
SELECT * FROM (VALUES (1)) AS x FOR UPDATE;
```

Using FOR UPDATE directly on VALUES is still disallowed:

```
VALUES (1) FOR UPDATE;
(VALUES (1)) FOR UPDATE;
INSERT INTO t VALUES (1) FOR UPDATE;
```

This matches PostgreSQL.

**optbuilder: refactor semantic analysis of FOR UPDATE**

Locking clauses such as FOR UPDATE and FOR SHARE apply to some or all of
the data sources in a query's FROM list, depending on whether they have
targets (FOR UPDATE OF x). Without targets, they always apply within
subqueries in the FROM list. With targets, they apply within subqueries
if the subquery alias matches the target.

Because of this scope-like nature of FOR UPDATE and FOR SHARE, we
implement semantic analysis using a stack of locking items that grow as
we build inner subqueries deeper in the recursive optbuilder calls.

Prior to this change, we only used the stack of locking items during
buildScan, at the very bottom of the recursion. Because of this, calls
to `lockingSpec.filter` could afford to compress the stack into a single
locking item on our way deeper in the recursion.

As part of the upcoming fix for #75457, however, we will need to build a
new Lock operator when popping off locking items after returning from
the recursion. That Lock operator will need some information gathered
from buildScan at the bottom of the recursion.

To support this, we refactor the stack of locking items to be two
stacks: one that tracks all locking items in scope, and a second that
tracks which locking items currently apply. This will allow buildScan to
associate table information with the correct locking item(s), which can
then be used to build Lock operators when popping the locking items.

As a bonus, by using only the applied locking item stack in
`validateLockingInFrom` we can make validation of SELECT FOR UPDATE
queries a little more precise, which allows some queries we were
incorrectly disallowing.

Informs: #57031, #75457

Epic: CRDB-25322

Release note (sql change): Allow FOR UPDATE on some queries that were
previously disallowed. Queries that use the following operations are now
allowed to have FOR UPDATE OF as long as the prohibited operation is in
a subquery not locked by the FOR UPDATE OF:

- UNION
- INTERSECT
- EXCEPT
- DISTINCT
- GROUP BY
- HAVING
- aggregations
- window functions

For example, the following query is now allowed because the subquery
using the prohibited operations is not affected by the FOR UPDATE OF:

```
SELECT * FROM t,
  (SELECT DISTINCT 0, 0 UNION SELECT a, count(*) FROM t GROUP BY a HAVING a > 0) AS u
  FOR UPDATE OF t;
```

This matches PostgreSQL.

Co-authored-by: Michael Erickson <michae2@cockroachlabs.com>
michae2 added a commit to michae2/cockroach that referenced this issue Oct 9, 2023
Add a new implementation of `SELECT FOR UPDATE` and `SELECT FOR SHARE`
statements. Instead of locking during the initial row fetch, this new
implementation constructs a `Lock` operator on the top of the query plan
which performs the locking phase using a locking semi-join lookup.

During optbuilder we build plans with both `Lock` operators and
initial-row-fetch locking. During execbuilder we decide which
implementation to use based on the isolation level and whether
`optimizer_use_lock_op_for_serializable` is set. If the new
implementation is chosen, `Lock` operators become locking
semi-LookupJoins.

In some cases these new plans will have superfluous lookup joins. A
future PR will optimize away some of these superfluous lookup joins.

Fixes: cockroachdb#57031, cockroachdb#75457

Epic: CRDB-25322

Release note (sql change): Add a new session variable,
`optimizer_use_lock_op_for_serializable`, which when set enables a new
implementation of `SELECT FOR UPDATE`. This new implementation of
`SELECT FOR UPDATE` acquires row locks *after* any joins and filtering,
and always acquires row locks on the primary index of the table being
locked. This more closely matches `SELECT FOR UPDATE` behavior in
PostgreSQL, but at the cost of more round trips from gateway node to
replica leaseholder.

Under read committed isolation (and other isolation levels weaker than
serializable) we will always use this new implementation of `SELECT FOR
UPDATE` regardless of the value of
`optimizer_use_lock_op_for_serializable` to ensure correctness.
michae2 added a commit to michae2/cockroach that referenced this issue Oct 10, 2023
Add a new implementation of `SELECT FOR UPDATE` and `SELECT FOR SHARE`
statements. Instead of locking during the initial row fetch, this new
implementation constructs a `Lock` operator on the top of the query plan
which performs the locking phase using a locking semi-join lookup.

During optbuilder we build plans with both `Lock` operators and
initial-row-fetch locking. During execbuilder we decide which
implementation to use based on the isolation level and whether
`optimizer_use_lock_op_for_serializable` is set. If the new
implementation is chosen, `Lock` operators become locking
semi-LookupJoins.

In some cases these new plans will have superfluous lookup joins. A
future PR will optimize away some of these superfluous lookup joins.

Fixes: cockroachdb#57031, cockroachdb#75457

Epic: CRDB-25322

Release note (sql change): Add a new session variable,
`optimizer_use_lock_op_for_serializable`, which when set enables a new
implementation of `SELECT FOR UPDATE`. This new implementation of
`SELECT FOR UPDATE` acquires row locks *after* any joins and filtering,
and always acquires row locks on the primary index of the table being
locked. This more closely matches `SELECT FOR UPDATE` behavior in
PostgreSQL, but at the cost of more round trips from gateway node to
replica leaseholder.

Under read committed isolation (and other isolation levels weaker than
serializable) we will always use this new implementation of `SELECT FOR
UPDATE` regardless of the value of
`optimizer_use_lock_op_for_serializable` to ensure correctness.
michae2 added a commit to michae2/cockroach that referenced this issue Oct 10, 2023
Add a new implementation of `SELECT FOR UPDATE` and `SELECT FOR SHARE`
statements. Instead of locking during the initial row fetch, this new
implementation constructs a `Lock` operator on the top of the query plan
which performs the locking phase using a locking semi-join lookup.

During optbuilder we build plans with both `Lock` operators and
initial-row-fetch locking. During execbuilder we decide which
implementation to use based on the isolation level and whether
`optimizer_use_lock_op_for_serializable` is set. If the new
implementation is chosen, `Lock` operators become locking
semi-LookupJoins.

In some cases these new plans will have superfluous lookup joins. A
future PR will optimize away some of these superfluous lookup joins.

Fixes: cockroachdb#57031, cockroachdb#75457

Epic: CRDB-25322

Release note (sql change): Add a new session variable,
`optimizer_use_lock_op_for_serializable`, which when set enables a new
implementation of `SELECT FOR UPDATE`. This new implementation of
`SELECT FOR UPDATE` acquires row locks *after* any joins and filtering,
and always acquires row locks on the primary index of the table being
locked. This more closely matches `SELECT FOR UPDATE` behavior in
PostgreSQL, but at the cost of more round trips from gateway node to
replica leaseholder.

Under read committed isolation (and other isolation levels weaker than
serializable) we will always use this new implementation of `SELECT FOR
UPDATE` regardless of the value of
`optimizer_use_lock_op_for_serializable` to ensure correctness.
michae2 added a commit to michae2/cockroach that referenced this issue Oct 10, 2023
Add a new implementation of `SELECT FOR UPDATE` and `SELECT FOR SHARE`
statements. Instead of locking during the initial row fetch, this new
implementation constructs a `Lock` operator on the top of the query plan
which performs the locking phase using a locking semi-join lookup.

During optbuilder we build plans with both `Lock` operators and
initial-row-fetch locking. During execbuilder we decide which
implementation to use based on the isolation level and whether
`optimizer_use_lock_op_for_serializable` is set. If the new
implementation is chosen, `Lock` operators become locking
semi-LookupJoins.

In some cases these new plans will have superfluous lookup joins. A
future PR will optimize away some of these superfluous lookup joins.

Fixes: cockroachdb#57031, cockroachdb#75457

Epic: CRDB-25322

Release note (sql change): Add a new session variable,
`optimizer_use_lock_op_for_serializable`, which when set enables a new
implementation of `SELECT FOR UPDATE`. This new implementation of
`SELECT FOR UPDATE` acquires row locks *after* any joins and filtering,
and always acquires row locks on the primary index of the table being
locked. This more closely matches `SELECT FOR UPDATE` behavior in
PostgreSQL, but at the cost of more round trips from gateway node to
replica leaseholder.

Under read committed isolation (and other isolation levels weaker than
serializable) we will always use this new implementation of `SELECT FOR
UPDATE` regardless of the value of
`optimizer_use_lock_op_for_serializable` to ensure correctness.
michae2 added a commit to michae2/cockroach that referenced this issue Oct 10, 2023
Add a new implementation of `SELECT FOR UPDATE` and `SELECT FOR SHARE`
statements. Instead of locking during the initial row fetch, this new
implementation constructs a `Lock` operator on the top of the query plan
which performs the locking phase using a locking semi-join lookup.

During optbuilder we build plans with both `Lock` operators and
initial-row-fetch locking. During execbuilder we decide which
implementation to use based on the isolation level and whether
`optimizer_use_lock_op_for_serializable` is set. If the new
implementation is chosen, `Lock` operators become locking
semi-LookupJoins.

In some cases these new plans will have superfluous lookup joins. A
future PR will optimize away some of these superfluous lookup joins.

Fixes: cockroachdb#57031, cockroachdb#75457

Epic: CRDB-25322

Release note (sql change): Add a new session variable,
`optimizer_use_lock_op_for_serializable`, which when set enables a new
implementation of `SELECT FOR UPDATE`. This new implementation of
`SELECT FOR UPDATE` acquires row locks *after* any joins and filtering,
and always acquires row locks on the primary index of the table being
locked. This more closely matches `SELECT FOR UPDATE` behavior in
PostgreSQL, but at the cost of more round trips from gateway node to
replica leaseholder.

Under read committed isolation (and other isolation levels weaker than
serializable) we will always use this new implementation of `SELECT FOR
UPDATE` regardless of the value of
`optimizer_use_lock_op_for_serializable` to ensure correctness.
craig bot pushed a commit that referenced this issue Oct 10, 2023
111662: opt: add lock operator r=DrewKimball a=michae2

**execbuilder: change error message to match PostgreSQL**

Release note (sql change): Change the following error message:

```
statement error cannot execute FOR UPDATE in a read-only transaction
```

to this to match PostgreSQL:

```
statement error cannot execute SELECT FOR UPDATE in a read-only transaction
```

**opt: add lock operator**

Add a new implementation of `SELECT FOR UPDATE` and `SELECT FOR SHARE`
statements. Instead of locking during the initial row fetch, this new
implementation constructs a `Lock` operator on the top of the query plan
which performs the locking phase using a locking semi-join lookup.

During optbuilder we build plans with both `Lock` operators and
initial-row-fetch locking. During execbuilder we decide which
implementation to use based on the isolation level and whether
`optimizer_use_lock_op_for_serializable` is set. If the new
implementation is chosen, `Lock` operators become locking
semi-LookupJoins.

In some cases these new plans will have superfluous lookup joins. A
future PR will optimize away some of these superfluous lookup joins.

Fixes: #57031, #75457

Epic: CRDB-25322

Release note (sql change): Add a new session variable,
`optimizer_use_lock_op_for_serializable`, which when set enables a new
implementation of `SELECT FOR UPDATE`. This new implementation of
`SELECT FOR UPDATE` acquires row locks *after* any joins and filtering,
and always acquires row locks on the primary index of the table being
locked. This more closely matches `SELECT FOR UPDATE` behavior in
PostgreSQL, but at the cost of more round trips from gateway node to
replica leaseholder.

Under read committed isolation (and other isolation levels weaker than
serializable) we will always use this new implementation of `SELECT FOR
UPDATE` regardless of the value of
`optimizer_use_lock_op_for_serializable` to ensure correctness.

111934: sql: cache role existence checks to fix perf regression r=rafiss a=rafiss

In 12ec26f, we started to check for the existence of a role whenever a privilege for the public role was checked. This can hapen multiple times during some pg_catalog queries, so it introduced a performance regression.

Now, the role existence is cached so we avoid the penalty. The existence of a role is cached for the duration of the entire transaction, and also gets inherited by any internal executor used to implement a command run by the user's transaction.

No release note since this is new in 23.2.

informs #20718
fixes #112102
Release note: None

Co-authored-by: Michael Erickson <michae2@cockroachlabs.com>
Co-authored-by: Rafi Shamim <rafi@cockroachlabs.com>
@msirek
Copy link
Contributor

msirek commented Oct 10, 2023

Fixed by #111662

@msirek msirek closed this as completed Oct 10, 2023
blathers-crl bot pushed a commit that referenced this issue Oct 10, 2023
Add a new implementation of `SELECT FOR UPDATE` and `SELECT FOR SHARE`
statements. Instead of locking during the initial row fetch, this new
implementation constructs a `Lock` operator on the top of the query plan
which performs the locking phase using a locking semi-join lookup.

During optbuilder we build plans with both `Lock` operators and
initial-row-fetch locking. During execbuilder we decide which
implementation to use based on the isolation level and whether
`optimizer_use_lock_op_for_serializable` is set. If the new
implementation is chosen, `Lock` operators become locking
semi-LookupJoins.

In some cases these new plans will have superfluous lookup joins. A
future PR will optimize away some of these superfluous lookup joins.

Fixes: #57031, #75457

Epic: CRDB-25322

Release note (sql change): Add a new session variable,
`optimizer_use_lock_op_for_serializable`, which when set enables a new
implementation of `SELECT FOR UPDATE`. This new implementation of
`SELECT FOR UPDATE` acquires row locks *after* any joins and filtering,
and always acquires row locks on the primary index of the table being
locked. This more closely matches `SELECT FOR UPDATE` behavior in
PostgreSQL, but at the cost of more round trips from gateway node to
replica leaseholder.

Under read committed isolation (and other isolation levels weaker than
serializable) we will always use this new implementation of `SELECT FOR
UPDATE` regardless of the value of
`optimizer_use_lock_op_for_serializable` to ensure correctness.
@michae2
Copy link
Collaborator

michae2 commented Nov 20, 2023

This is fixed in 23.2 when using the new SFU implementation (optimizer_use_lock_op_for_serializable).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-read-committed Related to the introduction of Read Committed A-sql-execution Relating to SQL execution. A-sql-optimizer SQL logical planning and optimizations. C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) docs-done docs-known-limitation T-sql-queries SQL Queries Team
Projects
Archived in project
SQL Queries
23.2 Release
Development

No branches or pull requests

6 participants