Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql, kv: support non-default key locking with leaf txns #94290

Open
Nican opened this issue Dec 25, 2022 · 18 comments
Open

sql, kv: support non-default key locking with leaf txns #94290

Nican opened this issue Dec 25, 2022 · 18 comments
Labels
branch-release-22.2 Used to mark GA and release blockers, technical advisories, and bugs for 22.2 C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. O-community Originated from the community T-kv KV Team T-sql-queries SQL Queries Team X-blathers-triaged blathers was able to find an owner

Comments

@Nican
Copy link

Nican commented Dec 25, 2022

With #94399 we no longer choose to use the leaf txn (which would allow us to use the streamer API and parallelize local scans in some cases) if we find non-default key locking strength. This stems from the fact that we don't propagate lock spans for leaf txns, and even if we did, we'd run into another complication (quoting Nathan):

The other complication that we would have run into with this approach is that only the root txn coordinator runs a heartbeat loop to keep its locks alive. If we only acquired locks in leaf txns, we would never have started the txn heartbeat and the txn could have been aborted after 5s. We could have launched the heartbeat when merging the LeafTxnFinalState, but that still only works if we call UpdateRootWithLeafFinalState within 5s of the initial lock being acquired.

I think this again hints at our implementation of explicit row-level locking and pushing that locking directly into scans (in all cases) being a little funky (see #75457 and #57031). If we had a separate RowLock execution node that lived above a scan/filter/limit/etc. then we could run that using the RootTxn and distribute the rest of the query as much as we'd like. We're going to have to do part of that work for read-committed.

We should figure out how to lift this restriction.

Original issue description

Edit from @jordanlewis: See #94290 (comment) for a trivial repro

To work around this problem in 22.2.0 and 22.2.1, run the following

SET CLUSTER SETTING sql.distsql.use_streamer.enabled = false;

Describe the problem

It looks like v22.2.1 has some regression with FOR UPDATE. It looks like that a lock is being held on the row for 5 seconds after the transaction finishes. Several of my e2e tests are now failing due to timeout. I drilled into one the tests.

Doing explain analyze on the select query after the transaction finishes shows cumulative time spent in KV: 4.9s and cumulative time spent due to contention: 4.9s. Find attached also a debug zip from explain ANALYZE(debug) from the query.

Removing for update fixes the issue, OR reverting to v22.1.8 fixes the issue.

To Reproduce

The test on my codebase fails pretty consistently. From what I can tell, the sequence of events:

  1. Transaction starts.
  2. Transaction runs select * from tbl where id = 1 for update
  3. The row in tbl is NOT updated inside of the transaction, but other data in unrelated tables is updated/inserted.
  4. Transaction completes successfully.
  5. (only run after transaction is complete) select * from tbl now takes 5 seconds to run.

Expected behavior
Running SELECT * FROM tbl; when there is no other workload on the database to take a few milliseconds to run.

Additional data / screenshots
stmt-bundle-825387661464305665.zip

Environment:

  • CockroachDB version v22.2.1
  • Server OS: Linux on docker
  • Client app: Sequelize

Additional context
E2E (including clicking on buttons) Tests are failing due to timeout.

Jira issue: CRDB-22800

@Nican Nican added the C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. label Dec 25, 2022
@blathers-crl

This comment was marked as outdated.

@blathers-crl blathers-crl bot added O-community Originated from the community X-blathers-triaged blathers was able to find an owner labels Dec 25, 2022
@Nican Nican changed the title FOR UPDATE unexpected holds a lock for longer than expected FOR UPDATE inexpertly holds a lock for longer than expected Dec 25, 2022
@Nican Nican changed the title FOR UPDATE inexpertly holds a lock for longer than expected FOR UPDATE unexpectedly holds a lock for longer than expected Dec 25, 2022
@rafiss rafiss added this to Incoming in KV via automation Dec 26, 2022
@blathers-crl blathers-crl bot added the T-kv KV Team label Dec 26, 2022
@rafiss
Copy link
Collaborator

rafiss commented Dec 26, 2022

Thanks for the report! Have you tried with v22.2.0?

@Nican
Copy link
Author

Nican commented Dec 26, 2022

@rafiss Thanks for the response. I should have tried that.

I tried with v22.2.0, v22.2.0-rc.1, v22.2.0-beta.1, v22.2.0-alpha.1, and they all broken, and displays the 5s contention behavior.

v22.1.12 works as expected.

@Nican
Copy link
Author

Nican commented Dec 28, 2022

Hello @rafiss,

I was able to create a minimum viable repro of the issue. Note that each command needs to be run on separate connections. This is running (v22.2.1)

Create table:

-- drop table if exists dealers cascade;
CREATE TABLE dealers (
	"organizationId" UUID NOT NULL,
	"conventionId" INT8 NOT NULL,
	id INT8 NOT NULL,
	"userId" INT8 NULL,
	CONSTRAINT dealers_pkey PRIMARY KEY ("organizationId" ASC, id ASC),
	UNIQUE INDEX "dealers_userId_conventionId_key" ("userId" ASC, "conventionId" ASC),
	UNIQUE INDEX "dealers_conventionId_key" ("conventionId" ASC, id ASC)
);

Connection 1 (10ms):

insert into dealers values ('00000000-0000-0000-0000-000000000000', 5427958, 1, 1);

Connection 2 (Takes 5100ms to run):

START TRANSACTION;
	select * from dealers where "conventionId" = 5427958 and id = 1 for update;
COMMIT;
select * from dealers; 

Oddly, removing the dealers_userId_conventionId_key index from the table also fixes the issue. Also removing for update also fixes the issue.

Please let me know if you are able to repro the issue.

@jordanlewis
Copy link
Member

I can't repro this on 22.2.1, what is the exact order/interleaving of the commands you're running on each connection?

@jordanlewis
Copy link
Member

Ok, I could reproduce it. The 2 connection thing is a red herring, you can repro on 1 connection like this:

root@127.0.0.1:26257/defaultdb> create table a (a int, b int, c int, primary key(a), unique index(b));
root@127.0.0.1:26257/defaultdb> insert into a values(1,2,3);
root@127.0.0.1:26257/defaultdb> select * from a where b = 2 for update;
  a | b | c
----+---+----
  1 | 2 | 3
(1 row)

Time: 3ms total (execution 3ms / network 0ms)

root@127.0.0.1:26257/defaultdb> select * from a where b = 2 for update;
  a | b | c
----+---+----
  1 | 2 | 3
(1 row)

Time: 4.542s total (execution 4.541s / network 0.001s)

Once you have the select for update in the shell, just hit up/enter quickly and you'll see the slow query.

I think this has to do with the FOR UPDATE locking strength on the induced index join somehow, as without ensuring the plan has an index join the issue doesn't repro.

EXPLAIN ANALYZE output on the slow iteration looks like this:

root@127.0.0.1:26257/defaultdb> explain analyze select * from a where b = 2 for update;
                                        info
------------------------------------------------------------------------------------
  planning time: 614µs
  execution time: 4.5s
  distribution: local
  vectorized: true
  rows read from KV: 2 (23 B, 2 gRPC calls)
  cumulative time spent in KV: 4.5s
  cumulative time spent due to contention: 4.5s
  maximum memory usage: 60 KiB
  network usage: 0 B (0 messages)
  regions: us-east1

  • index join
  │ nodes: n1
  │ regions: us-east1
  │ actual row count: 1
  │ KV time: 1ms
  │ KV contention time: 0µs
  │ KV rows read: 1
  │ KV bytes read: 13 B
  │ KV gRPC calls: 1
  │ estimated max memory allocated: 30 KiB
  │ estimated max sql temp disk usage: 0 B
  │ estimated row count: 1
  │ table: a@a_pkey
  │ locking strength: for update
  │
  └── • scan
        nodes: n1
        regions: us-east1
        actual row count: 1
        KV time: 4.5s
        KV contention time: 4.5s
        KV rows read: 1
        KV bytes read: 10 B
        KV gRPC calls: 1
        estimated max memory allocated: 20 KiB
        estimated row count: 1 (100% of the table; stats collected 47 seconds ago)
        table: a@a_b_key
        spans: [/2 - /2]
        locking strength: for update
(40 rows)

Note the 4.5 second "contention time".

@jordanlewis jordanlewis added the S-1 High impact: many users impacted, serious risk of high unavailability or data loss label Dec 28, 2022
@jordanlewis jordanlewis changed the title FOR UPDATE unexpectedly holds a lock for longer than expected kv: FOR UPDATE unexpectedly holds a lock for longer than expected Dec 28, 2022
@jordanlewis
Copy link
Member

Bisected to b72c109 (#77878), which unfortunately doesn't tell us much. Is the streamer using APIs incorrectly, or is there bug a bug in the API that the streamer is using?

cc @yuzefovich

Bisect log
git bisect start
# status: waiting for both good and bad commits
# good: [3b76f78d724dfc1e7bc8d697f5a7de960d8d1e98] Merge pull request #93143 from adityamaru/show-backup-speedup
git bisect good 3b76f78d724dfc1e7bc8d697f5a7de960d8d1e98
# status: waiting for bad commit, 1 good commit known
# bad: [77667a1b0101cd323090011f50cf910aaa933654] Merge pull request #91926 from cockroachdb/blathers/backport-release-22.2.0-91911
git bisect bad 77667a1b0101cd323090011f50cf910aaa933654
# good: [a1c1879e01ceee79a81693c67a1dba184b5fc1b1] Merge #77561 #77775
git bisect good a1c1879e01ceee79a81693c67a1dba184b5fc1b1
# bad: [8edc7ba73079eee467a8b9348260e01c703ac96c] Merge #83941
git bisect bad 8edc7ba73079eee467a8b9348260e01c703ac96c
# bad: [ce7b1972441f8eaf5aa7772c56250b2d83806cab] Merge #80706
git bisect bad ce7b1972441f8eaf5aa7772c56250b2d83806cab
# bad: [94fe8ddfaec046ac3a769506d65e98ce8d674048] Merge #79038
git bisect bad 94fe8ddfaec046ac3a769506d65e98ce8d674048
# bad: [7d941bbf95172d423c3fbfb23d554471e113c639] Merge #77785 #78266 #78268 #78470
git bisect bad 7d941bbf95172d423c3fbfb23d554471e113c639
# bad: [5bd50f5659a3ca88e8ad378be40c2f5a10496ec8] Merge #78276
git bisect bad 5bd50f5659a3ca88e8ad378be40c2f5a10496ec8
# bad: [0b6d5024573cd39b7dcf8c93cc9dce5eaa38312d] Merge #77993
git bisect bad 0b6d5024573cd39b7dcf8c93cc9dce5eaa38312d
# good: [8179b6f996149ef9e04eb62f9c95ecc992b9f30e] Merge #77853
git bisect good 8179b6f996149ef9e04eb62f9c95ecc992b9f30e
# bad: [d063fcf94d83ed330b548c4fd3e0c4db6a20ce10] opt: add missing documentation for opt tester options
git bisect bad d063fcf94d83ed330b548c4fd3e0c4db6a20ce10
# good: [6c1e2c29ddcf4ac190eb91f82ca530392ef5078e] roachprod: update thrift artifact url for use with charybdefs
git bisect good 6c1e2c29ddcf4ac190eb91f82ca530392ef5078e
# good: [f43648aeea968840c3ea9932eb8e3e13f45140c5] sql: use IndexFetchSpec for inverted joiner
git bisect good f43648aeea968840c3ea9932eb8e3e13f45140c5
# bad: [1a35e55e854ddb90632ac581b10a593e6e4f07d9] Merge #77875 #77878 #77968 #77995 #78013 #78023
git bisect bad 1a35e55e854ddb90632ac581b10a593e6e4f07d9
# good: [242118f0ad421dc949108165c61967e4ff697d82] roachtest: gracefully fail in sst-corruption test
git bisect good 242118f0ad421dc949108165c61967e4ff697d82
# bad: [b72c10955f5fab3f4d069568169d09a1934cbe15] kvstreamer: re-enable streamer by default
git bisect bad b72c10955f5fab3f4d069568169d09a1934cbe15
# first bad commit: [b72c10955f5fab3f4d069568169d09a1934cbe15] kvstreamer: re-enable streamer by default
commit b72c10955f5fab3f4d069568169d09a1934cbe15
Author: Yahor Yuzefovich <yahor@cockroachlabs.com>
Date:   Tue Mar 15 19:53:57 2022 -0700

    kvstreamer: re-enable streamer by default

    Release note: None

 pkg/sql/kvstreamer/large_keys_test.go |  3 ---
 pkg/sql/kvstreamer/streamer_test.go   | 12 ++----------
 pkg/sql/mem_limit_test.go             |  7 -------
 pkg/sql/row/kv_batch_streamer.go      |  2 +-
 4 files changed, 3 insertions(+), 21 deletions(-)

@jordanlewis
Copy link
Member

Sure enough, disabling the streamer cluster setting removes the problem:

demo@127.0.0.1:26257/defaultdb> SET CLUSTER SETTING sql.distsql.use_streamer.enabled = false;
demo@127.0.0.1:26257/defaultdb> select * from a where b = 2 for update;

  a | b | c
----+---+----
  1 | 2 | 3
(1 row)


Time: 3ms total (execution 3ms / network 0ms)

demo@127.0.0.1:26257/defaultdb> select * from a where b = 2 for update;

  a | b | c
----+---+----
  1 | 2 | 3
(1 row)


Time: 4ms total (execution 4ms / network 0ms)

EXPLAIN ANALYZE:

demo@127.0.0.1:26257/defaultdb> explain analyze select * from a where b = 2 for update;

                                       info
----------------------------------------------------------------------------------
  planning time: 251µs
  execution time: 1ms
  distribution: local
  vectorized: true
  rows read from KV: 2 (27 B, 2 gRPC calls)
  cumulative time spent in KV: 1ms
  maximum memory usage: 50 KiB
  network usage: 0 B (0 messages)
  estimated RUs consumed: 0

  • index join
  │ nodes: n1
  │ actual row count: 1
  │ KV time: 391µs
  │ KV contention time: 0µs
  │ KV rows read: 1
  │ KV bytes read: 15 B
  │ KV gRPC calls: 1
  │ estimated max memory allocated: 20 KiB
  │ estimated max sql temp disk usage: 0 B
  │ estimated row count: 1
  │ table: a@a_pkey
  │ locking strength: for update
  │
  └── • scan
        nodes: n1
        actual row count: 1
        KV time: 738µs
        KV contention time: 0µs
        KV rows read: 1
        KV bytes read: 12 B
        KV gRPC calls: 1
        estimated max memory allocated: 20 KiB
        estimated row count: 1 (100% of the table; stats collected 1 minute ago)
        table: a@a_b_key
        spans: [/2 - /2]
        locking strength: for update
(37 rows)


Time: 3ms total (execution 3ms / network 0ms)

@jordanlewis jordanlewis added this to Triage in SQL Queries via automation Dec 28, 2022
@blathers-crl blathers-crl bot added the T-sql-queries SQL Queries Team label Dec 28, 2022
@jordanlewis jordanlewis changed the title kv: FOR UPDATE unexpectedly holds a lock for longer than expected kvstreamer: FOR UPDATE unexpectedly holds a lock for longer than expected Dec 28, 2022
@yuzefovich
Copy link
Member

I think the difference here is that the streamer is using LeafTxn (since there might be concurrency going on) whereas the non-streamer code path uses RootTxn. (I confirmed this by hacking the code for the streamer to also use the root, and the problem disappeared.)

The behavior is such that it seems that the locks with the leaf txn are only released after about 4.5s have passed after the txn that acquired them was committed. Maybe this was already mentioned elsewhere, but the problematic stmt is not the second statement that blocks (which can actually use either the streamer or the non-streamer code path), but the first stmt that used the streamer.

I think we need KV expertise here. Perhaps @nvanbenschoten can share some wisdom.

@yuzefovich
Copy link
Member

@jordanlewis what's your take on the severity of this issue? Should we block 22.2.2 release? Should we consider disabling the streamer in 22.2.2? I'm tentatively adding the corresponding release blocker labels.

@yuzefovich yuzefovich added release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. branch-release-22.2 Used to mark GA and release blockers, technical advisories, and bugs for 22.2 labels Dec 28, 2022
@knz
Copy link
Contributor

knz commented Dec 28, 2022

I thought the API contract was that a LeafTxn cannot perform writes, including placing locks? That's what i remember from the txn / txncoordsender tech note.

Nvm, it can place lock spans. But they need to be "imported" into the RootTxn at the end.

@DrewKimball
Copy link
Collaborator

But they need to be "imported" into the RootTxn at the end.

That should happen during metadata draining here where GetLeafTxnFinalState is called by the index join and then here when UpdateRootWithLeafFinalState is called in the DistSQLReceiver, right? But the result of GetLeafTxnFinalState has no lock spans when I try the example.

@yuzefovich
Copy link
Member

GetLeafTxnFinalState is only about the refresh spans (at least right now). Perhaps there is a gap when FOR UPDATE was implemented - namely, it was assumed that we'd always be using the root txn (since FOR UPDATE implies that the txn will be performing writes which would prohibit the usage of the leaf txn), but this assumption is not verified, and the usage of the streamer breaks that assumption.

@yuzefovich
Copy link
Member

Probably the best quick fix is to examine the plan to see whether non-default key-locking is used and to not use the streamer if so. I'll work on that today.

@knz
Copy link
Contributor

knz commented Dec 28, 2022

Is it not simpler to extend GetLeafTxnFinalState to also include the lock spans if there's any?

@yuzefovich
Copy link
Member

Thanks. Indeed, that seems simple enough.

@yuzefovich yuzefovich self-assigned this Dec 28, 2022
@yuzefovich yuzefovich moved this from Triage to Active in SQL Queries Dec 28, 2022
@knz
Copy link
Contributor

knz commented Dec 28, 2022

maybe double check that UpdateRootWithLeafFinalState merges the lock spans too

craig bot pushed a commit that referenced this issue Dec 29, 2022
94399: sql: don't use streamer for local flows and non-default key locking r=yuzefovich a=yuzefovich

This commit makes it so that we don't use the streamer API when running
fully-local plans if some processors in that flow require non-default
key locking. This change allows us to fix a regression with `SELECT FOR
UPDATE` where the acquired locks by that stmt would not be properly
cleaned up on the txn commit (because we currently don't propagate lock
spans for leaf txns, and the streamer requires us to use the leaf txns).
The initial attempt to fix this propagation exposed an issue with
multi-tenant setups, so for now we choose to simply not use the streamer
in such cases.

Additionally, this commit disables the parallelization of local scans
when non-default key locking strength is found. The parallelization of
local scans also requires us to use the leaf txns, and it was introduced
in 21.2 version; however, we haven't had any user reports. Still, it
seems reasonable to update that code path with the recent learnings.

Addresses: #94290.
Addresses: #94400.

Epic: None

Release note (bug fix): Previously, CockroachDB could delay the release
of the locks acquired when evaluating SELECT FOR UPDATE statements in
some cases. This delay (up to 5s) could then block the future readers.
The bug was introduced in 22.2.0, and the temporary workaround without
upgrading to a release with this fix is to set undocumented cluster
setting `sql.distsql.use_streamer.enabled` to `false`.

Co-authored-by: Yahor Yuzefovich <yahor@cockroachlabs.com>
@yuzefovich yuzefovich removed S-1 High impact: many users impacted, serious risk of high unavailability or data loss release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. labels Dec 29, 2022
@yuzefovich
Copy link
Member

@Nican thanks for filing the issue, and @jordanlewis thanks for determining the concise reproduction. We have merged the fixes to master and 22.2 branches, and it should be included in the 22.2.2 release.

This issue now becomes about figuring out how to lift the restriction introduced in #94399. In particular, we either need to propagate the lock spans across leaf txns or re-implement how we do row-level locking. I'll update the issue description accordingly.

@yuzefovich yuzefovich removed their assignment Dec 29, 2022
@yuzefovich yuzefovich removed this from Active in SQL Queries Dec 29, 2022
@yuzefovich yuzefovich changed the title kvstreamer: FOR UPDATE unexpectedly holds a lock for longer than expected sql, kv: support non-default key locking with leaf txns Dec 29, 2022
@yuzefovich yuzefovich added this to Triage in SQL Queries via automation Dec 29, 2022
@DrewKimball DrewKimball moved this from Triage to Backlog in SQL Queries Jan 3, 2023
@exalate-issue-sync exalate-issue-sync bot removed the T-sql-queries SQL Queries Team label Jan 23, 2023
miraradeva added a commit to miraradeva/cockroach that referenced this issue Mar 21, 2023
Previously, as noted in cockroachdb#94290, it was possible for a LeafTxn to issue locking requests as part of SELECT FOR UPDATE. This behavior was unexpected and the RootTxn wasn't properly cleaning up the locks, resulting in others waiting for those locks to be released. The issue was resolved, in cockroachdb#94399, by ensuring non-default locking strength transactions don't use the streamer API and always run as RootTxn.

This patch adds an assertion on the kv side to prevent other existing or future attempts of LeafTxn issuing locking requests. We don't expect that there are such existing cases, so we don't expect this assertion to fail, but will keep an eye on the nightly tests to make sure.

Fixes: cockroachdb#97817
Release note: None
miraradeva added a commit to miraradeva/cockroach that referenced this issue Mar 21, 2023
Previously, as noted in cockroachdb#94290, it was possible for a LeafTxn to issue
locking requests as part of SELECT FOR UPDATE. This behavior was
unexpected and the RootTxn wasn't properly cleaning up the locks,
resulting in others waiting for those locks to be released. The issue
was resolved, in cockroachdb#94399, by ensuring non-default locking strength
transactions don't use the streamer API and always run as RootTxn.

This patch adds an assertion on the kv side to prevent other existing or
future attempts of LeafTxn issuing locking requests. We don't expect
that there are such existing cases, so we don't expect this assertion
to fail, but will keep an eye on the nightly tests to make sure.

Fixes: cockroachdb#97817 Release note: None
miraradeva added a commit to miraradeva/cockroach that referenced this issue Mar 22, 2023
Previously, as noted in cockroachdb#94290, it was possible for a LeafTxn to issue
locking requests as part of SELECT FOR UPDATE. This behavior was
unexpected and the RootTxn wasn't properly cleaning up the locks,
resulting in others waiting for those locks to be released. The issue
was resolved, in cockroachdb#94399, by ensuring non-default locking strength
transactions don't use the streamer API and always run as RootTxn.

This patch adds an assertion on the kv side to prevent other existing or
future attempts of LeafTxn issuing locking requests. We don't expect
that there are such existing cases, so we don't expect this assertion
to fail, but will keep an eye on the nightly tests to make sure.

Fixes: cockroachdb#97817 Release note: None
craig bot pushed a commit that referenced this issue Mar 22, 2023
98741: ci: update bazel builder image r=rickystewart a=cockroach-teamcity

Release note: None
Epic: None


98878: backupccl: fix occassional TestRestoreErrorPropagates flake r=stevendanna a=adityamaru

Very rarely under stress race another automatic job would race with the restore and increment the error count. This would result in the count being greater than our expected value of 1. This disables all the automatic jobs eliminating the chance of this race.

Fixes: #98037

Release note: None

99099: kvserver: deflake TestReplicaTombstone r=andrewbaptist a=tbg

Like many other tests, this test could flake because we'd sometimes
catch a "cannot remove learner while snapshot is in flight" error.

I think the root cause is that sometimes there are errant Raft snapshots
in the system[^1] and these get mistaken for LEARNERs that are still
being caught up by the replicate queue. I tried to address this general
class of issues by making the check for in-flight learner snapshots not
care about *raft* snapshots.

I was able to stress TestReplicaTombstone for 30+ minutes without a
failure using that approach, whereas previously it usually failed within
a few minutes.

```
./dev test --stress pkg/kv/kvserver/ --filter TestReplicaTombstone 2>&1 | tee stress.log
[...]
2461 runs so far, 0 failures, over 35m45s
```

[^1]: #87553

Fixes #98883.

Epic: none
Release note: None


99126: kv: return error on locking request in LeafTxn r=nvanbenschoten a=miraradeva

Previously, as noted in #94290, it was possible for a LeafTxn to issue locking requests as part of SELECT FOR UPDATE. This behavior was unexpected and the RootTxn wasn't properly cleaning up the locks, resulting in others waiting for those locks to be released. The issue was resolved, in #94399, by ensuring non-default locking strength transactions don't use the streamer API and always run as RootTxn.

This patch adds an assertion on the kv side to prevent other existing or future attempts of LeafTxn issuing locking requests. We don't expect that there are such existing cases, so we don't expect this assertion to fail, but will keep an eye on the nightly tests to make sure.

Fixes: #97817
Release note: None

99150: backupccl: stop logging unsanitized backup stmt in schedule executor r=stevendanna a=msbutler

Informs #99145

Release note: None

Co-authored-by: cockroach-teamcity <teamcity@cockroachlabs.com>
Co-authored-by: adityamaru <adityamaru@gmail.com>
Co-authored-by: Tobias Grieger <tobias.b.grieger@gmail.com>
Co-authored-by: Mira Radeva <mira@cockroachlabs.com>
Co-authored-by: Michael Butler <butler@cockroachlabs.com>
blathers-crl bot pushed a commit that referenced this issue Mar 23, 2023
Previously, as noted in #94290, it was possible for a LeafTxn to issue
locking requests as part of SELECT FOR UPDATE. This behavior was
unexpected and the RootTxn wasn't properly cleaning up the locks,
resulting in others waiting for those locks to be released. The issue
was resolved, in #94399, by ensuring non-default locking strength
transactions don't use the streamer API and always run as RootTxn.

This patch adds an assertion on the kv side to prevent other existing or
future attempts of LeafTxn issuing locking requests. We don't expect
that there are such existing cases, so we don't expect this assertion
to fail, but will keep an eye on the nightly tests to make sure.

Fixes: #97817 Release note: None
@yuzefovich yuzefovich added the T-sql-queries SQL Queries Team label Feb 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
branch-release-22.2 Used to mark GA and release blockers, technical advisories, and bugs for 22.2 C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. O-community Originated from the community T-kv KV Team T-sql-queries SQL Queries Team X-blathers-triaged blathers was able to find an owner
Projects
KV
Incoming
Status: Backlog
SQL Queries
Backlog (DO NOT ADD NEW ISSUES)
Development

No branches or pull requests

6 participants