Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kv: skip locked does not work correctly with read committed #115057

Closed
arulajmani opened this issue Nov 24, 2023 · 5 comments · Fixed by #117888
Closed

kv: skip locked does not work correctly with read committed #115057

arulajmani opened this issue Nov 24, 2023 · 5 comments · Fixed by #117888
Assignees
Labels
A-kv-transactions Relating to MVCC and the transactional model. A-read-committed Related to the introduction of Read Committed branch-release-23.2 Used to mark GA and release blockers, technical advisories, and bugs for 23.2 branch-release-23.2.0-rc (deleted) C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. docs-known-limitation T-kv KV Team

Comments

@arulajmani
Copy link
Collaborator

arulajmani commented Nov 24, 2023

Describe the problem

Currently, when a skip locked request is evaluating, it only checks for in-memory exclusive and shared locks. Notably, it does not check for replicated locks. As a result, read committed transactions (which have a potential to acquire replicated locks) do not work well with SKIP LOCKED.

Related to #110743

Jira issue: CRDB-33860

Epic CRDB-34172

@arulajmani arulajmani added C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. A-kv-transactions Relating to MVCC and the transactional model. T-kv KV Team A-read-committed Related to the introduction of Read Committed labels Nov 24, 2023
@arulajmani arulajmani self-assigned this Nov 24, 2023
Copy link

blathers-crl bot commented Dec 12, 2023

Hi @michae2, please add branch-* labels to identify which branch(es) this GA-blocker affects.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

@michae2 michae2 added branch-release-23.2 Used to mark GA and release blockers, technical advisories, and bugs for 23.2 branch-release-23.2.0-rc (deleted) labels Dec 12, 2023
@nvanbenschoten
Copy link
Member

@arulajmani until we address it, should we add in a patch to reject this combination? We had been considering this to be almost as much work as fixing this, but I think it might actually just be a matter of catching LockConflictErrors coming from acquireLocksOnKeys for SkipLocked Get/Scan/RevScans and turning these into unsupported errors.

This wouldn't support non-locking skip-locked requests, but that's not a combination that's actually used by SQL.

@michae2
Copy link
Collaborator

michae2 commented Dec 13, 2023

After talking with @arulajmani and @nvanbenschoten, removing the GA-blocker label. We'll rely on documenting this initially, and maybe @nvanbenschoten's suggestion above if we can get to it.

arulajmani added a commit to arulajmani/cockroach that referenced this issue Dec 14, 2023
Currently, requests running with the skip locked wait policy only have
a handle into the in-memory lock table during evaluation, which excludes
uncontended replicated locks. As such, any conflicting replicated locks
may result in a `LockConflictError` after evaluation instead of getting
skipped over -- this breaks expectations about the skip locked wait
policy.

The linked issue talks about fixing this behaviour. Until fixed, we
disallow interactions between the skip locked wait policy and replicated
locks by transforming `LockConflictErrors` returned by evaluation of
skip locked requests into unimplemented errors.

Informs cockroachdb#115057

Release note: None
@michae2
Copy link
Collaborator

michae2 commented Dec 14, 2023

The limitation to document is that SELECT statements using SKIP LOCKED will not yet always skip over replicated locks (which are the kind used by read committed transactions).

nvanbenschoten pushed a commit to arulajmani/cockroach that referenced this issue Dec 18, 2023
Currently, requests running with the skip locked wait policy only have
a handle into the in-memory lock table during evaluation, which excludes
uncontended replicated locks. As such, any conflicting replicated locks
may result in a `LockConflictError` after evaluation instead of getting
skipped over -- this breaks expectations about the skip locked wait
policy.

The linked issue talks about fixing this behaviour. Until fixed, we
disallow interactions between the skip locked wait policy and replicated
locks by transforming `LockConflictErrors` returned by evaluation of
skip locked requests into unimplemented errors.

Informs cockroachdb#115057

Release note: None
craig bot pushed a commit that referenced this issue Dec 18, 2023
114787: sql: replace single node test clusters with test servers r=yuzefovich a=yuzefovich

This commit audits test clusters that are initialized with a single server in `pkg/sql` and replaces many with just test servers directly. This is a minor cleanup. Additionally, this commit adjusts some tests to work under multi-tenancy once enabled.

One notable change that required an investigation was in `TestLeaseTxnDeadlineExtension`. In particular, there we set the lease duration to 0. Then, with the switch to single test server, we were no longer setting `PartOfCluster` knob which made it so that we'd disable the replication on the server. That, in turn, results in "set-zone" internal queries, and those queries would get retried indefinitely because the query wouldn't be able to commit - IIUC due to zero lease duration it would always set the txn deadline in such a way that it would pass by the time we attempt to commit (both of these things are done in `connExecutor.handleAutoCommit`). Thus, to go around this problem the test continues to set `PartOfCluster` explicitly (similar to what we did in a new test in 3b46520).

Informs: #114769.

Epic: None

Release note: None

116129: logictest: break up multiple stmts into separate queries r=yuzefovich a=yuzefovich

We just saw a test failure where a stmt timed out; however, it's not clear which stmt it was because in `grant_in_txn` we currently have like 50 stmts issued together as part of a single `statement ok` directive. This commit breaks them down into separate commands to make it easier to understand the failure in the future.

Fixes: #115975.

Release note: None

116461: kv: disallow use of skip locked in conjunction with replicated locks  r=nvanbenschoten a=arulajmani

Currently, requests running with the skip locked wait policy only have
a handle into the in-memory lock table during evaluation, which excludes
uncontended replicated locks. As such, any conflicting replicated locks
may result in a `LockConflictError` after evaluation instead of getting
skipped over -- this breaks expectations about the skip locked wait
policy.

The linked issue talks about fixing this behaviour. Until fixed, we
disallow interactions between the skip locked wait policy and replicated
locks by transforming `LockConflictErrors` returned by evaluation of
skip locked requests into unimplemented errors.

Informs #115057

Release note: None

116475: cli/zip: show redacted CREATE TYPE statement in debug zip r=rafiss a=rafiss

The CREATE statement was not possible to view at all before, but now we use the hide_sql_constants builtin to show the redacted version.

Epic: None
Release note: None

Co-authored-by: Yahor Yuzefovich <yahor@cockroachlabs.com>
Co-authored-by: Arul Ajmani <arulajmani@gmail.com>
Co-authored-by: Rafi Shamim <rafi@cockroachlabs.com>
arulajmani added a commit to arulajmani/cockroach that referenced this issue Dec 19, 2023
Currently, requests running with the skip locked wait policy only have
a handle into the in-memory lock table during evaluation, which excludes
uncontended replicated locks. As such, any conflicting replicated locks
may result in a `LockConflictError` after evaluation instead of getting
skipped over -- this breaks expectations about the skip locked wait
policy.

The linked issue talks about fixing this behaviour. Until fixed, we
disallow interactions between the skip locked wait policy and replicated
locks by transforming `LockConflictErrors` returned by evaluation of
skip locked requests into unimplemented errors.

Informs cockroachdb#115057

Release note: None
arulajmani added a commit to arulajmani/cockroach that referenced this issue Dec 19, 2023
Currently, requests running with the skip locked wait policy only have
a handle into the in-memory lock table during evaluation, which excludes
uncontended replicated locks. As such, any conflicting replicated locks
may result in a `LockConflictError` after evaluation instead of getting
skipped over -- this breaks expectations about the skip locked wait
policy.

The linked issue talks about fixing this behaviour. Until fixed, we
disallow interactions between the skip locked wait policy and replicated
locks by transforming `LockConflictErrors` returned by evaluation of
skip locked requests into unimplemented errors.

Informs cockroachdb#115057

Release note: None
@mikeCRL
Copy link
Contributor

mikeCRL commented Jan 8, 2024

@arulajmani Do you know if the suggested limitation will be making it into 23.2 GA? I'm writing up the Known Limitations and it looks like the answer will impact the entry for #115057.

arulajmani added a commit to arulajmani/cockroach that referenced this issue Jan 17, 2024
Previously, we weren't correctly handling replicated locks when using
the SKIP LOCKED wait policy. Instead of skipping the replicated lock,
we'd instead throw an unsupported error. This patch lifts this
limitation. After doing so, we're able to support SKIP LOCKED in
conjunction with both replicated locks and read committed transactions,
as demostrated by the change to logic tests.

Closes cockroachdb#115057

Release note: None
craig bot pushed a commit that referenced this issue Jan 23, 2024
117888: kv: correctly handle replicated locks when using the SKIP LOCKED policy r=nvanbenschoten a=arulajmani

Previously, we weren't correctly handling replicated locks when using the SKIP LOCKED wait policy. Instead of skipping the replicated lock, we'd instead throw an unsupported error. This patch lifts this limitation. After doing so, we're able to support SKIP LOCKED in conjunction with both replicated locks and read committed transactions, as demostrated by the change to logic tests.

Closes #115057

Release note: None

Co-authored-by: Arul Ajmani <arulajmani@gmail.com>
@craig craig bot closed this as completed in bac5c82 Jan 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-kv-transactions Relating to MVCC and the transactional model. A-read-committed Related to the introduction of Read Committed branch-release-23.2 Used to mark GA and release blockers, technical advisories, and bugs for 23.2 branch-release-23.2.0-rc (deleted) C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. docs-known-limitation T-kv KV Team
Projects
No open projects
Status: Closed
Development

Successfully merging a pull request may close this issue.

4 participants