Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kv: eliminate range write latch during ranged intent resolution #70137

Open
nvanbenschoten opened this issue Sep 13, 2021 · 1 comment
Open
Labels
A-kv-transactions Relating to MVCC and the transactional model. C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-kv KV Team

Comments

@nvanbenschoten
Copy link
Member

nvanbenschoten commented Sep 13, 2021

#22349 and more recently #41720 (comment) detail a proposal for eliminating all latching during intent resolution. This issue explores a way in which we could make ranged intent resolution less disruptive to foreground traffic even before we eliminate all latching.

Currently, intent resolution acquires a write latch across its entire span at the target transaction's min_timestamp:

latchSpans.AddMVCC(spanset.SpanReadWrite, req.Header().Span(), minTxnTS)

Because this is a write latch and because it has an early timestamp, this effectively blocks all traffic across the entire resolution span.

For point intent resolution, this is unfortunate but not terribly so. Anyone who conflicts with the point latch would have otherwise conflicted with the intent being resolved anyway, so it wasn't introducing any new coordination between background and foreground processes. Additionally, if the intent had already been resolved and the intent resolution request was redundant, the request would perform a point read, notice the missing intent, skip Raft, and release its latches quickly.

For ranged intent resolution, things are worse for a few reasons. First, this is disruptive to other requests in the same span, even if those would not otherwise conflict with the intents being resolved. This means that ranged intent resolution introduces new coordination between background and foreground processes that didn't otherwise already exist. So a ranged intent resolution request scanning from [a, d) to resolve intents at key a and c will block a read or write at key b. Second, because ranges intent resolution requires an expensive scan, ranged intent resolution requests are disruptive even when they are redundant. This second point is why #66741 got so bad. #66268 went a long way to make ranged intent resolution cheaper, so we're in a better spot than we were before, but these two problems still remain.

Even before removing all latching from intent resolution, we could make ranged intent resolution less disruptive by avoiding ranged write latches. Ranged intent resolution can be thought of a two-step process:

  1. scan to find intents in key span (with an optional key limit)
  2. resolve each of these intents

If we handled each of these steps discretely, we could be more selective about when we hold latches and how disruptive we are to other traffic. For instance, an alternative approach to ranged intent resolution would be to evaluate it as:

1. acquire read latch across the entire span
2. scan and collect intent keys to resolve
3. release read latch
4. acquire point write latches on each intent key
5. resolve intent keys in a batch (evaluation + consensus)
6. release write latches

This works because point intent resolution already handles missing intents properly, so there are no races here.

It also follows that latches may not even be necessary for the initial scan at all, as long as it is still properly synchronized with range lifecycle events (splits, merges, rebalances, etc). We'll have to solve this problem for latch-less intent resolution, so we should be aware of it here as well.

Jira issue: CRDB-9956

@nvanbenschoten nvanbenschoten added C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) A-kv-transactions Relating to MVCC and the transactional model. labels Sep 13, 2021
@nvanbenschoten nvanbenschoten added this to Incoming in KV via automation Sep 13, 2021
@blathers-crl blathers-crl bot added the T-kv KV Team label Sep 13, 2021
@nvanbenschoten nvanbenschoten added this to Isolation: Transactions in KV 22.1 Planning (WIP) Sep 13, 2021
@nvanbenschoten nvanbenschoten moved this from Incoming to On Hold in KV Sep 13, 2021
@github-actions
Copy link

We have marked this issue as stale because it has been inactive for
18 months. If this issue is still relevant, removing the stale label
or adding a comment will keep it active. Otherwise, we'll close it in
10 days to keep the issue queue tidy. Thank you for your contribution
to CockroachDB!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-kv-transactions Relating to MVCC and the transactional model. C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-kv KV Team
Projects
KV
On Hold
KV 22.1 Planning (WIP)
Isolation: Transactions
Development

No branches or pull requests

1 participant