release-22.2: kvserver: avoid load based splits in middle of SQL row #104563

kvoli · 2023-06-07T23:53:36Z

Backport 1/1 commits from #89217 on behalf of @kvoli.
Backport 1/1 commits from #88720 on behalf of @kvoli.
Backport 1/1 commits from #91462 on behalf of @kvoli.
Backport 1/3 commits from #103690 on behalf of @kvoli.
Backport 1/1 commits from #104082 on behalf of @kvoli.

/cc @cockroachdb/release

See individual commit messages. The most important commits are the
last two, backported to 23.1 in #103876. There was some serious
wrestling to backport this fix. In order to avoid future problems if
we decide to backport again, I included some dependency commits which
are lower risk:

kvserver: Use response data in the load-based splitter #89217
- Use response data in load based splitter
kvserver: Fix performance regression due to new call to collectSpansRead #91462
- Fixes performance regression introduced in above commit.
split: add observability for when load based splitting cannot find a key #88720
- Observability improvements for load based splits.

#103690 fixes the unsafe load based splits and #104082 addresses
a bug in #103690 where a panic could occur if attempting to load-based
split meta1. These are backported already to 23.1.

It was possible for a SQL row to be torn across two ranges due to the
load-based splitter not rejecting potentially unsafe split keys. It is
impossible to determine with just the sampled request keys, whether a
key is certainly unsafe or safe, so a split key is returned regardless of error.

This PR side steps this problem by re-using the
adminSplitWithDescriptor command to find the first real key, after or
at the provided args.SplitKey. This ensures that the split key will
always be a real key whilst not requiring any checks in the splitter
itself.

The updated adminSplitWithDescriptor is local only and requires opting
into finding the first safe key by setting findFirstSafeKey to true.

As such, all safe split key checks are also removed from the split
pkg, with a warning added that the any split key returned is unsafe.

Resolves: #103483

Release note (bug fix): It was possible for a SQL row to be split across
two ranges. When this occurred, SQL queries could return unexpected
errors. This bug is resolved by these changes, as we now sample the real
keys, rather than just request keys to determine load-based split points.

Release justification: Serious bug fix.

We investigated why running YCSB Workload E results in a single hot range and we observed that range queries of the form SELECT * FROM table WHERE pkey >= A LIMIT B will result in all request spans having the same end key - similar to [A, range_end) - rather than end keys that take into account the specified LIMIT. Since the majority of request spans have the same end key, the load splitter algorithm cannot find a split key without too many contained and balance between left and right requests. A proposed solution is to use the response span rather than the request span, since the response span is more accurate in reflecting the keys that this request truly iterated over. We utilize the request span as well as the response's resume span to derive the key span that this request truly iterated over. Using response data (resume span) rather than just the request span in the load-based splitter (experimentally) allows the load-based splitter to find a split key under range query workloads (YCSB Workload E, KV workload with spans). Release note (ops change): We use response data rather than just the request span in the load-based splitter to pass more accurate data about the keys iterated over to the load splitter to find a suitable split key, enabling the load splitter to find a split key under heavy range query workloads.

Previously, there were no metrics or logging in the load-based splitter. This was inadequate because there is minimal observability into why the load splitter could not find a split key. To address this, this patch adds metrics and logging to the load splitter, including counter metrics indicating number of times could not find a split key and popular key (>25% occurrence) and logging indicating causes for no split key (insufficient counters, imbalance, too many contained). Release note (ops change): Added observability for when load based splitting cannot find a key to indicate the reasons why the load splitter could not find a split key, enabling us to have more observability and insight to debug why a range is not splitting more easily.

When we incorporated the use of response data in the load-based splitter, we called collectSpansRead, which is allocation heavy and computationally expensive, resulting in a performance regression. To address this, this patch performs 3 optimizations: 1. Remove the call to collectSpansRead; instead, add a custom function to iterate over the batch of requests / responses and calculate the true spans 2. Instead of constructing a *spanset.SpanSet and finding the union of spans (which uses O(batch_size) memory), we directly compute the union of spans while iterating over the batch resulting in only O(1) memory used 3. Lazily compute the union of true spans only when it is truly needed i.e. we are under heavy load (e.g. >2500QPS) and a load-based splitter has been initialized Release note: None

It was possible for a SQL row to be torn across two ranges due to the load-based splitter not rejecting potentially unsafe split keys. It is impossible to determine with keys sampled from response spans, whether a key is certainly unsafe or safe. This commit side steps this problem by re-using the `adminSplitWithDescriptor` command to find the first real key, after or at the provided `args.SplitKey`. This ensures that the split key will always be a real key whilst not requiring any checks in the splitter itself. The updated `adminSplitWithDescriptor` is local only and requires opting into finding the first safe key by setting `findFirstSafeKey` to `true`. As such, all safe split key checks are also removed from the `split` pkg, with a warning added that the any split key returned is unsafe. Resolves: cockroachdb#103483 Release note (bug fix): It was possible for a SQL row to be split across two ranges. When this occurred, SQL queries could return unexpected errors. This bug is resolved by these changes, as we now inspect the real keys, rather than just request keys to determine load-based split points.

It was possible that a load based split was suggested for `meta1`, which would call `MVCCFirstSplitKey` and panic as the `meta1` start key `\Min` is local, whilst the `meta1` end key is global `0x02 0xff 0xff`. Add a check that the start key is greater than the `meta1` end key before processing in `MVCCFirstSplitKey` to prevent the panic. Note `meta1` would never be split regardless, as `storage.IsValidSplitKey` would fail after finding a split key. Also note that if the desired split key is a local key, the same problem doesn't exist as the minimum split key would be used to seek the first split key instead. Fixes: cockroachdb#104007 Release note: None

cockroach-teamcity · 2023-06-07T23:53:49Z

This change is

nvanbenschoten

Reviewed 1 of 1 files at r1, 16 of 16 files at r2, 4 of 4 files at r3, 10 of 10 files at r4, 2 of 2 files at r5, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @kvoli and @sumeerbhola)

pkg/kv/kvserver/replica_split_load.go line 60 at r2 (raw file):

	}
	shouldInitSplit := r.loadBasedSplitter.Record(ctx, timeutil.Now(), len(ba.Requests), func() roachpb.Span {
		return spans.BoundarySpan(spanset.SpanGlobal)

We discussed on Slack that this change allows us to recommend local split points. These local keys will be handled correctly in adminSplitWithDescriptor. There is also protection against local keys being used as split points later in the function.

pkg/storage/mvcc.go line 5506 at r5 (raw file):

	// cannot be split.
	if startKey.Less(roachpb.RKey(keys.LocalMax)) {
		return nil, nil

Did you consider setting startKey = keys.MetaMin in this case, instead of failing the split? We don't want to split in meta1, but we do want to allow meta1 to split away from meta2. The effect of this logic is that we disable all load-based splits on range 1.

kvoli

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten and @sumeerbhola)

pkg/storage/mvcc.go line 5506 at r5 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Did you consider setting startKey = keys.MetaMin in this case, instead of failing the split? We don't want to split in meta1, but we do want to allow meta1 to split away from meta2. The effect of this logic is that we disable all load-based splits on range 1.

I did consider it but assumed that there are plenty of size splits on meta2 such that we'd never need to look for LB splits in r1.

The size split code does what you're suggesting.

cockroach/pkg/storage/mvcc.go

Lines 5864 to 5866 in 6235182

    
           if key.Less(roachpb.RKey(keys.LocalMax)) { 
        
           	key = roachpb.RKey(keys.LocalMax) 
        
           }

I'm thinking my assumption is okay but doing what you suggested and size splits do would be better.

Should I open up a PR on master and start the backport process back to here - also should this hold up the backport to 22.2?

nvanbenschoten

Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @kvoli and @sumeerbhola)

pkg/storage/mvcc.go line 5506 at r5 (raw file):

Should I open up a PR on master and start the backport process back to here

That sounds good. Disabling load-based splits on range 1 is a limitation, but it doesn't feel like a serious probable (for the reasons you listed), so we don't need to hold up this backport on that other PR.

kvoli · 2023-06-15T18:11:03Z

Filed the issue #104995

TYFTR!

KaiSun314 and others added 5 commits June 7, 2023 12:46

kvoli changed the title ~~kvserver: use response data in the load-based splitter~~ kvserver: avoid load based splits in middle of SQL row Jun 7, 2023

kvoli mentioned this pull request Jun 8, 2023

kv: load-based splitter permits splits between SQL rows backport tracker #104353

Closed

3 tasks

kvoli marked this pull request as ready for review June 8, 2023 02:27

kvoli requested review from a team as code owners June 8, 2023 02:27

kvoli requested a review from sumeerbhola June 8, 2023 02:27

dhartunian removed the request for review from a team June 8, 2023 14:23

yuzefovich changed the title ~~kvserver: avoid load based splits in middle of SQL row~~ release-22.2: kvserver: avoid load based splits in middle of SQL row Jun 8, 2023

nvanbenschoten self-requested a review June 9, 2023 14:45

nvanbenschoten reviewed Jun 15, 2023

View reviewed changes

kvoli requested a review from nvanbenschoten June 15, 2023 15:57

kvoli commented Jun 15, 2023

View reviewed changes

nvanbenschoten approved these changes Jun 15, 2023

View reviewed changes

kvoli merged commit a416bb2 into cockroachdb:release-22.2 Jun 15, 2023
6 checks passed

This was referenced Jun 16, 2023

PR #104563 - kvserver: use response data in the load-based splitter cockroachdb/docs#17285

Open

PR #104563 - split: add observability for when load based splitting cannot find a key cockroachdb/docs#17286

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

release-22.2: kvserver: avoid load based splits in middle of SQL row #104563

release-22.2: kvserver: avoid load based splits in middle of SQL row #104563

kvoli commented Jun 7, 2023

cockroach-teamcity commented Jun 7, 2023

nvanbenschoten left a comment

kvoli left a comment

nvanbenschoten left a comment

kvoli commented Jun 15, 2023

	if key.Less(roachpb.RKey(keys.LocalMax)) {
	key = roachpb.RKey(keys.LocalMax)
	}

release-22.2: kvserver: avoid load based splits in middle of SQL row #104563

release-22.2: kvserver: avoid load based splits in middle of SQL row #104563

Conversation

kvoli commented Jun 7, 2023

cockroach-teamcity commented Jun 7, 2023

nvanbenschoten left a comment

Choose a reason for hiding this comment

kvoli left a comment

Choose a reason for hiding this comment

nvanbenschoten left a comment

Choose a reason for hiding this comment

kvoli commented Jun 15, 2023