New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ccl/backupccl,ccl/importccl: disable merges during IMPORT and RESTORE #32538

Merged
merged 2 commits into from Nov 26, 2018

Conversation

Projects
None yet
7 participants
@petermattis
Copy link
Contributor

petermattis commented Nov 21, 2018

Add a facility for periodically gossiping keys which disable range
merges for a set of tables. This is used by IMPORT and RESTORE to
disable range merges on tables being imported / restored. This is done
to prevent the merge queue from fighting against the splitting performed
by IMPORT and RESTORE.

Fixes #29268

Release note: None

@petermattis petermattis requested review from cockroachdb/core-prs as code owners Nov 21, 2018

@cockroach-teamcity

This comment has been minimized.

Copy link
Member

cockroach-teamcity commented Nov 21, 2018

This change is Reviewable

@petermattis petermattis requested review from dt , benesch and tbg Nov 21, 2018

@petermattis petermattis force-pushed the petermattis:pmattis/disable-merges branch from 6fd3063 to 80eba6d Nov 21, 2018

@benesch
Copy link
Collaborator

benesch left a comment

:lgtm_strong: 🎉 thank you!

Reviewed 9 of 9 files at r1.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained


pkg/storage/merge_queue.go, line 127 at r1 (raw file):

}

func (mq *mergeQueue) mergesDisabled(desc *roachpb.RangeDescriptor) bool {

slight preference for mergesDisabledForRange or similar to disambiguate from the cluster-wide case


pkg/storage/merge_queue.go, line 167 at r1 (raw file):

		return false, 0
	}

nit: this splits the computation of sizeRatio from its usage. I think this should go after NeedsSplit and before sizeRatio.

@dt

dt approved these changes Nov 21, 2018

Copy link
Member

dt left a comment

:lgtm:

@petermattis petermattis force-pushed the petermattis:pmattis/disable-merges branch 3 times, most recently from b123243 to 6aef338 Nov 21, 2018

@petermattis
Copy link
Contributor

petermattis left a comment

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 1 stale)


pkg/cmd/roachtest/restore.go, line 245 at r3 (raw file):

						"cr.store.queue.merge.process.success": 20,
						"cr.store.queue.merge.process.failure": 20,
					})

I'm not sure if these thresholds are right as I'm still seeing a handful of merges during restore. PTAL at how this mechanism works.


pkg/cmd/roachtest/restore.go, line 272 at r3 (raw file):

// metric does not exceed some threshold during a test. For example, the
// restore and import tests verify that the range merge queue is inactive.
func verifyMetrics(ctx context.Context, c *cluster, m map[string]float64) error {

@mrtracy Inspired by verifyTxnPerSecond. PTAL and tell me if I'm doing anything wrong.

@petermattis petermattis force-pushed the petermattis:pmattis/disable-merges branch from 6aef338 to 6f83dc5 Nov 21, 2018

@mrtracy
Copy link
Collaborator

mrtracy left a comment

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 1 stale)


pkg/cmd/roachtest/restore.go, line 302 at r5 (raw file):

		now := timeutil.Now()
		request.StartNanos = now.Add(-sample).UnixNano()

What's the specific value in querying this as-soon-as-possible? I would suggest querying for a different range, something like [now-sample*3, now-sample], to ensure that you always get a few samples in the near past (and also don't run into the 'cannot query time series in the future' issue you're seeing here). The time series data is always historical, so you will still detect the same failures you were detecting before, you'll just detect them ten seconds later in the test.


pkg/cmd/roachtest/restore.go, line 324 at r5 (raw file):

			value := data[n-1].Value
			if value >= limit {
				return fmt.Errorf("%s: %.1f >= %.1f", name, value, limit)

Might want to output the timestamp of the failed sample to correlate it with logs.

@mrtracy
Copy link
Collaborator

mrtracy left a comment

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 1 stale)


pkg/cmd/roachtest/restore.go, line 302 at r5 (raw file):

Previously, mrtracy (Matt Tracy) wrote…

What's the specific value in querying this as-soon-as-possible? I would suggest querying for a different range, something like [now-sample*3, now-sample], to ensure that you always get a few samples in the near past (and also don't run into the 'cannot query time series in the future' issue you're seeing here). The time series data is always historical, so you will still detect the same failures you were detecting before, you'll just detect them ten seconds later in the test.

s/as-soon-as-possible/as-recently-as-possible/

@petermattis
Copy link
Contributor

petermattis left a comment

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 1 stale)


pkg/cmd/roachtest/restore.go, line 245 at r3 (raw file):

Previously, petermattis (Peter Mattis) wrote…

I'm not sure if these thresholds are right as I'm still seeing a handful of merges during restore. PTAL at how this mechanism works.

Ok, I figured out the reason for the handful of merges. My mechanism to disable range merges was only checking the start key of a range. The merge queue was periodically picking up the range immediately before the table being restored and seeing it as a candidate for merging and queueing the range. When the queue went to process the range it would decide not to do anything. But despite not doing anything the merge queue success metric would be bumped because that is how queue metrics work.

Something curious is that I wasn't seeing the log message from the merge queue I was expecting. Specifically, I expected the ranges not to be merged because the merged range would require a split. I'm going to dig into this further.


pkg/storage/merge_queue.go, line 143 at r5 (raw file):

	}
	_, err = mq.gossip.GetInfo(gossip.MakeTableDisableMergesKey(uint32(tableID2)))
	return err == nil

@benesch The logic changed here. Now I'm checking both the start key and the end key so that we don't queue a range if merges are disabled on it or on its right sibling.

@petermattis
Copy link
Contributor

petermattis left a comment

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 1 stale)


pkg/cmd/roachtest/restore.go, line 245 at r3 (raw file):

Previously, petermattis (Peter Mattis) wrote…

Ok, I figured out the reason for the handful of merges. My mechanism to disable range merges was only checking the start key of a range. The merge queue was periodically picking up the range immediately before the table being restored and seeing it as a candidate for merging and queueing the range. When the queue went to process the range it would decide not to do anything. But despite not doing anything the merge queue success metric would be bumped because that is how queue metrics work.

Something curious is that I wasn't seeing the log message from the merge queue I was expecting. Specifically, I expected the ranges not to be merged because the merged range would require a split. I'm going to dig into this further.

@benesch I added a bit of logic to the merge queue to understand the weirdness I'm seeing and disabled a portion of this PR. The log message says:

I181121 20:31:38.268606 1751 storage/merge_queue.go:272  [n1,merge,s1,r20/1:/Table/{23-53}] merging ranges: /Table/{23-53} + /{Table/53-Max} -> /{Table/23-Max}

This is output immediately after the call to shouldSplitRange. But this is super bizarre because /Table/53 is the table being restored.

Oh, I see what the problem is. We've allocated ID 53 for the table being restored, but the descriptor isn't written until the restore is finished. So sysCfg.NeedSplit doesn't see a need for a split at /Table/53. I'm going to file an issue about this. It is somewhat surprising that a table being restored doesn't show up in the admin UI, and it also has an effect here.


pkg/cmd/roachtest/restore.go, line 302 at r5 (raw file):

Previously, mrtracy (Matt Tracy) wrote…

s/as-soon-as-possible/as-recently-as-possible/

How will now-sample*3 avoid the cannot query time series in the future issue? Is that the only change you're suggesting here. There is no particular need to have the most up to date data, I just want the last time series value for these metrics. Tell me what code to type in order to achieve that, sensei.


pkg/cmd/roachtest/restore.go, line 324 at r5 (raw file):

Previously, mrtracy (Matt Tracy) wrote…

Might want to output the timestamp of the failed sample to correlate it with logs.

Done (or it will be when I push, I'm pulling on a thread right now that wants to unravel).

@petermattis petermattis force-pushed the petermattis:pmattis/disable-merges branch from 6f83dc5 to 213930a Nov 21, 2018

@petermattis petermattis force-pushed the petermattis:pmattis/disable-merges branch from 213930a to a931344 Nov 21, 2018

@mrtracy
Copy link
Collaborator

mrtracy left a comment

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 1 stale)


pkg/cmd/roachtest/restore.go, line 302 at r5 (raw file):

Previously, petermattis (Peter Mattis) wrote…

How will now-sample*3 avoid the cannot query time series in the future issue? Is that the only change you're suggesting here. There is no particular need to have the most up to date data, I just want the last time series value for these metrics. Tell me what code to type in order to achieve that, sensei.

Well, the most efficient option in terms of queries would be to wait until the end of the test, then query the time series for the entire duration of the test and verify every individual timestamp. However, it appears your intention here is to stop the test as soon as this condition fails.

The "cannot query in the future" is actually occurring because you cannot query for the current time period while it is still open (to avoid incomplete datapoints, where not all servers have reported). So, it's not really safe to query only ten seconds in the past, you need to have a start time earlier than that.

My suggestion is just to grab the last few datapoints by querying starting from three sample periods in the past, instead of just one; this will definitely return datapoints and will not run into this particular error. From a query perspective this is not more expensive than querying a single datapoint.

So in clear terms, if you use this line:

request.StartNanos = now.Add(-sample*3).UnixNano()

I think this will work as-is and the code should never encounter the "time series in the past" error; in fact, you should remove that check because it would indicate an actual problem.

@dt
Copy link
Member

dt left a comment

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 1 stale)


pkg/cmd/roachtest/restore.go, line 245 at r3 (raw file):

Previously, petermattis (Peter Mattis) wrote…

@benesch I added a bit of logic to the merge queue to understand the weirdness I'm seeing and disabled a portion of this PR. The log message says:

I181121 20:31:38.268606 1751 storage/merge_queue.go:272  [n1,merge,s1,r20/1:/Table/{23-53}] merging ranges: /Table/{23-53} + /{Table/53-Max} -> /{Table/23-Max}

This is output immediately after the call to shouldSplitRange. But this is super bizarre because /Table/53 is the table being restored.

Oh, I see what the problem is. We've allocated ID 53 for the table being restored, but the descriptor isn't written until the restore is finished. So sysCfg.NeedSplit doesn't see a need for a split at /Table/53. I'm going to file an issue about this. It is somewhat surprising that a table being restored doesn't show up in the admin UI, and it also has an effect here.

Holding off on writing the descriptor until the end was an intentional decision -- we were pretty uncomfortable with AddSSTable's relatively unsafe interaction with anyone else trying to operate in the same keys space, so not writing a descriptor was how we keep SQL/the adminui/etc, from having any reason to be in our key range.

@@ -123,6 +124,25 @@ func (mq *mergeQueue) enabled() bool {
return st.Version.IsMinSupported(cluster.VersionRangeMerges) && storagebase.MergeQueueEnabled.Get(&st.SV)
}

func (mq *mergeQueue) mergesDisabledForRange(desc *roachpb.RangeDescriptor) bool {
_, tableID, err := keys.DecodeTablePrefix(desc.StartKey.AsRawKey())

This comment has been minimized.

@tbg

tbg Nov 21, 2018

Member

Couldn't the range span multiple tables? I think this means you have to look at the range tableID1..tableID2

@petermattis petermattis force-pushed the petermattis:pmattis/disable-merges branch from d7a1b14 to c176bf5 Nov 21, 2018

@petermattis
Copy link
Contributor

petermattis left a comment

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 1 stale)


pkg/storage/merge_queue.go, line 128 at r4 (raw file):

Previously, tbg (Tobias Schottdorf) wrote…

Couldn't the range span multiple tables? I think this means you have to look at the range tableID1..tableID2

Yes, the range can span multiple tables, but the sysCfg.NeedsSplit will capture that. The only case here that is interesting is when restore or import have created a table that isn't yet in system.descriptors. I believe that makes it fine to only check the table IDs in the start and end key, and not everything in between.

@benesch
Copy link
Collaborator

benesch left a comment

Reviewed 1 of 1 files at r2, 1 of 2 files at r3, 4 of 4 files at r4, 1 of 2 files at r5, 1 of 1 files at r6.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 1 stale)


pkg/cmd/roachtest/restore.go, line 245 at r3 (raw file):

Ok, I figured out the reason for the handful of merges. My mechanism to disable range merges was only checking the start key of a range. The merge queue was periodically picking up the range immediately before the table being restored and seeing it as a candidate for merging and queueing the range. When the queue went to process the range it would decide not to do anything. But despite not doing anything the merge queue success metric would be bumped because that is how queue metrics work.

We need to fix this more generally at the metrics level. There are various other queues that can nope out of processing a replica but still have that counted as a successful action. The replica GC queue comes to mind.

Your solution here LGTM though.


pkg/cmd/roachtest/restore.go, line 328 at r3 (raw file):

		}
	}
}

I didn't review this, just FYI. Figured @mrtracy had it covered.


pkg/storage/merge_queue.go, line 128 at r4 (raw file):

Previously, petermattis (Peter Mattis) wrote…

Yes, the range can span multiple tables, but the sysCfg.NeedsSplit will capture that. The only case here that is interesting is when restore or import have created a table that isn't yet in system.descriptors. I believe that makes it fine to only check the table IDs in the start and end key, and not everything in between.

Agreed.


pkg/storage/merge_queue.go, line 143 at r5 (raw file):

Previously, petermattis (Peter Mattis) wrote…

@benesch The logic changed here. Now I'm checking both the start key and the end key so that we don't queue a range if merges are disabled on it or on its right sibling.

LGTM


pkg/storage/merge_queue_test.go, line 168 at r4 (raw file):

		{
			startKey: tableKey(4),
			endKey:   append(tableKey(3), 'z'),

Isn't this interval inverted? Did you want append(tableKey(4), 'z')?

@spencerkimball

This comment has been minimized.

Copy link
Member

spencerkimball commented Nov 22, 2018

I was under the impression that when splitting by load or Admin intention, we noted that fact in the range descriptor to prevent merging for some period of time. Do we not prevent the merge queue from attempting to merge load based splits?

@petermattis

This comment has been minimized.

Copy link
Contributor

petermattis commented Nov 23, 2018

I was under the impression that when splitting by load or Admin intention, we noted that fact in the range descriptor to prevent merging for some period of time. Do we not prevent the merge queue from attempting to merge load based splits?

Yes, the merge queue has logic to avoid merging ranges if load-based splitting would split them right away. It does this by assuming the load on the RHS range is the same as the load on the LHS range (it only knows the load on the LHS range because the RHS might be remote).

We don't currently have any mechanism to prevent the merge queue from merging ranges split manually.

@spencerkimball

This comment has been minimized.

Copy link
Member

spencerkimball commented Nov 23, 2018

The lack of a mechanism to introduce hysteresis here feels short sighted. As soon as load slacks on a busy LBS range, it’ll get merged. Seems we should wait a day for that as load is somewhat (at least) likely to reappear on formerly busy ranges.

It feels overly complex to introduce a gossip based merge-freezing mechanism when something more fundamental at the range descriptor level feels necessary for load based splitting, and would handle both cases with more deliberate intention.

@petermattis

This comment has been minimized.

Copy link
Contributor

petermattis commented Nov 23, 2018

@benesch Care to weigh in here? I know you considered placing some marker in the RangeDescriptor, but I don't recall why that wasn't implemented.

@tbg

This comment has been minimized.

Copy link
Member

tbg commented Nov 23, 2018

Adding this to the range descriptor is the alternative to this (and probably the solution for 2.2), but it could be a little more involved than anticipated. Adding fields to protos which we use via CPut is difficult because old versions will (perhaps?) drop the new field on the floor and get bogus mismatches (failed replica changes). I think we've been burned by this in the past more than once.

On the flip side, this change leaves no permanent trace and can be kicked out at any time, so it neither locks us into any particular decision nor is it likely to cause trouble (including in backports).

@spencerkimball

This comment has been minimized.

Copy link
Member

spencerkimball commented Nov 23, 2018

@tbg, I don’t fully understand he point about CPut. Wouldn’t the value of whatever sub proto we might add – to indicate reason for a split – just be transmitted with the expected value on the next CPut?

@tbg

This comment has been minimized.

Copy link
Member

tbg commented Nov 26, 2018

@spencerkimball imagine you're a node which doesn't know about this new field, and you read a range descriptor which has the field. You get RangeDescriptor{Stuff} instead of RangeDescriptor{Stuff, NewField: NewStuff} (ignoring XXX_Unrecognized which may save the day but I'm not sure about that). When you go and do a CPut, you'll serialize RangeDescriptor{Stuff} instead of RangeDescriptor{Stuff, NewField: NewStuff} and the CPut fails.

This can all be made work (and maybe it'll "just work"), but it's more work than this change and I don't want to introduce any more bugs right now. I agree that there's a shortcoming with respect to load based splitting, but the current priority is getting our fundamental stability problems under control (#31409), which is what ultimately motivated this change.

Perhaps the concern is overblown and going the other route right away would've made sense, which is something you could explore if you have the bandwidth but note that you'd need mixed-version testing and must not introduce a cluster setting (if this needs to be backported, which maybe isn't a requirement with this PR in).

@benesch

This comment has been minimized.

Copy link
Collaborator

benesch commented Nov 26, 2018

I added a field to the RangeDescriptor proto without too much trouble here: ce93650. The trick is to avoid the set-but-zero state. New fields need to be either entirely unset or set to non-zero value.

I agree that we should commit this fix in the short time. I also agree that right solution involves storing persistent data somewhere—maybe the range descriptor, maybe a range local key—but I think we want to send any change like that through a lightweight RFC process.

@petermattis petermattis force-pushed the petermattis:pmattis/disable-merges branch from c176bf5 to 3f7b4e5 Nov 26, 2018

petermattis added some commits Nov 21, 2018

ccl/backupccl,ccl/importccl: disable merges during IMPORT and RESTORE
Add a facility for periodically gossiping keys which disable range
merges for a set of tables. This is used by IMPORT and RESTORE to
disable range merges on tables being imported / restored. This is done
to prevent the merge queue from fighting against the splitting performed
by IMPORT and RESTORE.

Fixes #29268

Release note (enterprise change): Disable range merges on tables that
are being restored or imported into.
roachtest: add a facility for verifying timeseries metrics
Extend `restore2TB/nodes=10` and `import/tpch/nodes=4` to verify that
the range merge queue in not active during the test.

Release note: None

@petermattis petermattis force-pushed the petermattis:pmattis/disable-merges branch from 3f7b4e5 to 776bc06 Nov 26, 2018

@petermattis
Copy link
Contributor

petermattis left a comment

Ok, this should be good to go.

@spencerkimball Did the response from @tbg and @benesch mitigate your concerns?

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 1 stale)


pkg/cmd/roachtest/restore.go, line 302 at r5 (raw file):

Previously, mrtracy (Matt Tracy) wrote…

Well, the most efficient option in terms of queries would be to wait until the end of the test, then query the time series for the entire duration of the test and verify every individual timestamp. However, it appears your intention here is to stop the test as soon as this condition fails.

The "cannot query in the future" is actually occurring because you cannot query for the current time period while it is still open (to avoid incomplete datapoints, where not all servers have reported). So, it's not really safe to query only ten seconds in the past, you need to have a start time earlier than that.

My suggestion is just to grab the last few datapoints by querying starting from three sample periods in the past, instead of just one; this will definitely return datapoints and will not run into this particular error. From a query perspective this is not more expensive than querying a single datapoint.

So in clear terms, if you use this line:

request.StartNanos = now.Add(-sample*3).UnixNano()

I think this will work as-is and the code should never encounter the "time series in the past" error; in fact, you should remove that check because it would indicate an actual problem.

Done.


pkg/cmd/roachtest/restore.go, line 242 at r8 (raw file):

				// TODO(peter): This currently causes the test to fail because we see a
				// flurry of valid merges when the restore finishes.

I had to disable my nifty merge check in this test because it would fire incorrectly at the end of the test. Once the restore is finished merges are unblocked and the merge queue would merrily merge a number of ranges. I need to think what can be done about this, but I don't want the fix to hold up this PR.


pkg/storage/merge_queue.go, line 128 at r4 (raw file):

Previously, benesch (Nikhil Benesch) wrote…

Agreed.

Done.


pkg/storage/merge_queue_test.go, line 168 at r4 (raw file):

Previously, benesch (Nikhil Benesch) wrote…

Isn't this interval inverted? Did you want append(tableKey(4), 'z')?

Yep. Fixed.

@spencerkimball

This comment has been minimized.

Copy link
Member

spencerkimball commented Nov 26, 2018

@benesch
Copy link
Collaborator

benesch left a comment

Reviewed 3 of 3 files at r9, 2 of 2 files at r10.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 1 stale)


pkg/cmd/roachtest/restore.go, line 242 at r8 (raw file):

Previously, petermattis (Peter Mattis) wrote…

I had to disable my nifty merge check in this test because it would fire incorrectly at the end of the test. Once the restore is finished merges are unblocked and the merge queue would merrily merge a number of ranges. I need to think what can be done about this, but I don't want the fix to hold up this PR.

Why is the restore creating ranges that should be merged away? o_O Perhaps the store dump has some bad splits that restore is faithfully recreating, in which case we could regen the store dump on a cluster with merges enabled.

@petermattis

This comment has been minimized.

Copy link
Contributor

petermattis commented Nov 26, 2018

Why is the restore creating ranges that should be merged away? o_O Perhaps the store dump has some bad splits that restore is faithfully recreating, in which case we could regen the store dump on a cluster with merges enabled.

Unfortunately, I nuked that cluster. I'm sure this would reproduce if you ran restore2TB yourself. 😄

@petermattis

This comment has been minimized.

Copy link
Contributor

petermattis commented Nov 26, 2018

bors r=benesch,dt

craig bot pushed a commit that referenced this pull request Nov 26, 2018

Merge #32538
32538: ccl/backupccl,ccl/importccl: disable merges during IMPORT and RESTORE r=benesch,dt a=petermattis

Add a facility for periodically gossiping keys which disable range
merges for a set of tables. This is used by IMPORT and RESTORE to
disable range merges on tables being imported / restored. This is done
to prevent the merge queue from fighting against the splitting performed
by IMPORT and RESTORE.

Fixes #29268

Release note: None

Co-authored-by: Peter Mattis <petermattis@gmail.com>
@craig

This comment has been minimized.

Copy link

craig bot commented Nov 26, 2018

Build failed

@petermattis

This comment has been minimized.

Copy link
Contributor

petermattis commented Nov 26, 2018

Known CI failure: #31974

bors r=benesch,dt

craig bot pushed a commit that referenced this pull request Nov 26, 2018

Merge #32491 #32538
32491: roachprod: change geo flag to group nodes in a region better r=m-schneider a=m-schneider

Currently when using the --geo flag with --clouds=aws, we round robin
regions and availibility zones for AWS. This results in a cluster where
we have nodes in a giving region scattered.

This PR clusters zones in a regions so now giving 3 regions. The first x
nodes will all be in region 1 the next x in region 2 and the remainder
in region 3.

Release note: None

32538: ccl/backupccl,ccl/importccl: disable merges during IMPORT and RESTORE r=benesch,dt a=petermattis

Add a facility for periodically gossiping keys which disable range
merges for a set of tables. This is used by IMPORT and RESTORE to
disable range merges on tables being imported / restored. This is done
to prevent the merge queue from fighting against the splitting performed
by IMPORT and RESTORE.

Fixes #29268

Release note: None

Co-authored-by: Masha Schneider <masha@cockroachlabs.com>
Co-authored-by: Peter Mattis <petermattis@gmail.com>
@benesch

This comment has been minimized.

Copy link
Collaborator

benesch commented Nov 26, 2018

The Bors batch timed out. Sigh. Bors, why don't you have any retry loops?

bors r+

craig bot pushed a commit that referenced this pull request Nov 26, 2018

Merge #31928 #32491 #32538 #32612
31928: docs: Add statistics SQL diagrams r=lhirata a=lhirata

Release notes: none

32491: roachprod: change geo flag to group nodes in a region better r=benesch a=m-schneider

Currently when using the --geo flag with --clouds=aws, we round robin
regions and availibility zones for AWS. This results in a cluster where
we have nodes in a giving region scattered.

This PR clusters zones in a regions so now giving 3 regions. The first x
nodes will all be in region 1 the next x in region 2 and the remainder
in region 3.

Release note: None

32538: ccl/backupccl,ccl/importccl: disable merges during IMPORT and RESTORE r=benesch a=petermattis

Add a facility for periodically gossiping keys which disable range
merges for a set of tables. This is used by IMPORT and RESTORE to
disable range merges on tables being imported / restored. This is done
to prevent the merge queue from fighting against the splitting performed
by IMPORT and RESTORE.

Fixes #29268

Release note: None

32612: opt: relax provided orderings assertion r=RaduBerinde a=RaduBerinde

This race-only check for provided orderings is too strict - in some
cases, the provided ordering takes into account facts that are not
reflected in the expression's FDs; the currently known case is when a
constrained scan (created during exploration) detects a contradiction
that is not detected during normalization. #32320 tracks the issue of
removing this gap between constraints and index constraints. Until
then, we disable this check.

Fixes #32520.

Release note: None

Co-authored-by: Lauren <lauren@cockroachlabs.com>
Co-authored-by: Masha Schneider <masha@cockroachlabs.com>
Co-authored-by: Peter Mattis <petermattis@gmail.com>
Co-authored-by: Radu Berinde <radu@cockroachlabs.com>
@craig

This comment has been minimized.

Copy link

craig bot commented Nov 26, 2018

Build succeeded

@craig craig bot merged commit 776bc06 into cockroachdb:master Nov 26, 2018

3 checks passed

GitHub CI (Cockroach) TeamCity build finished
Details
bors Build succeeded
Details
license/cla Contributor License Agreement is signed.
Details

@petermattis petermattis deleted the petermattis:pmattis/disable-merges branch Nov 27, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment