backupccl: prepare RESTORE router for multitenancy #81989

msbutler · 2022-05-27T20:30:53Z

In a multi tenant cluster, Restore's distSQL processors are assigned to sql instances using the sqlInstanceID. Currently, the splitAndScatterProcessor routes a scattered range to a sql instance running the restoreProcessor using the nodeID returned by the adminScatterRequest, which actually identifies a KV instance. In other words, to route ranges for restore ingestion after scatter, we currently assume the list of sqlInstanceIds from planning are identical to the nodeIDs returned by split and scatter during execution, which is certainly not the case, implying multitenant restore could be significantly slower. If there are fewer kv instances than planned sql instances ,for example, a subset of sql instances would never get sent any ranges to ingest!

In a non-multiregion multitenant cluster, we don't know (or even care) which sql instance is "closest" to a given a kv instance; thus, we ought to route ranges for ingestion such that we balance load across all available sql instances.

A simple solution: use a hashRouter as oppose to a rangeRouter. During planning, map each available kv node to a set of sql instances. If the backup job detects significant churn of sql instances, the job should be replanned.
A better solution: route ranges to sql instances dynamically. I'm not sure if this is possible right now.

In a multiregion multitenant cluster, we will likely want to route a range to a sql instance that is "close" to the range's leaseholder (or at least a follower?). Solution: apply the solution above, by region.

Jira issue: CRDB-16375

The text was updated successfully, but these errors were encountered:

blathers-crl · 2022-05-27T20:30:55Z

cc @cockroachdb/bulk-io

github-actions · 2023-12-04T11:04:50Z

We have marked this issue as stale because it has been inactive for
18 months. If this issue is still relevant, removing the stale label
or adding a comment will keep it active. Otherwise, we'll close it in
10 days to keep the issue queue tidy. Thank you for your contribution
to CockroachDB!

msbutler · 2023-12-15T16:45:41Z

Not working on this, but this is still a problem.

msbutler added C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. A-disaster-recovery T-disaster-recovery labels May 27, 2022

blathers-crl bot added this to Triage in Disaster Recovery Backlog May 27, 2022

msbutler changed the title ~~backupccl: prepare restore router for Multitenancy~~ backupccl: prepare RESTORE router for Multitenancy May 27, 2022

msbutler changed the title ~~backupccl: prepare RESTORE router for Multitenancy~~ backupccl: prepare RESTORE router for multitenancy May 27, 2022

msbutler mentioned this issue May 27, 2022

backupccl: ingest range on node running scatter if scatter fails #81319

Merged

mari-crl added sync-me and removed sync-me labels Jun 2, 2022

jlinder added the sync-me-5 label Jun 3, 2022

mari-crl added the sync-me label Jun 3, 2022

livlobo moved this from Triage to Backup/Restore in Disaster Recovery Backlog Jun 7, 2022

livlobo assigned msbutler Jun 8, 2022

github-actions bot added the no-issue-activity label Dec 4, 2023

msbutler removed their assignment Dec 15, 2023

github-actions bot removed the no-issue-activity label Dec 18, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

backupccl: prepare RESTORE router for multitenancy #81989

backupccl: prepare RESTORE router for multitenancy #81989

msbutler commented May 27, 2022 •

edited by cockroach-jira-scripts

blathers-crl bot commented May 27, 2022

github-actions bot commented Dec 4, 2023

msbutler commented Dec 15, 2023

backupccl: prepare RESTORE router for multitenancy #81989

backupccl: prepare RESTORE router for multitenancy #81989

Comments

msbutler commented May 27, 2022 • edited by cockroach-jira-scripts

blathers-crl bot commented May 27, 2022

github-actions bot commented Dec 4, 2023

msbutler commented Dec 15, 2023

msbutler commented May 27, 2022 •

edited by cockroach-jira-scripts