Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

restore: limit number of files restore span entries #119785

Closed
dt opened this issue Feb 29, 2024 · 1 comment · Fixed by #119840
Closed

restore: limit number of files restore span entries #119785

dt opened this issue Feb 29, 2024 · 1 comment · Fixed by #119840
Assignees
Labels
C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. O-postmortem Originated from a Postmortem action item. O-support Originated from a customer P-1 Issues/test failures with a fix SLA of 1 month T-disaster-recovery

Comments

@dt
Copy link
Member

dt commented Feb 29, 2024

Today we size restore span entries solely based on the sum size of the files grouped into that entry, to hit the target size (384mb by default).

However, if a backup contains many small files, this can mean that a single span, to hit its target size, includes many files. Since all files in a span are allowed to overlap, we are obligated to open all of them to merge them. This, however, combines poorly with the above heuristic that can put a very large number (hundreds or thousands) of files into one span, as the restore processor then needs to open thousands of files concurrently to process that span.

We should have an additional secondary target limit on the number of files (64? 128?) in a span when constructing these spans.

Jira issue: CRDB-36313

@dt dt added C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. O-support Originated from a customer P-1 Issues/test failures with a fix SLA of 1 month labels Feb 29, 2024
@dt dt added this to Backlog in Disaster Recovery Backlog via automation Feb 29, 2024
Copy link

blathers-crl bot commented Feb 29, 2024

cc @cockroachdb/disaster-recovery

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. O-postmortem Originated from a Postmortem action item. O-support Originated from a customer P-1 Issues/test failures with a fix SLA of 1 month T-disaster-recovery
Development

Successfully merging a pull request may close this issue.

2 participants