New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ddl: make local sort generate only one subtask for each instance #50925
Conversation
Hi @tangenta. Thanks for your PR. PRs from untrusted users cannot be marked as trusted with I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## master #50925 +/- ##
================================================
+ Coverage 70.5281% 72.6600% +2.1319%
================================================
Files 1466 1466
Lines 434112 437986 +3874
================================================
+ Hits 306171 318241 +12070
+ Misses 108699 99707 -8992
- Partials 19242 20038 +796
Flags with carried forward coverage won't be shown. Click here to find out more.
|
/retest |
@tangenta: Cannot trigger testing until a trusted user reviews the PR and leaves an In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: wjhuang2016, zimulala The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
[LGTM Timeline notifier]Timeline:
|
What problem does this PR solve?
Issue Number: ref #48795
Problem Summary:
Previously, the subtask count is unreasonable:
tidb/pkg/ddl/backfilling_dist_scheduler.go
Lines 325 to 327 in 35a7c9e
It is inaccurate to use 96M to calculate the size of index data in a region. As a result, the number of subtasks will become very large, which is quite inefficient.
What changed and how does it work?
Make local sort generate only one subtask for each instance.
Check List
Tests
1.194TiB table, 8 worker count:
Before this PR: 1 hour 1 min 29 sec
After this PR: 26 min 10 sec
Side effects
Documentation
Release note
Please refer to Release Notes Language Style Guide to write a quality release note.