Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tikv: update the Load Base Split introduction #9656

Merged
merged 6 commits into from Aug 5, 2022

Conversation

Oreoxmt
Copy link
Collaborator

@Oreoxmt Oreoxmt commented Jul 20, 2022

First-time contributors' checklist

What is changed, added or deleted? (Required)

Update the description of load base split:

  • Add split.region-cpu-overload-threshold-ratio

Which TiDB version(s) do your changes apply to? (Required)

Tips for choosing the affected version(s):

By default, CHOOSE MASTER ONLY so your changes will be applied to the next TiDB major or minor releases. If your PR involves a product feature behavior change or a compatibility change, CHOOSE THE AFFECTED RELEASE BRANCH(ES) AND MASTER.

For details, see tips for choosing the affected versions.

  • master (the latest development version)
  • v6.2 (TiDB 6.2 versions)
  • v6.1 (TiDB 6.1 versions)
  • v5.4 (TiDB 5.4 versions)
  • v5.3 (TiDB 5.3 versions)
  • v5.2 (TiDB 5.2 versions)
  • v5.1 (TiDB 5.1 versions)
  • v5.0 (TiDB 5.0 versions)

What is the related PR or file link(s)?

Do your changes match any of the following descriptions?

  • Delete files
  • Change aliases
  • Need modification after applied to another branch
  • Might cause conflicts after applied to another branch

@Oreoxmt Oreoxmt added translation/from-docs-cn This PR is translated from a PR in pingcap/docs-cn. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. area/scheduling Indicates that the Issue or PR belongs to the area of scheduling. v6.2 This PR/issue applies to TiDB v6.2. labels Jul 20, 2022
@ti-chi-bot
Copy link
Member

ti-chi-bot commented Jul 20, 2022

[REVIEW NOTIFICATION]

This pull request has been approved by:

  • TomShawn

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

@ti-chi-bot ti-chi-bot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Jul 20, 2022
@Oreoxmt Oreoxmt self-assigned this Jul 21, 2022
@Oreoxmt Oreoxmt removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 21, 2022
- `split.byte-threshold`: The threshold of read load for the Region to be split, default is 30 MiB per second.
- `split.region-cpu-overload-threshold-ratio`: The threshold of CPU usage (the percentage of CPU time of the read thread pool) for the Region to be split, default is `0.25`.

If the sum of all types of read requests per second for a Region exceeds the QPS threshold or traffic threshold for 10 consecutive seconds, PD splits the Region.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about the CPU usage threshold?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, it's not that "PD splits the Region", this is a TiKV feature that doesn't involve PD, so the precise description should be like "TiKV tries to split the Region".

@@ -207,6 +207,7 @@ The following TiKV configuration items can be modified online:
| `backup.num-threads` | The number of backup threads (supported since v4.0.3) |
| `split.qps-threshold` | The threshold to execute `load-base-split` on a Region. If the QPS of read requests for a Region exceeds `qps-threshold` for a consecutive period of time, this Region should be split.|
| `split.byte-threshold` | The threshold to execute `load-base-split` on a Region. If the traffic of read requests for a Region exceeds the `byte-threshold` for a consecutive period of time, this Region should be split. |
| `split.region-cpu-overload-threshold-ratio` | The threshold to execute `load-base-split` on a Region. If the CPU usage of unified read pool for a Region exceeds the `region-cpu-overload-threshold-ratio` for a consecutive period of time, this Region should be split. (supported since v6.2.0) |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unified Read Pool is a TiKV-specific concept, so I think it should be capitalized.

Suggested change
| `split.region-cpu-overload-threshold-ratio` | The threshold to execute `load-base-split` on a Region. If the CPU usage of unified read pool for a Region exceeds the `region-cpu-overload-threshold-ratio` for a consecutive period of time, this Region should be split. (supported since v6.2.0) |
| `split.region-cpu-overload-threshold-ratio` | The threshold to execute `load-base-split` on a Region. If the CPU usage of Unified Read Pool for a Region exceeds the `region-cpu-overload-threshold-ratio` for a consecutive period of time, this Region should be split. (supported since v6.2.0) |

@Oreoxmt Oreoxmt requested a review from JmPotato July 25, 2022 02:59
Copy link
Member

@JmPotato JmPotato left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rest LGTM.

configure-load-base-split.md Outdated Show resolved Hide resolved
Co-authored-by: JmPotato <github@ipotato.me>
@Oreoxmt
Copy link
Collaborator Author

Oreoxmt commented Jul 25, 2022

/status LGT1

@ti-chi-bot ti-chi-bot added the status/LGT1 Indicates that a PR has LGTM 1. label Jul 25, 2022

The Region split by Load Base Split will not be merged quickly. On the one hand, PD's `MergeChecker` skips the hot Regions; on the other hand, PD also determines whether to merge two Regions according to `QPS` in the heartbeat information, to avoid the merging of two Regions with high `QPS`.

## Usage

The Load Base Split feature is currently controlled by the `split.qps-threshold` parameter (QPS threshold) and `split.byte-threshold` parameter (traffic threshold). If the sum of all types of read requests per second for a Region exceeds the QPS threshold or traffic threshold for 10 consecutive seconds, PD splits the Region.
The Load Base Split feature is currently controlled by three parameters:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The Load Base Split feature is currently controlled by three parameters:
The Load Base Split feature is currently controlled by the following parameters:


Load Base Split is enabled by default, but the parameter is set to a rather high value. `split.qps-threshold` defaults to `3000` and `split.byte-threshold` defaults to 30MB/s. If you want to disable this feature, set the two thresholds high enough at the same time.
- `split.qps-threshold`: The threshold of QPS for the Region to be split, default is `3000` per second.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- `split.qps-threshold`: The threshold of QPS for the Region to be split, default is `3000` per second.
- `split.qps-threshold`: The QPS threshold at which a Region is identified as a hotspot. The default value is `3000` per second.


Load Base Split is enabled by default, but the parameter is set to a rather high value. `split.qps-threshold` defaults to `3000` and `split.byte-threshold` defaults to 30MB/s. If you want to disable this feature, set the two thresholds high enough at the same time.
- `split.qps-threshold`: The threshold of QPS for the Region to be split, default is `3000` per second.
- `split.byte-threshold`: The threshold of read load for the Region to be split, default is 30 MiB per second.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- `split.byte-threshold`: The threshold of read load for the Region to be split, default is 30 MiB per second.
- `split.byte-threshold`: The traffic threshold at which a Region is identified as a hotspot. The default value is 30 MiB per second.

Load Base Split is enabled by default, but the parameter is set to a rather high value. `split.qps-threshold` defaults to `3000` and `split.byte-threshold` defaults to 30MB/s. If you want to disable this feature, set the two thresholds high enough at the same time.
- `split.qps-threshold`: The threshold of QPS for the Region to be split, default is `3000` per second.
- `split.byte-threshold`: The threshold of read load for the Region to be split, default is 30 MiB per second.
- `split.region-cpu-overload-threshold-ratio`: The threshold of CPU usage (the percentage of CPU time of the read thread pool) for the Region to be split, default is `0.25`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

English and Chinese are inconsistent. Please check and update.

- `split.byte-threshold`: The threshold of read load for the Region to be split, default is 30 MiB per second.
- `split.region-cpu-overload-threshold-ratio`: The threshold of CPU usage (the percentage of CPU time of the read thread pool) for the Region to be split, default is `0.25`.

If the sum of all types of read requests per second for a Region exceeds the QPS threshold, traffic threshold, or CPU usage threshold for 10 consecutive seconds, TiKV tries to split the Region.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
If the sum of all types of read requests per second for a Region exceeds the QPS threshold, traffic threshold, or CPU usage threshold for 10 consecutive seconds, TiKV tries to split the Region.
If a Region meets one of the following conditions for 10 consecutive seconds, TiKV tries to split the Region:
- the sum of its read requests exceeds `split.qps-threshold`.
- its traffic exceeds `split.byte-threshold`.
- its CPU usage in the Unified Read Pool exceeds `split.region-cpu-overload-threshold-ratio`.

@Oreoxmt Oreoxmt requested a review from TomShawn August 4, 2022 05:56
@Oreoxmt Oreoxmt added the status/PTAL This PR is ready for reviewing. label Aug 4, 2022
@@ -207,6 +207,7 @@ The following TiKV configuration items can be modified online:
| `backup.num-threads` | The number of backup threads (supported since v4.0.3) |
| `split.qps-threshold` | The threshold to execute `load-base-split` on a Region. If the QPS of read requests for a Region exceeds `qps-threshold` for a consecutive period of time, this Region should be split.|
| `split.byte-threshold` | The threshold to execute `load-base-split` on a Region. If the traffic of read requests for a Region exceeds the `byte-threshold` for a consecutive period of time, this Region should be split. |
| `split.region-cpu-overload-threshold-ratio` | The threshold to execute `load-base-split` on a Region. If the CPU usage in the Unified Read Pool for a Region exceeds the `region-cpu-overload-threshold-ratio` for a consecutive period of time, this Region should be split. (supported since v6.2.0) |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does "for a consecutive period of time" mean 10 seconds?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does "for a consecutive period of time" mean 10 seconds?

Yes, fixed in 46fa37c

Copy link
Contributor

@TomShawn TomShawn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rest LGTM

@ti-chi-bot ti-chi-bot added status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels Aug 4, 2022
@Oreoxmt Oreoxmt removed the status/PTAL This PR is ready for reviewing. label Aug 5, 2022
@Oreoxmt
Copy link
Collaborator Author

Oreoxmt commented Aug 5, 2022

/merge

@ti-chi-bot
Copy link
Member

This pull request has been accepted and is ready to merge.

Commit hash: 46fa37c

@ti-chi-bot ti-chi-bot added the status/can-merge Indicates a PR has been approved by a committer. label Aug 5, 2022
@ti-chi-bot ti-chi-bot merged commit de9b182 into pingcap:master Aug 5, 2022
@Oreoxmt Oreoxmt deleted the translate/docs-cn/10519 branch February 14, 2023 08:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/scheduling Indicates that the Issue or PR belongs to the area of scheduling. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2. translation/from-docs-cn This PR is translated from a PR in pingcap/docs-cn. v6.2 This PR/issue applies to TiDB v6.2.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants