From 7bdd8e3240446f304cad1324b4f485caf23c5f83 Mon Sep 17 00:00:00 2001 From: lucasliang Date: Tue, 31 Oct 2023 11:58:59 +0800 Subject: [PATCH 1/6] Polish docs Signed-off-by: lucasliang --- best-practices/pd-scheduling-best-practices.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/best-practices/pd-scheduling-best-practices.md b/best-practices/pd-scheduling-best-practices.md index d8add90c1688d..cebdac2933d09 100644 --- a/best-practices/pd-scheduling-best-practices.md +++ b/best-practices/pd-scheduling-best-practices.md @@ -298,3 +298,7 @@ If a TiKV node fails, PD defaults to setting the corresponding node to the **dow Practically, if a node failure is considered unrecoverable, you can immediately take it offline. This makes PD replenish replicas soon in another node and reduces the risk of data loss. In contrast, if a node is considered recoverable, but the recovery cannot be done in 30 minutes, you can temporarily adjust `max-store-down-time` to a larger value to avoid unnecessary replenishment of the replicas and resources waste after the timeout. In TiDB v5.2.0, TiKV introduces the mechanism of slow TiKV node detection. By sampling the requests in TiKV, this mechanism works out a score ranging from 1 to 100. A TiKV node with a score higher than or equal to 80 is marked as slow. You can add [`evict-slow-store-scheduler`](/pd-control.md#scheduler-show--add--remove--pause--resume--config--describe) to detect and schedule slow nodes. If only one TiKV is detected as slow, and the slow score reaches the limit (80 by default), the leader in this node will be evicted (similar to the effect of `evict-leader-scheduler`). + + > **Note:** +> +> When the `evict-slow-store-scheduler` is enabled, there is a possibility that some leaders on slow nodes may have to wait for delayed requests to be processed before the leader eviction process can proceed. This can result in an overall extended duration for the leader eviction. It is recommended to synchronize the configuration with the [`store-io-pool-size`](/tikv-configuration-file.md#store-io-pool-size-new-in-v530) to avoid this situation.. From 25e482bed7492a4bd964c63c1bb8fa5d418a2165 Mon Sep 17 00:00:00 2001 From: lucasliang Date: Tue, 31 Oct 2023 16:33:35 +0800 Subject: [PATCH 2/6] Polish format. Signed-off-by: lucasliang --- best-practices/pd-scheduling-best-practices.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/best-practices/pd-scheduling-best-practices.md b/best-practices/pd-scheduling-best-practices.md index cebdac2933d09..0f00f72336e0f 100644 --- a/best-practices/pd-scheduling-best-practices.md +++ b/best-practices/pd-scheduling-best-practices.md @@ -299,6 +299,6 @@ Practically, if a node failure is considered unrecoverable, you can immediately In TiDB v5.2.0, TiKV introduces the mechanism of slow TiKV node detection. By sampling the requests in TiKV, this mechanism works out a score ranging from 1 to 100. A TiKV node with a score higher than or equal to 80 is marked as slow. You can add [`evict-slow-store-scheduler`](/pd-control.md#scheduler-show--add--remove--pause--resume--config--describe) to detect and schedule slow nodes. If only one TiKV is detected as slow, and the slow score reaches the limit (80 by default), the leader in this node will be evicted (similar to the effect of `evict-leader-scheduler`). - > **Note:** +> **Note:** > > When the `evict-slow-store-scheduler` is enabled, there is a possibility that some leaders on slow nodes may have to wait for delayed requests to be processed before the leader eviction process can proceed. This can result in an overall extended duration for the leader eviction. It is recommended to synchronize the configuration with the [`store-io-pool-size`](/tikv-configuration-file.md#store-io-pool-size-new-in-v530) to avoid this situation.. From 98707fea8b396abc57559813e338d92ab76b46e8 Mon Sep 17 00:00:00 2001 From: lucasliang Date: Thu, 2 Nov 2023 10:47:34 +0800 Subject: [PATCH 3/6] Polish codes. Signed-off-by: lucasliang --- best-practices/pd-scheduling-best-practices.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/best-practices/pd-scheduling-best-practices.md b/best-practices/pd-scheduling-best-practices.md index 0f00f72336e0f..ba4a199aac9d8 100644 --- a/best-practices/pd-scheduling-best-practices.md +++ b/best-practices/pd-scheduling-best-practices.md @@ -301,4 +301,4 @@ In TiDB v5.2.0, TiKV introduces the mechanism of slow TiKV node detection. By sa > **Note:** > -> When the `evict-slow-store-scheduler` is enabled, there is a possibility that some leaders on slow nodes may have to wait for delayed requests to be processed before the leader eviction process can proceed. This can result in an overall extended duration for the leader eviction. It is recommended to synchronize the configuration with the [`store-io-pool-size`](/tikv-configuration-file.md#store-io-pool-size-new-in-v530) to avoid this situation.. +> When the `evict-slow-store-scheduler` is enabled, there is a possibility that some leaders on slow nodes may have to wait for delayed requests to be processed before the leader eviction process can proceed. This can result in an overall extended duration for the leader eviction. It is recommended to enable the configuration [`store-io-pool-size`](/tikv-configuration-file.md#store-io-pool-size-new-in-v530) to mitigate this situation. From 2751c8e36c1c0074db180275d09af3ee737e7207 Mon Sep 17 00:00:00 2001 From: lucasliang Date: Fri, 3 Nov 2023 16:19:21 +0800 Subject: [PATCH 4/6] Polish comments. Signed-off-by: lucasliang --- best-practices/pd-scheduling-best-practices.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/best-practices/pd-scheduling-best-practices.md b/best-practices/pd-scheduling-best-practices.md index ba4a199aac9d8..fcdebc3e9c8d3 100644 --- a/best-practices/pd-scheduling-best-practices.md +++ b/best-practices/pd-scheduling-best-practices.md @@ -301,4 +301,4 @@ In TiDB v5.2.0, TiKV introduces the mechanism of slow TiKV node detection. By sa > **Note:** > -> When the `evict-slow-store-scheduler` is enabled, there is a possibility that some leaders on slow nodes may have to wait for delayed requests to be processed before the leader eviction process can proceed. This can result in an overall extended duration for the leader eviction. It is recommended to enable the configuration [`store-io-pool-size`](/tikv-configuration-file.md#store-io-pool-size-new-in-v530) to mitigate this situation. +> "Leader eviction" is accomplished by PD (Placement Driver) sending scheduling requests to TiKV slow nodes, and TiKV must process these requests sequentially. Slow nodes, affected by factors such as "slow I/O," may experience request accumulation. This can result in certain leaders having to wait for delayed requests to be processed before proceeding with the leader eviction process, leading to an extended overall leader eviction time. Therefore, when enabling `evict-slow-store-scheduler`, it is recommended to enable [`store-io-pool-size`](/tikv-configuration-file.md#store-io-pool-size-new-in-v530) to mitigate this situation. From 29c20256f40917a58540ab0bd6a67441390e60ef Mon Sep 17 00:00:00 2001 From: lucasliang Date: Tue, 28 Nov 2023 18:07:22 +0800 Subject: [PATCH 5/6] Apply suggestions from code review Co-authored-by: Grace Cai --- best-practices/pd-scheduling-best-practices.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/best-practices/pd-scheduling-best-practices.md b/best-practices/pd-scheduling-best-practices.md index fcdebc3e9c8d3..bbe62d1382c18 100644 --- a/best-practices/pd-scheduling-best-practices.md +++ b/best-practices/pd-scheduling-best-practices.md @@ -297,8 +297,8 @@ If a TiKV node fails, PD defaults to setting the corresponding node to the **dow Practically, if a node failure is considered unrecoverable, you can immediately take it offline. This makes PD replenish replicas soon in another node and reduces the risk of data loss. In contrast, if a node is considered recoverable, but the recovery cannot be done in 30 minutes, you can temporarily adjust `max-store-down-time` to a larger value to avoid unnecessary replenishment of the replicas and resources waste after the timeout. -In TiDB v5.2.0, TiKV introduces the mechanism of slow TiKV node detection. By sampling the requests in TiKV, this mechanism works out a score ranging from 1 to 100. A TiKV node with a score higher than or equal to 80 is marked as slow. You can add [`evict-slow-store-scheduler`](/pd-control.md#scheduler-show--add--remove--pause--resume--config--describe) to detect and schedule slow nodes. If only one TiKV is detected as slow, and the slow score reaches the limit (80 by default), the leader in this node will be evicted (similar to the effect of `evict-leader-scheduler`). +In TiDB v5.2.0, TiKV introduces the mechanism of slow TiKV node detection. By sampling the requests in TiKV, this mechanism works out a score ranging from 1 to 100. A TiKV node with a score higher than or equal to 80 is marked as slow. You can add [`evict-slow-store-scheduler`](/pd-control.md#scheduler-show--add--remove--pause--resume--config--describe) to detect and schedule slow nodes. If only one TiKV is detected as slow, and the slow score reaches the limit (80 by default), the Leader in this node will be evicted (similar to the effect of `evict-leader-scheduler`). > **Note:** > -> "Leader eviction" is accomplished by PD (Placement Driver) sending scheduling requests to TiKV slow nodes, and TiKV must process these requests sequentially. Slow nodes, affected by factors such as "slow I/O," may experience request accumulation. This can result in certain leaders having to wait for delayed requests to be processed before proceeding with the leader eviction process, leading to an extended overall leader eviction time. Therefore, when enabling `evict-slow-store-scheduler`, it is recommended to enable [`store-io-pool-size`](/tikv-configuration-file.md#store-io-pool-size-new-in-v530) to mitigate this situation. +> **Leader eviction** is accomplished by PD sending scheduling requests to TiKV slow nodes and then TiKV executing the received scheduling requests sequentially. Due to factors such as **slow I/O**, slow nodes might experience request accumulation, causing some Leaders to wait until the delayed requests are processed before handling **Leader eviction** requests. This results in an overall extended time for **Leader eviction**. Therefore, when enabling `evict-slow-store-scheduler`, it is recommended to enable [`store-io-pool-size`](/tikv-configuration-file.md#store-io-pool-size-new-in-v530) to mitigate this situation. From 046d340c0ce7553a66300626af531bd3e188dc68 Mon Sep 17 00:00:00 2001 From: lucasliang Date: Wed, 29 Nov 2023 14:59:18 +0800 Subject: [PATCH 6/6] Update best-practices/pd-scheduling-best-practices.md Co-authored-by: xixirangrang --- best-practices/pd-scheduling-best-practices.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/best-practices/pd-scheduling-best-practices.md b/best-practices/pd-scheduling-best-practices.md index bbe62d1382c18..2c10ad95150ec 100644 --- a/best-practices/pd-scheduling-best-practices.md +++ b/best-practices/pd-scheduling-best-practices.md @@ -301,4 +301,4 @@ In TiDB v5.2.0, TiKV introduces the mechanism of slow TiKV node detection. By sa > **Note:** > -> **Leader eviction** is accomplished by PD sending scheduling requests to TiKV slow nodes and then TiKV executing the received scheduling requests sequentially. Due to factors such as **slow I/O**, slow nodes might experience request accumulation, causing some Leaders to wait until the delayed requests are processed before handling **Leader eviction** requests. This results in an overall extended time for **Leader eviction**. Therefore, when enabling `evict-slow-store-scheduler`, it is recommended to enable [`store-io-pool-size`](/tikv-configuration-file.md#store-io-pool-size-new-in-v530) to mitigate this situation. +> **Leader eviction** is accomplished by PD sending scheduling requests to TiKV slow nodes and then TiKV executing the received scheduling requests sequentially. Due to factors such as **slow I/O**, slow nodes might experience request accumulation, causing some Leaders to wait until the delayed requests are processed before handling **Leader eviction** requests. This results in an overall extended time for **Leader eviction**. Therefore, when you enable `evict-slow-store-scheduler`, it is recommended to enable [`store-io-pool-size`](/tikv-configuration-file.md#store-io-pool-size-new-in-v530) as well to mitigate this situation.