From f590607fd41bdae4c3fadf093dd690e60f461a1d Mon Sep 17 00:00:00 2001 From: lucasliang Date: Tue, 31 Oct 2023 11:58:59 +0800 Subject: [PATCH 1/6] Polish docs Signed-off-by: lucasliang --- best-practices/pd-scheduling-best-practices.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/best-practices/pd-scheduling-best-practices.md b/best-practices/pd-scheduling-best-practices.md index 4edf5e14e51be..df238429b1e5e 100644 --- a/best-practices/pd-scheduling-best-practices.md +++ b/best-practices/pd-scheduling-best-practices.md @@ -297,3 +297,7 @@ If a TiKV node fails, PD defaults to setting the corresponding node to the **dow Practically, if a node failure is considered unrecoverable, you can immediately take it offline. This makes PD replenish replicas soon in another node and reduces the risk of data loss. In contrast, if a node is considered recoverable, but the recovery cannot be done in 30 minutes, you can temporarily adjust `max-store-down-time` to a larger value to avoid unnecessary replenishment of the replicas and resources waste after the timeout. In TiDB v5.2.0, TiKV introduces the mechanism of slow TiKV node detection. By sampling the requests in TiKV, this mechanism works out a score ranging from 1 to 100. A TiKV node with a score higher than or equal to 80 is marked as slow. You can add [`evict-slow-store-scheduler`](/pd-control.md#scheduler-show--add--remove--pause--resume--config--describe) to detect and schedule slow nodes. If only one TiKV is detected as slow, and the slow score reaches the limit (80 by default), the leader in this node will be evicted (similar to the effect of `evict-leader-scheduler`). + + > **Note:** +> +> When the `evict-slow-store-scheduler` is enabled, there is a possibility that some leaders on slow nodes may have to wait for delayed requests to be processed before the leader eviction process can proceed. This can result in an overall extended duration for the leader eviction. It is recommended to synchronize the configuration with the [`store-io-pool-size`](/tikv-configuration-file.md#store-io-pool-size-new-in-v530) to avoid this situation.. From fd3ada5bf9d8e851fa2852ee79f2df9cb7138072 Mon Sep 17 00:00:00 2001 From: lucasliang Date: Tue, 31 Oct 2023 16:33:35 +0800 Subject: [PATCH 2/6] Polish format. Signed-off-by: lucasliang --- best-practices/pd-scheduling-best-practices.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/best-practices/pd-scheduling-best-practices.md b/best-practices/pd-scheduling-best-practices.md index df238429b1e5e..3bdf015349ad1 100644 --- a/best-practices/pd-scheduling-best-practices.md +++ b/best-practices/pd-scheduling-best-practices.md @@ -298,6 +298,6 @@ Practically, if a node failure is considered unrecoverable, you can immediately In TiDB v5.2.0, TiKV introduces the mechanism of slow TiKV node detection. By sampling the requests in TiKV, this mechanism works out a score ranging from 1 to 100. A TiKV node with a score higher than or equal to 80 is marked as slow. You can add [`evict-slow-store-scheduler`](/pd-control.md#scheduler-show--add--remove--pause--resume--config--describe) to detect and schedule slow nodes. If only one TiKV is detected as slow, and the slow score reaches the limit (80 by default), the leader in this node will be evicted (similar to the effect of `evict-leader-scheduler`). - > **Note:** +> **Note:** > > When the `evict-slow-store-scheduler` is enabled, there is a possibility that some leaders on slow nodes may have to wait for delayed requests to be processed before the leader eviction process can proceed. This can result in an overall extended duration for the leader eviction. It is recommended to synchronize the configuration with the [`store-io-pool-size`](/tikv-configuration-file.md#store-io-pool-size-new-in-v530) to avoid this situation.. From 9b82cd3f016336af77e40917b925b33338227563 Mon Sep 17 00:00:00 2001 From: lucasliang Date: Thu, 2 Nov 2023 10:47:34 +0800 Subject: [PATCH 3/6] Polish codes. Signed-off-by: lucasliang --- best-practices/pd-scheduling-best-practices.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/best-practices/pd-scheduling-best-practices.md b/best-practices/pd-scheduling-best-practices.md index 3bdf015349ad1..bcf445d39bd6c 100644 --- a/best-practices/pd-scheduling-best-practices.md +++ b/best-practices/pd-scheduling-best-practices.md @@ -300,4 +300,4 @@ In TiDB v5.2.0, TiKV introduces the mechanism of slow TiKV node detection. By sa > **Note:** > -> When the `evict-slow-store-scheduler` is enabled, there is a possibility that some leaders on slow nodes may have to wait for delayed requests to be processed before the leader eviction process can proceed. This can result in an overall extended duration for the leader eviction. It is recommended to synchronize the configuration with the [`store-io-pool-size`](/tikv-configuration-file.md#store-io-pool-size-new-in-v530) to avoid this situation.. +> When the `evict-slow-store-scheduler` is enabled, there is a possibility that some leaders on slow nodes may have to wait for delayed requests to be processed before the leader eviction process can proceed. This can result in an overall extended duration for the leader eviction. It is recommended to enable the configuration [`store-io-pool-size`](/tikv-configuration-file.md#store-io-pool-size-new-in-v530) to mitigate this situation. From 87e63d5e0b1a227c7ae8e521cabbf3cd2a94ea5f Mon Sep 17 00:00:00 2001 From: lucasliang Date: Fri, 3 Nov 2023 16:19:21 +0800 Subject: [PATCH 4/6] Polish comments. Signed-off-by: lucasliang --- best-practices/pd-scheduling-best-practices.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/best-practices/pd-scheduling-best-practices.md b/best-practices/pd-scheduling-best-practices.md index bcf445d39bd6c..4c60fc3dd7eef 100644 --- a/best-practices/pd-scheduling-best-practices.md +++ b/best-practices/pd-scheduling-best-practices.md @@ -300,4 +300,4 @@ In TiDB v5.2.0, TiKV introduces the mechanism of slow TiKV node detection. By sa > **Note:** > -> When the `evict-slow-store-scheduler` is enabled, there is a possibility that some leaders on slow nodes may have to wait for delayed requests to be processed before the leader eviction process can proceed. This can result in an overall extended duration for the leader eviction. It is recommended to enable the configuration [`store-io-pool-size`](/tikv-configuration-file.md#store-io-pool-size-new-in-v530) to mitigate this situation. +> "Leader eviction" is accomplished by PD (Placement Driver) sending scheduling requests to TiKV slow nodes, and TiKV must process these requests sequentially. Slow nodes, affected by factors such as "slow I/O," may experience request accumulation. This can result in certain leaders having to wait for delayed requests to be processed before proceeding with the leader eviction process, leading to an extended overall leader eviction time. Therefore, when enabling `evict-slow-store-scheduler`, it is recommended to enable [`store-io-pool-size`](/tikv-configuration-file.md#store-io-pool-size-new-in-v530) to mitigate this situation. From 08aad210aa48d118433271fd63a8383d6b35b710 Mon Sep 17 00:00:00 2001 From: lucasliang Date: Tue, 28 Nov 2023 18:07:22 +0800 Subject: [PATCH 5/6] Apply suggestions from code review Co-authored-by: Grace Cai --- best-practices/pd-scheduling-best-practices.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/best-practices/pd-scheduling-best-practices.md b/best-practices/pd-scheduling-best-practices.md index 4c60fc3dd7eef..106413558896d 100644 --- a/best-practices/pd-scheduling-best-practices.md +++ b/best-practices/pd-scheduling-best-practices.md @@ -296,8 +296,8 @@ If a TiKV node fails, PD defaults to setting the corresponding node to the **dow Practically, if a node failure is considered unrecoverable, you can immediately take it offline. This makes PD replenish replicas soon in another node and reduces the risk of data loss. In contrast, if a node is considered recoverable, but the recovery cannot be done in 30 minutes, you can temporarily adjust `max-store-down-time` to a larger value to avoid unnecessary replenishment of the replicas and resources waste after the timeout. -In TiDB v5.2.0, TiKV introduces the mechanism of slow TiKV node detection. By sampling the requests in TiKV, this mechanism works out a score ranging from 1 to 100. A TiKV node with a score higher than or equal to 80 is marked as slow. You can add [`evict-slow-store-scheduler`](/pd-control.md#scheduler-show--add--remove--pause--resume--config--describe) to detect and schedule slow nodes. If only one TiKV is detected as slow, and the slow score reaches the limit (80 by default), the leader in this node will be evicted (similar to the effect of `evict-leader-scheduler`). +In TiDB v5.2.0, TiKV introduces the mechanism of slow TiKV node detection. By sampling the requests in TiKV, this mechanism works out a score ranging from 1 to 100. A TiKV node with a score higher than or equal to 80 is marked as slow. You can add [`evict-slow-store-scheduler`](/pd-control.md#scheduler-show--add--remove--pause--resume--config--describe) to detect and schedule slow nodes. If only one TiKV is detected as slow, and the slow score reaches the limit (80 by default), the Leader in this node will be evicted (similar to the effect of `evict-leader-scheduler`). > **Note:** > -> "Leader eviction" is accomplished by PD (Placement Driver) sending scheduling requests to TiKV slow nodes, and TiKV must process these requests sequentially. Slow nodes, affected by factors such as "slow I/O," may experience request accumulation. This can result in certain leaders having to wait for delayed requests to be processed before proceeding with the leader eviction process, leading to an extended overall leader eviction time. Therefore, when enabling `evict-slow-store-scheduler`, it is recommended to enable [`store-io-pool-size`](/tikv-configuration-file.md#store-io-pool-size-new-in-v530) to mitigate this situation. +> **Leader eviction** is accomplished by PD sending scheduling requests to TiKV slow nodes and then TiKV executing the received scheduling requests sequentially. Due to factors such as **slow I/O**, slow nodes might experience request accumulation, causing some Leaders to wait until the delayed requests are processed before handling **Leader eviction** requests. This results in an overall extended time for **Leader eviction**. Therefore, when enabling `evict-slow-store-scheduler`, it is recommended to enable [`store-io-pool-size`](/tikv-configuration-file.md#store-io-pool-size-new-in-v530) to mitigate this situation. From f629780ebad69016d88309525bbd53e040b619fc Mon Sep 17 00:00:00 2001 From: lucasliang Date: Wed, 29 Nov 2023 14:59:18 +0800 Subject: [PATCH 6/6] Update best-practices/pd-scheduling-best-practices.md Co-authored-by: xixirangrang --- best-practices/pd-scheduling-best-practices.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/best-practices/pd-scheduling-best-practices.md b/best-practices/pd-scheduling-best-practices.md index 106413558896d..1aa1ec324cee5 100644 --- a/best-practices/pd-scheduling-best-practices.md +++ b/best-practices/pd-scheduling-best-practices.md @@ -300,4 +300,4 @@ In TiDB v5.2.0, TiKV introduces the mechanism of slow TiKV node detection. By sa > **Note:** > -> **Leader eviction** is accomplished by PD sending scheduling requests to TiKV slow nodes and then TiKV executing the received scheduling requests sequentially. Due to factors such as **slow I/O**, slow nodes might experience request accumulation, causing some Leaders to wait until the delayed requests are processed before handling **Leader eviction** requests. This results in an overall extended time for **Leader eviction**. Therefore, when enabling `evict-slow-store-scheduler`, it is recommended to enable [`store-io-pool-size`](/tikv-configuration-file.md#store-io-pool-size-new-in-v530) to mitigate this situation. +> **Leader eviction** is accomplished by PD sending scheduling requests to TiKV slow nodes and then TiKV executing the received scheduling requests sequentially. Due to factors such as **slow I/O**, slow nodes might experience request accumulation, causing some Leaders to wait until the delayed requests are processed before handling **Leader eviction** requests. This results in an overall extended time for **Leader eviction**. Therefore, when you enable `evict-slow-store-scheduler`, it is recommended to enable [`store-io-pool-size`](/tikv-configuration-file.md#store-io-pool-size-new-in-v530) as well to mitigate this situation.