From 3e3e65e5e4b4e087ee7bd6ce0824688cdaa6b6e2 Mon Sep 17 00:00:00 2001 From: lucasliang Date: Tue, 31 Oct 2023 11:58:59 +0800 Subject: [PATCH 1/6] Polish docs Signed-off-by: lucasliang --- best-practices/pd-scheduling-best-practices.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/best-practices/pd-scheduling-best-practices.md b/best-practices/pd-scheduling-best-practices.md index c78306820b055..92e1864661d8b 100644 --- a/best-practices/pd-scheduling-best-practices.md +++ b/best-practices/pd-scheduling-best-practices.md @@ -297,3 +297,7 @@ If a TiKV node fails, PD defaults to setting the corresponding node to the **dow Practically, if a node failure is considered unrecoverable, you can immediately take it offline. This makes PD replenish replicas soon in another node and reduces the risk of data loss. In contrast, if a node is considered recoverable, but the recovery cannot be done in 30 minutes, you can temporarily adjust `max-store-down-time` to a larger value to avoid unnecessary replenishment of the replicas and resources waste after the timeout. In TiDB v5.2.0, TiKV introduces the mechanism of slow TiKV node detection. By sampling the requests in TiKV, this mechanism works out a score ranging from 1 to 100. A TiKV node with a score higher than or equal to 80 is marked as slow. You can add [`evict-slow-store-scheduler`](/pd-control.md#scheduler-show--add--remove--pause--resume--config--describe) to detect and schedule slow nodes. If only one TiKV is detected as slow, and the slow score reaches the limit (80 by default), the leader in this node will be evicted (similar to the effect of `evict-leader-scheduler`). + + > **Note:** +> +> When the `evict-slow-store-scheduler` is enabled, there is a possibility that some leaders on slow nodes may have to wait for delayed requests to be processed before the leader eviction process can proceed. This can result in an overall extended duration for the leader eviction. It is recommended to synchronize the configuration with the [`store-io-pool-size`](/tikv-configuration-file.md#store-io-pool-size-new-in-v530) to avoid this situation.. From a78e168a9c8f3a8e2eded581675ffb82d6019f05 Mon Sep 17 00:00:00 2001 From: lucasliang Date: Tue, 31 Oct 2023 16:33:35 +0800 Subject: [PATCH 2/6] Polish format. Signed-off-by: lucasliang --- best-practices/pd-scheduling-best-practices.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/best-practices/pd-scheduling-best-practices.md b/best-practices/pd-scheduling-best-practices.md index 92e1864661d8b..dba9e8ebed9b8 100644 --- a/best-practices/pd-scheduling-best-practices.md +++ b/best-practices/pd-scheduling-best-practices.md @@ -298,6 +298,6 @@ Practically, if a node failure is considered unrecoverable, you can immediately In TiDB v5.2.0, TiKV introduces the mechanism of slow TiKV node detection. By sampling the requests in TiKV, this mechanism works out a score ranging from 1 to 100. A TiKV node with a score higher than or equal to 80 is marked as slow. You can add [`evict-slow-store-scheduler`](/pd-control.md#scheduler-show--add--remove--pause--resume--config--describe) to detect and schedule slow nodes. If only one TiKV is detected as slow, and the slow score reaches the limit (80 by default), the leader in this node will be evicted (similar to the effect of `evict-leader-scheduler`). - > **Note:** +> **Note:** > > When the `evict-slow-store-scheduler` is enabled, there is a possibility that some leaders on slow nodes may have to wait for delayed requests to be processed before the leader eviction process can proceed. This can result in an overall extended duration for the leader eviction. It is recommended to synchronize the configuration with the [`store-io-pool-size`](/tikv-configuration-file.md#store-io-pool-size-new-in-v530) to avoid this situation.. From 301b1d0a2d4e3d5fa8bb93048a4a0fcab4749ab1 Mon Sep 17 00:00:00 2001 From: lucasliang Date: Thu, 2 Nov 2023 10:47:34 +0800 Subject: [PATCH 3/6] Polish codes. Signed-off-by: lucasliang --- best-practices/pd-scheduling-best-practices.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/best-practices/pd-scheduling-best-practices.md b/best-practices/pd-scheduling-best-practices.md index dba9e8ebed9b8..5ec4688a64e69 100644 --- a/best-practices/pd-scheduling-best-practices.md +++ b/best-practices/pd-scheduling-best-practices.md @@ -300,4 +300,4 @@ In TiDB v5.2.0, TiKV introduces the mechanism of slow TiKV node detection. By sa > **Note:** > -> When the `evict-slow-store-scheduler` is enabled, there is a possibility that some leaders on slow nodes may have to wait for delayed requests to be processed before the leader eviction process can proceed. This can result in an overall extended duration for the leader eviction. It is recommended to synchronize the configuration with the [`store-io-pool-size`](/tikv-configuration-file.md#store-io-pool-size-new-in-v530) to avoid this situation.. +> When the `evict-slow-store-scheduler` is enabled, there is a possibility that some leaders on slow nodes may have to wait for delayed requests to be processed before the leader eviction process can proceed. This can result in an overall extended duration for the leader eviction. It is recommended to enable the configuration [`store-io-pool-size`](/tikv-configuration-file.md#store-io-pool-size-new-in-v530) to mitigate this situation. From 6d5a5f5c1af1431527783ec22b68cd9fb521c54f Mon Sep 17 00:00:00 2001 From: lucasliang Date: Fri, 3 Nov 2023 16:19:21 +0800 Subject: [PATCH 4/6] Polish comments. Signed-off-by: lucasliang --- best-practices/pd-scheduling-best-practices.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/best-practices/pd-scheduling-best-practices.md b/best-practices/pd-scheduling-best-practices.md index 5ec4688a64e69..63aa847026e98 100644 --- a/best-practices/pd-scheduling-best-practices.md +++ b/best-practices/pd-scheduling-best-practices.md @@ -300,4 +300,4 @@ In TiDB v5.2.0, TiKV introduces the mechanism of slow TiKV node detection. By sa > **Note:** > -> When the `evict-slow-store-scheduler` is enabled, there is a possibility that some leaders on slow nodes may have to wait for delayed requests to be processed before the leader eviction process can proceed. This can result in an overall extended duration for the leader eviction. It is recommended to enable the configuration [`store-io-pool-size`](/tikv-configuration-file.md#store-io-pool-size-new-in-v530) to mitigate this situation. +> "Leader eviction" is accomplished by PD (Placement Driver) sending scheduling requests to TiKV slow nodes, and TiKV must process these requests sequentially. Slow nodes, affected by factors such as "slow I/O," may experience request accumulation. This can result in certain leaders having to wait for delayed requests to be processed before proceeding with the leader eviction process, leading to an extended overall leader eviction time. Therefore, when enabling `evict-slow-store-scheduler`, it is recommended to enable [`store-io-pool-size`](/tikv-configuration-file.md#store-io-pool-size-new-in-v530) to mitigate this situation. From e5cea1ca74064c99ec6916808c3be06bf3275e92 Mon Sep 17 00:00:00 2001 From: lucasliang Date: Tue, 28 Nov 2023 18:07:22 +0800 Subject: [PATCH 5/6] Apply suggestions from code review Co-authored-by: Grace Cai --- best-practices/pd-scheduling-best-practices.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/best-practices/pd-scheduling-best-practices.md b/best-practices/pd-scheduling-best-practices.md index 63aa847026e98..27621fad40339 100644 --- a/best-practices/pd-scheduling-best-practices.md +++ b/best-practices/pd-scheduling-best-practices.md @@ -296,8 +296,8 @@ If a TiKV node fails, PD defaults to setting the corresponding node to the **dow Practically, if a node failure is considered unrecoverable, you can immediately take it offline. This makes PD replenish replicas soon in another node and reduces the risk of data loss. In contrast, if a node is considered recoverable, but the recovery cannot be done in 30 minutes, you can temporarily adjust `max-store-down-time` to a larger value to avoid unnecessary replenishment of the replicas and resources waste after the timeout. -In TiDB v5.2.0, TiKV introduces the mechanism of slow TiKV node detection. By sampling the requests in TiKV, this mechanism works out a score ranging from 1 to 100. A TiKV node with a score higher than or equal to 80 is marked as slow. You can add [`evict-slow-store-scheduler`](/pd-control.md#scheduler-show--add--remove--pause--resume--config--describe) to detect and schedule slow nodes. If only one TiKV is detected as slow, and the slow score reaches the limit (80 by default), the leader in this node will be evicted (similar to the effect of `evict-leader-scheduler`). +In TiDB v5.2.0, TiKV introduces the mechanism of slow TiKV node detection. By sampling the requests in TiKV, this mechanism works out a score ranging from 1 to 100. A TiKV node with a score higher than or equal to 80 is marked as slow. You can add [`evict-slow-store-scheduler`](/pd-control.md#scheduler-show--add--remove--pause--resume--config--describe) to detect and schedule slow nodes. If only one TiKV is detected as slow, and the slow score reaches the limit (80 by default), the Leader in this node will be evicted (similar to the effect of `evict-leader-scheduler`). > **Note:** > -> "Leader eviction" is accomplished by PD (Placement Driver) sending scheduling requests to TiKV slow nodes, and TiKV must process these requests sequentially. Slow nodes, affected by factors such as "slow I/O," may experience request accumulation. This can result in certain leaders having to wait for delayed requests to be processed before proceeding with the leader eviction process, leading to an extended overall leader eviction time. Therefore, when enabling `evict-slow-store-scheduler`, it is recommended to enable [`store-io-pool-size`](/tikv-configuration-file.md#store-io-pool-size-new-in-v530) to mitigate this situation. +> **Leader eviction** is accomplished by PD sending scheduling requests to TiKV slow nodes and then TiKV executing the received scheduling requests sequentially. Due to factors such as **slow I/O**, slow nodes might experience request accumulation, causing some Leaders to wait until the delayed requests are processed before handling **Leader eviction** requests. This results in an overall extended time for **Leader eviction**. Therefore, when enabling `evict-slow-store-scheduler`, it is recommended to enable [`store-io-pool-size`](/tikv-configuration-file.md#store-io-pool-size-new-in-v530) to mitigate this situation. From bae943345d97bcbaf4d56f76a32c3890599c6e6e Mon Sep 17 00:00:00 2001 From: lucasliang Date: Wed, 29 Nov 2023 14:59:18 +0800 Subject: [PATCH 6/6] Update best-practices/pd-scheduling-best-practices.md Co-authored-by: xixirangrang --- best-practices/pd-scheduling-best-practices.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/best-practices/pd-scheduling-best-practices.md b/best-practices/pd-scheduling-best-practices.md index 27621fad40339..42d21015f8280 100644 --- a/best-practices/pd-scheduling-best-practices.md +++ b/best-practices/pd-scheduling-best-practices.md @@ -300,4 +300,4 @@ In TiDB v5.2.0, TiKV introduces the mechanism of slow TiKV node detection. By sa > **Note:** > -> **Leader eviction** is accomplished by PD sending scheduling requests to TiKV slow nodes and then TiKV executing the received scheduling requests sequentially. Due to factors such as **slow I/O**, slow nodes might experience request accumulation, causing some Leaders to wait until the delayed requests are processed before handling **Leader eviction** requests. This results in an overall extended time for **Leader eviction**. Therefore, when enabling `evict-slow-store-scheduler`, it is recommended to enable [`store-io-pool-size`](/tikv-configuration-file.md#store-io-pool-size-new-in-v530) to mitigate this situation. +> **Leader eviction** is accomplished by PD sending scheduling requests to TiKV slow nodes and then TiKV executing the received scheduling requests sequentially. Due to factors such as **slow I/O**, slow nodes might experience request accumulation, causing some Leaders to wait until the delayed requests are processed before handling **Leader eviction** requests. This results in an overall extended time for **Leader eviction**. Therefore, when you enable `evict-slow-store-scheduler`, it is recommended to enable [`store-io-pool-size`](/tikv-configuration-file.md#store-io-pool-size-new-in-v530) as well to mitigate this situation.