From ffe071a5390af3d6c121fd29c93806d66a5a2e6b Mon Sep 17 00:00:00 2001 From: ykadowak Date: Mon, 23 Oct 2023 16:44:29 +0900 Subject: [PATCH 1/4] Add index correction document --- docs/user-guides/index-correction.md | 37 ++++++++++++++++++++++++++++ 1 file changed, 37 insertions(+) create mode 100644 docs/user-guides/index-correction.md diff --git a/docs/user-guides/index-correction.md b/docs/user-guides/index-correction.md new file mode 100644 index 0000000000..f0374ec3f6 --- /dev/null +++ b/docs/user-guides/index-correction.md @@ -0,0 +1,37 @@ +# Index Correction + +In the Vald cluster, the same Index is replicated to multiple agents due to the `index_replica` setting. However, inconsistencies between replicas may occur due to pod eviction or the occurrence of OOM killer during vector insertions. For example, + +1. The timestamp of the index differs between agents (some agents have an old index saved and it has not been updated). +2. The number of replicas does not meet the value set in `index_replica`. + +To resolve these inconsistencies, you can use the `Index Correction` feature. + +`Index Correction` is implemented as a [`CronJob`](https://kubernetes.io/docs/concepts/workloads/controllers/cron-jobs/), checking the consistency between replicas regularly and resolving any inconsistencies. + +## Settings + +- enabled +Turns the index correction feature on/off. +- schedule +Sets the interval for the job start in cron notation (the default value is `3 6 * * *`, which means 3:06 AM every day). +- suspend +[Temporary suspension setting](https://kubernetes.io/docs/concepts/workloads/controllers/cron-jobs/#schedule-suspension) for CronJob. + +```yaml +manager: + index: + corrector: + enabled: true + schedule: "3 6 * * *" + suspend: false +``` + +## Important Notes + +- Processing time +Under conditions of 10 million vectors and agent replica *10, it takes about 10~20 minutes. The process is O(MN) where M is the number of vector items and N is the number of agent replicas. +- concurrencyPolicy +`Forbid` is set internally, so a new job will not be created while an existing job is running. In other words, if the process does not finish within the interval specified by the schedule, the next job will not be scheduled. +- Index operations during correction +Vector operations performed after the start of the index correction job are not considered in that job. From 431b04dc95a13f677c3fbdbaca14c2a502f5d357 Mon Sep 17 00:00:00 2001 From: "deepsource-autofix[bot]" <62050782+deepsource-autofix[bot]@users.noreply.github.com> Date: Mon, 23 Oct 2023 07:45:30 +0000 Subject: [PATCH 2/4] style: format code with Gofumpt and Prettier This commit fixes the style issues introduced in ffe071a according to the output from Gofumpt and Prettier. Details: https://github.com/vdaas/vald/pull/2217 --- docs/user-guides/index-correction.md | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/docs/user-guides/index-correction.md b/docs/user-guides/index-correction.md index f0374ec3f6..a512943018 100644 --- a/docs/user-guides/index-correction.md +++ b/docs/user-guides/index-correction.md @@ -12,11 +12,11 @@ To resolve these inconsistencies, you can use the `Index Correction` feature. ## Settings - enabled -Turns the index correction feature on/off. + Turns the index correction feature on/off. - schedule -Sets the interval for the job start in cron notation (the default value is `3 6 * * *`, which means 3:06 AM every day). + Sets the interval for the job start in cron notation (the default value is `3 6 * * *`, which means 3:06 AM every day). - suspend -[Temporary suspension setting](https://kubernetes.io/docs/concepts/workloads/controllers/cron-jobs/#schedule-suspension) for CronJob. + [Temporary suspension setting](https://kubernetes.io/docs/concepts/workloads/controllers/cron-jobs/#schedule-suspension) for CronJob. ```yaml manager: @@ -30,8 +30,8 @@ manager: ## Important Notes - Processing time -Under conditions of 10 million vectors and agent replica *10, it takes about 10~20 minutes. The process is O(MN) where M is the number of vector items and N is the number of agent replicas. + Under conditions of 10 million vectors and agent replica \*10, it takes about 10~20 minutes. The process is O(MN) where M is the number of vector items and N is the number of agent replicas. - concurrencyPolicy -`Forbid` is set internally, so a new job will not be created while an existing job is running. In other words, if the process does not finish within the interval specified by the schedule, the next job will not be scheduled. + `Forbid` is set internally, so a new job will not be created while an existing job is running. In other words, if the process does not finish within the interval specified by the schedule, the next job will not be scheduled. - Index operations during correction -Vector operations performed after the start of the index correction job are not considered in that job. + Vector operations performed after the start of the index correction job are not considered in that job. From b045658111d506c3877ea73d045aeb6a9134f7e4 Mon Sep 17 00:00:00 2001 From: Yusuke Kadowaki Date: Wed, 25 Oct 2023 17:48:49 +0900 Subject: [PATCH 3/4] Update docs/user-guides/index-correction.md Co-authored-by: Hiroto Funakoshi --- docs/user-guides/index-correction.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/docs/user-guides/index-correction.md b/docs/user-guides/index-correction.md index a512943018..658ebf8cab 100644 --- a/docs/user-guides/index-correction.md +++ b/docs/user-guides/index-correction.md @@ -31,7 +31,9 @@ manager: - Processing time Under conditions of 10 million vectors and agent replica \*10, it takes about 10~20 minutes. The process is O(MN) where M is the number of vector items and N is the number of agent replicas. + - concurrencyPolicy `Forbid` is set internally, so a new job will not be created while an existing job is running. In other words, if the process does not finish within the interval specified by the schedule, the next job will not be scheduled. + - Index operations during correction Vector operations performed after the start of the index correction job are not considered in that job. From e9f5ecd28b0240e2368871579a67234765d5e073 Mon Sep 17 00:00:00 2001 From: ykadowak Date: Fri, 27 Oct 2023 11:59:14 +0900 Subject: [PATCH 4/4] Update notes on processing time --- docs/user-guides/index-correction.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/user-guides/index-correction.md b/docs/user-guides/index-correction.md index 658ebf8cab..f6759ab57f 100644 --- a/docs/user-guides/index-correction.md +++ b/docs/user-guides/index-correction.md @@ -30,7 +30,7 @@ manager: ## Important Notes - Processing time - Under conditions of 10 million vectors and agent replica \*10, it takes about 10~20 minutes. The process is O(MN) where M is the number of vector items and N is the number of agent replicas. + Under conditions of 10 million identical vectors(not including `index_replica`) and 10 agent replicas, the processing takes about 30~40 minutes (this is only a reference, and the actual execution time may vary depending on the infrastructure). Time complexity of the process is `O(MN)` where M is the number of identical vector items and N is the number of agent replicas. `index_replica` does not matter for the processing time. - concurrencyPolicy `Forbid` is set internally, so a new job will not be created while an existing job is running. In other words, if the process does not finish within the interval specified by the schedule, the next job will not be scheduled.