diff --git a/modules/search/examples/run-search-full-request.jsonc b/modules/search/examples/run-search-full-request.jsonc index 5d04c3143..4587e3702 100644 --- a/modules/search/examples/run-search-full-request.jsonc +++ b/modules/search/examples/run-search-full-request.jsonc @@ -47,6 +47,10 @@ "knn": [ { "k": 10, + "params": { + "ivf_nprobe_pct": 1, + "ivf_max_codes_pct": 0.2 + }, "field": "vector_field", "vector": [ 0.707106781186548, 0, 0.707106781186548 ] } diff --git a/modules/search/pages/search-request-params.adoc b/modules/search/pages/search-request-params.adoc index a14951e99..b5527ced7 100644 --- a/modules/search/pages/search-request-params.adoc +++ b/modules/search/pages/search-request-params.adoc @@ -173,6 +173,12 @@ The Search Service returns the `k` closest vectors to the vector given in `vecto NOTE: The <> overrides any value set in `k`. +|params |Object |No a| + +Enter additional parameters to control how the Search Service compares vectors when running a Vector Search request. + +For more information about the `params` object, see <>. + |field |String |Yes a| The name of the field that contains the vector data you want to search. @@ -199,6 +205,40 @@ For more information about the dimension value, see the xref:search-index-params |==== +[#knn-params] +=== Knn params Object + +Use the `params` object inside a `knn` object to fine tune the probes and centroids the Search Services uses and searches while running a Vector Search request. + +The `params` object can contain the following properties: + +[cols="1,1,1,4"] +|==== +|Property |Type |Required? |Description + +|ivf_nprobe_pct |Number (percentage) |No a| + +Set the `ivf_nprobe_pct` value to control the percentage of probes, or the percentage of clusters, that the Search Service searches during a single Vector Search query. + +The Search Service automatically calculates a default `nprobe` percentage based on the vectors in a given partition of your Vector Search index. +For more information about this calculation, see xref:vector-search:fine-tune-vector-search.adoc[]. + +If you set the value of `ivf_nprobe_pct` higher than this default calculated value, the Search Service will search a higher percentage of clusters in your processed vectors. +This can increase your accuracy and recall for Vector Search, but requires more compute time for each query. + +In the example, the Search Service searches only `1%` of the total available clusters. + +|ivf_max_codes_pct |Number (percentage out of 100) |No a| + +Set the `ivf_max_codes_pct` value to control the maximum number of centroids that the Search Service accesses during a single Vector Search query. + +By default, this value is always 100%. + +If you reduce your `ivf_max_codes_pct` value, the Search Service accesses fewer centroids, which reduces your Vector Search accuracy and recall, but gives faster compute times for your search. + +In the example, the Search Service searches only `0.2%` of the available centroids in your vector data. +|==== + [#query-object] == Query Object diff --git a/modules/search/partials/vector-search-field-descriptions.adoc b/modules/search/partials/vector-search-field-descriptions.adoc index fbe64610d..30c3f555b 100644 --- a/modules/search/partials/vector-search-field-descriptions.adoc +++ b/modules/search/partials/vector-search-field-descriptions.adoc @@ -1,5 +1,5 @@ // tag::optimized_for[] -For a `vector` child field, choose whether the Search Service should prioritize recall or latency when returning similar vectors in search results: +For a `vector` child field, choose whether the Search Service should prioritize recall, latency, or memory efficiency when returning similar vectors in search results: * *recall*: The Search Service prioritizes returning the most accurate result. This may increase resource usage for Search queries. @@ -12,6 +12,11 @@ This may reduce the accuracy of results. + The Search Service uses half the `nprobe` value calculated for *recall* priority. +* *memory-efficient*: The Search Service prioritizes reducing memory usage and optimizes search operations for less resources. +This may reduce both accuracy (recall) and latency. ++ +The Search Service uses either an inverted file index with scalar quantization, or a directly mapped index with exact vector comparisons, depending on the number of vectors in your data. + For more information about Vector Search indexes, see xref:vector-search:vector-search.adoc[] or xref:vector-search:create-vector-search-index-ui.adoc[]. // end::optimized_for[] // tag::similarity_metric[] diff --git a/modules/vector-search/pages/fine-tune-vector-search.adoc b/modules/vector-search/pages/fine-tune-vector-search.adoc index 44a348fd7..a277f218b 100644 --- a/modules/vector-search/pages/fine-tune-vector-search.adoc +++ b/modules/vector-search/pages/fine-tune-vector-search.adoc @@ -119,7 +119,7 @@ If you have a Vector Search index with `vector_index_optimized_for` set to `"rec == Fine-Tuning Query Parameters -You can add set the values of `ivf_nprobe_pct` and `ivf_max_codes_pct` in your Vector Search queries to tune the recall or accuracy of your search. +You can set the values of `ivf_nprobe_pct` and `ivf_max_codes_pct` in your Vector Search queries to tune the recall or accuracy of your search. You can add the following parameters to your query: