From 2ce50d729cead0e54ea37b31c48d08f2f238a9d5 Mon Sep 17 00:00:00 2001 From: Sarah Welton Date: Wed, 11 Dec 2024 12:00:15 -0500 Subject: [PATCH 1/8] [DOC-12430] Adding anchor to child-field-options-reference First draft of vector-search-index-architecture --- .../pages/child-field-options-reference.adoc | 2 +- .../vector-search-index-architecture.adoc | 121 ++++++++++++++++++ 2 files changed, 122 insertions(+), 1 deletion(-) create mode 100644 modules/vector-search/pages/vector-search-index-architecture.adoc diff --git a/modules/search/pages/child-field-options-reference.adoc b/modules/search/pages/child-field-options-reference.adoc index c763baf04..c69b168bc 100644 --- a/modules/search/pages/child-field-options-reference.adoc +++ b/modules/search/pages/child-field-options-reference.adoc @@ -23,7 +23,7 @@ include::partial$vector-search-field-descriptions.adoc[tag=dimension] include::partial$vector-search-field-descriptions.adoc[tag=similarity_metric] -|Optimized For (Vector Fields Only) a| +|[[optimized]]Optimized For (Vector Fields Only) a| include::partial$vector-search-field-descriptions.adoc[tag=optimized_for] diff --git a/modules/vector-search/pages/vector-search-index-architecture.adoc b/modules/vector-search/pages/vector-search-index-architecture.adoc new file mode 100644 index 000000000..ed489d7cc --- /dev/null +++ b/modules/vector-search/pages/vector-search-index-architecture.adoc @@ -0,0 +1,121 @@ += Vector Search Index Architecture +:page-topic-type: concept +:description: Vector Search indexes use features from traditional Search indexes, with unique indexing algorithms and features that allow you to compare vectors in nearest neighbor searches. +:page-toclevels: 3 + +[abstract] +{description} + +A Vector Search index still relies on <> and uses <> to manage merging and persisting data to disk in your cluster. + +[#sync] +== Synchronization with Database Change Protocol (DCP) and the Data Service + +The Search Service uses batches to process data that comes in from xref:server:learn:clusters-and-availability/intra-cluster-replication.adoc#database-change-protocol[DCP] and the xref:server:learn:services-and-indexes:services/data-service.adoc[Data Service]. +DCP and Data Service changes are introduced gradually, based on available memory on Search Service nodes, until reindexing operations for an index are complete. + +The Search Service can merge batches into a single batch before they're sent to the disk write queue, to reduce the resources required for batch processing. + +The Search Service maintains index snapshots on each Search index partition. +These snapshots contain a representation of document mutations on either a write queue, or in storage. + +If the Search Service loses connection to the Data Service, the Search Service compares its rollback sequence numbers in its snapshots with the Data Service when the connection is reestablished. +If the index snapshots on the Search Service are too far ahead, the Search Service performs a full rollback to get back in sync with the Data Service. + +[#segments] +== Search Index Segments + +Search and Vector Search indexes in Couchbase Server are built with segments. + +All Search indexes contain a root segment, which includes all data for the Search index but excludes any segments that might be stale. +Stale segments are eventually removed by the Search Services's persister or merger routines. + +The persister reads in-memory segments from the disk write queue and flushes them to disk, completing batch operations as part of <>. +The merger works with the persister to consolidate flushed files and flush the consolidated results back through the persister - while purging the smaller, older files. + +The persister and merger interact to continuously flush and merge new in-memory segments to disk, and remove stale segments. + +Segments are marked as stale when they're replaced by a new merged segment created by the merger. +Stale segements are deleted when they're no longer used by any new queries. + +As smaller segments are merged together through the merger routine, the Search Service automatically runs any needed retraining for Vector Search indexes. +The segments for a Vector Search index can contain different index types and use a separate indexing pipeline, choosing the appropriate indexing algorithm based on the size of your available documents. + +== Vector Search and FAISS + +Vector Search specifically uses https://faiss.ai/index.html[FAISS^] indexes. +Any vectors inside your documents are indexed using FAISS, to create a new query vector that can be searched for similar vectors inside your Vector Search index. + +Vector Search chooses the best https://github.com/facebookresearch/faiss/wiki/Faiss-indexes[FAISS index class^], or vector search algorithm, for your data, and automatically tunes parameters to provide a balance of recall and latency. +You can choose to prioritize recall or latency with the xref:search:child-field-options-reference.adoc#optimized[Optimized For] setting on your index. +You can also choose to xref:fine-tune-vector-search.adoc[fine tune your Vector Search queries] to override the default balancing for your index, and change the number of centroids or probes searched in a query. + +The FAISS indexes created for your vector data can be: + +* <> +* <> + +[#flat] +=== FLAT Indexes + +The most basic kind of index that Vector Search can use for your vectors is a flat index. + +Vector Search uses flat indexes for data that contains less than 1000 vectors. + +Flat indexes are a list of vectors. +Searches run on a nearest neighbor process, based on examining the query vector against each vector in the index and calculating the distance. +Results for flat indexes are very accurate, but performance does not scale well as a dataset grows. + +If a Vector Search index uses only flat indexes, no training is required - IDs are mapped directly to vectors with exact vector comparisons, with no need for preprocessing or learning on the data. + +[#ivf] +=== Inverted File Index (IVF) + +For reduced latency, Vector Search can also use Inverted File Indexes (IVF). + +Vector Search uses a combination of IVF and flat indexes for data that contains between 1000 and 9999 vectors. +For even larger datasets, Vector Search uses IVF indexes with <>. + +IVF creates partitions called Voronoi cells in an index. +The total number of cells is the *nlist* parameter. + +Every cell has a centroid. +Every vector in the processed dataset is assigned to a cell that corresponds to its nearest centroid. + +In an IVF index, a Vector Search first tries to find the cell that the query vector belongs to. +After it knows the cell to search, Vector Search uses another algorithm to find out the exact vector that's closest to the query vector in that cell. + +The result of an IVF index search can be less accurate, as the nearest vector to a query vector can be in a different cell than the chosen cell. +You can increase accuracy by changing the *nprobe* parameter when you xref:fine-tune-vector-search.adoc[fine tune your Vector Search queries]. + +Larger IVF indexes automatically train to learn the data distribution of your vectors, and the centroids of cells in your dataset. +The training data helps to encode and compress the vectors in your index with <>. +All training occurs during building and merging <>. + +IVF indexes that also use flat indexing automatically train to determine the centroids of cells, but still uses exact vector comparisons within each cell. +Training still occurs while building and merging <>. + +[#scalar-quant] +==== Scalar Quantization + +Vector Search uses scalar quantization on large datasets to reduce the size of your indexes. + +Scalar quantization is an important data compression technique that turns the floating point values that could be present in a large vector into low-dimensional integers. +For example, a float32 value could be reduced to an int8 value. + +Scalar quantization in Vector Search does not have a significant effect on the recall, or accuracy, of query results on large datasets. + +== Search Request Processing + +The Search Service uses a scatter-gather process for running Search queries, when there are multiple nodes in the cluster running the Search Service. + +The Search Service node that receives the Search request is assigned as the coordinating node. +Using https://grpc.io/[gRPC^], the coordinating node scatters the request to all other partitions for the Search index in the request across other nodes. +The coordinating node applies filters to the results received from the other partitions, and returns the final result set. + +== See Also + +* xref:fine-tune-vector-search.adoc[] +* xref:search:search-request-params.adoc[] +* xref:create-vector-search-index-rest-api.adoc[] +* xref:create-vector-search-index-ui.adoc[] \ No newline at end of file From 54271493b0e3c74aa4683565b2fe876ca6c0ac52 Mon Sep 17 00:00:00 2001 From: Sarah Welton Date: Wed, 11 Dec 2024 12:02:07 -0500 Subject: [PATCH 2/8] [DOC-12430] Add entry to nav.adoc --- modules/vector-search/partials/nav.adoc | 1 + 1 file changed, 1 insertion(+) diff --git a/modules/vector-search/partials/nav.adoc b/modules/vector-search/partials/nav.adoc index 43d3ac6c0..95afc4f47 100644 --- a/modules/vector-search/partials/nav.adoc +++ b/modules/vector-search/partials/nav.adoc @@ -1,6 +1,7 @@ * xref:7.6@server:vector-search:vector-search.adoc[] ** xref:7.6@server:vector-search:create-vector-search-index-ui.adoc[] ** xref:7.6@server:vector-search:create-vector-search-index-rest-api.adoc[] +** xref:7.6@server:vector-search:vector-search-index-architecture.adoc[] ** xref:7.6@server:vector-search:run-vector-search-ui.adoc[] ** xref:7.6@server:vector-search:run-vector-search-rest-api.adoc[] ** xref:7.6@server:vector-search:run-vector-search-sdk.adoc[] \ No newline at end of file From f43b526ac1180b3da6141b4d54c3dd8a9bf5e6d2 Mon Sep 17 00:00:00 2001 From: Sarah Welton Date: Thu, 12 Dec 2024 10:32:43 -0500 Subject: [PATCH 3/8] [DOC-12430] Elaboration on when each index type is used + other fixes --- .../vector-search-index-architecture.adoc | 30 +++++++++++++++++-- 1 file changed, 28 insertions(+), 2 deletions(-) diff --git a/modules/vector-search/pages/vector-search-index-architecture.adoc b/modules/vector-search/pages/vector-search-index-architecture.adoc index ed489d7cc..05f8e81ee 100644 --- a/modules/vector-search/pages/vector-search-index-architecture.adoc +++ b/modules/vector-search/pages/vector-search-index-architecture.adoc @@ -7,6 +7,7 @@ {description} A Vector Search index still relies on <> and uses <> to manage merging and persisting data to disk in your cluster. +All changes from DCP and the Data Service are introduced to a Search index in batches, which are further managed by segments. [#sync] == Synchronization with Database Change Protocol (DCP) and the Data Service @@ -55,6 +56,29 @@ The FAISS indexes created for your vector data can be: * <> * <> +The specific type of index used depends on the number of vectors in your dataset: + +|==== +| Vector Count | Index Types | Description + +| >=10,000 +| IVF with scalar quantization +a| Vectors are indexed with <> indexes and <>. + +If xref:search:child-field-options-reference.adoc#optimized[Optimized For] is set to *recall* or *latency*, Vector Search uses 8bit scalar quantization. +If set to *memory-efficient*, Vector Search uses 4bit scalar quantization. + +| >=1000 +| IVF with Flat +| Vectors are indexed with <> combined with <>. +Indexes do not use <>. + +| <1000 +| Flat +| Vectors are indexed with <>. +Indexes do not use <>. +|==== + [#flat] === FLAT Indexes @@ -92,7 +116,7 @@ Larger IVF indexes automatically train to learn the data distribution of your ve The training data helps to encode and compress the vectors in your index with <>. All training occurs during building and merging <>. -IVF indexes that also use flat indexing automatically train to determine the centroids of cells, but still uses exact vector comparisons within each cell. +IVF indexes that also use flat indexing automatically train to determine the centroids of cells, but still use exact vector comparisons within each cell. Training still occurs while building and merging <>. [#scalar-quant] @@ -103,7 +127,9 @@ Vector Search uses scalar quantization on large datasets to reduce the size of y Scalar quantization is an important data compression technique that turns the floating point values that could be present in a large vector into low-dimensional integers. For example, a float32 value could be reduced to an int8 value. -Scalar quantization in Vector Search does not have a significant effect on the recall, or accuracy, of query results on large datasets. +Scalar quantization in Vector Search does not have a significant effect on the recall, or accuracy, of query results on large datasets. + +Vector Search uses both 8bit and 4bit scalar quantization for indexes, based on your xref:search:child-field-options-reference.adoc#optimized[Optimized For] setting. == Search Request Processing From fa6403f2758bdbdc4c323e014a5064c6c26f4a06 Mon Sep 17 00:00:00 2001 From: Sarah Welton Date: Thu, 12 Dec 2024 11:44:32 -0500 Subject: [PATCH 4/8] [DOC-12430] Tying processing in with scoring. --- .../pages/vector-search-index-architecture.adoc | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/modules/vector-search/pages/vector-search-index-architecture.adoc b/modules/vector-search/pages/vector-search-index-architecture.adoc index 05f8e81ee..dc57bb06b 100644 --- a/modules/vector-search/pages/vector-search-index-architecture.adoc +++ b/modules/vector-search/pages/vector-search-index-architecture.adoc @@ -133,12 +133,17 @@ Vector Search uses both 8bit and 4bit scalar quantization for indexes, based on == Search Request Processing -The Search Service uses a scatter-gather process for running Search queries, when there are multiple nodes in the cluster running the Search Service. +The Search Service uses a scatter-gather process for running all Search queries, when there are multiple nodes in the cluster running the Search Service. The Search Service node that receives the Search request is assigned as the coordinating node. -Using https://grpc.io/[gRPC^], the coordinating node scatters the request to all other partitions for the Search index in the request across other nodes. +Using https://grpc.io/[gRPC^], the coordinating node scatters the request to all other partitions for the Search or Vector Search index in the request across other nodes. The coordinating node applies filters to the results received from the other partitions, and returns the final result set. +Results are scored, and based on the xref:search:search-request-params.adoc#sort[Sort Object] provided in the Search request, returned in a list. + +For a Vector Search query, search results include the top `k` nearest neighbor vectors to the vector in the Search query. +For more information about how results are scored and returned for Search requests, see xref:search:run-searches.adoc#scoring[Scoring for Search Queries]. + == See Also * xref:fine-tune-vector-search.adoc[] From e6d6e8fa9617a6c3ec7bead53e2a8c461420085b Mon Sep 17 00:00:00 2001 From: Sarah Welton Date: Thu, 12 Dec 2024 16:29:45 -0500 Subject: [PATCH 5/8] [DOC-12430] Addressing some comments from SME review --- .../pages/vector-search-index-architecture.adoc | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/modules/vector-search/pages/vector-search-index-architecture.adoc b/modules/vector-search/pages/vector-search-index-architecture.adoc index dc57bb06b..275920879 100644 --- a/modules/vector-search/pages/vector-search-index-architecture.adoc +++ b/modules/vector-search/pages/vector-search-index-architecture.adoc @@ -48,7 +48,7 @@ Vector Search specifically uses https://faiss.ai/index.html[FAISS^] indexes. Any vectors inside your documents are indexed using FAISS, to create a new query vector that can be searched for similar vectors inside your Vector Search index. Vector Search chooses the best https://github.com/facebookresearch/faiss/wiki/Faiss-indexes[FAISS index class^], or vector search algorithm, for your data, and automatically tunes parameters to provide a balance of recall and latency. -You can choose to prioritize recall or latency with the xref:search:child-field-options-reference.adoc#optimized[Optimized For] setting on your index. +You can choose to prioritize recall, latency, or memory efficiency with the xref:search:child-field-options-reference.adoc#optimized[Optimized For] setting on your index. You can also choose to xref:fine-tune-vector-search.adoc[fine tune your Vector Search queries] to override the default balancing for your index, and change the number of centroids or probes searched in a query. The FAISS indexes created for your vector data can be: @@ -106,11 +106,11 @@ The total number of cells is the *nlist* parameter. Every cell has a centroid. Every vector in the processed dataset is assigned to a cell that corresponds to its nearest centroid. -In an IVF index, a Vector Search first tries to find the cell that the query vector belongs to. -After it knows the cell to search, Vector Search uses another algorithm to find out the exact vector that's closest to the query vector in that cell. +In an IVF index, Vector Search first tries to find a centroid vector closest to the query vector. +After finding this closest centroid vector, Vector Search uses the default `nprobe` and `max_codes` values to search over adjoining cells to the closest centroid and find the top `k` number of vectors. -The result of an IVF index search can be less accurate, as the nearest vector to a query vector can be in a different cell than the chosen cell. -You can increase accuracy by changing the *nprobe* parameter when you xref:fine-tune-vector-search.adoc[fine tune your Vector Search queries]. +IVF index searches are not exhaustive searches. +You can increase accuracy by changing the `max_nprobe_pct` parameter or `max_codes_pct` when you xref:fine-tune-vector-search.adoc[fine tune your Vector Search queries]. Larger IVF indexes automatically train to learn the data distribution of your vectors, and the centroids of cells in your dataset. The training data helps to encode and compress the vectors in your index with <>. From 625f294b35c3da81d2d92f8ebd5ab9db43a9c30e Mon Sep 17 00:00:00 2001 From: sarahlwelton <110928505+sarahlwelton@users.noreply.github.com> Date: Wed, 15 Jan 2025 11:15:07 -0500 Subject: [PATCH 6/8] Update modules/vector-search/pages/vector-search-index-architecture.adoc Co-authored-by: Rebecca Martinez <167447972+Rebecca-Martinez007@users.noreply.github.com> --- .../vector-search/pages/vector-search-index-architecture.adoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/modules/vector-search/pages/vector-search-index-architecture.adoc b/modules/vector-search/pages/vector-search-index-architecture.adoc index 275920879..9837cd68f 100644 --- a/modules/vector-search/pages/vector-search-index-architecture.adoc +++ b/modules/vector-search/pages/vector-search-index-architecture.adoc @@ -37,7 +37,7 @@ The merger works with the persister to consolidate flushed files and flush the c The persister and merger interact to continuously flush and merge new in-memory segments to disk, and remove stale segments. Segments are marked as stale when they're replaced by a new merged segment created by the merger. -Stale segements are deleted when they're no longer used by any new queries. +Stale segments are deleted when they're no longer used by any new queries. As smaller segments are merged together through the merger routine, the Search Service automatically runs any needed retraining for Vector Search indexes. The segments for a Vector Search index can contain different index types and use a separate indexing pipeline, choosing the appropriate indexing algorithm based on the size of your available documents. From 903ce18bdc777b326f34ef450f02437dba81dcc5 Mon Sep 17 00:00:00 2001 From: sarahlwelton <110928505+sarahlwelton@users.noreply.github.com> Date: Wed, 15 Jan 2025 11:15:21 -0500 Subject: [PATCH 7/8] Update modules/vector-search/pages/vector-search-index-architecture.adoc Co-authored-by: Rebecca Martinez <167447972+Rebecca-Martinez007@users.noreply.github.com> --- .../vector-search/pages/vector-search-index-architecture.adoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/modules/vector-search/pages/vector-search-index-architecture.adoc b/modules/vector-search/pages/vector-search-index-architecture.adoc index 9837cd68f..7470bf434 100644 --- a/modules/vector-search/pages/vector-search-index-architecture.adoc +++ b/modules/vector-search/pages/vector-search-index-architecture.adoc @@ -107,7 +107,7 @@ Every cell has a centroid. Every vector in the processed dataset is assigned to a cell that corresponds to its nearest centroid. In an IVF index, Vector Search first tries to find a centroid vector closest to the query vector. -After finding this closest centroid vector, Vector Search uses the default `nprobe` and `max_codes` values to search over adjoining cells to the closest centroid and find the top `k` number of vectors. +After finding this closest centroid vector, Vector Search uses the default `nprobe` and `max_codes` values to search over adjoining cells to the closest centroid and finds the top `k` number of vectors. IVF index searches are not exhaustive searches. You can increase accuracy by changing the `max_nprobe_pct` parameter or `max_codes_pct` when you xref:fine-tune-vector-search.adoc[fine tune your Vector Search queries]. From be53aa75dfe7fb23fe1558c44086eb59e27332bc Mon Sep 17 00:00:00 2001 From: Sarah Welton Date: Wed, 15 Jan 2025 11:18:42 -0500 Subject: [PATCH 8/8] [DOC-12430] Changes/suggestions from peer review --- .../vector-search/pages/vector-search-index-architecture.adoc | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/modules/vector-search/pages/vector-search-index-architecture.adoc b/modules/vector-search/pages/vector-search-index-architecture.adoc index 7470bf434..c29e8a018 100644 --- a/modules/vector-search/pages/vector-search-index-architecture.adoc +++ b/modules/vector-search/pages/vector-search-index-architecture.adoc @@ -7,7 +7,7 @@ {description} A Vector Search index still relies on <> and uses <> to manage merging and persisting data to disk in your cluster. -All changes from DCP and the Data Service are introduced to a Search index in batches, which are further managed by segments. +All changes from Database Change Protocol (DCP) and the Data Service are introduced to a Search index in batches, which are further managed by segments. [#sync] == Synchronization with Database Change Protocol (DCP) and the Data Service @@ -112,7 +112,7 @@ After finding this closest centroid vector, Vector Search uses the default `npro IVF index searches are not exhaustive searches. You can increase accuracy by changing the `max_nprobe_pct` parameter or `max_codes_pct` when you xref:fine-tune-vector-search.adoc[fine tune your Vector Search queries]. -Larger IVF indexes automatically train to learn the data distribution of your vectors, and the centroids of cells in your dataset. +The Search Service automatically trains larger IVF indexes to learn the data distribution of your vectors, and the centroids of cells in your dataset. The training data helps to encode and compress the vectors in your index with <>. All training occurs during building and merging <>.