-
Notifications
You must be signed in to change notification settings - Fork 19
[DOC-12430] Vector Search Index Architecture #308
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
sarahlwelton
merged 10 commits into
release/7.6
from
DOC-12430-search-index-arch-vs-server
Jan 16, 2025
Merged
Changes from all commits
Commits
Show all changes
10 commits
Select commit
Hold shift + click to select a range
2ce50d7
[DOC-12430] Adding anchor to child-field-options-reference
sarahlwelton 5427149
[DOC-12430] Add entry to nav.adoc
sarahlwelton 8a63caf
Merge branch 'release/7.6' into DOC-12430-search-index-arch-vs-server
sarahlwelton f43b526
[DOC-12430] Elaboration on when each index type is used + other fixes
sarahlwelton fa6403f
[DOC-12430] Tying processing in with scoring.
sarahlwelton 788ef26
Merge remote-tracking branch 'origin/release/7.6' into DOC-12430-sear…
sarahlwelton e6d6e8f
[DOC-12430] Addressing some comments from SME review
sarahlwelton 625f294
Update modules/vector-search/pages/vector-search-index-architecture.adoc
sarahlwelton 903ce18
Update modules/vector-search/pages/vector-search-index-architecture.adoc
sarahlwelton be53aa7
[DOC-12430] Changes/suggestions from peer review
sarahlwelton File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
152 changes: 152 additions & 0 deletions
152
modules/vector-search/pages/vector-search-index-architecture.adoc
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,152 @@ | ||
| = Vector Search Index Architecture | ||
| :page-topic-type: concept | ||
| :description: Vector Search indexes use features from traditional Search indexes, with unique indexing algorithms and features that allow you to compare vectors in nearest neighbor searches. | ||
| :page-toclevels: 3 | ||
|
|
||
| [abstract] | ||
| {description} | ||
|
|
||
| A Vector Search index still relies on <<sync,>> and uses <<segments,>> to manage merging and persisting data to disk in your cluster. | ||
| All changes from Database Change Protocol (DCP) and the Data Service are introduced to a Search index in batches, which are further managed by segments. | ||
|
|
||
| [#sync] | ||
| == Synchronization with Database Change Protocol (DCP) and the Data Service | ||
|
|
||
| The Search Service uses batches to process data that comes in from xref:server:learn:clusters-and-availability/intra-cluster-replication.adoc#database-change-protocol[DCP] and the xref:server:learn:services-and-indexes:services/data-service.adoc[Data Service]. | ||
| DCP and Data Service changes are introduced gradually, based on available memory on Search Service nodes, until reindexing operations for an index are complete. | ||
|
|
||
| The Search Service can merge batches into a single batch before they're sent to the disk write queue, to reduce the resources required for batch processing. | ||
|
|
||
| The Search Service maintains index snapshots on each Search index partition. | ||
| These snapshots contain a representation of document mutations on either a write queue, or in storage. | ||
|
|
||
| If the Search Service loses connection to the Data Service, the Search Service compares its rollback sequence numbers in its snapshots with the Data Service when the connection is reestablished. | ||
| If the index snapshots on the Search Service are too far ahead, the Search Service performs a full rollback to get back in sync with the Data Service. | ||
|
|
||
| [#segments] | ||
| == Search Index Segments | ||
|
|
||
| Search and Vector Search indexes in Couchbase Server are built with segments. | ||
|
|
||
| All Search indexes contain a root segment, which includes all data for the Search index but excludes any segments that might be stale. | ||
| Stale segments are eventually removed by the Search Services's persister or merger routines. | ||
|
|
||
| The persister reads in-memory segments from the disk write queue and flushes them to disk, completing batch operations as part of <<sync,>>. | ||
| The merger works with the persister to consolidate flushed files and flush the consolidated results back through the persister - while purging the smaller, older files. | ||
|
|
||
| The persister and merger interact to continuously flush and merge new in-memory segments to disk, and remove stale segments. | ||
|
|
||
| Segments are marked as stale when they're replaced by a new merged segment created by the merger. | ||
| Stale segments are deleted when they're no longer used by any new queries. | ||
|
|
||
| As smaller segments are merged together through the merger routine, the Search Service automatically runs any needed retraining for Vector Search indexes. | ||
| The segments for a Vector Search index can contain different index types and use a separate indexing pipeline, choosing the appropriate indexing algorithm based on the size of your available documents. | ||
|
|
||
| == Vector Search and FAISS | ||
|
|
||
| Vector Search specifically uses https://faiss.ai/index.html[FAISS^] indexes. | ||
| Any vectors inside your documents are indexed using FAISS, to create a new query vector that can be searched for similar vectors inside your Vector Search index. | ||
|
|
||
| Vector Search chooses the best https://github.com/facebookresearch/faiss/wiki/Faiss-indexes[FAISS index class^], or vector search algorithm, for your data, and automatically tunes parameters to provide a balance of recall and latency. | ||
| You can choose to prioritize recall, latency, or memory efficiency with the xref:search:child-field-options-reference.adoc#optimized[Optimized For] setting on your index. | ||
| You can also choose to xref:fine-tune-vector-search.adoc[fine tune your Vector Search queries] to override the default balancing for your index, and change the number of centroids or probes searched in a query. | ||
|
|
||
| The FAISS indexes created for your vector data can be: | ||
|
|
||
| * <<flat,>> | ||
| * <<ivf,>> | ||
|
|
||
| The specific type of index used depends on the number of vectors in your dataset: | ||
|
|
||
| |==== | ||
| | Vector Count | Index Types | Description | ||
|
|
||
| | >=10,000 | ||
| | IVF with scalar quantization | ||
| a| Vectors are indexed with <<ivf,>> indexes and <<scalar-quant,>>. | ||
|
|
||
| If xref:search:child-field-options-reference.adoc#optimized[Optimized For] is set to *recall* or *latency*, Vector Search uses 8bit scalar quantization. | ||
| If set to *memory-efficient*, Vector Search uses 4bit scalar quantization. | ||
|
|
||
| | >=1000 | ||
| | IVF with Flat | ||
| | Vectors are indexed with <<ivf,>> combined with <<flat,>>. | ||
| Indexes do not use <<scalar-quant,>>. | ||
|
|
||
| | <1000 | ||
| | Flat | ||
| | Vectors are indexed with <<flat,>>. | ||
| Indexes do not use <<scalar-quant,>>. | ||
| |==== | ||
|
|
||
| [#flat] | ||
| === FLAT Indexes | ||
|
|
||
| The most basic kind of index that Vector Search can use for your vectors is a flat index. | ||
|
|
||
| Vector Search uses flat indexes for data that contains less than 1000 vectors. | ||
|
|
||
| Flat indexes are a list of vectors. | ||
| Searches run on a nearest neighbor process, based on examining the query vector against each vector in the index and calculating the distance. | ||
| Results for flat indexes are very accurate, but performance does not scale well as a dataset grows. | ||
|
|
||
| If a Vector Search index uses only flat indexes, no training is required - IDs are mapped directly to vectors with exact vector comparisons, with no need for preprocessing or learning on the data. | ||
|
|
||
| [#ivf] | ||
| === Inverted File Index (IVF) | ||
|
|
||
| For reduced latency, Vector Search can also use Inverted File Indexes (IVF). | ||
|
|
||
| Vector Search uses a combination of IVF and flat indexes for data that contains between 1000 and 9999 vectors. | ||
| For even larger datasets, Vector Search uses IVF indexes with <<scalar-quant,>>. | ||
|
|
||
| IVF creates partitions called Voronoi cells in an index. | ||
| The total number of cells is the *nlist* parameter. | ||
|
|
||
| Every cell has a centroid. | ||
| Every vector in the processed dataset is assigned to a cell that corresponds to its nearest centroid. | ||
|
|
||
| In an IVF index, Vector Search first tries to find a centroid vector closest to the query vector. | ||
| After finding this closest centroid vector, Vector Search uses the default `nprobe` and `max_codes` values to search over adjoining cells to the closest centroid and finds the top `k` number of vectors. | ||
|
|
||
| IVF index searches are not exhaustive searches. | ||
| You can increase accuracy by changing the `max_nprobe_pct` parameter or `max_codes_pct` when you xref:fine-tune-vector-search.adoc[fine tune your Vector Search queries]. | ||
|
|
||
| The Search Service automatically trains larger IVF indexes to learn the data distribution of your vectors, and the centroids of cells in your dataset. | ||
| The training data helps to encode and compress the vectors in your index with <<scalar-quant,>>. | ||
| All training occurs during building and merging <<segments,>>. | ||
|
|
||
| IVF indexes that also use flat indexing automatically train to determine the centroids of cells, but still use exact vector comparisons within each cell. | ||
| Training still occurs while building and merging <<segments,>>. | ||
|
|
||
| [#scalar-quant] | ||
| ==== Scalar Quantization | ||
|
|
||
| Vector Search uses scalar quantization on large datasets to reduce the size of your indexes. | ||
|
|
||
| Scalar quantization is an important data compression technique that turns the floating point values that could be present in a large vector into low-dimensional integers. | ||
| For example, a float32 value could be reduced to an int8 value. | ||
|
|
||
| Scalar quantization in Vector Search does not have a significant effect on the recall, or accuracy, of query results on large datasets. | ||
|
|
||
| Vector Search uses both 8bit and 4bit scalar quantization for indexes, based on your xref:search:child-field-options-reference.adoc#optimized[Optimized For] setting. | ||
|
|
||
| == Search Request Processing | ||
|
|
||
| The Search Service uses a scatter-gather process for running all Search queries, when there are multiple nodes in the cluster running the Search Service. | ||
|
|
||
| The Search Service node that receives the Search request is assigned as the coordinating node. | ||
| Using https://grpc.io/[gRPC^], the coordinating node scatters the request to all other partitions for the Search or Vector Search index in the request across other nodes. | ||
| The coordinating node applies filters to the results received from the other partitions, and returns the final result set. | ||
|
|
||
| Results are scored, and based on the xref:search:search-request-params.adoc#sort[Sort Object] provided in the Search request, returned in a list. | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should mention tf-idf and vector distance scores are summed during hybrid search and the user can influence the scoring with the |
||
|
|
||
| For a Vector Search query, search results include the top `k` nearest neighbor vectors to the vector in the Search query. | ||
| For more information about how results are scored and returned for Search requests, see xref:search:run-searches.adoc#scoring[Scoring for Search Queries]. | ||
|
|
||
| == See Also | ||
|
|
||
| * xref:fine-tune-vector-search.adoc[] | ||
| * xref:search:search-request-params.adoc[] | ||
| * xref:create-vector-search-index-rest-api.adoc[] | ||
| * xref:create-vector-search-index-ui.adoc[] | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should mention here the 3rd optimization we offer -
memory_efficientas well?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good shout! Yes, let's add that in the next line.