-
Notifications
You must be signed in to change notification settings - Fork 25.5k
Description
To support ANN search, Elasticsearch indexes the vector values of each segment as an HNSW graph. Usually an index has multiple segments, which means ANN must check several HNSW graphs as it searches one segment after another. With fewer segments, the ANN search is more efficient, since we check a smaller number of graphs. Another way to describe it: searching an HNSW graph is sub-linear in the number of vectors, so it's better to search larger graphs vs. several smaller ones.
It's certainly possible to force merge the index to one segment, but force merge can take substantial time and has some downsides (explained here).
This meta issue explores ideas to improve performance that don't require force merge. These are early ideas and may change.
- Improvements to Elasticsearch's flush logic
- Fix the surprising behavior in Elasticsearch flush where we sometimes create tiny segments (#90776)
- Enhance
IndexMemoryController
to no longer flush all pending segments when hitting theindices.memory.index_buffer_size
threshold, and instead only flush the largest pending segment (#34553) - When uploading a large set of vectors in bulk, the segment size is usually limited by
index.translog.flush_threshold_size
(512MB by default). Are there ways to avoid bumping up against this limit?
- Enable merge-on-refresh to avoid publishing tiny segments in point-in-time views of shards.
- Consider a dedicated merge policy optimized for vector search that encourages larger segments.
- Decrease
index.merge.policy.segments_per_tier
to encourage the merge policy to merge segments more aggressively. - Increase
index.merge.policy.max_merged_segment
to not stop merging segments at 5GB. - Increase
index.merge.policy.floor_segment
to further bias the merge policy against small segments.
- Decrease
- Improve search speed in presence of multiple segments.
- Maybe we don't need to gather the full
num_candidates
from each segment?
- Maybe we don't need to gather the full
- Perhaps we could have a special "bulk vector upload" operation that does not allow concurrent searches, and makes sure to create a single graph by tweaking settings under-the-hood.