-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Fixing sorted indices for GPU built indices #138138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
mayya-sharipova
commented
Nov 16, 2025
- For flush, vectors are now reordered according to sortMap before building the GPU index, ensuring that HNSW graph node ordinals match the sorted document order.
- Merge on the other hand doesn't require explicit sortMap handling since Lucene's MergedVecto utilities apply docMaps internally.
- Enhanced tests with both approximate and exact KNN searches to validate sorting correctness.
- For flush, vectors are now reordered according to sortMap before building the GPU index, ensuring that HNSW graph node ordinals match the sorted document order. - Merge on the other hand doesn't require explicit sortMap handling since Lucene's MergedVecto utilities apply docMaps internally. - Enhanced tests with both approximate and exact KNN searches to validate sorting correctness.
|
Pinging @elastic/es-search-relevance (Team:Search Relevance) |
|
Hi @mayya-sharipova, I've created a changelog YAML for you. |
ldematte
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change looks good to me; I'm not a Lucene expert so I cannot say if it's the right way to do it so I'll trust you/Chris on this.
You probably want to merge in changes from #138155 and add the test-gpu flag to be sure tests pass/are OK.
ChrisHegarty
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes look good to me. I guess I'm surprised to not see some changes to ES92GpuHnswVectorsFormatTests! Is there no sorting tests there?
These tests use tests inside BaseKnnVectorsFormatTestCase, such as @ChrisHegarty Do you suggest we need to add tests into ES92GpuHnswVectorsFormatTests, I can do that |
|
First, I think that this PR is good to be merged as-is.
Right. It surprises me that sorting was completely unimplemented, and that no scenarios in |
|
Thanks Chris, I will merge this PR and look into adding more sorted index test into BaseKnnVectorsFormatTestCase |
💚 Backport successful
|
- For flush, vectors are now reordered according to sortMap before building the GPU index, ensuring that HNSW graph node ordinals match the sorted document order. - Merge on the other hand doesn't require explicit sortMap handling since Lucene's MergedVecto utilities apply docMaps internally. - Enhanced tests with both approximate and exact KNN searches to validate sorting correctness.
- For flush, vectors are now reordered according to sortMap before building the GPU index, ensuring that HNSW graph node ordinals match the sorted document order. - Merge on the other hand doesn't require explicit sortMap handling since Lucene's MergedVecto utilities apply docMaps internally. - Enhanced tests with both approximate and exact KNN searches to validate sorting correctness.