-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix TestHnswByteVectorGraph.testSortedAndUnsortedIndicesReturnSameResults #13361
Fix TestHnswByteVectorGraph.testSortedAndUnsortedIndicesReturnSameResults #13361
Conversation
…dAndUnsortedIndicesReturnSameResults.
@timgrein could you determine if the scores the same or not? I wonder if we are getting tripped up by doc IDs being the tie breaker for equal scores. |
Without increasing
(So it seems like the first/unsorted index doesn't find document Increasing
|
@timgrein what is the beamwidth set to in the failing case? We may want to increase the beamWidth size to just make the test more consistent.
|
@benwtrent The beam width for the failing test case was the smallest value possible |
I would rather not, we keep bumping it up, eventually we are going to stop searching in the graph altogether and just brute force, which ruins the reason for the test. |
Makes sense, decreased |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
…ults (apache#13361) Considering that the graphs of 2 indices are organized differently we need to explore a lot of candidates to ensure that both searchers find the same docs. Increasing beamWidth (number of nearest neighbor candidates to track while searching the graph for each newly inserted node) from 5 to 10 fixes the test.
Closes #13210
Description
The following test failed as it produced two different lists of ids for a sorted and unsorted HNSW byte vector graph as one graph didn't find a higher scoring doc the other one found:
gradlew test --tests TestHnswByteVectorGraph.testSortedAndUnsortedIndicesReturnSameResults -Dtests.seed=B41BEC5619361A16 -Dtests.locale=hi-IN -Dtests.timezone=Atlantic/Stanley -Dtests.asserts=true -Dtests.file.encoding=UTF-8
Considering that the graphs of 2 indices are organized differently we need to explore a lot of candidates to ensure that both searchers find the same docs. Increasing
beamWidth
(number of nearest neighbor candidates to track while searching the graph for each newly inserted node) from5
to10
fixes the test.