Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix TestHnswByteVectorGraph.testSortedAndUnsortedIndicesReturnSameResults #13361

Merged

Conversation

timgrein
Copy link
Contributor

@timgrein timgrein commented May 12, 2024

Closes #13210

Description

The following test failed as it produced two different lists of ids for a sorted and unsorted HNSW byte vector graph as one graph didn't find a higher scoring doc the other one found:
gradlew test --tests TestHnswByteVectorGraph.testSortedAndUnsortedIndicesReturnSameResults -Dtests.seed=B41BEC5619361A16 -Dtests.locale=hi-IN -Dtests.timezone=Atlantic/Stanley -Dtests.asserts=true -Dtests.file.encoding=UTF-8

Considering that the graphs of 2 indices are organized differently we need to explore a lot of candidates to ensure that both searchers find the same docs. Increasing beamWidth (number of nearest neighbor candidates to track while searching the graph for each newly inserted node) from 5 to 10 fixes the test.

@benwtrent
Copy link
Member

@timgrein could you determine if the scores the same or not? I wonder if we are getting tripped up by doc IDs being the tie breaker for equal scores.

@timgrein
Copy link
Contributor Author

timgrein commented May 14, 2024

@benwtrent

Without increasing k we'll get the following for the failing test instance:

TOP 1 docs:
Document<stored<id:23>> 9.601536E-5
Document<stored<id:119>> 7.3713694E-5
Document<stored<id:163>> 7.087675E-5
Document<stored<id:148>> 7.051192E-5
Document<stored<id:51>> 6.879472E-5
  
TOP 2 docs:
Document<stored<id:23>> 9.601536E-5
Document<stored<id:193>> 8.53898E-5
Document<stored<id:119>> 7.3713694E-5
Document<stored<id:163>> 7.087675E-5
Document<stored<id:148>> 7.051192E-5

(So it seems like the first/unsorted index doesn't find document 193, when k is too small (60); I guess due to the different index structure?)

Increasing k to 80 leads to the following results for the previously failing test instance:

TOP 1 docs:
Document<stored<id:23>> 9.601536E-5
Document<stored<id:193>> 8.53898E-5
Document<stored<id:119>> 7.3713694E-5
Document<stored<id:163>> 7.087675E-5
Document<stored<id:148>> 7.051192E-5

TOP 2 docs:
Document<stored<id:23>> 9.601536E-5
Document<stored<id:193>> 8.53898E-5
Document<stored<id:119>> 7.3713694E-5
Document<stored<id:163>> 7.087675E-5
Document<stored<id:148>> 7.051192E-5

@benwtrent
Copy link
Member

@timgrein what is the beamwidth set to in the failing case?

We may want to increase the beamWidth size to just make the test more consistent.

int beamWidth = random().nextInt(10) + 10; // from the previous base of 5

@timgrein
Copy link
Contributor Author

timgrein commented May 14, 2024

@benwtrent The beam width for the failing test case was the smallest value possible 5. Increased the minimum to 10 according to your suggestion (which also fixes the test without increasing k). Do we still want to keep the increased k? Increasing it didn't seem to affect execution time too much at least for this test instance and it should probably reduce the likelihood of a flaky fail even further.

@benwtrent
Copy link
Member

Do we still want to keep the increased k?

I would rather not, we keep bumping it up, eventually we are going to stop searching in the graph altogether and just brute force, which ruins the reason for the test.

@timgrein
Copy link
Contributor Author

eventually we are going to stop searching in the graph altogether and just brute force, which ruins the reason for the test

Makes sense, decreased k again to 60 👍

Copy link
Contributor

@ChrisHegarty ChrisHegarty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ChrisHegarty ChrisHegarty merged commit 24fd426 into apache:main May 17, 2024
3 checks passed
ChrisHegarty pushed a commit to ChrisHegarty/lucene that referenced this pull request May 17, 2024
…ults (apache#13361)

Considering that the graphs of 2 indices are organized differently we need to explore a lot of candidates to ensure that both searchers find the same docs. Increasing beamWidth (number of nearest neighbor candidates to track while searching the graph for each newly inserted node) from 5 to 10 fixes the test.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Test failures in TestHnswByteVectorGraph.testSortedAndUnsortedIndicesReturnSameResults
3 participants