Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix flaky tests that are caused by small float vectors #12943

Merged
merged 1 commit into from Dec 14, 2023

Conversation

benwtrent
Copy link
Member

While quantization generally works well, when the number of dimensions is tiny (just two like in our tests), and we are indexing a circle, and we have random merge policies, we can end up getting unexpected ordering on the resulting vectors.

closes: #12940

@@ -100,22 +100,18 @@ public KnnVectorsFormat getKnnVectorsFormatForField(String field) {
}
};

if (vectorEncoding == VectorEncoding.FLOAT32) {
float32Codec = codec;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so this was relying on the default codec being a FLOAT32 codec? TBH I haven't kept up with how we now select whether or not to use quantization. Is it the default? Do we need to override the codec to select it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@msokolov yeah, two tests are really relying on perfect scores & things being float. I mistakenly turned on quantization which adds some error bands to the scores (obviously, because its lossy) and thus its flaky.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there was just another failure on the dev list

FAILED: org.apache.lucene.util.hnsw.TestHnswFloatVectorGraph.testSortedAndUnsortedIndicesReturnSameResults

Error Message:
java.lang.AssertionError: expected:<[43, 199, 163, 3, 180]> but was:<[270, 43, 199, 163, 269]>

I wonder if it could be a similar root cause?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be, I was going to look into that one next.

Copy link
Contributor

@msokolov msokolov Dec 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK that does repro for me, but also on branch_9_8, so I think it must be a different issue - I'll open a new one to track. Umm correction it does not repro on branch_9_8, but it is in any case a different issue: #12945

@benwtrent benwtrent merged commit 423f827 into apache:main Dec 14, 2023
4 checks passed
@benwtrent benwtrent deleted the test/fix-tests-for-small-floats branch December 14, 2023 19:38
benwtrent added a commit that referenced this pull request Dec 14, 2023
While quantization generally works well, when the number of dimensions is tiny (just two like in our tests), and we are indexing a circle, and we have random merge policies, we can end up getting unexpected ordering on the resulting vectors.

closes: #12940
slow-J pushed a commit to slow-J/lucene that referenced this pull request Dec 17, 2023
While quantization generally works well, when the number of dimensions is tiny (just two like in our tests), and we are indexing a circle, and we have random merge policies, we can end up getting unexpected ordering on the resulting vectors.

closes: apache#12940
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Test failure in TestKnnGraph.testMultiThreadedSearch
2 participants