New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix flaky tests that are caused by small float vectors #12943
Fix flaky tests that are caused by small float vectors #12943
Conversation
@@ -100,22 +100,18 @@ public KnnVectorsFormat getKnnVectorsFormatForField(String field) { | |||
} | |||
}; | |||
|
|||
if (vectorEncoding == VectorEncoding.FLOAT32) { | |||
float32Codec = codec; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so this was relying on the default codec being a FLOAT32 codec? TBH I haven't kept up with how we now select whether or not to use quantization. Is it the default? Do we need to override the codec to select it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@msokolov yeah, two tests are really relying on perfect scores & things being float. I mistakenly turned on quantization which adds some error bands to the scores (obviously, because its lossy) and thus its flaky.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there was just another failure on the dev list
FAILED: org.apache.lucene.util.hnsw.TestHnswFloatVectorGraph.testSortedAndUnsortedIndicesReturnSameResults
Error Message:
java.lang.AssertionError: expected:<[43, 199, 163, 3, 180]> but was:<[270, 43, 199, 163, 269]>
I wonder if it could be a similar root cause?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It could be, I was going to look into that one next.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK that does repro for me, but also on branch_9_8, so I think it must be a different issue - I'll open a new one to track. Umm correction it does not repro on branch_9_8, but it is in any case a different issue: #12945
While quantization generally works well, when the number of dimensions is tiny (just two like in our tests), and we are indexing a circle, and we have random merge policies, we can end up getting unexpected ordering on the resulting vectors. closes: #12940
While quantization generally works well, when the number of dimensions is tiny (just two like in our tests), and we are indexing a circle, and we have random merge policies, we can end up getting unexpected ordering on the resulting vectors. closes: apache#12940
While quantization generally works well, when the number of dimensions is tiny (just two like in our tests), and we are indexing a circle, and we have random merge policies, we can end up getting unexpected ordering on the resulting vectors.
closes: #12940