-
Notifications
You must be signed in to change notification settings - Fork 680
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
keep using LittleEndian if hnsw+pq for backwards compatibility #4348
Conversation
Quality Gate passedIssues Measures |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
given both compressors (PQ and BQ) may be using the same bucket name, is it already in place a validation to detect which compressor is being used? so to handle a change in settings
Do we also need to keep LittleEndian for flat+bq compatibility?
|
Currently, we do not allow switching from one compression method to the other. No need for handling config updates in such sense. When resuming the server though, we do need, but we have indeed a mechanism to detect the proper compression method. |
In the case you pointed out, it is the encoding of the vector values, not the ids. I think, we have not changed anything in this part. Check here the compressor code is using also LittleEndian |
I have updated the description for better understanding. |
What's being changed:
Reverting the use of BigEndian in case of HNSW+PQ for backwards compatibility.
Originally, we used LittleEndian everywhere. Then in 1.23 we introduced flat and flat+bq and we changed how the Ids are stored and used BigEndian instead, so we could seek in the files using the Ids correctly. This was necessary for the filter logic. Later, when we unified, we introduced the bug since hnsw+pq was originally using LittleEndian for the Ids. Now, for backward compatibility, we cannot use BigEndian for the Ids in the hnsw+pq case, but the rest should be fine.
Review checklist