Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

keep using LittleEndian if hnsw+pq for backwards compatibility #4348

Merged
merged 2 commits into from
Feb 29, 2024

Conversation

abdelr
Copy link
Contributor

@abdelr abdelr commented Feb 28, 2024

What's being changed:

Reverting the use of BigEndian in case of HNSW+PQ for backwards compatibility.

Originally, we used LittleEndian everywhere. Then in 1.23 we introduced flat and flat+bq and we changed how the Ids are stored and used BigEndian instead, so we could seek in the files using the Ids correctly. This was necessary for the filter logic. Later, when we unified, we introduced the bug since hnsw+pq was originally using LittleEndian for the Ids. Now, for backward compatibility, we cannot use BigEndian for the Ids in the hnsw+pq case, but the rest should be fine.

Review checklist

  • Documentation has been updated, if necessary. Link to changed documentation:
  • Chaos pipeline run or not necessary. Link to pipeline:
  • All new code is covered by tests where it is reasonable.
  • Performance tests have been run or not necessary.

Copy link

sonarcloud bot commented Feb 28, 2024

Quality Gate Passed Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
No data about Coverage
0.0% Duplication on New Code

See analysis details on SonarCloud

@abdelr abdelr self-assigned this Feb 28, 2024
Copy link
Contributor

@jeroiraz jeroiraz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

given both compressors (PQ and BQ) may be using the same bucket name, is it already in place a validation to detect which compressor is being used? so to handle a change in settings

@trengrj
Copy link
Member

trengrj commented Feb 29, 2024

Do we also need to keep LittleEndian for flat+bq compatibility?

binary.LittleEndian.PutUint64(slice[i*8:], vector[i])

@abdelr
Copy link
Contributor Author

abdelr commented Feb 29, 2024

given both compressors (PQ and BQ) may be using the same bucket name, is it already in place a validation to detect which compressor is being used? so to handle a change in settings

Currently, we do not allow switching from one compression method to the other. No need for handling config updates in such sense. When resuming the server though, we do need, but we have indeed a mechanism to detect the proper compression method.

@abdelr
Copy link
Contributor Author

abdelr commented Feb 29, 2024

Do we also need to keep LittleEndian for flat+bq compatibility?

In the case you pointed out, it is the encoding of the vector values, not the ids. I think, we have not changed anything in this part. Check here the compressor code is using also LittleEndian

@abdelr
Copy link
Contributor Author

abdelr commented Feb 29, 2024

I have updated the description for better understanding.

@parkerduckworth parkerduckworth merged commit b1c603e into stable/v1.24 Feb 29, 2024
32 of 33 checks passed
@parkerduckworth parkerduckworth deleted the fixing_wrong_endian branch February 29, 2024 21:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants