keep using LittleEndian if hnsw+pq for backwards compatibility #4348

abdelr · 2024-02-28T20:29:50Z

What's being changed:

Reverting the use of BigEndian in case of HNSW+PQ for backwards compatibility.

Originally, we used LittleEndian everywhere. Then in 1.23 we introduced flat and flat+bq and we changed how the Ids are stored and used BigEndian instead, so we could seek in the files using the Ids correctly. This was necessary for the filter logic. Later, when we unified, we introduced the bug since hnsw+pq was originally using LittleEndian for the Ids. Now, for backward compatibility, we cannot use BigEndian for the Ids in the hnsw+pq case, but the rest should be fine.

Review checklist

Documentation has been updated, if necessary. Link to changed documentation:
Chaos pipeline run or not necessary. Link to pipeline:
All new code is covered by tests where it is reasonable.
Performance tests have been run or not necessary.

sonarcloud · 2024-02-28T21:47:57Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
No data about Coverage
0.0% Duplication on New Code

See analysis details on SonarCloud

jeroiraz

given both compressors (PQ and BQ) may be using the same bucket name, is it already in place a validation to detect which compressor is being used? so to handle a change in settings

trengrj · 2024-02-29T04:28:23Z

Do we also need to keep LittleEndian for flat+bq compatibility?

weaviate/adapters/repos/db/vector/flat/index.go

Line 202 in 5a02ebb

binary.LittleEndian.PutUint64(slice[i*8:], vector[i])

abdelr · 2024-02-29T08:08:02Z

given both compressors (PQ and BQ) may be using the same bucket name, is it already in place a validation to detect which compressor is being used? so to handle a change in settings

Currently, we do not allow switching from one compression method to the other. No need for handling config updates in such sense. When resuming the server though, we do need, but we have indeed a mechanism to detect the proper compression method.

abdelr · 2024-02-29T08:15:38Z

Do we also need to keep LittleEndian for flat+bq compatibility?

In the case you pointed out, it is the encoding of the vector values, not the ids. I think, we have not changed anything in this part. Check here the compressor code is using also LittleEndian

abdelr · 2024-02-29T08:19:29Z

I have updated the description for better understanding.

abdelr added 2 commits February 28, 2024 21:28

keep using LittleEndian if hnsw+pq for backwards compatibility

d6db5b5

renaming and removing the flag in the parameters

22e6eb7

abdelr self-assigned this Feb 28, 2024

jeroiraz reviewed Feb 28, 2024

View reviewed changes

aliszka approved these changes Feb 29, 2024

View reviewed changes

amourao mentioned this pull request Feb 29, 2024

Revert id endianness to littleendian on vector cache #4345

Closed

4 tasks

asdine approved these changes Feb 29, 2024

View reviewed changes

parkerduckworth merged commit b1c603e into stable/v1.24 Feb 29, 2024
32 of 33 checks passed

parkerduckworth deleted the fixing_wrong_endian branch February 29, 2024 21:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

keep using LittleEndian if hnsw+pq for backwards compatibility #4348

keep using LittleEndian if hnsw+pq for backwards compatibility #4348

abdelr commented Feb 28, 2024 •

edited

sonarcloud bot commented Feb 28, 2024

jeroiraz left a comment

trengrj commented Feb 29, 2024 •

edited

abdelr commented Feb 29, 2024

abdelr commented Feb 29, 2024 •

edited

abdelr commented Feb 29, 2024

keep using LittleEndian if hnsw+pq for backwards compatibility #4348

keep using LittleEndian if hnsw+pq for backwards compatibility #4348

Conversation

abdelr commented Feb 28, 2024 • edited

What's being changed:

Review checklist

sonarcloud bot commented Feb 28, 2024

Quality Gate passed

jeroiraz left a comment

Choose a reason for hiding this comment

trengrj commented Feb 29, 2024 • edited

abdelr commented Feb 29, 2024

abdelr commented Feb 29, 2024 • edited

abdelr commented Feb 29, 2024

abdelr commented Feb 28, 2024 •

edited

trengrj commented Feb 29, 2024 •

edited

abdelr commented Feb 29, 2024 •

edited