Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enabling BQ compression through update user config #3875

Merged
merged 110 commits into from
Dec 13, 2023

Conversation

abdelr
Copy link
Contributor

@abdelr abdelr commented Dec 8, 2023

What's being changed:

Enabling BQ compression through update user config

Review checklist

  • Documentation has been updated, if necessary. Link to changed documentation:
  • Chaos pipeline run or not necessary. Link to pipeline:
  • All new code is covered by tests where it is reasonable.
  • Performance tests have been run or not necessary.

name: "setting bq compression on",
initial: ent.UserConfig{
BQ: ent.BQConfig{
Enabled: true,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be false initially?

Comment on lines -65 to -67
if err := h.commitLog.AddPQ(h.compressor.ExposeFields()); err != nil {
return errors.Wrap(err, "Adding PQ to the commit logger")
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no longer needed? if so ExposeFields() could be removed from VectorCompressor interface as this seems to be only usage

Comment on lines 28 to 30
h.shardedNodeLocks.RLock(0)
node := h.nodes[0]
h.shardedNodeLocks.RUnlock(0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you please merge master into HNSW_BQ branch. I believe is changed already

Comment on lines +47 to 53
cleanData := make([][]float32, 0, len(data))
for _, point := range data {
if point == nil {
continue
}
cleanData = append(cleanData, point)
}
Copy link
Member

@aliszka aliszka Dec 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I understand index of vector in cache is vector's id. Here all empty element are removed, but as a result indexes change. Is this change safe? Are we sure that indexes are not lost or mixed later on?
(I see that compressor's cache is grown to size of cleaned up slice, but that may be to small for highest ids if some empty elements were indeed removed)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cleanData is only used for fitting KMeans. Latter, the points are added using data instead of cleanData and this is when they are added to the cache. The two slices are needed since KMeans does expect a nil free array so the code is safe since the id is only needed for the cache where we do use data and not cleanData.

@abdelr abdelr requested a review from a team as a code owner December 13, 2023 10:05
@abdelr abdelr merged commit f9e6cb7 into HNSW_BQ Dec 13, 2023
7 of 15 checks passed
@abdelr abdelr deleted the Allow_BQ_on_UpdateUserConfig branch December 13, 2023 10:07
Copy link

sonarcloud bot commented Dec 13, 2023

Quality Gate Failed Quality Gate failed

Failed conditions

6.7% Duplication on New Code (required ≤ 3%)

See analysis details on SonarCloud

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

10 participants