Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

what does parameter "PQ_disk_bytes" mean? #14

Closed
HannanKan opened this issue Sep 7, 2021 · 4 comments
Closed

what does parameter "PQ_disk_bytes" mean? #14

HannanKan opened this issue Sep 7, 2021 · 4 comments

Comments

@HannanKan
Copy link

HannanKan commented Sep 7, 2021

I will appreciate if you can tell me what "PQ_disk_bytes" parameters controls. Some hint by source code are "(for very large dimensionality, use 0 for full vectors)". I still can not get it.

@HannanKan HannanKan changed the title what does parameter PQ_disk_bytes what does parameter "PQ_disk_bytes" mean? Sep 7, 2021
@ShikharJ
Copy link
Contributor

ShikharJ commented Sep 7, 2021

@harsha-simhadri

@ShikharJ
Copy link
Contributor

@rakri Would you be able to provide some insight?

@rakri
Copy link
Contributor

rakri commented Sep 16, 2021

For very large dimensional datasets, say, > 1024, as implemented DiskANN will crash as the contents of a single node exceed the 4KB disk sector. In such situations (or when you want lower disk footprint), we can store a separate compressed representation on the disk index file. The number of bytes per vector you want as part of the disk footprint is that parameter. If you specify 0, it will store the vectors in full precision form occupying dim*sizeof(data-type) bytes per point

@HannanKan
Copy link
Author

Got it. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants