Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ProductQuantizer compute_codes get wrong codes when nbits not 8 #2285

Closed
jasstionzyf opened this issue Apr 4, 2022 · 2 comments
Closed

ProductQuantizer compute_codes get wrong codes when nbits not 8 #2285

jasstionzyf opened this issue Apr 4, 2022 · 2 comments
Labels

Comments

@jasstionzyf
Copy link

jasstionzyf commented Apr 4, 2022

d = 10
n = 400000
cs = 5
np.random.seed(123)
x = np.random.random(size=(n, d)).astype('float32')
testInputs=np.random.random(size=(1, d)).astype('float32')
print(testInputs)
pq = faiss.ProductQuantizer(d, cs,6)
pq.verbose=True
pq.train(x)
codes=pq.compute_codes(testInputs)
#here expect 5 code range from 0-64, but get 4 and also code number not range 0-64
print(codes.shape)
@mdouze mdouze added the question label Apr 4, 2022
@mdouze
Copy link
Contributor

mdouze commented Apr 4, 2022

This is because the codes are packed into ceil(5 * 6 / 8) = 4 bytes.
To access the individual codes, use BitstringReader:

bs = faiss.BitstringReader(faiss.swig_ptr(codes[0]), codes.shape[1])
for i in range(cs): 
    print(bs.read(6))  # read 6 bits at a time

Admittedly, the BitstringReader API could be made more python friendly.

@jasstionzyf
Copy link
Author

@mdouze thanks very much!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants