-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MultiProbe for L2 Similarity #73
Comments
Some rough ideas on how to implement this: Keep the same L2Lsh Mapping structure. This seems to be strictly a query-time optimization. Add a parameter Hashing function will generate the additional
Otherwise the query looks exactly the same. It takes the generated hashes and goes on to look them up the same as any other approx. query. Implement the naive version first (enumerate, score, sort all possible perturbations). Make sure that you can get equivalent recall on SIFT with fewer tables and ideally but not necessarily shorter query time. Then go back and implement the optimized version which precomputes perturbation sets, estimates the scores, etc. Make sure the optimized matches the naive. |
Implemented the naive version in #123. As far I can tell, the naivety is not a bottleneck in the current benchmarking configuration ( Even with So for now there are likely more worthwhile things to implement than the optimized version. |
🎆 🥳 🎈 Implemented Algorithm 1 from Qin et. al. in #124. This generates the perturbation sets iteratively instead of generating them exhaustively and sorting all of them. I'd say this is good enough or now. There doesn't seem to be a need to implement the expected scores optimization from section 4.5, and it's also not obvious to me why/how it works. |
This is the original paper: https://www.cs.princeton.edu/cass/papers/mplsh_vldb07.pdf
These lectures (16 and 17) should also be useful: https://www.youtube.com/watch?v=c5DHtx5VxX8
There was an unmerged PR in the main ES repo implementing multiprobe: https://github.com/elastic/elasticsearch/pull/44374/files
The text was updated successfully, but these errors were encountered: