Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a dynamic LSH table #2

Open
ludwigschmidt opened this issue Dec 10, 2015 · 4 comments
Open

Add a dynamic LSH table #2

ludwigschmidt opened this issue Dec 10, 2015 · 4 comments
Assignees

Comments

@ludwigschmidt
Copy link
Collaborator

No description provided.

@ludwigschmidt ludwigschmidt self-assigned this Dec 10, 2015
@hxwu
Copy link

hxwu commented Nov 21, 2016

Excuse me, could you please show me how to add a dynamic LSH table?

@hxwu
Copy link

hxwu commented Nov 21, 2016

I just want to insert or remove some record from the established LSH table

@ludwigschmidt
Copy link
Collaborator Author

Hi hxwu,

updates are supported in some of the core classes, in particular the DynamicLinearProbingHashTable and the DynamicCompositeHashTable:

https://github.com/ludwigschmidt/fth_lsh/blob/master/lsh_table/src/probing_hash_table.h
https://github.com/ludwigschmidt/fth_lsh/blob/master/lsh_table/src/composite_hash_table.h

What we need next is a dynamic version of the StaticLSHTable in

https://github.com/FALCONN-LIB/FALCONN/blob/master/src/include/falconn/core/lsh_table.h

We should be able to re-use quite a bit of code here. The main difference is that the static LSH table always assumes that the key range is between 0 and data_set_size - 1. This allows us to simply keep a large array for checking whether a certain candidate already appears in the answer for the current query. If the key range is arbitrary, we need a less specialized set data structure here (a hash table would probably be the right thing). Overall it should not be a big performance issue, but it's good to get it right.

The code is organized so that the query objects in the LSHTable are only responsible for producing a list of candidates. The actual distances are then computed in the NearestNeighborQuery class:

https://github.com/ludwigschmidt/fth_lsh/blob/master/lsh_table/src/nn_query.h

Two questions about your application:

  • Are you planning to use the C++ or Python interface?
  • How is your dynamic point set stored? Would it make sense to always have the point indices in a small range (say 0 to some_constant * largest_point_set_size)?

Cheers,
Ludwig

@pengfu008
Copy link

Insert and remove record is very important in my application, I need to dynamic update LSH table, I use python interface, Do you have the plan to support it? @ludwigschmidt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants