Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

do search while at the same time adding incrementally #492

Closed
KangRinpoche opened this issue Jun 14, 2018 · 4 comments
Closed

do search while at the same time adding incrementally #492

KangRinpoche opened this issue Jun 14, 2018 · 4 comments
Assignees
Labels

Comments

@KangRinpoche
Copy link

KangRinpoche commented Jun 14, 2018

Hi,
there are 100 million 128-d vectors in my database, trained and added,
and I'm going to add around half a million vectors every 4 hours.
Can I search the index while at the same time adding new vectors.
Thanks for your help!

@mdouze
Copy link
Contributor

mdouze commented Jun 14, 2018

Hi
No you can't, see

https://github.com/facebookresearch/faiss/wiki/Threads-and-asynchronous-calls

Your options are, from simplest to most complex:

  • lock the index for search during add.

  • perform the add in a separate temp index (that is trained in the same way as the main index) then merge the main index with the temp index (see https://github.com/facebookresearch/faiss/wiki/Special-operations-on-indexes#splitting-and-merging-indexes). The index will still be unavailable for search during the merge but downtime will be shorter.

  • at search time, during the 4 hours, copy the index to an offline index, add vectors to that one, and swap indexes every 4 hours. No downtime but the index is stored twice.

@KangRinpoche
Copy link
Author

KangRinpoche commented Jun 14, 2018

Appreciate for your valuable options!@mdouze
As far as I understand, faiss keeps the vectors index in the same order as they were added.
For eg: vector_a was the 1000th added, then he owns index No.1000 in faiss index.
So for the second option "perform the add in a separate temp index", should I worry about the index number?
For eg: main index has 1000 vectors, then the 1st vector of temp index will own index No.1001 in the final index.
Thanks!

@mdouze
Copy link
Contributor

mdouze commented Jul 6, 2018

No activity, closing.

@mdouze mdouze closed this as completed Jul 6, 2018
@xukefang
Copy link

@KangRinpoche
just set the No.index you need to add_with_ids

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants