Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Faster Imports and better import DX: Async vector index building #3463

Closed
etiennedi opened this issue Aug 27, 2023 · 1 comment
Closed

Faster Imports and better import DX: Async vector index building #3463

etiennedi opened this issue Aug 27, 2023 · 1 comment

Comments

@etiennedi
Copy link
Member

At the moment, a batch request is fully synchronous. This means it will only complete once everything is fully indexed.

Instead, it should be possible to import async where only the object storage (and possibly the inverted index) is written synchronously. Vector indexing happens async in more organized batches.

This should lead to:

  • Better CPU utilization by making sure the import queue does not run dry
  • Easier to avoid lock contention as some processes can happen in larger batches
  • Avoid import vs delete races and contention, i.e. the whole delete re-built logic can be rewritten

To avoid the negative user impact of the index not being available immediately, we need to make sure that:

  • If feasible, a search while importing makes use of the partially built index and in addition brute-force objects which are not yet indexed
  • It is easily visible to the user whether an index is not fully built yet and when it turns ready. Clients should offer methods to wait until all is ready.
@etiennedi
Copy link
Member Author

Delivered in 1.23

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant