Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dense_vector data type in OpenSearch built on Lucene vector field #39

Closed
vamshin opened this issue Jun 9, 2021 · 8 comments
Closed
Assignees
Labels

Comments

@vamshin
Copy link
Member

vamshin commented Jun 9, 2021

Is your feature request related to a problem? Please describe.
OpenSearch dependent on knn_vector field part of kNN plugin for any vector operations.

Describe the solution you'd like
Introduce new vector field type dense_vector available with openSearch that piggybacks on Lucene vector field available from Lucene9

Describe alternatives you've considered
None

Additional context
None

@jmazanec15
Copy link
Member

This will be great. I would be interested to get feedback about moving the painless logic and some (or all) of the custom scoring to OpenSearch as well once Lucene's dense vector is available in OpenSearch.

I think if we were to do that, this plugin could just focus on using native libraries for ANN search.

@codebrain
Copy link

Lucene 9 released yesterday: https://lucene.apache.org/core/corenews.html#apache-lucenetm-900-available
I imagine this github issue is a prerequisite: opensearch-project/OpenSearch#1065

@jmazanec15
Copy link
Member

Hi @codebrain, we are looking into this. Will keep this thread updated.

@codebrain
Copy link

Thank you @jmazanec15, is vector search still targetted for the Opensearch 2.0 release?

@troilus-canva
Copy link

I think lucence9 just get merged 3 days ago
opensearch-project/OpenSearch#1065

@vamshin
Copy link
Member Author

vamshin commented Mar 25, 2022

Hi @codebrain @troilus-canva we are still evaluating lucene based vector search based on HNSW algorithm with k-NN plugin(c++ based) based vector search based on same HNSW. Our plan is to have this feature available if Lucene based solution can do better or equal in terms of indexing/search latency/recall thats being offered by current solution. This is to avoid duplication of having two solutions for the same problem.

We currently do not have plan to make this Lucene based solution part of 2.0 release. By the time of 2.1 release we will have clarity on the performance results and accordingly we would take call on below possible solutions
-> Have both solutions
-> Have lucene only
-> Have native(c++) only

@martin-gaievski
Copy link
Member

Published RFC 3545, aiming implementation of changes requested here

@martin-gaievski
Copy link
Member

done in knn plugin under existing knn_vector type, more details in #380

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants