Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrate weaviate as another DocumentStore #957

Closed
lalitpagaria opened this issue Apr 10, 2021 · 10 comments
Closed

Integrate weaviate as another DocumentStore #957

lalitpagaria opened this issue Apr 10, 2021 · 10 comments
Labels
Contributions wanted! Looking for external contributions type:feature New feature or request

Comments

@lalitpagaria
Copy link
Contributor

Is your feature request related to a problem? Please describe.
Haystack already support Vector search via FAISS and Milvus. In this both solution document/data reside in SQL store.
So main idea is what if we have data and embedding close to each other which Weaviate do (Yes Elasticsearch as well have this capability but not performant). Hence reduction in less network calls.

Describe the solution you'd like
What about integrating Weaviate as another document store.

Describe alternatives you've considered
I thought about having FAISS as embedding store and RocksDB as document store (only keeping vectorId to text mapping). I am sure this would beat many system but it would not be as customisable as other solutions :)
Also making it distributed would be challenge along with adding filter queries.

Additional context
I feel it would be easier to integrate via Python binding. All would be done via GraphQL api interface as done in case of Milvus.

@lalitpagaria lalitpagaria added the type:feature New feature or request label Apr 10, 2021
@tholor tholor added the Contributions wanted! Looking for external contributions label Apr 20, 2021
@venuraja79
Copy link
Contributor

@lalitpagaria
Copy link
Contributor Author

lalitpagaria commented Apr 30, 2021

@venuraja79 Would you like to contribute and create PR?
You can check ElasticSearchDocumentStore which use elasticsearch client or MilvusDocumentStore which use milvus client to sample integration with Haystack.

Obviously Haystack community can support you in this journey.

@venuraja79
Copy link
Contributor

sure @lalitpagaria. Just started reviewing the Weaviate docs.

@venuraja79
Copy link
Contributor

Few design decisions -

  1. Haystack Index == Weaviate class
  2. Haystack Document meta (dict) - to be stored as a property in weaviate
  3. text2vec-transformers to create the vectors, it will be configurable though

Just a quick update -
have made some progress in creating schema, writing docs and querying the system. With this, I can start implementing the document store and will post further progress here.

@lalitpagaria
Copy link
Contributor Author

Awesome! @venuraja79
Can you please create WIP PR so people can review and provide early feedback on design.

@venuraja79
Copy link
Contributor

venuraja79 commented May 16, 2021

All - raised a WIP PR. Write, get and query methods have been tested offline. I'll create automated tests and update during the next iteration.
A few design questions and the dev status (pending items etc.,) are in the PR itself. Please feel free to review.

#1064

@LarsAC
Copy link

LarsAC commented May 16, 2021

Great idea. Weaviate / haystack looks like a good fit. Happy to support / test, if help is needed ?

@LarsAC
Copy link

LarsAC commented May 19, 2021

Confirmed working for a simple scenario, very nice. If there is something specific you would like me to test, please let me know.

@venuraja79
Copy link
Contributor

Hi @LarsAC, thanks for your help earlier. We have made a few design changes from the last version and have updated the code & tests. Except for query and update embeddings methods, others have been validated. Any review / tests from your side will be great when you get a chance.

@tholor
Copy link
Member

tholor commented Jun 15, 2021

Implemented in #1064

@tholor tholor closed this as completed Jun 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Contributions wanted! Looking for external contributions type:feature New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants