New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Integrate weaviate as another DocumentStore #957
Comments
Found this notebook that uses python client to connect to Weaviate. https://github.com/semi-technologies/Getting-Started-With-Weaviate-Python-Client/blob/main/Getting-Started-With-Weaviate-Python-Client.ipynb |
@venuraja79 Would you like to contribute and create PR? Obviously Haystack community can support you in this journey. |
sure @lalitpagaria. Just started reviewing the Weaviate docs. |
Few design decisions -
Just a quick update - |
Awesome! @venuraja79 |
All - raised a WIP PR. Write, get and query methods have been tested offline. I'll create automated tests and update during the next iteration. |
Great idea. Weaviate / haystack looks like a good fit. Happy to support / test, if help is needed ? |
Confirmed working for a simple scenario, very nice. If there is something specific you would like me to test, please let me know. |
Hi @LarsAC, thanks for your help earlier. We have made a few design changes from the last version and have updated the code & tests. Except for query and update embeddings methods, others have been validated. Any review / tests from your side will be great when you get a chance. |
Implemented in #1064 |
Is your feature request related to a problem? Please describe.
Haystack already support Vector search via FAISS and Milvus. In this both solution document/data reside in SQL store.
So main idea is what if we have data and embedding close to each other which Weaviate do (Yes Elasticsearch as well have this capability but not performant). Hence reduction in less network calls.
Describe the solution you'd like
What about integrating Weaviate as another document store.
Describe alternatives you've considered
I thought about having FAISS as embedding store and RocksDB as document store (only keeping vectorId to text mapping). I am sure this would beat many system but it would not be as customisable as other solutions :)
Also making it distributed would be challenge along with adding filter queries.
Additional context
I feel it would be easier to integrate via Python binding. All would be done via GraphQL api interface as done in case of Milvus.
The text was updated successfully, but these errors were encountered: