Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data from Solr #81

Closed
adjouama opened this issue Apr 29, 2020 · 4 comments
Closed

Data from Solr #81

adjouama opened this issue Apr 29, 2020 · 4 comments
Labels

Comments

@adjouama
Copy link

Hello,

first, thank you a lot for this repo. I have a quick question.
my data is indexed in Solr. is it possible to use it instead of Elastic search ?

I can't find documentation in this repo regarding data preparation. Any guidance ?

thank you a lot in advance.

Best

@tholor
Copy link
Member

tholor commented Apr 30, 2020

Hey @adjouama,

We currently don't have Solr integrated. However, as Haystack is completely modular, it would definitely be possible to do that.

What you would need to do:

  1. Implement a SolrDocumentStore
    You could take the ElasticsearchDocumentStore as an example and adjust it to Solr.
    This will be the major work, but you won't need all of the implemented functions there (e.g. query_by_embedding is not mandatory). The documentstore is needed for indexing docs, triggering queries and storing the mapping of fields from DB to haystack Documents

  2. Implement a SolrRetriever (or use the in-memory TfidfRetriever for smaller prototyping)
    The retriever could be rather light-weight, e.g. consider the basic ElasticsearchRetriever which has only ~ 6 lines of code.

The other major components (Reader, Finder ...) are independent of this and will work out-of-the-box. We also plan to have some advanced Retrievers soon that are independent of the datastore.

Let us know if you want to tackle this in a PR. We are happy to support you along the way.

@tholor tholor added the type:feature New feature or request label May 6, 2020
@stale
Copy link

stale bot commented Jul 18, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs.

@stale stale bot added the stale label Jul 18, 2020
@stale stale bot closed this as completed Aug 1, 2020
@Always-prog
Copy link

Hi @adjouama.
I also need to refactor the work haystack to it will work with Solr.
But, did you do it? And if you refactored haystack for Solr, can I use your code?
Thanks.

@adjouama
Copy link
Author

Hi @adjouama.
I also need to refactor the work haystack to it will work with Solr.
But, did you do it? And if you refactored haystack for Solr, can I use your code?
Thanks.

Hi @Always-prog , unfortunately I did not spend time on the Solr implementation. I moved all my data on Elasticsearch and I am very much satisfied about the result. I'd suggest that you migrate :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants