Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integration with ElasticSearch #117

Open
rth opened this issue Mar 21, 2017 · 1 comment
Open

Integration with ElasticSearch #117

rth opened this issue Mar 21, 2017 · 1 comment
Milestone

Comments

@rth
Copy link
Contributor

rth commented Mar 21, 2017

E-discovery tools are typically used in combination with a search engine and this issue aims therefore to evaluate the possibility of integrating FreeDiscovery with ElasticSearch.

Multiple aspects of this question are possible,

  • API design: it might be worthwhile to make the FreeDiscovery REST API follow as much as possible the philosophy and principles of the ElasticSearch API which would make easier for users to use both
  • ElasticSearch integration within FreeDiscovery: FreeDiscovery methods that perform nearest-neighbor search in the semantic space (including LSI + 1-NN categorization) could be made to take as optional input an elastic search index, in order to combine search in the semantic space with regular search.
  • FreeDiscovery integration within ElasticSearch: ElasticSearch has a system of plugins aiming to enhance the base functionality, and it might be possible to add a FreeDiscovery plugin (though the specifics remain to be determined).

This is a long term discussion... Also might be related to Solr integration discussion #118 .

@rth rth added this to the v2.0 milestone Mar 21, 2017
@sany2k8
Copy link

sany2k8 commented Jan 29, 2019

Is their any update with Integration with ElasticSearch or Solr? I need to find near duplicate json documents of my Elasticsearch Index. Please let me know how can I do that? If it is not possible now then let me know how can I find
NDD now if I store all the data sets of my elasticsearch indices inside a folder e.g 20_newsgroups keeps text documents inside a sub-folder with e.g 1.txt file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants