Skip to content

datacleaner/extension_elasticsearch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ElasticSearch for DataCleaner

This is a DataCleaner (http://datacleaner.org) extension for using the ElasticSearch (http://www.elasticsearch.org/) search engine in indexing and searching reference data.

Currently the extension contains these DataCleaner components:

  • ElasticSearch indexer (Analyze menu)

    This component allows you to build a (new or existing) search index by feeding in records to it. Each record will become a document in the search index. Each column of the record needs to be mapped to a field in the search index.

  • ElasticSearch document ID lookup (Transform menu)

    Performs a document lookup for each record, based on ID. This transformation is the equivalent of looking up records in a database by their primary key.

  • ElasticSearch full text search (Transform menu)

    Performs a search for each record, into a search index. The component allows searching across all fields or by setting a specific field to use for matching. The result of the transformation is a Document ID and a Document (represented as a map), which can further be processed by e.g. the built-in Data structures (Transform menu) components of DataCleaner.

Please feel free to fork, and to provide feedback in any form.