Skip to content

dakkusingh/search_api_elasticsearch_attachments

Repository files navigation

Search API Elasticsearch Attachments

CircleCI

Elasticsearch is generally used to index data of types like string, number, date, etc. However, what if you wanted to index a file like a .pdf or a .doc directly and make it searchable?

This module allows Drupal to index files (attachments) to Elasticsearch by making use of Elasticsearch data type "attachment".

Search_API_Elasticsearch_Attachments

Requirements

This module requires:

  • Drupal 8
  • Search API Module
  • Elasticsearch Connector module (8.x-5.0-alpha3 or higher)
  • Elasticsearch Version 5.6
  • Elasticsearch mapper-attachments plugin

Elasticsearch Plugin Installation

The first step is to install the Elasticsearch plugin: mapper-attachments, which enables ES to recognise the "attachment" data type. In turn, it uses Apache Tika for content extraction and supports several file types such as .pdf, .doc, .xls, .rtf, .html, .odt, etc.

$ES_HOME> bin/elasticsearch-plugin install mapper-attachments

Thats the hard work done.

Install this module with composer

composer require 'drupal/search_api_elasticsearch_attachments:^1.2'

Version Information (Important)

You have to choose the correct versions of the module depending on your Elastic Search Server setup. Please see the table below for compatibility.

If you are using Elasticsearch Connector 8.x-5.x, please use 8.x-1.x of search_api_elasticsearch_attachments module.

Search API Elasticsearch Attachments Elasticsearch Connector Elasticsearch Version Attachment Plugin Support
8.x-1.x 8.x-5.x 5x Mapper Attachments Plugin
8.x-5.x (todo) 8.x-5.x 5x Ingest Attachment Processor Plugin
8.x-6.x 8.x-6.x 6x Ingest Attachment Processor Plugin

Elasticsearch Attachments Configuration

Enable and Configure the Elasticsearch Attachments Processor

Enable_the_Processor