Elasticsearch is generally used to index data of types like string, number, date, etc. However, what if you wanted to index a file like a .pdf or a .doc directly and make it searchable?
This module allows Drupal to index files (attachments) to Elasticsearch by making use of Elasticsearch data type "attachment".
This module requires:
- Drupal 8
- Search API Module
- Elasticsearch Connector module (8.x-5.0-alpha3 or higher)
- Elasticsearch Version 5.6
- Elasticsearch
mapper-attachments
plugin
The first step is to install the Elasticsearch plugin: mapper-attachments
,
which enables ES to recognise the "attachment" data type. In turn, it uses
Apache Tika for content extraction and supports several file types such as
.pdf, .doc, .xls, .rtf, .html, .odt, etc.
$ES_HOME> bin/elasticsearch-plugin install mapper-attachments
Thats the hard work done.
composer require 'drupal/search_api_elasticsearch_attachments:^1.2'
You have to choose the correct versions of the module depending on your Elastic Search Server setup. Please see the table below for compatibility.
If you are using Elasticsearch Connector 8.x-5.x, please use 8.x-1.x of search_api_elasticsearch_attachments module.
Search API Elasticsearch Attachments | Elasticsearch Connector | Elasticsearch Version | Attachment Plugin Support |
---|---|---|---|
8.x-1.x | 8.x-5.x | 5x | Mapper Attachments Plugin |
8.x-5.x (todo) | 8.x-5.x | 5x | Ingest Attachment Processor Plugin |
8.x-6.x | 8.x-6.x | 6x | Ingest Attachment Processor Plugin |