Description
Using the new plugins system, implement the attachments
plugin, allow to add a mapping type called attachment
which accepts a binary input (base64) of an attachment to index.
Installation is simple, just download the plugin zip file and place it under plugins
directory within the installation. When building from source, the plugin will be under build/distributions/plugins
. Once placed in the installation, the attachment
mapper type will be automatically supported.
Using the attachment
type is simple, in your mapping JSON, simply a certain JSON element as attachment
, for example:
{
person : {
properties : {
"myAttachment" : { type : "attachment" }
}
}
}
In this case, the JSON to index can be:
{
myAttachment : "... base64 encoded attachment ..."
}
The attachment
type not only indexes the content of the doc, but also automatically adds meta data on the attachment as well (when available). The metadata supported are: date
, title
, author
, and keywords
. They can be queries using the "dot notation", for example: myAttachment.author
.
Both the meta data and the actual content are simple core type mappers (string
, date
, ...), thus, they can be controlled in the mappings. For example:
{
person : {
properties : {
"file" : {
type : "attachment",
fields : {
file : {index : "no"},
date : {store : "yes"},
author : {analyzer: "myAnalyzer"}
}
}
}
}
}
In the above example, the actual content indexed is mapped under fields
name file
, and we decide not to index it, so it will only be available in the _all
field. The other fields
map to their respective metadata names, but there is no need to specify the type
(like string
or date
) since it is already known.
The plugin uses Apache Tika (http://lucene.apache.org/tika/) to parse it, so many formats are supported, listed here: http://lucene.apache.org/tika/0.6/formats.html.