-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for 'attachment' data type #12
Comments
I can only give this a cursory answer at the moment, but the Does that answer your question? |
OK,
but when I do a :
document file is a |
I've tryed also to put my mapping definition to the |
My default analyzer is "french" :
|
Hi, I'm back with my attachments... and I'got it ! I've MANUALY updated the FIELD_MAPPINGS in the haystack elasticsearch_bakends.py file, which work and gives the correct mapping from the server. So I have to find a way to ovrride the build_schema method in your ConfigurableElasticBackend to use this feature :
I'll give you my piece of code after... Thanks ! |
OK, I've got it :
DEFAULT_FIELD_MAPPING = {'type': 'string', 'analyzer': 'snowball'}
FIELD_MAPPINGS = {
'edge_ngram': {'type': 'string', 'analyzer': 'edgengram_analyzer'},
'ngram': {'type': 'string', 'analyzer': 'ngram_analyzer'},
'date': {'type': 'date'},
'datetime': {'type': 'date'},
'location': {'type': 'geo_point'},
'boolean': {'type': 'boolean'},
'float': {'type': 'float'},
'long': {'type': 'long'},
'integer': {'type': 'long'},
'attachment': {'type': 'attachment'}, # I've added as default the attachment
}
def get_default_field_mappings():
default_field_mappings = getattr(settings, 'ELASTICSEARCH_DEFAULT_FIELD_MAPPINGS', DEFAULT_FIELD_MAPPING)
return default_field_mappings
def get_field_mappings():
"""
Gets the field_mappings from settings `ELASTICSEARCH_FIELD_MAPPINGS` if exists,
otherwise returns FIELD_MAPPINGS dict.
:return: dict of mappings from field types to properties
"""
field_mappings = getattr(settings, 'ELASTICSEARCH_FIELD_MAPPINGS', FIELD_MAPPINGS)
return field_mappings
class ExtendedElasticsearchBackend(ConfigurableElasticBackend):
"""
Adds `attachment` support for elasticsearch backend settings
"""
def build_schema(self, fields):
"""
Merge from haystack and elasticstack elasticsearch backend `build_shema` methods.
It provides an additional feuture : custom field mappings, from settings or default FIELD_MAPPINGS dict.
:param fields:
:return:
"""
content_field_name = ''
mapping = {
DJANGO_CT: {'type': 'string', 'index': 'not_analyzed', 'include_in_all': False},
DJANGO_ID: {'type': 'string', 'index': 'not_analyzed', 'include_in_all': False},
}
field_mappings = get_field_mappings()
default_field_mappings = get_default_field_mappings()
for field_name, field_class in fields.items():
field_mapping = field_mappings.get(field_class.field_type, default_field_mappings).copy()
if field_class.boost != 1.0:
field_mapping['boost'] = field_class.boost
if field_class.document is True:
content_field_name = field_class.index_fieldname
# Do this last to override `text` fields.
if field_mapping['type'] == 'string' and field_class.indexed:
if not hasattr(field_class, 'facet_for') and not field_class.field_type in ('ngram', 'edge_ngram'):
field_mapping['analyzer'] = getattr(field_class, 'analyzer', self.DEFAULT_ANALYZER)
mapping[field_class.index_fieldname] = field_mapping
return content_field_name, mapping
class ExtendedElasticSearchEngine(ConfigurableElasticSearchEngine):
backend = ExtendedElasticsearchBackend This class can be used as a backend, it's a quick and dirty merge of the haystack and elasticstack elasticsearch backends.
from filer.models import File as fi_File
class AttachmentField(SearchField):
field_type = 'attachment'
author_field = 'author'
def __init__(self, **kwargs):
if 'content_type_field' in kwargs:
self.content_type_field = kwargs.pop('content_type_field')
if 'author_field' in kwargs:
self.author_field = kwargs.pop('author_field')
super(AttachmentField, self).__init__(**kwargs)
def convert(self, value):
if isinstance(value, fi_File):
field_file = value.file.file
name = value.label
content_length = len(field_file)
content_type = get_content_type(name)
try:
content = base64.b64encode(field_file.read())
except AttributeError:
content = base64.b64encode(field_file)
else: # isinstance(field, dj_File):
field_file = value
content_length = len(field_file)
content_type = None
name = None
try:
content = base64.b64encode(field_file.read())
except AttributeError:
content = base64.b64encode(field_file)
output = {'_language': 'fr',
'_content': content,
'_content_type': content_type,
'_name': name,
'_title': name,
'_content_length': content_length
}
return output ... And it seems to work ! |
My final version : https://gist.github.com/frague59/aab071f0bdce5b010ce4 Do WTF you want with it ;) |
Thanks @frague59 for sharing! 👍 |
Hi,
I'm trying to use haystack + elasticsearch to populate an index with documents, aka files, using the mapper-attachments elasticsearch plugin.
My files are uploaded to the index as base64 streams, I've built a SearchField that provides this functionnality. My files are visible as a string in the elasticsearch index, it has been uploaded.
My question is : How can I bind the field type 'attachment' into the mappings ?
As I saw in the haystack code, it uses a hard coded set of rules, and does not provide this kind of mappings. In elasticstack,
show_mapping --detail
uses the same mapping stuff, so I'm stuck.Have you any idea on how to provide this functionality ? Thanks !
The text was updated successfully, but these errors were encountered: