New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Elasticsearch add support for custom_score to pass script in #804

Closed
nkeilar opened this Issue Jun 11, 2013 · 17 comments

Comments

Projects
None yet
8 participants
@nkeilar
Copy link

nkeilar commented Jun 11, 2013

I'd like to extend haystack to support the for 'custom_score'

http://www.elasticsearch.org/guide/reference/query-dsl/custom-score-query/

I've been looking though the source for awhile now and am tying to figure out the best place to add this via an override. Can someone with a little more tribal knowledge give me a tip. But generally, I think it would be nice to add this feature.

I'll also be looking to pass some per user paramaters in from a Form/View.

@nkeilar

This comment has been minimized.

Copy link

nkeilar commented Jun 11, 2013

Proposed approach: subclass ElasticsearchSearchBackend and override the build_search_kwargs method to take some paramaters and adjust the resulting query.

curl -XGET 'http://127.0.0.1:9200/app_haystack/modelresult/_search?pretty=1'  -d '
{
    "query":{
        "custom_score" : {
            "script" : "_score * doc[\"field\"].value * fieldweight",
            "params": { "vegan": 0 },
            "query": {
                "query_string":{
                    "query":"(red)",
                    "default_operator":"AND",
                    "default_field":"text",
                    "auto_generate_phrase_queries":true,
                    "analyze_wildcard":true
                }
            }
        }
    },
    "facets":{
        "size_exact":{
            "terms":{
                "field":"size_exact",
                "size":100
            }
        }
    },
    "from":0,
    "size":30
}
' | sed ':a;N;$!ba;s/\\n/\n /g'

Seems to work okay, but I think I mistakenly dropped one of the filter constraints. I'll attempt to add that back when subclassing. If im heading down the wrong path, please let me know! Thanks.

@nkeilar

This comment has been minimized.

Copy link

nkeilar commented Jun 12, 2013

I've been able to get it working by overriding ElasticsearchSearchBackend.build_search_kwargs()

I am just trying to work out the best way to pass params in. Its a bit tricking to follow the breadcrumbs from Views to Forms to Queries to Backends, and Engines. Can anyone give a clear explanation of how the query is passed down this chain to the ElasticBackend?

if prioritise:
            kwargs['query'] = { "custom_score": {
                                    "params": {
                                        "p": 1
                                        },
                                    "script": "_score * doc['score'].value * p",
                                    "query": kwargs['query']
                                    }
                                }

        return kwargs
@nkeilar

This comment has been minimized.

Copy link

nkeilar commented Jun 14, 2013

This is the solution I came up with:

__author__ = 'Nathan Keilar'
__company__ = 'Hunted Hive Web Studio'
__website__ = 'http://huntedhive.com'

from copy import deepcopy
from django.conf import settings

from haystack.backends.elasticsearch_backend import ElasticsearchSearchBackend, ElasticsearchSearchQuery
from haystack.backends.elasticsearch_backend import ElasticsearchSearchEngine
from haystack.query import SearchQuerySet
from haystack.constants import DEFAULT_ALIAS

class ConfigurableSearchQuerySet(SearchQuerySet):
    def custom_score(self, score_query_string=None, params=None):
        # import ipdb;ipdb.set_trace()
        """Adds arguments for custom_score to the query"""
        clone = self._clone()
        clone.query.add_custom_score(score_query_string, params)
        return clone


class ConfigurableElasticBackend(ElasticsearchSearchBackend):
    """
    http://www.wellfireinteractive.com/blog/custom-haystack-elasticsearch-backend/
    """

    def __init__(self, connection_alias, **connection_options):
        super(ConfigurableElasticBackend, self).__init__(
            connection_alias, **connection_options)
        try:
            user_settings = getattr(settings, 'ELASTICSEARCH_INDEX_SETTINGS')
        except AttributeError:
            user_settings = None

        if user_settings:
            setattr(self, 'DEFAULT_SETTINGS', user_settings)


    def build_search_kwargs(self, query_string, sort_by=None, start_offset=0, end_offset=None,
                        fields='', highlight=False, facets=None,
                        date_facets=None, query_facets=None,
                        narrow_queries=None, spelling_query=None,
                        within=None, dwithin=None, distance_point=None,
                        models=None, limit_to_registered_models=None,
                        result_class=None,custom_score=None):
        """
        Adding a custom_score paramater

        :param prioritise:
        :return:
        """
        out = super(ConfigurableElasticBackend, self).build_search_kwargs(query_string, sort_by, start_offset, end_offset,
                                                               fields, highlight, facets,
                                                               date_facets, query_facets,
                                                               narrow_queries, spelling_query,
                                                               within, dwithin, distance_point,
                                                               models, limit_to_registered_models,
                                                               result_class)


        if custom_score:
            # import ipdb;ipdb.set_trace()
            out['query'] = { "custom_score": {
                                    "script": custom_score['score_query_string'],
                                    "query": out['query']
                                    }
                                }
            if custom_score['score_query_params']:
                out['query']['custom_score']['params'] = custom_score['score_query_params']

        return out


class ConfigurableElasticsearchSearchQuery(ElasticsearchSearchQuery):
    def __init__(self, using=DEFAULT_ALIAS):
        out = super(ConfigurableElasticsearchSearchQuery, self).__init__(using)
        self.custom_score = {}

    def add_custom_score(self, score_query_string=None, params=None):
        """Adds arguments for custom_score to the query"""
        self.custom_score = {
            'score_query_string': score_query_string,
            'score_query_params': params,
            }

    def build_params(self, spelling_query=None, **kwargs):
        """
        Add a custom_score paramater

        :param spelling_query:
        :param kwargs:
        :return:
        """
        search_kwargs = super(ConfigurableElasticsearchSearchQuery, self).build_params(spelling_query, **kwargs)
        if self.custom_score:
            search_kwargs['custom_score'] = self.custom_score

        return search_kwargs

    def _clone(self, klass=None, using=None):
        clone = super(ConfigurableElasticsearchSearchQuery, self)._clone(klass, using)
        clone.custom_score = self.custom_score
        return clone


class ConfigurableElasticSearchEngine(ElasticsearchSearchEngine):
    backend = ConfigurableElasticBackend
    query = ConfigurableElasticsearchSearchQuery

urls.py

url(r'^search/$', MyBlahSearchView(
        template='blah_blah/search/faceted_search.html',
        # searchqueryset=SearchQuerySet().models(Blah, Item).facet('size').facet('model_type'),
        searchqueryset=ConfigurableSearchQuerySet().models(Blah, Item).custom_score("_score * doc['category_one_id_exact'].value * blah", params={
            "blah": 2
        }).facet('size').facet('model_type'),
        form_class=FacetedSearchForm
    ), name='haystack_search'),
@toastdriven

This comment has been minimized.

Copy link
Contributor

toastdriven commented Jun 14, 2013

That solution is completely reasonable (& the intended way Haystack should be extended). Sorry that it's invasive in your case, but looks good. A blog post on it or a pull request (creating &) adding to a "cookbook" doc in Haystack would be welcomed!

@nkeilar

This comment has been minimized.

Copy link

nkeilar commented Jun 15, 2013

Okay, I'll see if I can pump something out this weekend. It was great reading through the code. Leant a thing or two in the process - I'm sure. Have a good weekend.

@davecm

This comment has been minimized.

Copy link

davecm commented Sep 10, 2013

Great to see this - I was actually just getting ready to write this same thing myself.
@madteckhead - did you ever get around to doing a pull request? or a cookbook doc?

@nkeilar

This comment has been minimized.

Copy link

nkeilar commented Sep 10, 2013

Sorry. I have not. I intend to, but free time has not been forthcoming.
Will try in December.

@davecm

This comment has been minimized.

Copy link

davecm commented Sep 10, 2013

No worries - MSG me if you get around to it - I also added custom_filter_score to it if you're interested in clumping it together.

@nkeilar

This comment has been minimized.

Copy link

nkeilar commented Sep 10, 2013

Sure. Please share your solution

@aolieman

This comment has been minimized.

Copy link

aolieman commented Nov 19, 2013

Thanks for sharing your solution madteckhead!

I'm trying to get the combination of a custom score query and a nested query working. In pure Elasticsearch this is quite simple to do. See for example:

http://stackoverflow.com/questions/13095866/elastic-search-tagging-strength-nested-child-document-boosting

Defining a field type that (seems to) store a list containing dictionary values in the index was easy. But I've not yet succeeded in getting ES to actually accept these values as nested, although I've tried this in .build_schema(). Using a query that returns any of the required sub-documents is also something I'm still figuring out.

Would there be any straightforward way to extend the ES backend to update the mapping with nested properties, and to do a nested query?

Any suggestions would be appreciated, and I'm more than happy to contribute the solution to a haystack cookbook.

@aolieman

This comment has been minimized.

Copy link

aolieman commented Nov 29, 2013

It all worked out in the end ;-D!

Would there be a logical place in the docs to put an "extending backend(s)" piece? Or at least to link to a blog post that I might write about my further extensions of @madteckhead's solution?

@thedrow

This comment has been minimized.

Copy link

thedrow commented Dec 17, 2013

So should this be closed now that a blog post was written about it?

@aolieman

This comment has been minimized.

Copy link

aolieman commented Dec 17, 2013

I'm still writing and not so sure if the blog post solves this issue for everyone ;-)

This particular issue does seem pretty much solved if a link to @madteckhead + @davecm 's solution and / or my blog post ends up somewhere in the docs. It seems to belong under Creating New Backends, at least that's where I started looking. Perhaps it could be renamed "Creating and Extending Backends", since there is not much content at this moment.

Can we still use this issue to coordinate a bit, or should I just do a pull request to add a link whenever I've finished the blog post (soon!)?

@aolieman

This comment has been minimized.

Copy link

aolieman commented Dec 19, 2013

And here it is: http://www.stamkracht.com/extending-haystacks-elasticsearch-backend/

Any comments are welcome, and a link from the Haystack docs would be highly appreciated. If anyone thinks the post needs some modifications, it's probably easiest to fork the gist that I used to write it in. I could then update the post on stamkracht.com.

@HonzaKral

This comment has been minimized.

Copy link
Contributor

HonzaKral commented Jan 6, 2014

Thanks you for the writeup, I will be working on the elasticsearch backend and hopefully provide easier way how to extend it. This is very useful!

Closing in the mean time since this is not a haystack issue.

@HonzaKral HonzaKral closed this Jan 6, 2014

@speedplane

This comment has been minimized.

Copy link

speedplane commented May 13, 2014

Any movement on this? I working on extending the implementation by @aolieman to be able to support nested facets and filtering on multiple nested objects. It's coming together, but it's a bit of a mess requiring a lot of subclassing and overridden functions.

Would be better if this issue could be reopened and these features could be merged into haystack itself.

@sabine

This comment has been minimized.

Copy link

sabine commented Oct 2, 2014

@speedplane did you get things working? I'm now in the same situation - I could use nested facets and filtering on nested objects. Before I start working on this it would be interesting to know whether you have something you'd consider sharing. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment