Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Phrase Suggester: Add option to filter out phrase suggestions #6773

Closed
wants to merge 15 commits into from

Conversation

areek
Copy link
Contributor

@areek areek commented Jul 7, 2014

The newly added filter option will let the user provide a template query which will be executed for every
phrase suggestions generated to ensure that the suggestion matches at least one document for the query.
The filter query is only executed on the local node for now. When the new filter option is used, the
size of the suggestion is restricted to 20.

Closes #3482

…ching any documents for a given query

The newly added filter option will let the user provide a template query which will be executed for every
phrase suggestions generated to ensure that the suggestion matches at least one document for the query.
The filter query is only executed on the local node for now. When the new filter option is used, the
size of the suggestion is restricted to 20.

Closes elastic#3482
}
CompiledScript compiledScript = suggester.scriptService().compile("mustache", templateNameOrTemplateContent);
suggestion.setFilterQueryScript(compiledScript);
if (suggestion.getSize() >= MAX_RESULT_SIZE_FILTER) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if we should do this.... we don't introduce limits elsewhere though let them figure that out themself :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed limits check

@s1monw
Copy link
Contributor

s1monw commented Jul 8, 2014

I like this PR a lot. left a bunch of comments

@s1monw s1monw removed the review label Jul 8, 2014
@areek
Copy link
Contributor Author

areek commented Jul 8, 2014

Updated PR; incorporated feedback and Thanks for the review!

/**
* Routing Preference Type
*/
public final class PreferenceType {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd name it just Preference...maybe we can make this an enum and have a parse method on it like:

public Preference parse(String string) {

   switch (valueBeforeColon(string)) {
      case "_shards":
          return SHARDS;
      case "_prefer_node":
      case "_preferNode":
          return PREFER_NODE;
     //...

   } 
}

/**
  * this splits off the values ie in _prefer_node / _shards*/
public abstract String values();

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I renamed it to Preference and changed it to an enum, but it seems the way I have it currently, there is more boiler plate added rather then being a clean solution. Examples:

Preference.ONLY_LOCAL.type()

is needed for the string representation of '_local_only'.

The values() business feels wrong on the enum, some options have to have values others don't and its not obvious wants happening.

I got rid of all the changes to the unrelated files for Preference, but still left the implementation for the feedback and how to move forward with it.

@s1monw
Copy link
Contributor

s1monw commented Jul 9, 2014

I left some more comments. This looks great though. I'd like to get @clintongormley to give some feedback to make sure the REST interface is good

… API; Added support for accepting filters and template params
@areek
Copy link
Contributor Author

areek commented Jul 10, 2014

I have incorporated the feedback (would like to get some comments on the Preference (see comment above).

After discussion on the REST interface, this is what it looks like now:

curl -XPOST 'localhost:9200/_search' -d {
    "suggest": {
        "text": "Xor the Got-Jewel",
        "simple_phrase": {
            "phrase": {
                "field": "body",
                "size": 5,
                "shard_size": 10,
                "confidence": 2,
                "collate": {
                    "query": {
                        "body": "{{suggestion}}"
                    },
                    "preference": "_primary"
                }
            }
        }
    }
}

collate also accepts a filter instead of the query and there is an optional params which the user can use to inject additional params to the query/filter template

@areek areek added the review label Jul 10, 2014
case "_onlyLocal":
return ONLY_LOCAL;
case "_only_node":
return ONLY_NODE.setValue(value);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the default case should throw an exception instead of return null?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed

@s1monw
Copy link
Contributor

s1monw commented Jul 11, 2014

I left some cosmetic comments. I think this looks very good @clintongormley WDYT... In a second PR we could add the ability to just mark the results if the query / filter matched or not instead of filtering that might help folks to see what is going on.

@@ -161,6 +161,54 @@ can contain misspellings (See parameter descriptions below).
in a row are changed the entire phrase of changed tokens
is wrapped rather than each token.

`collate`::

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's a suggestion to make the docs slightly easier to read:

 `collate`::

     Checks each suggestion against the specified `query` or `filter` to
     prune suggestions for which not matching docs exist in the index. Either
     a `query` or a `filter` must be specified, and it is run as a <<query-
     dsl-template-query,`template` query>>. The current suggestion is
     automatically made available as the `{{suggestion}}`  variable, which
     should be used in your query/filter.  You can still specify  your own
     template `params` -- the `suggestion` value will be added to the
     variables you specify. You can also specify a `preference` to control
     on which shards the query is executed (see <<search-request-preference).
     The default value is `_only_local`.
+ 
 [source,js]
 --------------------------------------------------
 curl -XPOST 'localhost:9200/_search' -d {
   "suggest" : {
     "text" : "Xor the Got-Jewel",
     "simple_phrase" : {
       "phrase" : {
         "field" :  "bigram",
         "size" :   1,
         "direct_generator" : [ {
           "field" :            "body",
           "suggest_mode" :     "always",
           "min_word_length" :  1
         } ],
         "collate": {
           "query": { <1>
             "match": {
                 "{{field_name}}" : "{{suggestion}}" <2>
             }
           },
           "params": {"field_name" : "title"}, <3>
           "preference": "_primary", <4>
         }
       }
     }
   }
 }
 --------------------------------------------------
<1> This query will be run once for every suggestion.
<2> The `{{suggestion}}` variable will be replaced by the text
    of each suggestion.
<3> An additional `field_name` variable has been specified in
    `params` and is used by the `match` query.
<4> The default `preference` has been changed to `_primary`.

I also removed most of the options from the suggest request which are not relevant to collate

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @clintongormley, I have updated the docs

@clintongormley
Copy link

Suggested some doc changes, but otherwise LGTM

@areek
Copy link
Contributor Author

areek commented Jul 11, 2014

I have updated the PR with the feedback. I think the enhancement is ready but still not happy with the state of Preference (see comment in Preference).

@s1monw
Copy link
Contributor

s1monw commented Jul 14, 2014

...but still not happy with the state of Preference (see comment in Preference)

I didn't see a comment, what are you not happy with?

this LGTM actually and I think we should move forward with it unless the issue is major which I didn't see yet?

@s1monw s1monw added v1.4.0 and removed review labels Jul 14, 2014
@areek
Copy link
Contributor Author

areek commented Jul 14, 2014

It seems the way I have Preference currently, there is more boiler plate added rather then being a clean solution. Examples:

Preference.ONLY_LOCAL.type()

is needed for '_local_only'.

The values() business feels wrong on the enum, some options have to have values others don't and its not obvious wants happening.

That was my concern, I think if we plan to have it as is, I will atleast want to use the new Preference type in more places than in PhraseSuggester (at least use it in the PlainOperationRouting)

@areek
Copy link
Contributor Author

areek commented Jul 14, 2014

Updated PR:

  • added static helper methods to parse Preference type & value
  • use new Preference enum in PlainOperationRouting

@areek areek added the review label Jul 14, 2014
}
if ("_local".equals(preference)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think now that we have an enum we should just use a switch / case statement here?

@s1monw
Copy link
Contributor

s1monw commented Jul 14, 2014

LGTM please squash and push

@areek areek removed review labels Jul 14, 2014
@areek areek closed this Jul 14, 2014
@clintongormley clintongormley added >feature v2.0.0-beta1 v1.3.0 :Search/Suggesters "Did you mean" and suggestions as you type labels Jun 6, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>feature :Search/Suggesters "Did you mean" and suggestions as you type v1.3.0 v2.0.0-beta1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Suggester: Phrase suggest option to limit suggestions to exising phrases
3 participants