Skip to content

Commit

Permalink
More Like This Query: defaults to all possible fields for items
Browse files Browse the repository at this point in the history
Items with no specified field now defaults to all the possible fields from the
document source. Previously, we had required 'fields' to be specified either
as a top level parameter or for each item. The default behavior is now similar
to the MLT API.

Closes #7382
  • Loading branch information
alexksikes committed Aug 22, 2014
1 parent a1a9aad commit e78694a
Show file tree
Hide file tree
Showing 2 changed files with 23 additions and 11 deletions.
14 changes: 8 additions & 6 deletions docs/reference/query-dsl/queries/mlt-query.asciidoc
Expand Up @@ -46,7 +46,6 @@ If only one document is specified, the query behaves the same as the
}
--------------------------------------------------


`more_like_this` can be shortened to `mlt`.

Under the hood, `more_like_this` simply creates multiple `should` clauses in a `bool` query of
Expand All @@ -61,26 +60,29 @@ such as `min_word_length`, `max_word_length` or `stop_words`, to control what
terms should be considered as interesting. In order to give more weight to
more interesting terms, each boolean clause associated with a term could be
boosted by the term tf-idf score times some boosting factor `boost_terms`.

When a search for multiple `docs` is issued, More Like This generates a
`more_like_this` query per document field in `fields`. These `fields` are
specified as a top level parameter or within each `doc`.

IMPORTANT: The fields must be indexed and of type `string`. Additionally, when
using `ids` or `docs`, the fields must be either `stored`, store `term_vector`
or `_source` must be enabled.

The `more_like_this` top level parameters include:

[cols="<,<",options="header",]
|=======================================================================
|Parameter |Description
|`fields` |A list of the fields to run the more like this query against.
Defaults to the `_all` field.
Defaults to the `_all` field for `like_text` and to all possible fields
for `ids` or `docs`.

|`like_text` |The text to find documents like it, *required* if `ids` or `docs` are
not specified.

|`ids` or `docs` |A list of documents following the same syntax as the
<<docs-multi-get,Multi GET API>>. This parameter is *required* if
`like_text` is not specified. The texts are fetched from `fields` unless
specified in each `doc`, and cannot be set to `_all`.
<<docs-multi-get,Multi GET API>>. The text is fetched from `fields`
unless specified otherwise in each `doc`.

|`include` |When using `ids` or `docs`, specifies whether the documents should be
included from the search. Defaults to `false`.
Expand Down
Expand Up @@ -164,28 +164,34 @@ public Query parse(QueryParseContext parseContext) throws IOException, QueryPars
if (mltQuery.getLikeText() == null && items.isEmpty()) {
throw new QueryParsingException(parseContext.index(), "more_like_this requires at least 'like_text' or 'ids/docs' to be specified");
}
if (moreLikeFields != null && moreLikeFields.isEmpty()) {
throw new QueryParsingException(parseContext.index(), "more_like_this requires 'fields' to be non-empty");
}

// set analyzer
if (analyzer == null) {
analyzer = parseContext.mapperService().searchAnalyzer();
}
mltQuery.setAnalyzer(analyzer);

if (moreLikeFields == null) {
// set like text fields
boolean useDefaultField = (moreLikeFields == null);
if (useDefaultField) {
moreLikeFields = Lists.newArrayList(parseContext.defaultField());
} else if (moreLikeFields.isEmpty()) {
throw new QueryParsingException(parseContext.index(), "more_like_this requires 'fields' to be non-empty");
}

// possibly remove unsupported fields
removeUnsupportedFields(moreLikeFields, analyzer, failOnUnsupportedField);
if (moreLikeFields.isEmpty()) {
return null;
}
mltQuery.setMoreLikeFields(moreLikeFields.toArray(Strings.EMPTY_ARRAY));

// support for named query
if (queryName != null) {
parseContext.addNamedQuery(queryName, mltQuery);
}

// handle items
if (!items.isEmpty()) {
// set default index, type and fields if not specified
for (MultiGetRequest.Item item : items) {
Expand All @@ -201,7 +207,11 @@ public Query parse(QueryParseContext parseContext) throws IOException, QueryPars
}
}
if (item.fields() == null && item.fetchSourceContext() == null) {
item.fields(moreLikeFields.toArray(new String[moreLikeFields.size()]));
if (useDefaultField) {
item.fields("*");
} else {
item.fields(moreLikeFields.toArray(new String[moreLikeFields.size()]));
}
}
}
// fetching the items with multi-termvectors API
Expand Down

0 comments on commit e78694a

Please sign in to comment.