Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge search_type=count and size=0. #9296

Merged
merged 2 commits into from Mar 31, 2015

Conversation

jpountz
Copy link
Contributor

@jpountz jpountz commented Jan 14, 2015

This commit brings the benefits of the count search type to search requests
that have a size of 0:

  • a single round-trip to shards (no fetch phase)
  • ability to use the query cache

Since count now provides no benefits over query_then_fetch, it has been
deprecated.

Close #7630

@jpountz jpountz force-pushed the enhancement/remove_count_search_type branch from a573a8e to 1e28b5c Compare January 14, 2015 11:44
@jpountz jpountz added review :Search/Search Search-related issues that do not fall into other categories v2.0.0-beta1 labels Jan 14, 2015
@@ -348,12 +339,8 @@ public static void writeTopDocs(StreamOutput out, TopDocs topDocs, int from) thr
out.writeBoolean(sortField.getReverse());
}

out.writeVInt(topDocs.scoreDocs.length - from);
int index = 0;
out.writeVInt(topDocs.scoreDocs.length);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we not subtract from now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method was always called with from = 0 so I simplified it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, fair enough

@jpountz
Copy link
Contributor Author

jpountz commented Jan 15, 2015

@colings86 Thanks for the review, I pushed a new commit.

@colings86
Copy link
Contributor

@jpountz LGTM but as I am new to this area you might want to get another review?

@jpountz
Copy link
Contributor Author

jpountz commented Jan 15, 2015

Makes sense. @martijnvg @bleskes @kimchy Maybe one of you would be the right person to look at this PR?

@martijnvg
Copy link
Member

I like the fact that without any additional overhead we can rely on size=0 to optimize the search execution. LGTM

if (context.size() != 0) {
return false;
}
// We cannot cache with DFS because results depend not only on the content of the index but also
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the only thing I could think of here is top hits aggs relying on score. Are there other uses where this goes wrong? (o.w. I think it might be handy to add that to the comment, it's not obvious imho)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is also an issue with search scripts since all search scripts can read the score of the current document. I'll add a comment.

@jpountz
Copy link
Contributor Author

jpountz commented Jan 15, 2015

@bleskes Pushed a new commit.

@@ -218,7 +221,7 @@ public boolean canCache(ShardSearchRequest request, SearchContext context) {
* to have a single load operation that will cause other requests with the same key to wait till its loaded an reuse
* the same cache.
*/
public QuerySearchResultProvider load(final ShardSearchRequest request, final SearchContext context, final QueryPhase queryPhase) throws Exception {
public void load(final ShardSearchRequest request, final SearchContext context, final QueryPhase queryPhase) throws Exception {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now that it doesn't return a value, maybe rename in to loadIntoContext , or add something to the java docs to indicate where the output is put.

@bleskes
Copy link
Contributor

bleskes commented Jan 16, 2015

LGTM. I left some comments with questions and suggestions. I was also wondering about the implications of this stored search templates and whether we have some bwc compatibility issue there. I'm not sure we need to solve it (no one has 1TB of stored searches, yet :)) but we have to document it if so.

@kimchy
Copy link
Member

kimchy commented Jan 16, 2015

it looks good, I know this change is aimed at master (2.0), I wonder if we can still support search_type count (at least on the rest layer) and convert it to size=0 and query_then_fetch?

Also, its a shame we are loosing the non deserialization aspect of the query cache, is there still a chance to keep it?

@jpountz
Copy link
Contributor Author

jpountz commented Jan 16, 2015

it looks good, I know this change is aimed at master (2.0), I wonder if we can still support search_type count (at least on the rest layer) and convert it to size=0 and query_then_fetch?

Why should we keep it, it's less confusing to have a single way to not execute the fetch phase?

Also, its a shame we are loosing the non deserialization aspect of the query cache, is there still a chance to keep it?

Is it really worth doing? When I worked on the PR it made things complicated because other code had to make sure to use the QuerySearchResult that came back from the cache instead of the one that was attached to the context. And running a few benchmarks with this PR vs. master didn't show differences.

@javanna
Copy link
Member

javanna commented Mar 17, 2015

What's the status here? I think we should get it, would love to go ahead with #9117 as a next step and make the count api become a shortcut to size=0.

@jpountz
Copy link
Contributor Author

jpountz commented Mar 19, 2015

@javanna Agreed it is a good change! :) I did not forget about it, just wanted to get #9595 in first, as it would get this change more coverage. I have been traveling recently but will get back to these issues soon.

Also I totally agree that the count API should become a shortcut to the search API and be documented as such!

@javanna
Copy link
Member

javanna commented Mar 19, 2015

cool thanks @jpountz no pressure :)

@s1monw s1monw self-assigned this Mar 20, 2015
jpountz added a commit to jpountz/elasticsearch that referenced this pull request Mar 30, 2015
Even if there is a background thread that periodically closes search contexts
that seem unused (every minute by default), it is important to close search
contexts as soon as possible in order to not keep unnecessary open files or
to prevent segments from being deleted.

This check would help ensure that refactorings of the SearchContext management
like elastic#9296 are correct.
@jpountz jpountz force-pushed the enhancement/remove_count_search_type branch from 970a87a to 64f817a Compare March 30, 2015 15:55
@jpountz
Copy link
Contributor Author

jpountz commented Mar 30, 2015

OK, I pushed a new version of this change that keeps search_type=count in but deprecated using ParseField. I had to rebase since quite some changes happened since the last version of this PR. I have had a couple of reviews already and I think I addressed all concerns. So if you would like to give it another look or have concerns, please tell me. Otherwise I will merge it soon.

request without any docs (represented in `total_hits`), and possibly,
including aggregations as well. In general, this is preferable to the `count`
API as it provides more options.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick...maybe you wanna leave this and mark it as deprecated?

@javanna
Copy link
Member

javanna commented Mar 31, 2015

I left a bunch of docs related comments, mainly around the fact that we are now deprecating search_type count instead of removing it, hence some things need to be adjusted there.

@@ -94,7 +94,7 @@ Defaults to no terminate_after.

|`search_type` |The type of the search operation to perform. Can be
`dfs_query_then_fetch`, `dfs_query_and_fetch`, `query_then_fetch`,
`query_and_fetch`, `count`, `scan`. Defaults to `query_then_fetch`. See
`query_and_fetch`, `scan` or `count` (deprecated). Defaults to `query_then_fetch`. See
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you could do this here: deprecated[2.0,Replaced by size: 0]

Even if there is a background thread that periodically closes search contexts
that seem unused (every minute by default), it is important to close search
contexts as soon as possible in order to not keep unnecessary open files or
to prevent segments from being deleted.

This check would help ensure that refactorings of the SearchContext management
like elastic#9296 are correct.
This commit brings the benefits of the `count` search type to search requests
that have a `size` of 0:
 - a single round-trip to shards (no fetch phase)
 - ability to use the query cache

Since `count` now provides no benefits over `query_then_fetch`, it has been
deprecated.

Close elastic#7630
@jpountz jpountz force-pushed the enhancement/remove_count_search_type branch from 1d8a2ea to a608db1 Compare March 31, 2015 09:34
jpountz added a commit that referenced this pull request Mar 31, 2015
…_type

Search: Merge `search_type=count` and `size=0`.

Close #9226
@jpountz jpountz merged commit 0a6be2c into elastic:master Mar 31, 2015
@jpountz jpountz deleted the enhancement/remove_count_search_type branch March 31, 2015 09:42
@jpountz jpountz removed the review label Mar 31, 2015
jpountz added a commit that referenced this pull request Mar 31, 2015
@clintongormley clintongormley changed the title Search: Merge search_type=count and size=0. Merge search_type=count and size=0. Jun 8, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>enhancement :Search/Search Search-related issues that do not fall into other categories v2.0.0-beta1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

If size:0, automatically set search_type=count
8 participants