New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merge search_type=count
and size=0
.
#9296
Merge search_type=count
and size=0
.
#9296
Conversation
a573a8e
to
1e28b5c
Compare
@@ -348,12 +339,8 @@ public static void writeTopDocs(StreamOutput out, TopDocs topDocs, int from) thr | |||
out.writeBoolean(sortField.getReverse()); | |||
} | |||
|
|||
out.writeVInt(topDocs.scoreDocs.length - from); | |||
int index = 0; | |||
out.writeVInt(topDocs.scoreDocs.length); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we not subtract from
now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method was always called with from
= 0 so I simplified it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, fair enough
@colings86 Thanks for the review, I pushed a new commit. |
@jpountz LGTM but as I am new to this area you might want to get another review? |
Makes sense. @martijnvg @bleskes @kimchy Maybe one of you would be the right person to look at this PR? |
I like the fact that without any additional overhead we can rely on |
if (context.size() != 0) { | ||
return false; | ||
} | ||
// We cannot cache with DFS because results depend not only on the content of the index but also |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the only thing I could think of here is top hits aggs relying on score. Are there other uses where this goes wrong? (o.w. I think it might be handy to add that to the comment, it's not obvious imho)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is also an issue with search scripts since all search scripts can read the score of the current document. I'll add a comment.
@bleskes Pushed a new commit. |
@@ -218,7 +221,7 @@ public boolean canCache(ShardSearchRequest request, SearchContext context) { | |||
* to have a single load operation that will cause other requests with the same key to wait till its loaded an reuse | |||
* the same cache. | |||
*/ | |||
public QuerySearchResultProvider load(final ShardSearchRequest request, final SearchContext context, final QueryPhase queryPhase) throws Exception { | |||
public void load(final ShardSearchRequest request, final SearchContext context, final QueryPhase queryPhase) throws Exception { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
now that it doesn't return a value, maybe rename in to loadIntoContext , or add something to the java docs to indicate where the output is put.
LGTM. I left some comments with questions and suggestions. I was also wondering about the implications of this stored search templates and whether we have some bwc compatibility issue there. I'm not sure we need to solve it (no one has 1TB of stored searches, yet :)) but we have to document it if so. |
it looks good, I know this change is aimed at master (2.0), I wonder if we can still support search_type count (at least on the rest layer) and convert it to size=0 and query_then_fetch? Also, its a shame we are loosing the non deserialization aspect of the query cache, is there still a chance to keep it? |
Why should we keep it, it's less confusing to have a single way to not execute the fetch phase?
Is it really worth doing? When I worked on the PR it made things complicated because other code had to make sure to use the QuerySearchResult that came back from the cache instead of the one that was attached to the context. And running a few benchmarks with this PR vs. master didn't show differences. |
What's the status here? I think we should get it, would love to go ahead with #9117 as a next step and make the count api become a shortcut to |
@javanna Agreed it is a good change! :) I did not forget about it, just wanted to get #9595 in first, as it would get this change more coverage. I have been traveling recently but will get back to these issues soon. Also I totally agree that the count API should become a shortcut to the search API and be documented as such! |
cool thanks @jpountz no pressure :) |
Even if there is a background thread that periodically closes search contexts that seem unused (every minute by default), it is important to close search contexts as soon as possible in order to not keep unnecessary open files or to prevent segments from being deleted. This check would help ensure that refactorings of the SearchContext management like elastic#9296 are correct.
970a87a
to
64f817a
Compare
OK, I pushed a new version of this change that keeps |
request without any docs (represented in `total_hits`), and possibly, | ||
including aggregations as well. In general, this is preferable to the `count` | ||
API as it provides more options. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick...maybe you wanna leave this and mark it as deprecated?
I left a bunch of docs related comments, mainly around the fact that we are now deprecating search_type count instead of removing it, hence some things need to be adjusted there. |
@@ -94,7 +94,7 @@ Defaults to no terminate_after. | |||
|
|||
|`search_type` |The type of the search operation to perform. Can be | |||
`dfs_query_then_fetch`, `dfs_query_and_fetch`, `query_then_fetch`, | |||
`query_and_fetch`, `count`, `scan`. Defaults to `query_then_fetch`. See | |||
`query_and_fetch`, `scan` or `count` (deprecated). Defaults to `query_then_fetch`. See |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you could do this here: deprecated[2.0,Replaced by size: 0
]
Even if there is a background thread that periodically closes search contexts that seem unused (every minute by default), it is important to close search contexts as soon as possible in order to not keep unnecessary open files or to prevent segments from being deleted. This check would help ensure that refactorings of the SearchContext management like elastic#9296 are correct.
This commit brings the benefits of the `count` search type to search requests that have a `size` of 0: - a single round-trip to shards (no fetch phase) - ability to use the query cache Since `count` now provides no benefits over `query_then_fetch`, it has been deprecated. Close elastic#7630
1d8a2ea
to
a608db1
Compare
…_type Search: Merge `search_type=count` and `size=0`. Close #9226
search_type=count
and size=0
.search_type=count
and size=0
.
This commit brings the benefits of the
count
search type to search requeststhat have a
size
of 0:Since
count
now provides no benefits overquery_then_fetch
, it has beendeprecated.
Close #7630