-
Notifications
You must be signed in to change notification settings - Fork 431
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add terminate after behavior to concurrent segment search #5143
Conversation
Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>
Co-authored-by: Jay Deng <jayd0104@gmail.com> Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just have a couple of tweaks and need some clarification.
The [`terminate_after` search parameter]({{site.url}}{{site.baseurl}}/api-reference/search/#url-parameters) is used to terminate a search request once a specified number of documents has been collected. In the non-concurrent search workflow, this count is evaluated for each shard. However, in the concurrent search workflow, it is evaluated for each leaf slice instead in order to avoid synchronizing document counts between threads. With concurrent search, the request performs more work than expected because each segment slice on the shard collects up to the specified number of documents. The intent to terminate collection after the threshold is reached is evaluated at the slice level. Thus, the hit count in the results will be greater than the `terminate_after` threshold but less than `slice_count * terminate_after`. The actual number of returned hits will be controlled by the `size` parameter. | ||
The [`terminate_after` search parameter]({{site.url}}{{site.baseurl}}/api-reference/search/#url-parameters) is used to terminate a search request once a specified number of documents has been collected. If you include the `terminate_after` parameter in a request, concurrent segment search is disabled and the request is run in a non-concurrent manner. | ||
|
||
Typically, queries are expected to be used with smaller `terminate_after` values and thus complete very quickly because the search is performed on a reduced dataset, so concurrent search may not improve performance in this case. Moreover, when `terminate_after` is used with other search request parameters, such as `track_total_hits` and `size`, it adds complexity and changes the expected query behavior. Falling back to non-concurrent path for these search requests ensures consistent results between concurrent and non-concurrent requests. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have to read this a few times to understand the meaning. Are you recommending that they use smaller terminate_after values so that the query completes more quickly for non-concurrent searches? What do you mean by 'falling back to non-concurrent path'? I think you mean that by using this parameter, you can ensure that concurrent and non-concurrent requests perform more consistently?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's what it means: A. If you use a terminate_after
parameter, the request completes very quickly anyway so there's no use for concurrent search. B. If you use a terminate_after
parameter with other search parameters, concurrent and non-concurrent requests may produce different results. Because of A and B, we decided that if you use a terminate_after
parameter, we'll always run the request as non-concurrent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hdhalter I've rewritten the paragraph. Please review again. Thank you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Falling back is a common computer science term.
Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>
Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kolchfa-aws LGTM with one minor change. Thanks!
Co-authored-by: Nathan Bower <nbower@amazon.com> Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>
…-project#5143) * Add terminate after behavior to concurrent segment search Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> * Update _search-plugins/concurrent-segment-search.md Co-authored-by: Jay Deng <jayd0104@gmail.com> Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> * Doc review feedback Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> * Remove extra space Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> * Update _search-plugins/concurrent-segment-search.md Co-authored-by: Nathan Bower <nbower@amazon.com> Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> --------- Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> Co-authored-by: Jay Deng <jayd0104@gmail.com> Co-authored-by: Nathan Bower <nbower@amazon.com>
* Add terminate after behavior to concurrent segment search Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> * Update _search-plugins/concurrent-segment-search.md Co-authored-by: Jay Deng <jayd0104@gmail.com> Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> * Doc review feedback Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> * Remove extra space Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> * Update _search-plugins/concurrent-segment-search.md Co-authored-by: Nathan Bower <nbower@amazon.com> Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> --------- Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> Co-authored-by: Jay Deng <jayd0104@gmail.com> Co-authored-by: Nathan Bower <nbower@amazon.com>
Fixes #5142
Checklist
For more information on following Developer Certificate of Origin and signing off your commits, please check here.