Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add terminate after behavior to concurrent segment search #5143

Merged
merged 5 commits into from
Oct 16, 2023

Conversation

kolchfa-aws
Copy link
Collaborator

Fixes #5142

Checklist

  • By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and subject to the Developers Certificate of Origin.
    For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>
@kolchfa-aws kolchfa-aws added release-notes PR: Include this PR in the automated release notes v2.11.0 labels Oct 3, 2023
@kolchfa-aws kolchfa-aws self-assigned this Oct 3, 2023
_search-plugins/concurrent-segment-search.md Outdated Show resolved Hide resolved
Co-authored-by: Jay Deng <jayd0104@gmail.com>
Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>
Copy link
Collaborator

@hdhalter hdhalter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just have a couple of tweaks and need some clarification.

_search-plugins/concurrent-segment-search.md Show resolved Hide resolved
The [`terminate_after` search parameter]({{site.url}}{{site.baseurl}}/api-reference/search/#url-parameters) is used to terminate a search request once a specified number of documents has been collected. In the non-concurrent search workflow, this count is evaluated for each shard. However, in the concurrent search workflow, it is evaluated for each leaf slice instead in order to avoid synchronizing document counts between threads. With concurrent search, the request performs more work than expected because each segment slice on the shard collects up to the specified number of documents. The intent to terminate collection after the threshold is reached is evaluated at the slice level. Thus, the hit count in the results will be greater than the `terminate_after` threshold but less than `slice_count * terminate_after`. The actual number of returned hits will be controlled by the `size` parameter.
The [`terminate_after` search parameter]({{site.url}}{{site.baseurl}}/api-reference/search/#url-parameters) is used to terminate a search request once a specified number of documents has been collected. If you include the `terminate_after` parameter in a request, concurrent segment search is disabled and the request is run in a non-concurrent manner.

Typically, queries are expected to be used with smaller `terminate_after` values and thus complete very quickly because the search is performed on a reduced dataset, so concurrent search may not improve performance in this case. Moreover, when `terminate_after` is used with other search request parameters, such as `track_total_hits` and `size`, it adds complexity and changes the expected query behavior. Falling back to non-concurrent path for these search requests ensures consistent results between concurrent and non-concurrent requests.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have to read this a few times to understand the meaning. Are you recommending that they use smaller terminate_after values so that the query completes more quickly for non-concurrent searches? What do you mean by 'falling back to non-concurrent path'? I think you mean that by using this parameter, you can ensure that concurrent and non-concurrent requests perform more consistently?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's what it means: A. If you use a terminate_after parameter, the request completes very quickly anyway so there's no use for concurrent search. B. If you use a terminate_after parameter with other search parameters, concurrent and non-concurrent requests may produce different results. Because of A and B, we decided that if you use a terminate_after parameter, we'll always run the request as non-concurrent.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hdhalter I've rewritten the paragraph. Please review again. Thank you.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Falling back is a common computer science term.

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>
Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>
@hdhalter hdhalter added the 4 - Doc Review PR: Doc review in progress label Oct 10, 2023
Copy link
Collaborator

@vagimeli vagimeli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Collaborator

@natebower natebower left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kolchfa-aws LGTM with one minor change. Thanks!

_search-plugins/concurrent-segment-search.md Outdated Show resolved Hide resolved
Co-authored-by: Nathan Bower <nbower@amazon.com>
Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>
@kolchfa-aws kolchfa-aws added 6 - Done but waiting to merge PR: The work is done and ready to merge and removed 4 - Doc Review PR: Doc review in progress labels Oct 10, 2023
@kolchfa-aws kolchfa-aws merged commit cb2c2f5 into main Oct 16, 2023
5 checks passed
harshavamsi pushed a commit to harshavamsi/documentation-website that referenced this pull request Oct 31, 2023
…-project#5143)

* Add terminate after behavior to concurrent segment search

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Update _search-plugins/concurrent-segment-search.md

Co-authored-by: Jay Deng <jayd0104@gmail.com>
Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>

* Doc review feedback

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Remove extra space

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Update _search-plugins/concurrent-segment-search.md

Co-authored-by: Nathan Bower <nbower@amazon.com>
Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>

---------

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>
Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>
Co-authored-by: Jay Deng <jayd0104@gmail.com>
Co-authored-by: Nathan Bower <nbower@amazon.com>
vagimeli pushed a commit that referenced this pull request Dec 21, 2023
* Add terminate after behavior to concurrent segment search

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Update _search-plugins/concurrent-segment-search.md

Co-authored-by: Jay Deng <jayd0104@gmail.com>
Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>

* Doc review feedback

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Remove extra space

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Update _search-plugins/concurrent-segment-search.md

Co-authored-by: Nathan Bower <nbower@amazon.com>
Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>

---------

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>
Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>
Co-authored-by: Jay Deng <jayd0104@gmail.com>
Co-authored-by: Nathan Bower <nbower@amazon.com>
@kolchfa-aws kolchfa-aws deleted the terminate-after branch March 28, 2024 21:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
6 - Done but waiting to merge PR: The work is done and ready to merge release-notes PR: Include this PR in the automated release notes v2.11.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[DOC] Concurrent Segment Search is disabled in terminate_after workflow
5 participants