Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Executes incremental reduce in the search thread pool #58461

Merged
merged 19 commits into from
Jul 28, 2020

Conversation

jimczi
Copy link
Contributor

@jimczi jimczi commented Jun 23, 2020

This change forks the execution of partial reduces in the coordinating node to the search thread pool.
It also ensures that partial reduces are executed sequentially and asynchronously in order to limit the
memory and cpu that a single search request can use but also to avoid blocking a network thread.
If a partial reduce fails with an exception, the search request is cancelled and the reporting of the error is
delayed to the start of the fetch phase (when the final reduce is performed). This ensures that we cleanup the
in-flight search requests before returning an error to the user.

Closes #53411
Relates #51857

This change forks the execution of partial
reduces in the coordinating node to the search thread pool.
It also ensures that partial reduces are executed sequentially
and asynchronously in order to limit the memory and cpu that a
single search request can use but also to avoid blocking a
network thread.
If a partial reduce fails with an exception, the search
request is cancelled and the reporting of the error is
delayed to the start of the fetch phase (when the final
reduce is performed). This ensures that we cleanup the
in-flight search requests before returning an error to
the user.

Closes elastic#53411
Relates elastic#51857
@jimczi jimczi added >enhancement :Search/Search Search-related issues that do not fall into other categories v8.0.0 v7.9.0 labels Jun 23, 2020
@jimczi jimczi requested review from nik9000 and javanna June 23, 2020 22:51
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search (:Search/Search)

@elasticmachine elasticmachine added the Team:Search Meta label for search team label Jun 23, 2020
Copy link
Member

@javanna javanna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a couple of minor comments/questions. I think I will have to look at it another 3/4 times before I can properly review it :)

Copy link
Member

@nik9000 nik9000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a few things, mostly small. I think this is the sort of thing that we need, but I also feel like I don't know JVM threading stuff well enough to be super clear. I feel like we're being fairly paranoid here with the threading stuff. Which is probably fine because this won't trigger on small searches. But I don't know enough to be sure that my instinct is right about it either.

@jimczi jimczi added v7.10.0 and removed v7.9.0 labels Jul 20, 2020
@jimczi
Copy link
Contributor Author

jimczi commented Jul 27, 2020

Thanks for looking @nik9000 @javanna . I pushed another commit to address your comments and to simplify the change a bit. Would you mind taking another look ?

Copy link
Member

@nik9000 nik9000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had another look this morning and I think I mostly understand it. I left two small things.

@jimczi jimczi merged commit 13e2004 into elastic:master Jul 28, 2020
@jimczi jimczi deleted the partial_reduce_thread_pool branch July 28, 2020 09:35
jimczi added a commit that referenced this pull request Jul 28, 2020
This change forks the execution of partial
reduces in the coordinating node to the search thread pool.
It also ensures that partial reduces are executed sequentially
and asynchronously in order to limit the memory and cpu that a
single search request can use but also to avoid blocking a
network thread.
If a partial reduce fails with an exception, the search
request is cancelled and the reporting of the error is
delayed to the start of the fetch phase (when the final
reduce is performed). This ensures that we cleanup the
in-flight search requests before returning an error to
the user.

Closes #53411
Relates #51857
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>enhancement :Search/Search Search-related issues that do not fall into other categories Team:Search Meta label for search team v7.10.0 v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Partial reduce failures can throw AssertionError
5 participants