Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make TaskExecutor cx public and use TaskExecutor for concurrent HNSW graph build #12799

Merged
merged 5 commits into from
Nov 21, 2023

Conversation

shubhamvishu
Copy link
Contributor

Description

This PR proposes to :

  • Make the TaskExecutor public which is currently pkg-private - As we are making more use to concurrency and not restrictive at searching time eg: now at indexing time we concurrently create the hnsw graph (Concurrent HNSW Merge #12660). We could use the TaskExecutor implementation to do this for us.
  • Use TaskExecutor#invokeAll in HnswConcurrentMergeBuilder#build to run the workers concurrently.

Copy link
Contributor

@javanna javanna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice to see more parallelism introduced, and new ways to use the current TaskExecutor!

I initially wondered if it was a good idea to use TaskExecutor outside of the searcher, but it looks like it is already general enough to be able to do so. It has protection against deadlock (for concurrent tasks trying to parallelize further), which applies to any usage.

The fact that a separate instance is created makes sense to me, as the underlying executor won't necessarily be the one for search.

This seems like a good idea all in all.

@jpountz would you like to double check?

Copy link
Contributor

@javanna javanna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a small comment on exception handling. Could you add a changes entry please as well?

try {
workers[finalI].run(maxOrd);
return null;
} catch (IOException e) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it necessary to catch IOException here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so. I retained this as it is. I'll remove it in the next revision

@shubhamvishu
Copy link
Contributor Author

shubhamvishu commented Nov 16, 2023

@javanna I have added the CHANGES entry and addressed the comment. Seems the precommit check fails on :lucene:documentation:markdownToHtml task, which looks unrelated? Not sure.

@@ -53,7 +53,7 @@ public final class TaskExecutor {

private final Executor executor;

TaskExecutor(Executor executor) {
public TaskExecutor(Executor executor) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that its public, you might have to add a JavaDoc here?

Copy link
Contributor Author

@shubhamvishu shubhamvishu Nov 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh I see....just wondering why it wasn't caught locally with gradlew check. I think it should have? Also the initial revision passed successfully which had it public too. Dunno but maybe there is some opportunity for some sort of fix?

Copy link
Contributor

@javanna javanna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for making the changes I requested, I found another small thing, sorry for not noticing that straight-away.

@shubhamvishu
Copy link
Contributor Author

@javanna I have addressed the latest comments now. Thanks!

Copy link
Contributor

@javanna javanna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left one last small comment, sorry I could have made my earlier comment clearer and avoid more back and forth. Thanks for your patience. Could you also address the merge conflict in the changes file please?

@shubhamvishu
Copy link
Contributor Author

shubhamvishu commented Nov 20, 2023

I left one last small comment, sorry I could have made my earlier comment clearer and avoid more back and forth. Thanks for your patience.

No problem. I have addressed your comment to hold the TE in the classes instead of ES in the latest revision. Thanks for reviewing @javanna !

Could you also address the merge conflict in the changes file please?

Done

* generated by this format to do the merge
*/
public Lucene99HnswVectorsFormat(
int maxConn, int beamWidth, int numMergeWorkers, ExecutorService mergeExec) {
int maxConn, int beamWidth, int numMergeWorkers, TaskExecutor mergeExec) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would have rather left ExecutorService as a constructor argument, but create a TaskExecutor out of it and hold a reference to it instead of the executor. Does that make sense to you?

Copy link
Contributor Author

@shubhamvishu shubhamvishu Nov 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure I'll change it, but tbh I feel maybe it was better to keep the ES itself instead of storing the TE reference because ES is capable of doing other things as well that TE does not support yet eg: timeout support etc. I understand there is no use case as such currently in hnsw indexing side here but if maybe there could be some in future that would help in having this a bit extensible(?). Just wanted to make this point here.

Copy link
Contributor

@javanna javanna Nov 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for voicing your opinion. My main goal was to create one instance of TaskExecutor per Executor, hence create it as soon as possible. I also believe that we will want to extend TaskExecutor's capabilities instead of relying on executor service capabilities in specific places. The idea would be that we use TaskExecutor for general task execution, and that is used in different places like query rewrite, search against slices, loading of terms statistics, and now also building the graph.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@javanna Totally makes sense to me! I have addressed your comment in the latest commit. Please have a look. Thanks!

The idea would be that we use TaskExecutor for general task execution, and that is used in different places like query rewrite, search against slices, loading of terms statistics, and now also building the graph.

+1 to extend TE capabilities and use it for all concurrency use cases.

Copy link
Contributor

@javanna javanna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@javanna javanna added this to the 9.9.0 milestone Nov 21, 2023
@javanna javanna merged commit 4628327 into apache:main Nov 21, 2023
4 checks passed
@javanna
Copy link
Contributor

javanna commented Nov 21, 2023

Thanks @shubhamvishu !

javanna pushed a commit to javanna/lucene that referenced this pull request Nov 21, 2023
…graph build (apache#12799)

Make the TaskExecutor public which is currently pkg-private. At indexing time we concurrently create the hnsw graph (Concurrent HNSW Merge apache#12660). We could use the TaskExecutor implementation to do this for us.
Use TaskExecutor#invokeAll in HnswConcurrentMergeBuilder#build to run the workers concurrently.
javanna pushed a commit to javanna/lucene that referenced this pull request Nov 21, 2023
…graph build (apache#12799)

Make the TaskExecutor public which is currently pkg-private. At indexing time we concurrently create the hnsw graph (Concurrent HNSW Merge apache#12660). We could use the TaskExecutor implementation to do this for us.
Use TaskExecutor#invokeAll in HnswConcurrentMergeBuilder#build to run the workers concurrently.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants