-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make TaskExecutor cx public and use TaskExecutor for concurrent HNSW graph build #12799
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice to see more parallelism introduced, and new ways to use the current TaskExecutor!
I initially wondered if it was a good idea to use TaskExecutor outside of the searcher, but it looks like it is already general enough to be able to do so. It has protection against deadlock (for concurrent tasks trying to parallelize further), which applies to any usage.
The fact that a separate instance is created makes sense to me, as the underlying executor won't necessarily be the one for search.
This seems like a good idea all in all.
@jpountz would you like to double check?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left a small comment on exception handling. Could you add a changes entry please as well?
try { | ||
workers[finalI].run(maxOrd); | ||
return null; | ||
} catch (IOException e) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it necessary to catch IOException here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so. I retained this as it is. I'll remove it in the next revision
61a5f33
to
c374e20
Compare
@javanna I have added the CHANGES entry and addressed the comment. Seems the precommit check fails on |
@@ -53,7 +53,7 @@ public final class TaskExecutor { | |||
|
|||
private final Executor executor; | |||
|
|||
TaskExecutor(Executor executor) { | |||
public TaskExecutor(Executor executor) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that its public, you might have to add a JavaDoc here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh I see....just wondering why it wasn't caught locally with gradlew check
. I think it should have? Also the initial revision passed successfully which had it public too. Dunno but maybe there is some opportunity for some sort of fix?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for making the changes I requested, I found another small thing, sorry for not noticing that straight-away.
lucene/core/src/java/org/apache/lucene/util/hnsw/HnswConcurrentMergeBuilder.java
Outdated
Show resolved
Hide resolved
6a8d8cd
to
3456d7d
Compare
@javanna I have addressed the latest comments now. Thanks! |
lucene/core/src/java/org/apache/lucene/util/hnsw/ConcurrentHnswMerger.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left one last small comment, sorry I could have made my earlier comment clearer and avoid more back and forth. Thanks for your patience. Could you also address the merge conflict in the changes file please?
lucene/core/src/java/org/apache/lucene/codecs/lucene99/Lucene99HnswVectorsWriter.java
Outdated
Show resolved
Hide resolved
8a24931
to
e42abc7
Compare
No problem. I have addressed your comment to hold the TE in the classes instead of ES in the latest revision. Thanks for reviewing @javanna !
Done |
* generated by this format to do the merge | ||
*/ | ||
public Lucene99HnswVectorsFormat( | ||
int maxConn, int beamWidth, int numMergeWorkers, ExecutorService mergeExec) { | ||
int maxConn, int beamWidth, int numMergeWorkers, TaskExecutor mergeExec) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would have rather left ExecutorService as a constructor argument, but create a TaskExecutor out of it and hold a reference to it instead of the executor. Does that make sense to you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure I'll change it, but tbh I feel maybe it was better to keep the ES itself instead of storing the TE reference because ES is capable of doing other things as well that TE does not support yet eg: timeout support etc. I understand there is no use case as such currently in hnsw indexing side here but if maybe there could be some in future that would help in having this a bit extensible(?). Just wanted to make this point here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for voicing your opinion. My main goal was to create one instance of TaskExecutor per Executor, hence create it as soon as possible. I also believe that we will want to extend TaskExecutor's capabilities instead of relying on executor service capabilities in specific places. The idea would be that we use TaskExecutor for general task execution, and that is used in different places like query rewrite, search against slices, loading of terms statistics, and now also building the graph.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@javanna Totally makes sense to me! I have addressed your comment in the latest commit. Please have a look. Thanks!
The idea would be that we use TaskExecutor for general task execution, and that is used in different places like query rewrite, search against slices, loading of terms statistics, and now also building the graph.
+1 to extend TE capabilities and use it for all concurrency use cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thanks @shubhamvishu ! |
…graph build (apache#12799) Make the TaskExecutor public which is currently pkg-private. At indexing time we concurrently create the hnsw graph (Concurrent HNSW Merge apache#12660). We could use the TaskExecutor implementation to do this for us. Use TaskExecutor#invokeAll in HnswConcurrentMergeBuilder#build to run the workers concurrently.
…graph build (apache#12799) Make the TaskExecutor public which is currently pkg-private. At indexing time we concurrently create the hnsw graph (Concurrent HNSW Merge apache#12660). We could use the TaskExecutor implementation to do this for us. Use TaskExecutor#invokeAll in HnswConcurrentMergeBuilder#build to run the workers concurrently.
Description
This PR proposes to :
TaskExecutor
public which is currently pkg-private - As we are making more use to concurrency and not restrictive at searching time eg: now at indexing time we concurrently create the hnsw graph (Concurrent HNSW Merge #12660). We could use theTaskExecutor
implementation to do this for us.TaskExecutor#invokeAll
inHnswConcurrentMergeBuilder#build
to run the workers concurrently.