-
Notifications
You must be signed in to change notification settings - Fork 975
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reuse neighborqueue during hnsw index build (attempt 2) #12372
Conversation
Additionally, the original change only re-used the candidates queues within a single addNode call, so this is improved in that respect as well. |
Hey @jbellis the change looks nice to me. But, I ran https://github.com/mikemccand/luceneutil Am I missing something? Could you provide the benchmark you ran to track the 2.5% improvement? |
I'm using the million-row sift dataset via this harness https://github.com/jbellis/hnswdemo/tree/benchmarking I believe what is happening is that allocation is basically free and there is enough slack across the JVM in these synthetic benchmarks to soak up the extra GC required -- I do not see any changes when run normally, either. So I forced it to a single core with
(it is a hyperthreaded core so i give it logical cores 0-1) Also if you take a look at the jfr files it is very clear that a significant amount of allocation is gone. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change seems sane to me. Thanks for the optimizations!
CI says this test is failing
Not sure how that's related to this code or how to fix it tests pass locally |
I think this is due to newly cut 9.7 branch so we probably just need to wait a bit more. I see all the auto testing is complaining too. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for pursuing this! LGTM
@jbellis if you are ready, I can merge and backport this change. |
Ready. Thanks! |
This changes HnswGraphBuilder to re-use the same candidates queues for adding nodes by allocating them in the Builder instance. This saves about 2.5% of build time and takes memory allocations of NQ long[] from 25% of total to 0%. JFR runs are attached. The difference from the first attempt (which actually made things slower for some graphs) is that it preserves the original code's behavior of using a 1-sized queue for the search in the levels above where the node actually gets added. * Re-use NeighborQueue during build's search * improve javadoc for OnHeapHnswGraphSearcher * assert that results parameter is minheap as expected * update CHANGES
This changes HnswGraphBuilder to re-use the same candidates queues for adding nodes by allocating them in the Builder instance.
This saves about 2.5% of build time and takes memory allocations of NQ long[] from 25% of total to 0%. JFR runs are attached.
The difference from the first attempt (which actually made things slower for some graphs) is that it preserves the original code's behavior of using a 1-sized queue for the search in the levels above where the node actually gets added.
main.jfr.gz
nq2.jfr.gz