Reuse neighborqueue during hnsw index build (attempt 2) #12372

jbellis · 2023-06-14T19:27:17Z

This changes HnswGraphBuilder to re-use the same candidates queues for adding nodes by allocating them in the Builder instance.

This saves about 2.5% of build time and takes memory allocations of NQ long[] from 25% of total to 0%. JFR runs are attached.

The difference from the first attempt (which actually made things slower for some graphs) is that it preserves the original code's behavior of using a 1-sized queue for the search in the levels above where the node actually gets added.

main.jfr.gz
nq2.jfr.gz

jbellis · 2023-06-14T19:29:35Z

Additionally, the original change only re-used the candidates queues within a single addNode call, so this is improved in that respect as well.

jbellis · 2023-06-16T12:51:15Z

cc @msokolov @benwtrent @zhaih

benwtrent · 2023-06-16T13:02:34Z

Hey @jbellis the change looks nice to me. But, I ran https://github.com/mikemccand/luceneutil knnPerTest and saw no change at all in indexing time.

Am I missing something? Could you provide the benchmark you ran to track the 2.5% improvement?

jbellis · 2023-06-16T14:18:53Z

I'm using the million-row sift dataset via this harness https://github.com/jbellis/hnswdemo/tree/benchmarking

I believe what is happening is that allocation is basically free and there is enough slack across the JVM in these synthetic benchmarks to soak up the extra GC required -- I do not see any changes when run normally, either. So I forced it to a single core with

$ taskset -c 0-1 ./gradlew runTexmex -PsiftName=sift

(it is a hyperthreaded core so i give it logical cores 0-1)

Also if you take a look at the jfr files it is very clear that a significant amount of allocation is gone.

benwtrent

Change seems sane to me. Thanks for the optimizations!

lucene/core/src/java/org/apache/lucene/util/hnsw/HnswGraphSearcher.java

jbellis · 2023-06-16T16:08:20Z

CI says this test is failing

   >     java.lang.AssertionError: Missing backcompat test files:
   >       9.6.0-cfs
   >       9.7.0-cfs
   >         at __randomizedtesting.SeedInfo.seed([6833F9FB78703710:78EC16BBCCE7213C]:0)
   >         at junit@4.13.1/org.junit.Assert.fail(Assert.java:89)
   >         at org.apache.lucene.backward_index.TestBackwardsCompatibility.testAllVersionsTested(TestBackwardsCompatibility.java:818)

Not sure how that's related to this code or how to fix it

tests pass locally

zhaih · 2023-06-16T16:44:09Z

Not sure how that's related to this code or how to fix it

I think this is due to newly cut 9.7 branch so we probably just need to wait a bit more. I see all the auto testing is complaining too.

zhaih

Thank you for pursuing this! LGTM

benwtrent · 2023-06-20T16:28:44Z

@jbellis if you are ready, I can merge and backport this change.

jbellis · 2023-06-20T16:43:13Z

Ready. Thanks!

This changes HnswGraphBuilder to re-use the same candidates queues for adding nodes by allocating them in the Builder instance. This saves about 2.5% of build time and takes memory allocations of NQ long[] from 25% of total to 0%. JFR runs are attached. The difference from the first attempt (which actually made things slower for some graphs) is that it preserves the original code's behavior of using a 1-sized queue for the search in the levels above where the node actually gets added. * Re-use NeighborQueue during build's search * improve javadoc for OnHeapHnswGraphSearcher * assert that results parameter is minheap as expected * update CHANGES

jbellis added 2 commits June 14, 2023 14:02

Re-use NeighborQueue during build's search

4372f53

improve javadoc for OnHeapHnswGraphSearcher

73e5af4

alessandrobenedetti added the vector-based-search label Jun 15, 2023

benwtrent approved these changes Jun 16, 2023

View reviewed changes

lucene/core/src/java/org/apache/lucene/util/hnsw/HnswGraphSearcher.java Show resolved Hide resolved

assert that results parameter is minheap as expected

567177c

zhaih approved these changes Jun 16, 2023

View reviewed changes

jbellis added 2 commits June 20, 2023 07:14

Merge branch 'main' into hnsw-nq-2

762d488

update CHANGES

fdd7233

benwtrent merged commit fe0278e into apache:main Jun 20, 2023
4 checks passed

zhaih added this to the 9.8.0 milestone Sep 20, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reuse neighborqueue during hnsw index build (attempt 2) #12372

Reuse neighborqueue during hnsw index build (attempt 2) #12372

jbellis commented Jun 14, 2023 •

edited

Loading

jbellis commented Jun 14, 2023

jbellis commented Jun 16, 2023

benwtrent commented Jun 16, 2023

jbellis commented Jun 16, 2023

benwtrent left a comment

jbellis commented Jun 16, 2023

zhaih commented Jun 16, 2023

zhaih left a comment

benwtrent commented Jun 20, 2023

jbellis commented Jun 20, 2023

Reuse neighborqueue during hnsw index build (attempt 2) #12372

Reuse neighborqueue during hnsw index build (attempt 2) #12372

Conversation

jbellis commented Jun 14, 2023 • edited Loading

jbellis commented Jun 14, 2023

jbellis commented Jun 16, 2023

benwtrent commented Jun 16, 2023

jbellis commented Jun 16, 2023

benwtrent left a comment

Choose a reason for hiding this comment

jbellis commented Jun 16, 2023

zhaih commented Jun 16, 2023

zhaih left a comment

Choose a reason for hiding this comment

benwtrent commented Jun 20, 2023

jbellis commented Jun 20, 2023

jbellis commented Jun 14, 2023 •

edited

Loading