LUCENE-10054 Make HnswGraph hierarchical #416

mayya-sharipova · 2021-10-27T17:17:14Z

Currently HNSW has only a single layer.
This patch attempts to make it multi-layered.

This PR is based on the feature branch that already went through several PR's approval #250, #267, #287, #315, #536

Currently HNSW has only a single layer. This is the first part to make it multi-layered. To keep changes small, this PR only adds multiple layers in the HnswGraph class.

This patch handles hierarchy in graph construction and search, but only in memory. Changes: - HnswGraphBuilder has an extra parameter :ml normalization factor for level generation. When ml=0, the graph will have only a single layer: SNW. When ml > 0, the graph will have multiple layers. - The recommended ml value from 2018 HNSW paper is : ml = 1 / Math.log(1.0 * maxConn), which was used for tests in TestHnswGraph class. - When ml = 0, we use the previous code in the method buildSNW - When ml > 0, the method buildHNSW is used, that according the paper, for every new node: generates a random level for this node, and then places the new node in those levels with its connections. - HnswGraph's search method has also been modified to handle two cases: - When ml = 0, we use the previous code for a flat graph: generate boundedNumSeed number of random entry points, and use them to search a flat graph - When ml > 0, we use the hierarchical graph: from max level till level 1 using topK=1, and on the level 0th using topK=boundedNumSeed Work left for future: handle hierarchy on disk

Currently we are storing neighbourhoods of all nodes for all levels, even though level > 0 contain fewer nodes. This patch fixes this by: - graph stores for a level a list of neighbourhoods only for nodes present on this level. Neighbours on all levels are presented as level 0's ordinals. - nodesByLevel defines which nodes are stored on this level, presented as level 0's nodes' ordinals. - to get neighbours for a given node for level > 0, we first do a binary search on nodesByLevel for this level to find node's index in the graph's lists of neighbourhoods for this level. We then use this index to retieve a desired neighbourhood.

Disk write and read of hierarchical nsw graph. Modify graph files as following: - .vem - meta file - .vec - vector data file - .vex - vector graph index, index file for the vector graph file - .vgr - vector graph file, stores graph connections (previously called .vex file) .vem file +-------------+--------+------------------+------------------+- | FieldNumber | SimFun | VectorDataOffset | vectorDataLength | +-------------+--------+------------------+------------------+- -+------------------+------------------+-----------------+-----------------+- | graphIndexOffset | graphIndexLength | graphDataOffset | graphDataLength | -+------------------+------------------+-----------------+-----------------+- --+-------------+------------+------+--------+ | NumOfLevels | Dimensions | Size | DocIds | --+-------------+------------+------+--------+ - graph offsets are moved to .vex file. This allows to keep metada file smaller, and possibly later to to load graph offsets on first use, or make them off-heap. .vec file stays the same .vex file: +-------------+--------------------+--------------------+---------------+-- | NumOfLevels | NumOfNodesOnLevel0 | NumOfNodesOnLevel1 | NodesOnLevel1 | ... +-------------+--------------------+--------------------+---------------+-- -+----------------------+-----------------+- | NumOfNodesOnLevelMax | NodesOnLevelMax | -+----------------------+-----------------+-- --+------------------------------+----+--------------------------------+ | GraphOffsetsForNodesOnLevel0 | ...| GraphOffsetsForNodesOnLevelMax | --+------------------------------+----+--------------------------------+ .vgr file: +------------------------+------------------------+-----+--------------------------+ | Level0NodesConnections | Level1NodesConnections | ....| LevelMaxNodesConnections | +------------------------+------------------------+-----+--------------------------+

mayya-sharipova · 2021-10-27T17:19:53Z

Benchmarking based on @jtibshirani setup

baseline: main branch
candidate: this PR

glove-25-angular

	baseline recall	baseline QPS	candidate recall	candidate QPS
n_cands=10	0.626	10962.821	0.631	8869.807
n_cands=50	0.888	4409.952	0.889	4111.685
n_cands=100	0.946	2621.846	0.947	2734.787
n_cands=500	0.994	661.253	0.994	686.700
n_cands=800	0.997	430.172	0.997	459.356
n_cands=1000	0.998	342.915	0.998	355.238

glove-200-angular

	baseline recall	baseline QPS	candidate recall	candidate QPS
n_cands=10	0.285	4843.028	0.312	5208.453
n_cands=50	0.556	2119.933	0.558	2250.213
n_cands=100	0.655	1399.261	0.648	1454.996
n_cands=500	0.806	379.745	0.806	410.553
n_cands=800	0.836	252.796	0.836	276.456
n_cands=1000	0.849	201.012	0.849	220.739

sift-128-euclidean

	baseline recall	baseline QPS	candidate recall	candidate QPS
n_cands=10	0.601	6948.736	0.607	6677.931
n_cands=50	0.889	3003.781	0.892	3202.925
n_cands=100	0.952	1622.276	0.953	1996.992
n_cands=500	0.996	444.135	0.996	540.368
n_cands=800	0.998	296.835	0.998	367.316
n_cands=1000	0.999	245.498	0.999	311.339

As can be seen from the comparison, there is very slight change that the hierarchy brings: a small increase in recall by at the expense of lower QPSs.

jtibshirani · 2021-10-27T18:28:17Z

As can be seen from the comparison, there is very slight change that the hierarchy brings: a small increase in recall by at the expense of lower QPSs

It looks like QPS is sometimes worse, but often better (like in all the sift-128-euclidean runs, num_cands >=50). I wonder if the first runs are affected by a lack of warm-up? My original set-up you linked to didn't include a warm-up, and in LUCENE-9937 we found that this can have a big impact on the first runs.

mayya-sharipova · 2021-10-27T20:01:40Z

@jtibshirani Thanks for the comment.

I wonder if the first runs are affected by a lack of warm-up?

In my benchmarks I had a warmup stage as well starting with bogus query args in ann benchmarking algorithm .

jtibshirani · 2021-11-02T03:54:32Z

lucene/core/src/java/org/apache/lucene/codecs/lucene90/Lucene90HnswVectorsReader.java

@@ -205,6 +215,43 @@ private FieldEntry readField(DataInput input) throws IOException {
    return new FieldEntry(input, similarityFunction);
  }

+  private void fillGraphNodesAndOffsetsByLevel() throws IOException {
+    for (FieldEntry entry : fields.values()) {
+      IndexInput input =


If I understand correctly, the graph index file is only used to populate these data structures on FieldEntry (the graph level information). It feels a bit surprising that FieldEntry is constructed across two different files (both the metadata and graph index files). It also means FieldEntry isn't immutable.

hmm, I'm not sure what typically belongs in a metadata file. I assume it doesn't make sense to move the graph index information there, alongside the ord -> doc mapping. Maybe we could keep the file as-is but create a new class to hold the graph index, something like GraphLevels ?

Yes, indeed we populate FieldEntry from the graph index (levels) file.

It feels a bit surprising that FieldEntry is constructed across two different files

Indeed, I was not happy with this as well. Alternatively, we can put all this information about levels and layers into a single meta file as it is currently done. Happy to hear other suggestions as well.

It also means FieldEntry isn't immutable

No, FieldEntry is immutable, as all its fields are final, they are initialized and filled in the Lucene90HnswVectorsReader constructor.

In #315 was trying to fill graph data lazily on the first use, but it looks like we don't like lazy loading.

Maybe we could keep the file as-is but create a new class to hold the graph index, something like GraphLevels?

Is GraphLevels instead of FieldEntry? I like the approach of a single FieldEntry for every field, as it follows how other field types are organized.

@jpountz I wonder if you can suggest us better how to organize vector files?

For the context: currently, .vem (vector metadata) can be quite big as it contains graph offsets (for 1M docs: 1M * 8 bytes = 8Mb; for 10M docs: 80 Mb). This PR tries to store graph offsets into a separate file, but still load them during initialization of FieldEntry. The problem then as @jtibshirani noticed FieldEntry data is constructed from two files: the metadata file and the file that stores graph offsets. May be it is worth to keep the original format and store all this data in the metadata file?

@jtibshirani

It feels a bit surprising that FieldEntry is constructed across two different files (both the metadata and graph index files)

As an alternative, to make FieldEntry fields only from a single meta file, I've created a PR that keeps the current 3 files ( .vec, .vex, and .vem), where .vem file stores all the level's nodes; but instead of storing neighbour's offsets we calculate them, which allows to keep .vem metadata file smaller. What do you think of this approach?

I like this idea. How would you like to proceed -- work on that PR first (since it seems useful on its own), or move forward with this one and follow-up with a fix soon after?

To clarify, I was not thinking that GraphLevels would replace FieldEntry. It would be a second data structure.

How would you like to proceed -- work on that PR first (since it seems useful on its own), or move forward with this one and follow-up with a fix soon after?

I was under the impression that we are not happy with the current state of this PR and would not want to merge it without some changes. No?

To clarify, I was not thinking that GraphLevels would replace FieldEntry. It would be a second data structure.

Can you please elaborate more how to do you see it is organized? How GraphLevels connected to a FieldEntry? Do you suggest put GraphLevels into a separate file and load them on a first use?

The issue is proving tricky to resolve and personally I don't feel like it needs to block merging. We could look into improving it in a follow-up PR.

jtibshirani · 2021-11-02T04:03:47Z

lucene/core/src/java/org/apache/lucene/util/hnsw/HnswGraph.java

   *
   * @param query search query vector
   * @param topK the number of nodes to be returned
-   * @param numSeed the size of the queue maintained while searching, and controls the number of
-   *     random entry points to sample
+   * @param numSeed the size of the queue maintained while searching


Could we remove this parameter numSeed? It seems particularly out-of-place now that we don't sample random entry points.

I think it'd be a nice simplification to only have one parameter topK. I can't think of a case where callers would set numSeed to be something different from topK.

@jtibshirani Thanks for the comment, I have thought long about this, and I think we should keep it until we have a better strategy.

Currently, we search HNSW graphs through KnnVectorQuery providing k as numCandidates (numSeed or fanout), as this query doesn't support numCandidates parameter. I think it is not optimal that instead of returning let's say top 10 docs from every segment (or shard in a distributed search), we return 100 (200, 800 etc).

Also the signature of KnnVectorQuery and HnswGraph::search need to be updated to reflect the fact that k is indeed the number of candidates we are considering.

May be can discuss this outside this PR, whether and where it makes sense to have numCandidates.

It works for me to have a separate discussion. Maybe at least in this PR we can rename this to numCandidates, since the 'seed' naming no longer makes sense?

As context, I still think it makes sense to remove the numCandidates vs. k distinction in HnswGraph. The public signature KnnVectorsReader#search does not include a notion of "num candidates", so users have no way to even use this distinction. I'd be in favor of removing it from HnswGraph, then having a follow-up discussion about whether the vector search APIs should handle numCandidates vs k.

@jtibshirani Thanks, for now removing numSeed with a possible re-discussion makes sense to me. Addressed in b421dce

lucene/core/src/java/org/apache/lucene/util/hnsw/HnswGraph.java

lucene/core/src/java/org/apache/lucene/codecs/lucene90/Lucene90HnswVectorsFormat.java

jtibshirani · 2021-11-02T04:35:40Z

lucene/core/src/java/org/apache/lucene/util/hnsw/HnswGraph.java

+      // Typically with diversity criteria we see nodes not fully occupied;
+      // average fanout seems to be about 1/2 maxConn.
+      // There is some indexing time penalty for under-allocating, but saves RAM
+      graph.get(i).add(new NeighborArray(Math.max(32, maxConn / 4)));


I am a little confused why we are careful not to overallocate here, but in addNode we do new NeighborArray(maxConn + 1)));? I see you maintained the existing behavior, so not critical to address in this PR, I was just curious about it.

I am curios as well; this was indeed carried over from the existing code.

heh, this makes no sense at all! Git-spelunking, it looks like I added this fancy dancing when adding the diversity criteria, but overlooked addNode - which does the majority of the allocation! It could be worth taking a lok to see if we can cheaply save some RAM

jtibshirani · 2021-11-02T04:44:59Z

lucene/core/src/java/org/apache/lucene/index/KnnGraphValues.java

+   * @param level level for which to get all nodes
+   * @return an iterator over nodes where {@code nextDoc} returns a next node ordinal
+   */
+  // TODO: return a more suitable iterator over nodes than DocIdSetIterator


hmm, I guess we never addressed this TODO. I don't have a good alternative suggestion, maybe others do.

I guess we could make one up

Would IntStream be a good fit? Or PrimitiveIterator.OfInt?

jtibshirani · 2021-11-02T04:45:23Z

Got it, it sounds like you already adjusted my set-up to include warm-ups. Overall it looks like a positive performance improvement. I'm in favor of merging this even though the improvement is relatively small -- I think it's good to implement the actual algorithm that we claim to! I also think this sets us up well for future performance improvements, by closely comparing to other HNSW implementations.

One last thing to check regarding performance: does it have an impact on indexing speed?

Reviewing the code with fresh eyes, I found some more parts where I had questions. I know 9.0 feature freeze is coming up really soon, maybe we want to discuss the timing of this PR?

mayya-sharipova · 2021-11-17T22:35:24Z

@jtibshirani Thanks for your extensive review. I've tried to address some comments, and some comments I think we still need to discuss

does it have an impact on indexing speed?

I've run another benchmarking experiment with KnnGraphTester , and as can be seen below there seems to be a small impact on indexing speed: 6% increase in indexing time;.

Comparison of glove-100

baseline: main branch
candidate: this PR

Baseline	candidate
built 1180000 in 23061/2280885 ms	built 1180000 in 23113/2295566 ms
flushed: segment=_1 ramUsed=474.061 MB newFlushedSize=513.284 MB docs/MB=2,305.768	flushed: segment=_0 ramUsed=474.061 MB newFlushedSize=517.153 MB docs/MB=2,288.517

	baseline recall	baseline QPS	candidate recall	candidate QPS
n_cands=10	0.482	4199.190	0.496	3982.133
n_cands=20	0.553	3563.974	0.561	3412.251
n_cands=40	0.631	2713.881	0.635	2455.465
n_cands=80	0.707	1902.787	0.709	1863.012
n_cands=120	0.748	1497.979	0.747	1400.822
n_cands=200	0.790	1031.353	0.791	1018.244
n_cands=400	0.840	577.106	0.841	600.762
n_cands=600	0.865	417.961	0.865	436.488
n_cands=800	0.881	319.900	0.881	334.503

jtibshirani · 2021-12-15T21:24:10Z

@mayya-sharipova thanks for all your work on this PR. For me, there aren't blockers to merging. I'll do a final review tomorrow (with my morning coffee :)) to make sure I didn't miss anything. What are the next steps -- are you also hoping for a review from Adrien or Mike, who are experienced at designing file formats? Any open questions for them?

mayya-sharipova · 2021-12-18T05:06:03Z

@jtibshirani Thanks for continuing your review.

My main questions to @jpountz and @msokolov is indeed about file formats.

Is it better to keep the current design of 3 files: .vem (metadata), .vec (vectors), and .vex (graphs neighbours), and extra information about graph levels into the metadata file? This make metadata file even bigger, but will allow to load all info for FieldEntry from a single meta file.

Or is it better to break the current metadata file into 2 files 1) .vem - a smaller field description file and 2) a bigger .vel file about graph levels and neighbours' offsets? This make metadata file small, but FieldEntry is loaded from two files.

msokolov · 2021-12-18T14:24:39Z

@mayya-sharipova I don't have strong feelings about the file structure; either way seems OK to me. I guess if pressed I would probably choose to keep a smaller number of files since it will reduce the amount of managing/opening/closing files, and the directory will be less cluttered (most of the formats have one or two files?). I don't see any particular downside to having different kinds of information in a file, although then I guess the file parsing code can be a little more complicated, so it's a tradeoff without a clear deciding factor for me?

Currently we store for each node an offset in the graph neighbours file from where to read this node's neighbours. We also load these offsets into the heap. This patch instead of storing offsets calculates them when needed. This allows to save heap space and disk space for offsets. To make offsets calculable: 1) we write neighbours as Int instead of the current VInt to have predictable offsets (some extra space here) 2) for each node we allocate ((maxConn + 1) * 4) bytes for storing the node's neighbours, where "maxConn" is the maximum number of connections the node can have. If a node has less than maxConn we add extra padding to fill the leftover space (some extra space here). In big graphs most nodes have "maxConn" of neighbours so there should not be much of a waste space.

jpountz · 2022-01-11T13:35:35Z

Sorry @mayya-sharipova I had missed your ping. I realize we don't have good guidelines on writing file formats so I started something at #598. I'll have a quick look at this PR now.

jpountz

The way files are organized makes sense to me.

lucene/core/src/java/org/apache/lucene/codecs/lucene90/Lucene90HnswVectorsFormat.java

lucene/core/src/java/org/apache/lucene/codecs/lucene90/Lucene90HnswVectorsReader.java

jpountz · 2022-01-11T14:09:48Z

lucene/core/src/java/org/apache/lucene/index/KnnGraphValues.java

+   * @param level level for which to get all nodes
+   * @return an iterator over nodes where {@code nextDoc} returns a next node ordinal
+   */
+  // TODO: return a more suitable iterator over nodes than DocIdSetIterator


Would IntStream be a good fit? Or PrimitiveIterator.OfInt?

jtibshirani

This looks great! I left some last tiny comments, but it looks ready to me. As we discussed on #536, before merging we plan to move these changes to a new format Lucene91HnswVectorsFormat to preserve backwards compatibility.

lucene/core/src/test/org/apache/lucene/index/TestKnnGraph.java

lucene/core/src/test/org/apache/lucene/util/hnsw/TestHnswGraph.java

jtibshirani · 2022-01-12T18:41:27Z

lucene/core/src/test/org/apache/lucene/index/TestKnnGraph.java

+                  latch.await();
+                  IndexSearcher searcher = manager.acquire();
+                  try {
+                    KnnVectorQuery query = new KnnVectorQuery("vector", new float[] {0f, 0.1f}, 5);


Tiny comment, should we use assertGraphSearch or doKnnSearch here like we do elsewhere in the test?

assertGraphSearch expects an IndexReader and user reader fo search, while in this test we have SearcherManager that uses IndexSearcher for search; but the assertion of results is copied from assertGraphSearch .

The test tests that there is no problem with accessing a graph from multiple threads at the same time. We can move this test to TestKnnVectorQuery if you think that it will fit there better.

lucene/core/src/java/org/apache/lucene/util/hnsw/HnswGraph.java

Currently HNSW has only a single layer. This patch makes HNSW graph multi-layered. This PR is based on the following PRs: apache#250, apache#267, apache#287, apache#315, apache#536, apache#416 Main changes: - Multi layers are introduced into HnswGraph and HnswGraphBuilder - A new Lucene91HnswVectorsFormat with new Lucene91HnswVectorsReader and Lucene91HnswVectorsWriter are introduced to encode graph layers' information - Lucene90Codec, Lucene90HnswVectorsFormat, and the reading logic of Lucene90HnswVectorsReader and Lucene90HnswGraph are moved to backward_codecs to support reading and searching of graphs built in pre 9.1 version. Lucene90HnswVectorsWriter is deleted. - For backwards compatible tests, previous Lucene90 graph reading and writing logic was copied into test files of Lucene90RWHnswVectorsFormat, Lucene90HnswVectorsWriter, Lucene90HnswGraphBuilder and Lucene90HnswRWGraph.

Currently HNSW has only a single layer. This patch makes HNSW graph multi-layered. This PR is based on the following PRs: apache#250, apache#267, apache#287, apache#315, apache#536, apache#416 Main changes: - Multi layers are introduced into HnswGraph and HnswGraphBuilder - A new Lucene91HnswVectorsFormat with new Lucene91HnswVectorsReader and Lucene91HnswVectorsWriter are introduced to encode graph layers' information - Lucene90Codec, Lucene90HnswVectorsFormat, and the reading logic of Lucene90HnswVectorsReader and Lucene90HnswGraph are moved to backward_codecs to support reading and searching of graphs built in pre 9.1 version. Lucene90HnswVectorsWriter is deleted. - For backwards compatible tests, previous Lucene90 graph reading and writing logic was copied into test files of Lucene90RWHnswVectorsFormat, Lucene90HnswVectorsWriter, Lucene90HnswGraphBuilder and Lucene90HnswRWGraph. TODO: tests for KNN search for graphs built in pre 9.1 version; tests for merge of indices of pre 9.1 + current versions.

mayya-sharipova · 2022-01-17T15:23:00Z

Thanks everyone for the review. I am closing this PR in favour of #608 that copies these changes into a new Lucene91Codec and Lucene91HnswVectorsFormat.

Currently HNSW has only a single layer. This patch makes HNSW graph multi-layered. This PR is based on the following PRs: apache#250, apache#267, apache#287, apache#315, apache#536, apache#416 Main changes: - Multi layers are introduced into HnswGraph and HnswGraphBuilder - A new Lucene91HnswVectorsFormat with new Lucene91HnswVectorsReader and Lucene91HnswVectorsWriter are introduced to encode graph layers' information - Lucene90Codec, Lucene90HnswVectorsFormat, and the reading logic of Lucene90HnswVectorsReader and Lucene90HnswGraph are moved to backward_codecs to support reading and searching of graphs built in pre 9.1 version. Lucene90HnswVectorsWriter is deleted. - For backwards compatible tests, previous Lucene90 graph reading and writing logic was copied into test files of Lucene90RWHnswVectorsFormat, Lucene90HnswVectorsWriter, Lucene90HnswGraphBuilder and Lucene90HnswRWGraph. TODO: tests for KNN search for graphs built in pre 9.1 version; tests for merge of indices of pre 9.1 + current versions.

Currently HNSW has only a single layer. This patch makes HNSW graph multi-layered. This PR is based on the following PRs: #250, #267, #287, #315, #536, #416 Main changes: - Multi layers are introduced into HnswGraph and HnswGraphBuilder - A new Lucene91HnswVectorsFormat with new Lucene91HnswVectorsReader and Lucene91HnswVectorsWriter are introduced to encode graph layers' information - Lucene90Codec, Lucene90HnswVectorsFormat, and the reading logic of Lucene90HnswVectorsReader and Lucene90HnswGraph are moved to backward_codecs to support reading and searching of graphs built in pre 9.1 version. Lucene90HnswVectorsWriter is deleted. - For backwards compatible tests, previous Lucene90 graph reading and writing logic was copied into test files of Lucene90RWHnswVectorsFormat, Lucene90HnswVectorsWriter, Lucene90HnswGraphBuilder and Lucene90HnswRWGraph. TODO: tests for KNN search for graphs built in pre 9.1 version; tests for merge of indices of pre 9.1 + current versions.

Currently HNSW has only a single layer. This patch makes HNSW graph multi-layered. This PR is based on the following PRs: apache#250, apache#267, apache#287, apache#315, apache#536, apache#416 Main changes: - Multi layers are introduced into HnswGraph and HnswGraphBuilder - A new Lucene91HnswVectorsFormat with new Lucene91HnswVectorsReader and Lucene91HnswVectorsWriter are introduced to encode graph layers' information - Lucene90Codec, Lucene90HnswVectorsFormat, and the reading logic of Lucene90HnswVectorsReader and Lucene90HnswGraph are moved to backward_codecs to support reading and searching of graphs built in pre 9.1 version. Lucene90HnswVectorsWriter is deleted. - For backwards compatible tests, previous Lucene90 graph reading and writing logic was copied into test files of Lucene90RWHnswVectorsFormat, Lucene90HnswVectorsWriter, Lucene90HnswGraphBuilder and Lucene90HnswRWGraph. TODO: tests for KNN search for graphs built in pre 9.1 version; tests for merge of indices of pre 9.1 + current versions.

Currently HNSW has only a single layer. This patch makes HNSW graph multi-layered. This PR is based on the following PRs: #250, #267, #287, #315, #536, #416 Main changes: - Multi layers are introduced into HnswGraph and HnswGraphBuilder - A new Lucene91HnswVectorsFormat with new Lucene91HnswVectorsReader and Lucene91HnswVectorsWriter are introduced to encode graph layers' information - Lucene90Codec, Lucene90HnswVectorsFormat, and the reading logic of Lucene90HnswVectorsReader and Lucene90HnswGraph are moved to backward_codecs to support reading and searching of graphs built in pre 9.1 version. Lucene90HnswVectorsWriter is deleted. - For backwards compatible tests, previous Lucene90 graph reading and writing logic was copied into test files of Lucene90RWHnswVectorsFormat, Lucene90HnswVectorsWriter, Lucene90HnswGraphBuilder and Lucene90HnswRWGraph. TODO: tests for KNN search for graphs built in pre 9.1 version; tests for merge of indices of pre 9.1 + current versions.

mayya-sharipova added 8 commits August 24, 2021 15:25

LUCENE-10054 Make HnswGraph hierarchical (#250)

fb5ba4d

Currently HNSW has only a single layer. This is the first part to make it multi-layered. To keep changes small, this PR only adds multiple layers in the HnswGraph class.

Merge remote-tracking branch 'upstream/main' into hnsw

3b4296d

Format changes

43083ed

Merge remote-tracking branch 'upstream/main' into hnsw

c790996

Merge remote-tracking branch 'upstream/main' into hnsw

31f9e48

mayya-sharipova requested review from msokolov, jtibshirani and jpountz and removed request for msokolov October 27, 2021 17:21

jtibshirani reviewed Nov 2, 2021

View reviewed changes

mayya-sharipova added 2 commits November 17, 2021 13:33

Merge remote-tracking branch 'upstream/main' into hnsw

13895d0

Address Julie's feedback 1

2961b1c

Remove numSeed and SplittableRandom parameters from HnswGraph#search

b421dce

mayya-sharipova added 3 commits January 10, 2022 10:28

Merge remote-tracking branch 'upstream/main' into hnsw

ef6a1e9

Correct Test and Format fter merge

dc79a2e

jpountz reviewed Jan 11, 2022

View reviewed changes

Address Adrien's feedback

5e42f22

jtibshirani reviewed Jan 12, 2022

View reviewed changes

mayya-sharipova added 2 commits January 12, 2022 14:34

Add documentation to NodesIterator

4b87c0c

Address Julie's feedback

7bce9f1

mayya-sharipova mentioned this pull request Jan 17, 2022

LUCENE-10054 Make HnswGraph hierarchical #608

Merged

mayya-sharipova closed this Jan 17, 2022

mayya-sharipova mentioned this pull request Jan 27, 2022

LUCENE-10054 Make HnswGraph hierarchical (#608) #629

Merged

mayya-sharipova deleted the hnsw branch February 4, 2022 19:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LUCENE-10054 Make HnswGraph hierarchical #416

LUCENE-10054 Make HnswGraph hierarchical #416

mayya-sharipova commented Oct 27, 2021 •

edited

Loading

mayya-sharipova commented Oct 27, 2021 •

edited

Loading

jtibshirani commented Oct 27, 2021

mayya-sharipova commented Oct 27, 2021 •

edited

Loading

jtibshirani Nov 2, 2021

mayya-sharipova Nov 17, 2021

mayya-sharipova Nov 22, 2021 •

edited

Loading

mayya-sharipova Dec 13, 2021

jtibshirani Dec 15, 2021

mayya-sharipova Dec 15, 2021

jtibshirani Dec 15, 2021

jtibshirani Nov 2, 2021

mayya-sharipova Nov 17, 2021

jtibshirani Nov 29, 2021

mayya-sharipova Dec 6, 2021

jtibshirani Nov 2, 2021

mayya-sharipova Nov 17, 2021

msokolov Nov 18, 2021

jtibshirani Nov 2, 2021

msokolov Nov 18, 2021

jpountz Jan 11, 2022

jtibshirani commented Nov 2, 2021

mayya-sharipova commented Nov 17, 2021 •

edited

Loading

jtibshirani commented Dec 15, 2021

mayya-sharipova commented Dec 18, 2021

msokolov commented Dec 18, 2021

jpountz commented Jan 11, 2022

jpountz left a comment

jpountz Jan 11, 2022

jtibshirani left a comment

jtibshirani Jan 12, 2022

mayya-sharipova Jan 13, 2022

mayya-sharipova commented Jan 17, 2022

LUCENE-10054 Make HnswGraph hierarchical #416

LUCENE-10054 Make HnswGraph hierarchical #416

Conversation

mayya-sharipova commented Oct 27, 2021 • edited Loading

mayya-sharipova commented Oct 27, 2021 • edited Loading

jtibshirani commented Oct 27, 2021

mayya-sharipova commented Oct 27, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mayya-sharipova Nov 22, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jtibshirani commented Nov 2, 2021

mayya-sharipova commented Nov 17, 2021 • edited Loading

Comparison of glove-100

jtibshirani commented Dec 15, 2021

mayya-sharipova commented Dec 18, 2021

msokolov commented Dec 18, 2021

jpountz commented Jan 11, 2022

jpountz left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jtibshirani left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mayya-sharipova commented Jan 17, 2022

mayya-sharipova commented Oct 27, 2021 •

edited

Loading

mayya-sharipova commented Oct 27, 2021 •

edited

Loading

mayya-sharipova commented Oct 27, 2021 •

edited

Loading

mayya-sharipova Nov 22, 2021 •

edited

Loading

mayya-sharipova commented Nov 17, 2021 •

edited

Loading