Add support for recursive graph bisection. #12489

jpountz · 2023-08-04T22:58:43Z

Recursive graph bisection is an extremely effective algorithm to reorder doc IDs in a way that improves both storage and query efficiency by clustering similar documents together. It usually performs better than other techniques that try to achieve a similar goal such as sorting the index in natural order (e.g. by URL) or by a min-hash, though it comes at a higher index-time cost.

The original paper is good but I found this follow-up reproducibility study to describe the algorithm in more practical ways.

Recursive graph bisection is an extremely effective algorithm to reorder doc IDs in a way that improves both storage and query efficiency by clustering similar documents together. It usually performs better than other techniques that try to achieve a similar goal such as sorting the index in natural order (e.g. by URL) or by a min-hash, though it comes at a higher index-time cost. The [original paper](https://arxiv.org/pdf/1602.08820.pdf) is good but I found this [follow-up paper](http://engineering.nyu.edu/~suel/papers/bp-ecir19.pdf) to describe the algorithm in more practical ways.

jpountz · 2023-08-04T23:01:44Z

I'm opening this draft in case someone wants to take a look. I only checked the output on very small indices for now. I also ran it on larger indexes, such as a 1.8M-docs wikimedium10m segment to see how long it takes (4 minutes on my 24-cores machine) but I haven't checked if the result made sense yet. It's probably full of bugs!

jpountz · 2023-08-20T10:39:25Z

I think it's starting to look better now. I worked on some inefficiencies and applied some of the optimizations suggested by Mackenzie et al. in "Tradeoff Options for Bipartite Graph Partitioning":

Use a simplified estimator that only requires two log computations.
Simulated annealing to stop iterating when the gain would be small by using the iteration number as a threshold.

With the suggested defaults of minDocFreq=4,096 and minPartitionSize=32, I'm getting the following performance numbers on wikimedium10m (10M docs):

indexing (24 threads): 6.5 minutes
force-merging (single thread): 4.2 minutes
reordering doc IDs, including building a forward index by uninverting the inverted index (24 threads): 5.6 minutes
serializing the reordered view via addIndexes (single thread): 7.4 minutes

Then comparing query performance, I'm getting interesting results. I had to disable verification of scores and counts because of the reordering. A quick manual check suggests that results are valid. I can guess why some queries like conjunctions are faster, but I'm not sure for OrHighLow or HighPhrase. Regarding sorting tasks, their performance is highly dependent on the index order, so I'm considering them as noise.

                            TaskQPS baseline      StdDevQPS my_modified_version      StdDev                Pct diff p-value
                       OrHighLow      583.26      (5.9%)      400.94      (5.9%)  -31.3% ( -40% -  -20%) 0.000
                      HighPhrase       28.64      (7.6%)       20.78      (5.0%)  -27.4% ( -37% -  -16%) 0.000
                      TermDTSort      113.57      (2.5%)       90.23      (1.2%)  -20.5% ( -23% -  -17%) 0.000
               HighTermTitleSort       72.11      (1.5%)       63.72      (1.4%)  -11.6% ( -14% -   -8%) 0.000
                        PKLookup      290.91      (3.5%)      264.17      (3.0%)   -9.2% ( -15% -   -2%) 0.000
                        HighTerm      634.93      (6.4%)      584.08      (5.9%)   -8.0% ( -19% -    4%) 0.000
                          IntNRQ       54.77     (16.8%)       50.44     (13.2%)   -7.9% ( -32% -   26%) 0.098
               HighTermMonthSort     7652.28      (2.8%)     7294.07      (3.2%)   -4.7% ( -10% -    1%) 0.000
                    OrHighNotLow      600.38      (5.5%)      600.98      (5.3%)    0.1% ( -10% -   11%) 0.953
                         Respell      272.80      (2.2%)      274.82      (2.3%)    0.7% (  -3% -    5%) 0.301
                        Wildcard      157.34      (4.4%)      160.16      (3.9%)    1.8% (  -6% -   10%) 0.172
                HighSloppyPhrase       19.99      (5.5%)       20.74      (4.3%)    3.8% (  -5% -   14%) 0.016
                         Prefix3      840.82      (4.7%)      882.90      (5.8%)    5.0% (  -5% -   16%) 0.002
                          Fuzzy1      361.19      (2.8%)      383.68      (3.7%)    6.2% (   0% -   13%) 0.000
            HighIntervalsOrdered        8.51      (4.9%)        9.10      (4.3%)    6.9% (  -2% -   16%) 0.000
                   OrHighNotHigh      408.11      (4.8%)      440.27      (5.1%)    7.9% (  -1% -   18%) 0.000
                    HighSpanNear       23.57      (3.2%)       25.52      (3.7%)    8.3% (   1% -   15%) 0.000
                   OrNotHighHigh      367.13      (4.2%)      397.89      (4.2%)    8.4% (   0% -   17%) 0.000
                          Fuzzy2      188.81      (2.1%)      204.96      (2.6%)    8.6% (   3% -   13%) 0.000
             MedIntervalsOrdered       28.61      (4.8%)       31.32      (4.3%)    9.5% (   0% -   19%) 0.000
                     LowSpanNear       51.15      (3.3%)       56.30      (2.6%)   10.1% (   4% -   16%) 0.000
                     MedSpanNear       46.95      (3.1%)       51.69      (2.9%)   10.1% (   3% -   16%) 0.000
                 MedSloppyPhrase       59.70      (5.0%)       66.11      (4.1%)   10.7% (   1% -   20%) 0.000
                    OrHighNotMed      514.44      (5.3%)      577.94      (5.7%)   12.3% (   1% -   24%) 0.000
             LowIntervalsOrdered       78.83      (3.8%)       88.74      (3.7%)   12.6% (   4% -   20%) 0.000
                 LowSloppyPhrase       54.64      (4.4%)       62.23      (3.6%)   13.9% (   5% -   22%) 0.000
                      OrHighHigh       60.79      (8.6%)       69.68      (9.4%)   14.6% (  -3% -   35%) 0.000
                         MedTerm      864.76      (5.3%)     1024.07      (7.0%)   18.4% (   5% -   32%) 0.000
                       LowPhrase       80.72      (4.5%)       97.66      (5.1%)   21.0% (  10% -   32%) 0.000
                       MedPhrase       49.16      (4.5%)       60.54      (5.1%)   23.1% (  12% -   34%) 0.000
                      AndHighMed      183.79      (4.8%)      238.88      (6.2%)   30.0% (  18% -   42%) 0.000
                     AndHighHigh       88.11      (5.8%)      114.97      (7.0%)   30.5% (  16% -   46%) 0.000
                    OrNotHighMed      462.38      (3.5%)      606.15      (4.7%)   31.1% (  22% -   40%) 0.000
            HighTermTitleBDVSort       27.04      (1.7%)       35.47      (5.8%)   31.2% (  23% -   39%) 0.000
                    OrNotHighLow     1666.47      (3.9%)     2265.52      (3.6%)   35.9% (  27% -   45%) 0.000
                       OrHighMed      170.06      (4.6%)      242.18      (8.7%)   42.4% (  27% -   58%) 0.000
                         LowTerm     1032.01      (4.4%)     1520.01      (7.2%)   47.3% (  34% -   61%) 0.000
                      AndHighLow     1577.56      (2.9%)     2337.64      (6.8%)   48.2% (  37% -   59%) 0.000
           HighTermDayOfYearSort      199.31      (2.3%)      357.90      (2.9%)   79.6% (  72% -   86%) 0.000

jpountz · 2023-08-20T20:15:17Z

I ran the benchmark multiple times to see if the slowdown on OrHighLow reproduced, and it does. I took the first OrHighLow query in the tasks file: OrHighLow: 2005 valois # freq=835460 freq=2277, and it reproduces the slowdown too. I printed doc freqs of both 2005 and valois for each 1% of the doc ID space (so 100k docs since the index has 10M docs), and it gives the following distributions:

Original index:
2005: [6363, 6296, 6187, 6448, 5812, 5304, 5394, 5340, 4968, 4322, 3041, 2989, 2367, 3991, 5087, 5401, 5561, 5328, 5482, 5235, 5287, 5513, 5817, 5940, 5707, 6057, 6642, 6252, 5963, 5698, 5652, 5630, 5675, 5736, 6189, 5679, 5935, 5868, 5965, 6014, 5698, 5746, 6173, 5843, 6035, 6097, 6004, 6341, 7390, 9190, 10011, 10986, 12463, 12324, 12079, 12109, 12274, 12338, 12676, 13237, 13494, 13261, 11942, 12720, 13443, 13589, 13497, 14363, 14285, 14433, 15217, 14572, 14124, 15481, 14246, 14612, 14002, 16313, 13869, 15555, 17412, 14246, 11731, 6999, 6612, 5965, 6392, 6200, 6142, 6222, 6301, 6340, 6415, 6369, 6262, 6202, 5945, 5807, 5861, 5870]
valois: [15, 22, 24, 31, 45, 62, 53, 96, 89, 87, 20, 14, 3, 16, 32, 27, 35, 28, 27, 18, 25, 37, 19, 19, 42, 26, 29, 14, 11, 10, 15, 10, 24, 54, 34, 43, 12, 18, 18, 27, 16, 68, 8, 34, 56, 43, 38, 20, 25, 15, 15, 21, 17, 23, 25, 43, 19, 17, 14, 11, 5, 4, 7, 17, 19, 23, 15, 10, 9, 11, 26, 25, 15, 20, 12, 22, 18, 19, 8, 23, 10, 18, 20, 6, 13, 15, 9, 9, 5, 10, 5, 15, 9, 12, 8, 5, 6, 10, 10, 15]

Reordered index:
2005: [1270, 597, 4767, 4579, 5490, 5282, 6493, 8367, 6432, 8939, 10370, 5048, 5958, 2415, 3788, 3184, 3256, 3643, 4017, 5183, 5249, 5104, 4424, 4997, 4750, 4276, 4960, 3428, 6715, 10277, 3500, 9427, 7701, 11009, 12684, 11684, 10947, 7721, 1463, 3840, 2213, 5607, 5538, 4133, 4750, 3557, 1977, 9233, 11173, 12639, 12849, 11259, 9666, 13103, 13936, 13909, 2192, 331, 1741, 2321, 3081, 4867, 4991, 3727, 5269, 5890, 1854, 4784, 8763, 7446, 2818, 4713, 13496, 17533, 15171, 5990, 8934, 10878, 14437, 12181, 12459, 7063, 5931, 5114, 5762, 11964, 10558, 8220, 2396, 353, 1003, 4298, 1751, 4883, 26546, 49839, 37667, 41060, 51507, 14902]
valois: [0, 0, 3, 1, 3, 6, 4, 2, 9, 2, 0, 8, 1, 0, 0, 2, 1, 0, 0, 0, 0, 1, 1, 2, 1, 63, 335, 72, 2, 7, 17, 17, 1, 12, 5, 6, 1, 19, 27, 2, 10, 3, 2, 42, 30, 84, 64, 6, 4, 1, 14, 28, 7, 28, 5, 8, 8, 14, 9, 5, 15, 3, 48, 400, 162, 47, 86, 93, 5, 14, 22, 2, 3, 1, 0, 6, 4, 1, 5, 4, 1, 4, 135, 4, 107, 11, 12, 4, 15, 4, 14, 22, 3, 1, 8, 1, 0, 1, 4, 0]

First, the reordering works pretty well, as there are 11 contiguous ranges of 100k doc IDs that don't have a single occurrence of valois in the reordered index, while there were none in the original index. And this helps some queries, e.g. counting documents that contain both 2005 and valois runs more than 2x faster with the reordered index as Lucene needs to decompress fewer blocks.

But I suspect that it is also the source of the slowdown with the disjunction: valois not only has a lower term freq, it also has a higher score contribution, so dynamic pruning starts working better once it has seen k(=100) hits for the higher scoring clause. This is when the minimum competitive score gets close to the actual score of the k-th top hit. In the original index, this happens after evaluating only 5% of the doc ID space given how matches are uniformly spread across the doc ID space. In the reordered index, this happens after evaluating 26% of the doc ID space. So it takes much longer for dynamic pruning to start helping significantly. I suspect we have room for improvement to better deal with this sort of scenario.

jpountz · 2023-08-29T21:37:11Z

So it takes much longer for dynamic pruning to start helping significantly. I suspect we have room for improvement to better deal with this sort of scenario.

I opened #12526 for a potential solution to this problem.

mikemccand · 2023-09-08T18:55:13Z

@jpountz did you measure any change to index size with the reordered docids?

jpountz · 2023-09-08T20:05:35Z

I did. My wikimedium file is sorted by title, which already gives some compression compared to random ordering. Disappointedly, recursive graph bisection only improved compression of postings (doc) by 1.5%. It significantly hurts stored fields though, I suspect it's because the title field is stored, and stored fields take advantage of splits of the same article being next to one another.

File	before (MB)	after (MB)
terms (tim)	307	315
postings (doc)	1706	1685
positions (pos)	2563	2540
points (kdd)	122	126
doc values (dvd)	686	693
stored fields (fdt)	255	364
norms (nvd)	20	20
total	5664	5747

It gave me doubts whether the algorithm was correctly implemented in the beginning, but the query speedups and postings distributions suggest it is not completely wrong.

I should run on wikibigall too.

…tion

jpountz · 2023-09-10T10:41:20Z

Wikibigall. Less space spent on doc valuse this time since I did not enable indexing of facets. There is a more significant size reduction of postings this time (-10.5%). This is not misaligned with the reproducibility paper which observered size reductions of 18% with partitioned Elias-Fano and 5% with SVByte on the Wikipedia dataset. I would expect PFor to be somewhere in between as it's better able to take advantage of small gaps between docs than SVByte, but less than partioned Elias-Fano.

File	before (MB)	after (MB)
terms (tim)	767	766
postings (doc)	2779	2489
positions (pos)	11356	10569
points (kdd)	100	99
doc values (dvd)	456	461
stored fields (fdt)	249	257
norms (nvd)	13	13
total	15734	14669

Benchmarks still show slowdowns on phrase queries and speedups on conjunctions, though it's less spectacular than on wikimedium10m.

                            TaskQPS baseline      StdDevQPS my_modified_version      StdDev                Pct diff p-value
                         MedTerm      652.41      (7.5%)      493.97      (2.6%)  -24.3% ( -31% -  -15%) 0.000
                      HighPhrase       30.86      (3.5%)       23.85      (2.6%)  -22.7% ( -27% -  -17%) 0.000
                       LowPhrase       51.09      (3.1%)       42.38      (2.2%)  -17.1% ( -21% -  -12%) 0.000
                         LowTerm     1057.76      (5.4%)      881.22      (2.5%)  -16.7% ( -23% -   -9%) 0.000
                       MedPhrase       82.18      (3.0%)       71.88      (1.7%)  -12.5% ( -16% -   -8%) 0.000
               HighTermMonthSort     6482.52      (4.5%)     5739.50      (3.5%)  -11.5% ( -18% -   -3%) 0.000
                        PKLookup      293.95      (3.2%)      276.15      (3.7%)   -6.1% ( -12% -    0%) 0.000
                 MedSloppyPhrase        8.68      (2.7%)        8.20      (2.9%)   -5.5% ( -10% -    0%) 0.000
                       OrHighLow      578.06      (4.4%)      550.49      (4.0%)   -4.8% ( -12% -    3%) 0.016
                HighSloppyPhrase        7.43      (2.2%)        7.10      (4.0%)   -4.4% ( -10% -    1%) 0.003
                          Fuzzy1      244.70      (2.9%)      238.49      (3.3%)   -2.5% (  -8% -    3%) 0.080
                      OrHighHigh       39.76      (9.5%)       39.21      (6.1%)   -1.4% ( -15% -   15%) 0.717
                        HighTerm      370.57      (8.5%)      367.09      (4.4%)   -0.9% ( -12% -   13%) 0.768
                 LowSloppyPhrase       13.68      (2.3%)       13.71      (3.3%)    0.2% (  -5% -    5%) 0.868
                         Respell      204.23      (1.8%)      204.98      (2.0%)    0.4% (  -3% -    4%) 0.679
                         Prefix3      225.23      (5.1%)      226.74      (5.5%)    0.7% (  -9% -   11%) 0.786
                        Wildcard      170.34      (4.0%)      171.63      (3.4%)    0.8% (  -6% -    8%) 0.665
                          IntNRQ       92.30     (11.9%)       95.15     (10.2%)    3.1% ( -17% -   28%) 0.555
                     MedSpanNear        5.79      (6.8%)        5.99      (9.3%)    3.4% ( -11% -   20%) 0.378
                       OrHighMed      104.41      (7.3%)      107.99      (5.3%)    3.4% (  -8% -   17%) 0.253
                    HighSpanNear        2.47      (4.2%)        2.56      (4.1%)    3.7% (  -4% -   12%) 0.059
                          Fuzzy2      139.96      (2.8%)      146.77      (2.6%)    4.9% (   0% -   10%) 0.000
                     LowSpanNear       42.96      (3.6%)       45.21      (2.5%)    5.2% (   0% -   11%) 0.000
                     AndHighHigh       33.24      (6.2%)       36.20      (4.3%)    8.9% (  -1% -   20%) 0.000
                      AndHighMed      131.84      (5.2%)      144.31      (3.2%)    9.5% (   0% -   18%) 0.000
           HighTermDayOfYearSort      186.67      (2.9%)      208.78      (3.2%)   11.8% (   5% -   18%) 0.000
                      AndHighLow      590.69      (3.2%)      677.22      (2.2%)   14.6% (   9% -   20%) 0.000

mikemccand · 2023-09-10T11:39:53Z

Thanks @jpountz -- these are fascinating results! I wonder why stored fields index size wasn't really hurt nearly as much for wikibigall but was for wikimediumall?

It's interesting that .pos was helped some on wikibigall versus wikimediumall.

jpountz · 2023-09-10T19:56:10Z

I wonder why stored fields index size wasn't really hurt nearly as much for wikibigall but was for wikimediumall?

This is because wikimedium uses chunks of articles as documents, and every chunk has the title of the Wikipedia article, so there are often ten or more adjacent docs that have the same title. This is a best case for stored fields compression as only the fist title is actually stored and other occurrences of the same title are replaced with a reference to the first occurrence. With reordering, these duplicate titles are no longer in the same block, so it goes back to just deduplicating bits of title strings, instead of entire titles. wikibig doesn't have this best case scenario for stored fields compression. Ordering only helps a bit because articles are in title order, so there are more duplicate strings in a block of stored fields (shared prefixes) compared to the reordered index.

jpountz · 2023-09-10T20:21:57Z

Regarding positions, the reproducibility paper noted that the algorithm helped term frequencies a bit, though not as much as docs. It doesn't say anythink about positions, though I suspect that if it tends to group together docs that have the same freq for the same term, then gaps in positions also tend to be more regular.

jpountz · 2023-09-13T10:11:38Z

I just found a bug that in practice only made BP run one iteration per level, fixing it makes performance better (wikibigall):

                            TaskQPS baseline      StdDevQPS my_modified_version      StdDev                Pct diff p-value
                          IntNRQ      122.77     (15.4%)      114.15      (0.7%)   -7.0% ( -20% -   10%) 0.363
                        PKLookup      294.84      (2.9%)      282.06      (2.7%)   -4.3% (  -9% -    1%) 0.030
                       OrHighLow      713.73      (3.5%)      688.95      (3.7%)   -3.5% ( -10% -    3%) 0.170
                        Wildcard       78.71      (4.2%)       78.01      (1.1%)   -0.9% (  -6% -    4%) 0.682
                         Prefix3      131.65      (9.1%)      132.63      (7.3%)    0.7% ( -14% -   18%) 0.898
                         Respell      203.56      (0.3%)      205.74      (1.1%)    1.1% (   0% -    2%) 0.051
               HighTermMonthSort     6065.88      (2.1%)     6162.98      (1.5%)    1.6% (  -1% -    5%) 0.208
                    HighSpanNear        5.21      (1.7%)        5.40      (2.6%)    3.6% (   0% -    7%) 0.021
                 MedSloppyPhrase        5.78      (3.5%)        6.15      (5.3%)    6.3% (  -2% -   15%) 0.047
                     MedSpanNear        9.40      (0.8%)       10.05      (1.1%)    6.9% (   4% -    8%) 0.000
                     LowSpanNear       13.99      (1.0%)       15.28      (1.2%)    9.2% (   6% -   11%) 0.000
                HighSloppyPhrase        1.26      (4.9%)        1.38      (8.3%)    9.9% (  -3% -   24%) 0.039
                      OrHighHigh       46.12      (8.9%)       55.13      (6.8%)   19.5% (   3% -   38%) 0.001
                          Fuzzy2      163.38      (0.8%)      199.07      (0.7%)   21.8% (  20% -   23%) 0.000
                 LowSloppyPhrase       28.75      (2.2%)       35.28      (3.1%)   22.7% (  17% -   28%) 0.000
                      HighPhrase        7.58      (2.1%)        9.35      (1.7%)   23.4% (  19% -   27%) 0.000
                       OrHighMed      146.19      (6.5%)      183.57      (5.2%)   25.6% (  12% -   39%) 0.000
           HighTermDayOfYearSort      153.45      (2.5%)      194.38      (1.9%)   26.7% (  21% -   31%) 0.000
                          Fuzzy1      259.92      (2.4%)      345.09      (2.5%)   32.8% (  27% -   38%) 0.000
                        HighTerm      478.18      (9.8%)      670.01      (9.2%)   40.1% (  19% -   65%) 0.000
                         MedTerm      577.98      (9.0%)      845.32     (10.0%)   46.3% (  25% -   71%) 0.000
                      AndHighMed      157.39      (4.5%)      243.75      (7.3%)   54.9% (  41% -   69%) 0.000
                         LowTerm     1016.15      (7.6%)     1671.11      (9.8%)   64.5% (  43% -   88%) 0.000
                      AndHighLow      746.14      (1.7%)     1227.66      (4.2%)   64.5% (  57% -   71%) 0.000
                       MedPhrase       41.72      (2.0%)       71.95      (3.4%)   72.4% (  65% -   79%) 0.000
                     AndHighHigh       31.03      (7.0%)       56.59     (13.4%)   82.4% (  57% -  110%) 0.000
                       LowPhrase       69.04      (1.5%)      126.15      (3.4%)   82.7% (  76% -   88%) 0.000

Space savings are also bigger on postings:

File	before (MB)	after (MB)
terms (tim)	767	763
postings (doc)	2779	2260
positions (pos)	11356	10522
points (kdd)	100	99
doc values (dvd)	456	462
stored fields (fdt)	249	226
norms (nvd)	13	13
total	15734	14360

jpountz · 2023-09-14T16:21:26Z

Since it's fairly unintrusive to other functionality, I felt free to merge.

Recursive graph bisection is an extremely effective algorithm to reorder doc IDs in a way that improves both storage and query efficiency by clustering similar documents together. It usually performs better than other techniques that try to achieve a similar goal such as sorting the index in natural order (e.g. by URL) or by a min-hash, though it comes at a higher index-time cost. The [original paper](https://arxiv.org/pdf/1602.08820.pdf) is good but I found this [reproducibility study](http://engineering.nyu.edu/~suel/papers/bp-ecir19.pdf) to describe the algorithm in more practical ways.

uschindler

Using default ThreadFactoty for fork join pools runs test without permissions in case of enabled security manager.

See test failures here: https://jenkins.thetaphi.de/job/Lucene-MMAPv2-Windows/794/

uschindler · 2023-09-18T16:20:25Z

lucene/misc/src/test/org/apache/lucene/misc/index/TestBPIndexReorderer.java

+
+  public void testSingleTermWithForkJoinPool() throws IOException {
+    int concurrency = TestUtil.nextInt(random(), 1, 8);
+    ForkJoinPool pool = new ForkJoinPool(concurrency);


The default implementation of ForkJoinPool executes tasks without any permissions. This causes the test to fail if a FS based directory implemntation is used:

To fix use a thread factory that does not remove all permissions.

jpountz added 13 commits August 5, 2023 11:27

iter

699ef3b

Save memory by no longer caching logs and using fast approximations.

2191e72

Remove IndexedTerms.

25a315a

iter

2f517c0

Fix estimator with docFreq=0.

398ae45

iter

6b03a8d

Faster decoding of the forward index.

85a6d4b

Cleanup.

0ee52a4

Skip ID fields.

b10f255

Better error message

f275a56

Better docs

5b91215

cleanup

b2ad7c9

Move to fork-join to simplify.

d786bf5

jpountz added 2 commits August 28, 2023 22:00

iter

6505cfb

iter

13ddcbf

felixbarny mentioned this pull request Aug 30, 2023

Hash the tsid to overcome dimensions limits elastic/elasticsearch#98023

Merged

Merge remote-tracking branch 'origin/main' into recursive_graph_bisec…

3c52fdb

…tion

jpountz marked this pull request as ready for review September 10, 2023 10:42

Better docs, small fix.

c9ae712

Merge branch 'main' into recursive_graph_bisection

27d2a6e

jpountz added 2 commits September 13, 2023 12:11

Undo unneeded change.

2e29895

tidy

9497b7a

jpountz added this to the 9.8.0 milestone Sep 14, 2023

jpountz merged commit 39f3777 into apache:main Sep 14, 2023
4 checks passed

jpountz deleted the recursive_graph_bisection branch September 14, 2023 16:21

jpountz added a commit that referenced this pull request Sep 14, 2023

CHANGES for #12489.

dff45a3

jpountz added a commit that referenced this pull request Sep 15, 2023

CHANGES for #12489.

780e684

uschindler reviewed Sep 18, 2023

View reviewed changes

jpountz mentioned this pull request Sep 29, 2023

Add a Lucene 9.8 engine. quickwit-oss/search-benchmark-game#48

Merged

jpountz mentioned this pull request Oct 12, 2023

Enable recursive graph bisection out of the box? #12665

Open

5 tasks

rishabhmaurya mentioned this pull request Feb 8, 2024

[RFC] Explore the use of document BPReorderer introduced in lucene opensearch-project/OpenSearch#12257

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for recursive graph bisection. #12489

Add support for recursive graph bisection. #12489

jpountz commented Aug 4, 2023 •

edited

Loading

jpountz commented Aug 4, 2023

jpountz commented Aug 20, 2023

jpountz commented Aug 20, 2023

jpountz commented Aug 29, 2023

mikemccand commented Sep 8, 2023

jpountz commented Sep 8, 2023 •

edited

Loading

jpountz commented Sep 10, 2023

mikemccand commented Sep 10, 2023

jpountz commented Sep 10, 2023

jpountz commented Sep 10, 2023

jpountz commented Sep 13, 2023

jpountz commented Sep 14, 2023

uschindler left a comment •

edited

Loading

uschindler Sep 18, 2023

Add support for recursive graph bisection. #12489

Add support for recursive graph bisection. #12489

Conversation

jpountz commented Aug 4, 2023 • edited Loading

jpountz commented Aug 4, 2023

jpountz commented Aug 20, 2023

jpountz commented Aug 20, 2023

jpountz commented Aug 29, 2023

mikemccand commented Sep 8, 2023

jpountz commented Sep 8, 2023 • edited Loading

jpountz commented Sep 10, 2023

mikemccand commented Sep 10, 2023

jpountz commented Sep 10, 2023

jpountz commented Sep 10, 2023

jpountz commented Sep 13, 2023

jpountz commented Sep 14, 2023

uschindler left a comment • edited Loading

Choose a reason for hiding this comment

uschindler Sep 18, 2023

Choose a reason for hiding this comment

jpountz commented Aug 4, 2023 •

edited

Loading

jpountz commented Sep 8, 2023 •

edited

Loading

uschindler left a comment •

edited

Loading