Counter-intuitive result: more RAM = slower indexing (standard inverted indexes) #2294

lintool · 2023-12-05T18:03:46Z

I'm currently on: #2275 at 3b8bee7

I've bumped up default memory buffer size from 4G to 16G, as follows:

config.setRAMBufferSizeMB(args.memoryBuffer);

But I've discovered that more RAM actually slows indexing speed. Here are some runs with the SPLADE++ ED model on MS MARCO v2, sweeping {2G, 4G, 8G, 16G, 32G}.

logs/log.msmarco-v2-passage-splade-pp-ed.02gb.1:2023-12-03 14:21:47,515 INFO  [main] index.AbstractIndexer (AbstractIndexer.java:315) - Total 138,364,198 documents indexed in 02:33:25
logs/log.msmarco-v2-passage-splade-pp-ed.04gb.1:2023-12-03 16:56:44,825 INFO  [main] index.AbstractIndexer (AbstractIndexer.java:315) - Total 138,364,198 documents indexed in 02:34:55
logs/log.msmarco-v2-passage-splade-pp-ed.08gb.1:2023-12-03 20:01:08,410 INFO  [main] index.AbstractIndexer (AbstractIndexer.java:315) - Total 138,364,198 documents indexed in 03:04:20
logs/log.msmarco-v2-passage-splade-pp-ed.16gb.1:2023-12-03 23:34:57,443 INFO  [main] index.AbstractIndexer (AbstractIndexer.java:315) - Total 138,364,198 documents indexed in 03:33:45
logs/log.msmarco-v2-passage-splade-pp-ed.32gb.1:2023-12-04 02:54:37,891 INFO  [main] index.AbstractIndexer (AbstractIndexer.java:315) - Total 138,364,198 documents indexed in 03:19:36

It seems like more RAM actually slows indexing... this expected behavior? (This is with spinning disks; on SSDs, same pattern persists, although not as prounced.)

@jpountz @benwtrent @ChrisHegarty @tteofili any ideas here?

The text was updated successfully, but these errors were encountered:

jpountz · 2023-12-05T21:15:58Z

My guess is that it's not actually faster, it's just taking a bit of work off indexing threads, and adding more work to merging, which is running asynchronously in its own threads.

Indexing boils down to updating large hash tables (inverted indexes) or graphs (HNSW). And the bigger they get, the slower the updates because you get more cache misses, etc.. So flushing N segments of size N is more costly than flushing N*2 segments of size S/2. But in-turn, this adds more work for merging. In your case, I'm assuming that you are not maxing out your CPU, so merging can take all the CPU it wants and indexing appears to be faster. But if you were trying to max out indexing so that indexing and merging would be competing for the same resources, then you would see a slowdown when decreasing the RAM buffer. Likewise if you told Lucene to run merging in indexing threads rather than their own threads (SerialMergeScheduler instead of ConcurrentMergeScheduler).

lintool · 2023-12-05T21:20:22Z

Ah, makes sense! I am using ConcurrentMergeScheduler.

Also, I guess that merging is (typically) disk throughput bound... and quite efficient since merging sorted lists is a linear time operation.

jpountz · 2023-12-05T21:32:23Z

Right. It's rather efficient, but almost always still more expensive than doing less merging by accumulating bigger flush segments in the first place by configuring a bigger RAM buffer.

lintool closed this as completed Dec 16, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Counter-intuitive result: more RAM = slower indexing (standard inverted indexes) #2294

Counter-intuitive result: more RAM = slower indexing (standard inverted indexes) #2294

lintool commented Dec 5, 2023

jpountz commented Dec 5, 2023

lintool commented Dec 5, 2023

jpountz commented Dec 5, 2023

Counter-intuitive result: more RAM = slower indexing (standard inverted indexes) #2294

Counter-intuitive result: more RAM = slower indexing (standard inverted indexes) #2294

Comments

lintool commented Dec 5, 2023

jpountz commented Dec 5, 2023

lintool commented Dec 5, 2023

jpountz commented Dec 5, 2023