Fix splitting shards with large purge sequences #3739

nickva · 2021-09-10T03:16:27Z

Previously, if the source db purge sequence > purge_infos_limit, shard splitting would crash with the {{invalid_start_purge_seq,0}, [{couch_bt_engine,fold_purge_infos,5... error. That was because purge sequences were always copied starting from 0. That would only work as long as the total number of purges stayed below the purge_infos_limit threshold. In order to correctly gather the purge sequences, the start sequence must be based off of the actual oldest sequence currently available.

An example of how it should be done is in the mem_rpc module, when loading purge infos, so here we do exactly the same. The MinSeq - 1 logic is also evident by inspecting the fold_purge_infos function.

The test sets up the exact scenario as described above: reduces the purge info limit to 10 then purges 20 documents. By purging more than the limit, we ensure the starting sequence is now != 0. However, the purge sequence btree is actually trimmed down during compaction. That is why there are a few extra helper functions to ensure compaction runs and finishes before shard splitting starts.

Fixes: #3738

src/mem3/test/eunit/mem3_reshard_test.erl

kocolosk · 2021-09-10T13:57:43Z

Code looks great to me. I saw you mention a shard splitting PR on IRC and figured I'd need an extra cup of coffee this morning, then saw a 2 line changes with detailed tests 🎉

Couple of small typos in the commit message, "purge sequences were always copied", "In order to correctly gather and (split?) the purge sequences". Nice work! 👍

Previously, if the source db purge sequence > `purge_infos_limit`, shard splitting would crash with the `{{invalid_start_purge_seq,0}, [{couch_bt_engine,fold_purge_infos,5...` error. That was because purge sequences were always copied starting from 0. That would only work as long as the total number of purges stayed below the purge_infos_limit threshold. In order to correctly gather the purge sequences, the start sequence must be based off of the actual oldest sequence currently available. An example of how it should be done is in the `mem_rpc` module, when loading purge infos [0], so here we do exactly the same. The `MinSeq - 1` logic is also evident by inspecting the fold_purge_infos [1] function. The test sets up the exact scenario as described above: reduces the purge info limit to 10 then purges 20 documents. By purging more than the limit, we ensure the starting sequence is now != 0. However, the purge sequence btree is actually trimmed down during compaction. That is why there are a few extra helper functions to ensure compaction runs and finishes before shard splitting starts. Fixes: #3738 [0] https://github.com/apache/couchdb/blob/4ea9f1ea1a2078162d0e281948b56469228af3f7/src/mem3/src/mem3_rpc.erl#L206-L207 [1] https://github.com/apache/couchdb/blob/3.x/src/couch/src/couch_bt_engine.erl#L625-L637

nickva · 2021-09-10T15:12:43Z

Thank you for taking a look, Adam!

nickva force-pushed the fix-shard-splitting-with-lots-of-purges branch 4 times, most recently from f51a949 to ce2293d Compare September 10, 2021 06:24

kocolosk reviewed Sep 10, 2021

View reviewed changes

src/mem3/test/eunit/mem3_reshard_test.erl Outdated Show resolved Hide resolved

nickva force-pushed the fix-shard-splitting-with-lots-of-purges branch from ce2293d to 9721046 Compare September 10, 2021 14:42

nickva merged commit 9f08191 into 3.x Sep 10, 2021

nickva deleted the fix-shard-splitting-with-lots-of-purges branch September 10, 2021 15:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix splitting shards with large purge sequences #3739

Fix splitting shards with large purge sequences #3739

nickva commented Sep 10, 2021 •

edited

kocolosk commented Sep 10, 2021

nickva commented Sep 10, 2021

Fix splitting shards with large purge sequences #3739

Fix splitting shards with large purge sequences #3739

Conversation

nickva commented Sep 10, 2021 • edited

kocolosk commented Sep 10, 2021

nickva commented Sep 10, 2021

nickva commented Sep 10, 2021 •

edited