remove storage of intermediate checkpoints in restore blocks #2143

KtorZ · 2020-09-16T11:23:22Z

Issue Number

#2035

Overview

This was a perhaps too-early optimization trying to reduce the time of
rolling back. The sparseCheckpoint function return an empty list for
pretty much the entire restoration, except when reaching the last k
blocks where it'll return a list of checkpoints to save, sparse for
older blocks and dense near the tip. With the current parameters,
blocks are kept if:

Their blockheight is not older than (tip - k) or it is 0
And their blockheight is a multiple of 100 or, they are near within
10 blocks from the last known block.

We currently do this calculation and filtering in two places:

In restoreBlocks, to pre-filter checkpoints to store in the database
In prune from the wallet DBLayer, to garbage collect old checkpoints

Yet, what (1) buys us is a very little gain on standard wallet, and a huge
performance cost on large wallets. So let's analyze the two
cases:

A/ Small Wallets

The time to create a checkpoint is very small in front of the slot
length.
Restoring blocks is fast, (up to 10K blocks per seconds on empty
wallets).

Therefore, rolling back of 2, 3 blocks or, 100 makes pretty much no
difference. Being able to store more checkpoints near the tip adds
very little benefits in terms of performances especially, for the
first restoration.

B/ Large Wallets

The time to create a checkpoint is important in front of the slot
length (we've seen up to 4s).
Restoring blocks is still quite fast (the time needed for processing
blocks remains quite small in front of the time needed to read and create
new checkpoints).

The main problem with large wallets occur when the wallet is almost
synced and reaches the 10 last blocks of the chain. By trying to store
intermediate checkpoints, not only does the wallet spent 10* more time
in restoreBlocks than normally, but it also keep the database lock
for all that duration. Consider the case where the wallet takes 4s to
read, and 4s to create a checkpoint, plus some additional 2s to prune
them (these are actual data from large exchanges), by default, 10s is
spent for creating one checkpoint. And at least 40 more to create the
intermediate ones. During this time, between 1 and 3 checkpoints have
been created. So it already needs to prune out the last one it spends
12s to create and needs already to create new checkpoints right away.

As a consequence, a lot of other functionalities are made needlessly
slower than they could be, because for the whole duration of the
restoreBlocks function, the wallet is holding the database lock.

Now, what happen if we avoid storing the "intermediate" checkpoints in
restore blocks: blocks near the tip will eventually get stored, but
one by one. So, when we just reach the top for the first time, we'll
only store the last checkpoint. But then, the next 9 checkpoints
created won't be pruned out. So, the worse that can happen is that the
wallet is asked to rollback right after we've reached the tip and
haven't created many checkpoints yet. Still, We would have at least
two checkpoints in the past that are at most 2K blocks from the tip
(because we fetch blocks by batches of 1000). So it's important that
the batch size remains smaller than k so that we can be sure that
there's always one checkpoint in the database.

Comments

Anviking

Lgtm

This was a perhaps too-early optimization trying to reduce the time of rolling back. The `sparseCheckpoint` function return an empty list for pretty much the entire restoration, except when reaching the last k blocks where it'll return a list of checkpoints to save, sparse for older blocks and dense near the tip. With the current parameters, blocks are kept if: - Their blockheight is not older than (tip - k) or it is 0 - And their blockheight is a multiple of 100 or, they are near within 10 blocks from the last known block. We currently do this calculation and filtering in two places: 1. In `restoreBlocks`, to pre-filter checkpoints to store in the database 2. In `prune` from the wallet DBLayer, to garbage collect old checkpoints Yet, what (1) buys us is a very little gain on standard wallet, and a huge performance cost on large wallets. So let's analyze the two cases: A/ Small Wallets - The time to create a checkpoint is very small in front of the slot length. - Restoring blocks is fast, (up to 10K blocks per seconds on empty wallets). Therefore, rolling back of 2, 3 blocks or, 100 makes pretty much no difference. Being able to store more checkpoints near the tip adds very little benefits in terms of performances especially, for the first restoration. B/ Large Wallets - The time to create a checkpoint is important in front of the slot length (we've seen up to 4s). - Restoring blocks is still quite fast (the time needed for processing blocks remains quite small in front of the time needed to read and create new checkpoints). The main problem with large wallets occur when the wallet is almost synced and reaches the 10 last blocks of the chain. By trying to store intermediate checkpoints, not only does the wallet spent 10* more time in `restoreBlocks` than normally, but it also keep the database lock for all that duration. Consider the case where the wallet takes 4s to read, and 4s to create a checkpoint, plus some additional 2s to prune them (these are actual data from large exchanges), by default, 10s is spent for creating one checkpoint. And at least 40 more to create the intermediate ones. During this time, between 1 and 3 checkpoints have been created. So it already needs to prune out the last one it spends 12s to create and needs already to create new checkpoints right away. As a consequence, a lot of other functionalities are made needlessly slower than they could be, because for the whole duration of the `restoreBlocks` function, the wallet is holding the database lock. Now, what happen if we avoid storing the "intermediate" checkpoints in restore blocks: blocks near the tip will eventually get stored, but one by one. So, when we _just_ reach the top for the first time, we'll only store the last checkpoint. But then, the next 9 checkpoints created won't be pruned out. So, the worse that can happen is that the wallet is asked to rollback right after we've reached the tip and haven't created many checkpoints yet. Still, We would have at least two checkpoints in the past that are at most 2K blocks from the tip (because we fetch blocks by batches of 1000). So it's important that the batch size remains smaller than `k` so that we can be sure that there's always one checkpoint in the database.

…cks' is sound.

KtorZ · 2020-09-16T20:19:23Z

bors r+

2143: remove storage of intermediate checkpoints in restore blocks r=KtorZ a=KtorZ # Issue Number  #2035 # Overview  This was a perhaps too-early optimization trying to reduce the time of rolling back. The `sparseCheckpoint` function return an empty list for pretty much the entire restoration, except when reaching the last k blocks where it'll return a list of checkpoints to save, sparse for older blocks and dense near the tip. With the current parameters, blocks are kept if: - Their blockheight is not older than (tip - k) or it is 0 - And their blockheight is a multiple of 100 or, they are near within 10 blocks from the last known block. We currently do this calculation and filtering in two places: 1. In `restoreBlocks`, to pre-filter checkpoints to store in the database 2. In `prune` from the wallet DBLayer, to garbage collect old checkpoints Yet, what (1) buys us is a very little gain on standard wallet, and a huge performance cost on large wallets. So let's analyze the two cases: A/ Small Wallets - The time to create a checkpoint is very small in front of the slot length. - Restoring blocks is fast, (up to 10K blocks per seconds on empty wallets). Therefore, rolling back of 2, 3 blocks or, 100 makes pretty much no difference. Being able to store more checkpoints near the tip adds very little benefits in terms of performances especially, for the first restoration. B/ Large Wallets - The time to create a checkpoint is important in front of the slot length (we've seen up to 4s). - Restoring blocks is still quite fast (the time needed for processing blocks remains quite small in front of the time needed to read and create new checkpoints). The main problem with large wallets occur when the wallet is almost synced and reaches the 10 last blocks of the chain. By trying to store intermediate checkpoints, not only does the wallet spent 10* more time in `restoreBlocks` than normally, but it also keep the database lock for all that duration. Consider the case where the wallet takes 4s to read, and 4s to create a checkpoint, plus some additional 2s to prune them (these are actual data from large exchanges), by default, 10s is spent for creating one checkpoint. And at least 40 more to create the intermediate ones. During this time, between 1 and 3 checkpoints have been created. So it already needs to prune out the last one it spends 12s to create and needs already to create new checkpoints right away. As a consequence, a lot of other functionalities are made needlessly slower than they could be, because for the whole duration of the `restoreBlocks` function, the wallet is holding the database lock. Now, what happen if we avoid storing the "intermediate" checkpoints in restore blocks: blocks near the tip will eventually get stored, but one by one. So, when we _just_ reach the top for the first time, we'll only store the last checkpoint. But then, the next 9 checkpoints created won't be pruned out. So, the worse that can happen is that the wallet is asked to rollback right after we've reached the tip and haven't created many checkpoints yet. Still, We would have at least two checkpoints in the past that are at most 2K blocks from the tip (because we fetch blocks by batches of 1000). So it's important that the batch size remains smaller than `k` so that we can be sure that there's always one checkpoint in the database. # Comments   Co-authored-by: KtorZ <matthias.benkort@gmail.com>

iohk-bors · 2020-09-16T21:19:19Z

Build failed:

buildkite/cardano-wallet

Anviking · 2020-09-16T14:56:02Z

lib/core/src/Cardano/Wallet.hs

+                -- Rollback may still occur during this short period, but
+                -- rolling back from a few hundred blocks is relatively fast
+                -- anyway.
+                cfg = defaultSparseCheckpointsConfig { edgeSize = 0 }


lib/core/test/unit/Cardano/Wallet/DB/Properties.hs

lib/core/src/Cardano/Wallet/DB.hs

Anviking

Now that you have SparseCheckpointsConfig and properties, I think it would be nice to have the preconditions for it to work properly clear and explicit, either through types or through doc comments in the implementation, but not just by restricting generators.

(Mainly cf https://github.com/input-output-hk/cardano-wallet/pull/2143/files#r490008971)

Also moved the epoch stability to the 'SparseCheckpointsConfiguration' since it's mostly static

KtorZ · 2020-09-17T12:17:37Z

@Anviking ☝️ let me know

The previous version could possibly lead to an unaware developer tweaking parameters in a way that would generate invalid configuration. This new version makes it more difficult / safer.

Anviking

this is pretty intricate 😅 , but👌

Anviking · 2020-09-17T14:39:13Z

lib/core/src/Cardano/Wallet/DB.hs

+--
+-- So, `k / 3` = 720, which should remain around a second of time needed to catch
+-- up in case of large rollbacks.
+gapSize :: SparseCheckpointsConfig -> Word32


Anviking · 2020-09-18T10:25:50Z

lib/core/test/unit/Cardano/Wallet/DB/Properties.hs

+prop_checkpointsEventuallyEqual
+    :: GenSparseCheckpointsArgs
+    -> Property
+prop_checkpointsEventuallyEqual args@(GenSparseCheckpointsArgs cfg h) = prop


So as per DM I tried digging into this property a bit. My original question was whether the scenario you came up with actually is covered by tests:

Imagine the following scenario:

We have a checkpoint at [0, 1000]

Network layer start fetching the next blocks, fetches the first 500, then rollsback internally > 142 blocks back, and fetches the next 500. So it yield in the end 858 blocks.

We store the last one, so we know have [0, 1000, 1858] so far so good.

We continue, and it fetches the next batch, so we are now at [0,1000,1858,2858], pruning > occurs and we end up with [0,1000,2858].

Next batch is fetched, [0,1000,2858,3858], pruning occurs and [0,3858]
And here is the problem. Now, if we rollback, we can only rollback to genesis. Because we've > had this interruption, it created an offset than makes the pruning function discard intermediate > checkpoints that would be worth keeping.

or in some sense that even if we rollback k blocks, there should be a checkpoint close by, such that we don't have to rollback to genesis.

If I've understood things correctly, this does hold after the wallet has synced 5 (edgeSize) blocks one-by-one at the tip, but not before. Ok — this seems fine.

(anviking/2035/experiments: holds at the end of prop_checkpointsEventuallyEqual, and otherwise fails with

quickCheck prop_checkpointsEventuallyEqual *** Failed! Falsifiable (after 1 test): GenSparseCheckpointsArgs (SparseCheckpointsConfig {edgeSize = 1, epochStability = 10}) 113 h=113 Batches [[0,1,2,3,4,5,6,7],[8],[9,10,11,12],[13,14,15,16,17,18],[19,20,21,22,23,24],[25,26,27,28,29],[30],[31,32,33,34,35,36,37],[38,39,40,41,42,43,44,45,46],[47,48],[49,50,51,52,53,54,55,56,57],[58,59,60,61,62,63],[64,65,66,67,68,69,70,71,72],[73,74,75,76,77,78,79],[80,81,82,83,84,85,86,87],[88,89,90],[91,92,93,94,95],[96,97,98,99,100,101,102],[103],[104],[105,106,107,108],[109,110,111],[112,113],[114]] Tip: 46 Prev prev db: SparseCheckpointsDB [0,18,24,29,30] Prev db: SparseCheckpointsDB [0,30,37] New db: [0,46] ...which is db 9 out of 25 First unstable: 45 Oldest expected: 33 Cps handling max rollback: [] Future cps: [0,36,39,42,45,46]

)

…nown acceptable limit What we show is that, assuming when the longest chain switch possible is of length k, the wallet will never need to reapply more than k + gapSize blocks, where gapSize is strictly smaller than k. So, this gives the guarantee that we'll never have to re-apply more than 2k blocks in worse scenarios which seems very acceptable.

This timeout is truly to check that the function doesn't blow up and stay within 'acceptable' bound. It fails to pass on some machine in CI, so increasing it a bit but within the same order of magnitude.

KtorZ · 2020-09-18T16:11:27Z

bors r+

iohk-bors · 2020-09-18T18:00:46Z

Build succeeded:

KtorZ added the RESOLVING ISSUE Mark a PR as resolving issues, for auto-generated CHANGELOG label Sep 16, 2020

KtorZ requested a review from Anviking September 16, 2020 11:23

KtorZ self-assigned this Sep 16, 2020

Anviking approved these changes Sep 16, 2020

View reviewed changes

KtorZ force-pushed the KtorZ/2035/long-checkpoint-times branch from 1403d52 to 9eacf44 Compare September 16, 2020 14:18

add property to show that the checkpointing approach from 'restoreBlo…

e477982

…cks' is sound.

KtorZ force-pushed the KtorZ/2035/long-checkpoint-times branch from 9eacf44 to e477982 Compare September 16, 2020 15:00

Anviking self-requested a review September 16, 2020 15:14

Anviking reviewed Sep 17, 2020

View reviewed changes

lib/core/test/unit/Cardano/Wallet/DB/Properties.hs Outdated Show resolved Hide resolved

lib/core/test/unit/Cardano/Wallet/DB/Properties.hs Outdated Show resolved Hide resolved

Anviking reviewed Sep 17, 2020

View reviewed changes

lib/core/src/Cardano/Wallet/DB.hs Show resolved Hide resolved

Anviking requested changes Sep 17, 2020

View reviewed changes

make pre-conditions for sparseCheckpoints explicit in the function.

aeae316

Also moved the epoch stability to the 'SparseCheckpointsConfiguration' since it's mostly static

KtorZ force-pushed the KtorZ/2035/long-checkpoint-times branch from 6575e53 to aeae316 Compare September 17, 2020 12:17

make 'sparseCheckpoints' API a bit more future proof

9b53180

The previous version could possibly lead to an unaware developer tweaking parameters in a way that would generate invalid configuration. This new version makes it more difficult / safer.

Anviking force-pushed the KtorZ/2035/long-checkpoint-times branch from d6a15b2 to 9b53180 Compare September 18, 2020 10:17

Anviking approved these changes Sep 18, 2020

View reviewed changes

KtorZ added 2 commits September 18, 2020 15:37

increase timeout for Shelley estimateMaxNumberOfInputs properties

4cb977e

This timeout is truly to check that the function doesn't blow up and stay within 'acceptable' bound. It fails to pass on some machine in CI, so increasing it a bit but within the same order of magnitude.

iohk-bors bot merged commit 55d0811 into master Sep 18, 2020

iohk-bors bot deleted the KtorZ/2035/long-checkpoint-times branch September 18, 2020 18:00

KtorZ mentioned this pull request Sep 21, 2020

Too many intermediate checkpoints created in one go #2035

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

remove storage of intermediate checkpoints in restore blocks #2143

remove storage of intermediate checkpoints in restore blocks #2143

KtorZ commented Sep 16, 2020

Anviking left a comment

KtorZ commented Sep 16, 2020

iohk-bors bot commented Sep 16, 2020

Anviking Sep 16, 2020

Anviking left a comment

KtorZ commented Sep 17, 2020

Anviking left a comment

Anviking Sep 17, 2020

Anviking Sep 18, 2020 •

edited

Loading

KtorZ commented Sep 18, 2020

iohk-bors bot commented Sep 18, 2020

remove storage of intermediate checkpoints in restore blocks #2143

remove storage of intermediate checkpoints in restore blocks #2143

Conversation

KtorZ commented Sep 16, 2020

Issue Number

Overview

Comments

Anviking left a comment

Choose a reason for hiding this comment

KtorZ commented Sep 16, 2020

iohk-bors bot commented Sep 16, 2020

Anviking Sep 16, 2020

Choose a reason for hiding this comment

Anviking left a comment

Choose a reason for hiding this comment

KtorZ commented Sep 17, 2020

Anviking left a comment

Choose a reason for hiding this comment

Anviking Sep 17, 2020

Choose a reason for hiding this comment

Anviking Sep 18, 2020 • edited Loading

Choose a reason for hiding this comment

KtorZ commented Sep 18, 2020

iohk-bors bot commented Sep 18, 2020

Anviking Sep 18, 2020 •

edited

Loading