Skip to content

Commit

Permalink
Merge #2143
Browse files Browse the repository at this point in the history
2143: remove storage of intermediate checkpoints in restore blocks r=KtorZ a=KtorZ

# Issue Number

<!-- Put here a reference to the issue this PR relates to and which requirements it tackles -->

#2035

# Overview

<!-- Detail in a few bullet points the work accomplished in this PR -->

  This was a perhaps too-early optimization trying to reduce the time of
  rolling back. The `sparseCheckpoint` function return an empty list for
  pretty much the entire restoration, except when reaching the last k
  blocks where it'll return a list of checkpoints to save, sparse for
  older blocks and dense near the tip. With the current parameters,
  blocks are kept if:

  - Their blockheight is not older than (tip - k) or it is 0
  - And their blockheight is a multiple of 100 or, they are near within
    10 blocks from the last known block.

  We currently do this calculation and filtering in two places:

  1. In `restoreBlocks`, to pre-filter checkpoints to store in the database
  2. In `prune` from the wallet DBLayer, to garbage collect old checkpoints

  Yet, what (1) buys us is a very little gain on standard wallet, and a huge
  performance cost on large wallets. So let's analyze the two
  cases:

  A/ Small Wallets

  - The time to create a checkpoint is very small in front of the slot
    length.

  - Restoring blocks is fast, (up to 10K blocks per seconds on empty
    wallets).

  Therefore, rolling back of 2, 3 blocks or, 100 makes pretty much no
  difference. Being able to store more checkpoints near the tip adds
  very little benefits in terms of performances especially, for the
  first restoration.

  B/ Large Wallets

  - The time to create a checkpoint is important in front of the slot
    length (we've seen up to 4s).

  - Restoring blocks is still quite fast (the time needed for processing
    blocks remains quite small in front of the time needed to read and create
    new checkpoints).

  The main problem with large wallets occur when the wallet is almost
  synced and reaches the 10 last blocks of the chain. By trying to store
  intermediate checkpoints, not only does the wallet spent 10* more time
  in `restoreBlocks` than normally, but it also keep the database lock
  for all that duration. Consider the case where the wallet takes 4s to
  read, and 4s to create a checkpoint, plus some additional 2s to prune
  them (these are actual data from large exchanges), by default, 10s is
  spent for creating one checkpoint. And at least 40 more to create the
  intermediate ones. During this time, between 1 and 3 checkpoints have
  been created. So it already needs to prune out the last one it spends
  12s to create and needs already to create new checkpoints right away.

  As a consequence, a lot of other functionalities are made needlessly
  slower than they could be, because for the whole duration of the
  `restoreBlocks` function, the wallet is holding the database lock.

  Now, what happen if we avoid storing the "intermediate" checkpoints in
  restore blocks: blocks near the tip will eventually get stored, but
  one by one. So, when we _just_ reach the top for the first time, we'll
  only store the last checkpoint. But then, the next 9 checkpoints
  created won't be pruned out. So, the worse that can happen is that the
  wallet is asked to rollback right after we've reached the tip and
  haven't created many checkpoints yet. Still, We would have at least
  two checkpoints in the past that are at most 2K blocks from the tip
  (because we fetch blocks by batches of 1000). So it's important that
  the batch size remains smaller than `k` so that we can be sure that
  there's always one checkpoint in the database.


# Comments

<!-- Additional comments or screenshots to attach if any -->

<!-- 
Don't forget to:

 ✓ Self-review your changes to make sure nothing unexpected slipped through
 ✓ Assign yourself to the PR
 ✓ Assign one or several reviewer(s)
 ✓ Once created, link this PR to its corresponding ticket
 ✓ Assign the PR to a corresponding milestone
 ✓ Acknowledge any changes required to the Wiki
-->


Co-authored-by: KtorZ <matthias.benkort@gmail.com>
  • Loading branch information
iohk-bors[bot] and KtorZ committed Sep 18, 2020
2 parents d393d82 + 4cb977e commit 55d0811
Show file tree
Hide file tree
Showing 5 changed files with 272 additions and 68 deletions.
19 changes: 18 additions & 1 deletion lib/core/src/Cardano/Wallet.hs
Original file line number Diff line number Diff line change
Expand Up @@ -194,6 +194,8 @@ import Cardano.Wallet.DB
, ErrRemovePendingTx (..)
, ErrWalletAlreadyExists (..)
, PrimaryKey (..)
, SparseCheckpointsConfig (..)
, defaultSparseCheckpointsConfig
, sparseCheckpoints
)
import Cardano.Wallet.Network
Expand Down Expand Up @@ -843,7 +845,22 @@ restoreBlocks ctx wid blocks nodeTip = db & \DBLayer{..} -> mapExceptT atomicall
liftIO $ logDelegation delegation
putDelegationCertificate (PrimaryKey wid) cert slotNo

let unstable = sparseCheckpoints k (nodeTip ^. #blockHeight)
let unstable = sparseCheckpoints cfg (nodeTip ^. #blockHeight)
where
-- NOTE
-- The edge really is an optimization to avoid rolling back too
-- "far" in the past. Yet, we let the edge construct itself
-- organically once we reach the tip of the chain and start
-- processing blocks one by one.
--
-- This prevents the wallet from trying to create too many
-- checkpoints at once during restoration which causes massive
-- performance degradation on large wallets.
--
-- Rollback may still occur during this short period, but
-- rolling back from a few hundred blocks is relatively fast
-- anyway.
cfg = (defaultSparseCheckpointsConfig k) { edgeSize = 0 }

forM_ (NE.init cps) $ \cp' -> do
let (Quantity h) = currentTip cp' ^. #blockHeight
Expand Down
72 changes: 58 additions & 14 deletions lib/core/src/Cardano/Wallet/DB.hs
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
{-# LANGUAGE DataKinds #-}
{-# LANGUAGE ExistentialQuantification #-}
{-# LANGUAGE FlexibleContexts #-}
{-# LANGUAGE GADTs #-}
{-# LANGUAGE NamedFieldPuns #-}
{-# LANGUAGE RankNTypes #-}
{-# LANGUAGE RecordWildCards #-}

Expand All @@ -20,6 +22,9 @@ module Cardano.Wallet.DB

-- * Checkpoints
, sparseCheckpoints
, SparseCheckpointsConfig (..)
, defaultSparseCheckpointsConfig
, gapSize

-- * Errors
, ErrRemovePendingTx (..)
Expand Down Expand Up @@ -59,7 +64,7 @@ import Control.Monad.Trans.Except
import Data.Quantity
( Quantity (..) )
import Data.Word
( Word32, Word64 )
( Word32, Word64, Word8 )
import Numeric.Natural
( Natural )

Expand Down Expand Up @@ -381,27 +386,66 @@ cleanDB DBLayer{..} = atomically $
-- Therefore, we need to keep the very first checkpoint in the database, no
-- matter what.
sparseCheckpoints
:: Quantity "block" Word32
-- ^ Epoch Stability, i.e. how far we can rollback
:: SparseCheckpointsConfig
-- ^ Parameters for the function.
-> Quantity "block" Word32
-- ^ A given block height
-> [Word32]
-- ^ The list of checkpoint heights that should be kept in DB.
sparseCheckpoints epochStability blkH =
sparseCheckpoints cfg blkH =
let
gapsSize = 100
edgeSize = 10

k = getQuantity epochStability
SparseCheckpointsConfig{edgeSize,epochStability} = cfg
g = gapSize cfg
h = getQuantity blkH
e = fromIntegral edgeSize

minH =
let x = if h < k then 0 else h - k
in gapsSize * (x `div` gapsSize)
let x = if h < epochStability + g then 0 else h - epochStability - g
in g * (x `div` g)

initial = 0
longTerm = [minH,minH+gapsSize..h]
shortTerm = if h < edgeSize
longTerm = [minH,minH+g..h]
shortTerm = if h < e
then [0..h]
else [h-edgeSize,h-edgeSize+1..h]
else [h-e,h-e+1..h]
in
L.sort $ L.nub $ initial : (longTerm ++ shortTerm)
L.sort (L.nub $ initial : (longTerm ++ shortTerm))

-- | Captures the configuration for the `sparseCheckpoints` function.
--
-- NOTE: large values of 'edgeSize' aren't recommended as they would mean
-- storing many unnecessary checkpoints. In Ouroboros Praos, there's a
-- reasonable probability for small forks of a few blocks so it makes sense to
-- maintain a small part that is denser near the edge.
data SparseCheckpointsConfig = SparseCheckpointsConfig
{ edgeSize :: Word8
, epochStability :: Word32
} deriving Show

-- | A sensible default to use in production. See also 'SparseCheckpointsConfig'
defaultSparseCheckpointsConfig :: Quantity "block" Word32 -> SparseCheckpointsConfig
defaultSparseCheckpointsConfig (Quantity epochStability) =
SparseCheckpointsConfig
{ edgeSize = 5
, epochStability
}

-- | A reasonable gap size used internally in 'sparseCheckpoints'.
--
-- 'Reasonable' means that it's not _too frequent_ and it's not too large. A
-- value that is too small in front of k would require generating much more
-- checkpoints than necessary.
--
-- A value that is larger than `k` may have dramatic consequences in case of
-- deep rollbacks.
--
-- As a middle ground, we current choose `k / 3`, which is justified by:
--
-- - The current speed of the network layer (several thousands blocks per seconds)
-- - The current value of k = 2160
--
-- So, `k / 3` = 720, which should remain around a second of time needed to catch
-- up in case of large rollbacks.
gapSize :: SparseCheckpointsConfig -> Word32
gapSize SparseCheckpointsConfig{epochStability} =
epochStability `div` 3
4 changes: 3 additions & 1 deletion lib/core/src/Cardano/Wallet/DB/Sqlite.hs
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,7 @@ import Cardano.Wallet.DB
, ErrRemovePendingTx (..)
, ErrWalletAlreadyExists (..)
, PrimaryKey (..)
, defaultSparseCheckpointsConfig
, sparseCheckpoints
)
import Cardano.Wallet.DB.Sqlite.TH
Expand Down Expand Up @@ -1230,7 +1231,8 @@ pruneCheckpoints
pruneCheckpoints wid cp = do
let height = Quantity $ fromIntegral $ checkpointBlockHeight cp
let epochStability = Quantity $ checkpointEpochStability cp
let cps = sparseCheckpoints epochStability height
let cfg = defaultSparseCheckpointsConfig epochStability
let cps = sparseCheckpoints cfg height
deleteCheckpoints wid [ CheckpointBlockHeight /<-. cps ]

-- | Delete TxMeta values for a wallet.
Expand Down
Loading

0 comments on commit 55d0811

Please sign in to comment.