Skip to content

Commit

Permalink
Merge #3327
Browse files Browse the repository at this point in the history
3327: Implement `CheckpointPolicy` data type r=HeinrichApfelmus a=HeinrichApfelmus

### Issue number

ADP-1497

### Overview

This pull request implements an abstract data type `CheckpointPolicy` which describes a policy for keeping and discarding checkpoints based on their BlockHeight.

Specifically, a boolean `keepWhereTip policy tip height` indicates whether the `policy` wants to store a checkpoint at the block height `height` , given that the blockchain is at `tip`.

### Details

* The implementation of the data type is based on a function
    ```
    nextCheckpoint policy tip height :: Maybe BlockHeight
    ```
    which returns the next block height (`>= height`) at which a checkpoint should be made. This function is an efficient balance between the purely boolean predicate (`keepWhereTip`) and a full listing of all checkpoints (`toListAtTip`).
* New policies can be created from old policies by using the semigroup operation `(<>)`, which represents the union of the checkpoint listings. This simplifies the implementation of the `sparseArithmetic` policy significantly.
* The `sparseArithmetic` policy is modeled on the checkpointing policy currently implemented in `cardano-wallet`, but it is not equal to it.

### Comments

* This pull request implements the data type, but does *not* use it yet, in order to keep this PR small and easy to review. The type will be used in a future pull request.
* This data type is a reimplementation of the data type introduced in the stale pull request #3159.
    * The main reason for not merging the old PR is that a purely boolean predicate `keepWhereTip` is not suitable for use with `lightSync`. The problem is that `lightSync` does not always read individual blocks and can (potentially) skip over many BlockHeights — which would result in *no* checkpoints being created, as the predicate always returns `False`. Put differently, the checkpoints would "fall between the cracks".
    In contrast, an implementation based on `nextCheckpoint` can be used with `lightSync`, as it can now *specifically target* the next block height, at which a checkpoint is created.

Co-authored-by: Heinrich Apfelmus <heinrich.apfelmus@iohk.io>
  • Loading branch information
iohk-bors[bot] and HeinrichApfelmus committed Jun 19, 2022
2 parents df24232 + 9903b2e commit e06071d
Show file tree
Hide file tree
Showing 3 changed files with 360 additions and 0 deletions.
2 changes: 2 additions & 0 deletions lib/core/cardano-wallet-core.cabal
Original file line number Diff line number Diff line change
Expand Up @@ -190,6 +190,7 @@ library
Cardano.Wallet.Api.Server.Tls
Cardano.Wallet.Api.Types
Cardano.Wallet.Api.Types.SchemaMetadata
Cardano.Wallet.Checkpoints.Policy
Cardano.Wallet.CoinSelection
Cardano.Wallet.CoinSelection.Internal
Cardano.Wallet.CoinSelection.Internal.Balance
Expand Down Expand Up @@ -450,6 +451,7 @@ test-suite unit
Cardano.Wallet.Api.ServerSpec
Cardano.Wallet.Api.TypesSpec
Cardano.Wallet.ApiSpec
Cardano.Wallet.Checkpoints.PolicySpec
Cardano.Wallet.CoinSelectionSpec
Cardano.Wallet.CoinSelection.InternalSpec
Cardano.Wallet.CoinSelection.Internal.BalanceSpec
Expand Down
202 changes: 202 additions & 0 deletions lib/core/src/Cardano/Wallet/Checkpoints/Policy.hs
Original file line number Diff line number Diff line change
@@ -0,0 +1,202 @@
-- |
-- Copyright: © 2022 IOHK
-- License: Apache-2.0
--
-- Abstract data type that describes a policy for keeping and discarding
-- checkpoints. To be used with the 'Checkpoints' type.
module Cardano.Wallet.Checkpoints.Policy
( BlockHeight
, CheckpointPolicy
, nextCheckpoint
, keepWhereTip
, toListAtTip

-- * Construction
, atGenesis
, atTip
, trailingArithmetic
, sparseArithmetic
, gapSize

-- * Internal invariants
-- $invariants
) where

import Prelude

import Data.List
( unfoldr )

{-------------------------------------------------------------------------------
CheckpointPolicy, abstract data type
-------------------------------------------------------------------------------}
type BlockHeight = Integer

{-| [CheckpointPolicy]
To save memory and time, we do not store every checkpoint.
Instead, a 'CheckpointPolicy' determines which checkpoints
to store and which ones to discard.
The 'extendAndPrune' functions consults such a policy and
drops checkpoints as it deems necessary.
A 'CheckpointPolicy' determines whether a checkpoint is worth storing
only based on its block height. The boolean
keepWhereTip policy tip blockheight
indicates whether the checkpoint should be stored ('True') or
not ('False').
It is important that this function does not oscillate:
If @blockheight <= tip@, the function result may change from 'True'
to 'False' as the @tip@ increases, but not the other way round.
This is because we can only create checkpoints the first time we
read the corresponding block.
TODO:
The 'Checkpoints' collection currently relies on 'Slot' instead
of 'BlockHeight' to store checkpoints. We need to better integrate
this with 'BlockHeight'.
I (Heinrich) actually prefer 'Slot'. However, not every slot contains a block,
and we would lose too many checkpoints if we based the decision of
whether to keep a checkpoint or not based on the slot number alone.
In contrast, block height is "dense".
-}
newtype CheckpointPolicy = CheckpointPolicy
{ nextCheckpoint :: BlockHeight -> BlockHeight -> Maybe BlockHeight
-- ^ Assuming that the tip of the chain is at block height @tip@,
-- @nextCheckpoint policy tip height@ returns the smallest
-- @height'@ satisfying @height' >= height#
-- at which the next checkpoint is to be made.
}

{-$invariants
Internal invariants of the 'CheckpointPolicy' type:
* 'prop_monotonicHeight' — 'nextCheckpoint' returns the same height
for all heights between a given height and the height returned.
* prop_monotonicTip' — when increasing the @tip@ height, 'nextCheckpoint'
will never return a blockheight that is smaller.
-}

-- | Assuming that the tip of the chain is at block height @tip@,
-- the value @keepWhereTip policy tip height@
-- indicates whether a checkpoint should ('True') or should not ('False')
-- be stored at @height@.
keepWhereTip
:: CheckpointPolicy -> BlockHeight -> BlockHeight
-> Bool
keepWhereTip policy tip height =
nextCheckpoint policy tip height == Just height

-- | List all checkpoints for a given tip.
toListAtTip :: CheckpointPolicy -> BlockHeight -> [BlockHeight]
toListAtTip policy tip = unfoldr (fmap next . nextCheckpoint policy tip) 0
where next x = (x,x+1)

{-------------------------------------------------------------------------------
CheckpointPolicy, construction
-------------------------------------------------------------------------------}
-- | The combination of two 'CheckpointPolicy' makes a checkpoint
-- where at least one of the policies wants to make a checkpoint.
instance Semigroup CheckpointPolicy where
p1 <> p2 = CheckpointPolicy $ \t h ->
union min (nextCheckpoint p1 t h) (nextCheckpoint p2 t h)
where
union _ Nothing mb = mb
union _ ma Nothing = ma
union f (Just a) (Just b) = Just (f a b)

instance Monoid CheckpointPolicy where
mempty = CheckpointPolicy $ \_ _ -> Nothing

-- | The 'CheckpointPolicy' that keeps only the genesis block.
atGenesis :: CheckpointPolicy
atGenesis = CheckpointPolicy $ \_tip height ->
if height <= 0 then Just height else Nothing

-- | The 'CheckpointPolicy' that only keeps the tip of the chain.
atTip :: CheckpointPolicy
atTip = CheckpointPolicy $ \tip height ->
if height <= tip then Just tip else Nothing

-- | @trailingArithmetic n height@ keeps @n@ checkpoints
-- at block heights that are multiples of @height@
-- and which are closest to the tip of the chain.
-- (Fewer than @n@ checkpoints are kept while the chain is too short
-- to accommodate all checkpoints.)
trailingArithmetic :: Integer -> BlockHeight -> CheckpointPolicy
trailingArithmetic n grid = CheckpointPolicy $ \tip height ->
case [h | h <- window tip, h >= height] of
[] -> Nothing
(x:_) -> Just x
where
window tip = [a, a + grid .. tip]
where
m = n - 1
a = if tip > m * grid then toGrid (tip - m * grid) else 0
toGrid x = (x `div` grid) * grid

{- | Note [sparseArithmeticPolicy]
The 'sparseArithmetic' checkpoint policy contains essentially two
sets of checkpoints: One fairly dense set near the tip of the chain
in order to handle frequent potential rollbacks, and one sparse
set that spans the entire epoch stability window. These two sets
are arranged as arithmetic sequences.
This policy is motivated by the following observations:
- We can't rollback for more than `k = epochStability` blocks in the past
- It is pretty fast to re-sync a few hundred blocks
- Small rollbacks near the tip may occur more often than long ones
Hence, we should strive to
- Prune any checkpoint that are more than `k` blocks in the past
- Keep only one checkpoint every `largeGap` ~100 blocks
- But still keep ~10 most recent checkpoints to cope with small rollbacks.
Roughly, the 'sparseArithmetic'
0 ..... N*largeGap .... (N+1)*largeGap .. .. M*smallGap (M+1)*smallGap tip
|_______________________________________________________________|
epochStability
Note: In the event where chain following "fails completely" (because, for
example, the node has switch to a different chain, different by more than `k`),
we have no choice but rolling back from genesis.
Therefore, we need to keep the very first checkpoint in the database, no
matter what.
-}
sparseArithmetic :: BlockHeight -> CheckpointPolicy
sparseArithmetic epochStability =
atGenesis
<> atTip
<> trailingArithmetic 10 1
<> trailingArithmetic n largeGap
where
largeGap = gapSize epochStability
n = epochStability `div` largeGap

{- | A reasonable gap size used internally in 'sparseArithmeticPolicy'.
'Reasonable' means that it's not _too frequent_ and it's not too large. A
value that is too small in front of k would require generating much more
checkpoints than necessary.
A value that is larger than `k` may have dramatic consequences in case of
deep rollbacks.
As a middle ground, we current choose `k / 3`, which is justified by:
- The current bandwidth of the network layer (several thousands blocks per seconds)
- The current value of k = 2160
So, `k / 3` = 720, which corresponds to around a second of time needed to catch
up in case of large rollbacks (if our local node has caught up already).
-}
gapSize :: BlockHeight -> Integer
gapSize epochStability = max 1 (epochStability `div` 3)
156 changes: 156 additions & 0 deletions lib/core/test/unit/Cardano/Wallet/Checkpoints/PolicySpec.hs
Original file line number Diff line number Diff line change
@@ -0,0 +1,156 @@
{-# LANGUAGE DataKinds #-}
module Cardano.Wallet.Checkpoints.PolicySpec
( spec
) where

import Prelude

import Cardano.Wallet.Checkpoints.Policy
( BlockHeight, CheckpointPolicy, nextCheckpoint, toListAtTip )
import Test.Hspec
( Spec, describe, it )
import Test.QuickCheck
( Arbitrary (..)
, Gen
, NonNegative (..)
, Property
, choose
, elements
, forAll
, frequency
, oneof
, property
, (===)
)

import qualified Cardano.Wallet.Checkpoints.Policy as CP
import qualified Data.Set as Set

spec :: Spec
spec = do
describe "general laws, tested on trailingArithmetic" $
it "(<>) is union of checkpoints" $
property prop_mappendOnTrailing

describe "general laws, tested on sparseArithmetic" $ do
it "prop_monotonicHeight — next heights do not fluctuate" $
property $ \ctx@(GenHeightContext epochStability tip) ->
forAll (genBlockHeight ctx) $ \height ->
prop_monotonicHeight
(CP.sparseArithmetic epochStability)
tip
height

it "prop_monotonicTip — heights become sparser as tip increases" $
property $ \ctx@(GenHeightContext epochStability tip) ->
forAll (genBlockHeight ctx) $ \height ->
prop_monotonicTip
(CP.sparseArithmetic epochStability)
tip
height

describe "specific policies" $ do
it "atGenesis <> atTip has exactly two checkpoints" $
property $ \(GenHeightContext _ tip) ->
CP.toListAtTip (CP.atGenesis <> CP.atTip) tip == [0,tip]

it "trailingArithmetic n _ has at most n checkpoints" $
property prop_trailingLength

it "trailingArithmetic checkpoints are located at grid points" $
property prop_trailingGrid

it "sparseArithmetic checkpoints after genesis are close to tip" $
property $ \(GenHeightContext epochStability tip) ->
maybe False (>= tip - 2*epochStability - 20) $
nextCheckpoint (CP.sparseArithmetic epochStability) tip 1

{-------------------------------------------------------------------------------
Properties, general
-------------------------------------------------------------------------------}
-- | Internal invariant.
prop_monotonicHeight
:: CheckpointPolicy
-> BlockHeight -> BlockHeight
-> Property
prop_monotonicHeight policy tip h1 = case nextCheckpoint policy tip h1 of
Nothing -> forAll ((h1 +) <$> genNonNegative) $ \h ->
nextCheckpoint policy tip h === Nothing
Just h2 -> forAll (genInterval (h1,h2)) $ \h ->
nextCheckpoint policy tip h === Just h2

-- | Internal invariant.
prop_monotonicTip
:: CheckpointPolicy
-> BlockHeight -> BlockHeight
-> Property
prop_monotonicTip policy tip height = case nextCheckpoint policy tip height of
Nothing -> property True
Just height1 -> forAll ((tip +) <$> genNonNegative) $ \tip2 ->
case nextCheckpoint policy tip2 height of
Nothing -> property False
Just height2 -> property $ height1 <= height2

prop_mappendOnTrailing :: Property
prop_mappendOnTrailing = let g17 = choose (1,7) in
forAll g17 $ \n1 -> forAll g17 $ \gap1 ->
forAll g17 $ \n2 -> forAll g17 $ \gap2 ->
let policy1 = CP.trailingArithmetic n1 gap1
policy2 = CP.trailingArithmetic n2 gap2
tip = 100
in Set.fromList (toListAtTip policy1 tip)
<> Set.fromList (toListAtTip policy2 tip)
===
Set.fromList (toListAtTip (policy1 <> policy2) tip)

{-------------------------------------------------------------------------------
Properties, specific
-------------------------------------------------------------------------------}
prop_trailingLength :: GenHeightContext -> Property
prop_trailingLength (GenHeightContext gap tip) =
forAll (choose (0,5)) $ \n ->
fromIntegral n >= length (toListAtTip (CP.trailingArithmetic n gap) tip)

prop_trailingGrid :: GenHeightContext -> Property
prop_trailingGrid (GenHeightContext gap tip) =
forAll (choose (1,7)) $ \n ->
all (`divisibleBy` gap) $ toListAtTip (CP.trailingArithmetic n gap) tip
where
a `divisibleBy` b = a `mod` b == 0

{-------------------------------------------------------------------------------
Generators
-------------------------------------------------------------------------------}
-- | Data type for generating sensible blockchain heights
data GenHeightContext = GenHeightContext
{ _epochStability :: BlockHeight
, _tip :: BlockHeight
} deriving Show

instance Arbitrary GenHeightContext where
arbitrary = do
es <- max 1 <$> oneof
[choose (0,1000000), elements [1,3,10,30,100,300,1000] ]
tip <- max 1 <$> oneof
[choose (0,1000000), choose (0,10), choose (es-200,es+200)]
pure $ GenHeightContext es tip

-- | Generate 'Integers' from an interval, skewed towards the ends.
genInterval :: (Integer, Integer) -> Gen Integer
genInterval (a,b) = clip <$> frequency
[ (1, choose (a,a+3))
, (1, choose (b,b+3))
, (5, choose (a,b))
]
where
clip = max a . min b

genNonNegative :: Gen Integer
genNonNegative = getNonNegative <$> arbitrary

genBlockHeight :: GenHeightContext -> Gen BlockHeight
genBlockHeight (GenHeightContext epochStability tip) = oneof
[ genNonNegative
, choose (0,10)
, choose (tip-2*epochStability,tip+2*epochStability)
]

0 comments on commit e06071d

Please sign in to comment.