State-only checkpoint state startup #4251

arnetheduck · 2022-10-21T00:08:06Z

Currently, we require genesis and a checkpoint block and state to start from an arbitrary slot - this PR relaxes this requirement so that we can start with a state alone.

The current trusted-node-sync algorithm works by first downloading blocks until we find an epoch aligned non-empty slot, then downloads the state via slot.

However, current proposals for checkpointing prefer finalized state as the main reference - this allows more simple access control and caching on the server side - in particular, this should help checkpoint-syncing from sources that have a fast finalized state download (like teku) but are slow when accessing state via slot.

Earlier versions of Nimbus will not be able to read databases created without a checkpoint block and genesis. In most cases, backfilling makes the database compatible except where genesis is also missing (custom networks).

backfill checkpoint block from libp2p instead of checkpoint source, when doing trusted node sync
allow starting the client without genesis / checkpoint block
perform epoch start slot lookahead when loading tail state, so as to deal with the case where the epoch start slot does not have a block
replace --blockId with --state-id in TNS command line
when replaying, also look at the parent of the last-known-block (even if we don't have the parent block data, we can still replay from a "parent" state) - in particular, this clears the way for implementing state pruning
deprecate --finalized-checkpoint-block option (no longer needed)

github-actions · 2022-10-21T03:04:24Z

Unit Test Results

      9 files ±0   681 suites +3 20m 58s ⏱️ -10s
1 998 tests +1 1 849 ✔️ +1 149 💤 ±0 0 ❌ ±0
8 120 runs +3 7 947 ✔️ +3 173 💤 ±0 0 ❌ ±0

Results for commit 86469f9. ± Comparison against base commit 655e2fa.

beacon_chain/consensus_object_pools/blockchain_dag.nim

tersec · 2022-11-01T16:36:36Z

beacon_chain/trusted_node_sync.nim

+  let
+    missingSlots = block:
+      var total = 0
+      for i in 0..<checkpointSlot.int:


In theory, one can avoid this Slot <-> int conversion by writing

for i in 0.Slot ..< checkpointSlot: discard dbCache.isKnown(i)

where the rest of the loop aside, i remains a slot the entire time. At some point, there were Nim issues around this construct, so maybe not reliable enough to use.

Also, the range of int isn't necessarily the range of Slot -- seems better to at least use uint64 here.

68e5446 - that said, in constructs like for a in 0..<slot.int, there's a real advantage of doing the int conversion up-front in that the loop won't start at all if the slot is off (say because the code is missing a check or has an underflow) meaning less risk of corruption due to side effects in the loop..

tersec · 2022-11-01T16:48:27Z

beacon_chain/trusted_node_sync.nim

@@ -23,13 +23,16 @@ type
    summaries: Table[Eth2Digest, BeaconBlockSummary]
    slots: seq[Option[Eth2Digest]]

+proc updateSlots(cache: var DbCache, slot: Slot) =
+  if cache.slots.len() < slot.int + 1:


Another int/uint64 thing -- ultimately, it uses setLen in the line below, which means that it's dealing with an int regardless, but still seems better to avoid potentially lossy conversions when feasible, e.g.,

if cache.slots.lenu64() < slot + 1

The dependent on int is ultimately a dependence on seq and that built-in set of types. My preference is, even if that's going to be the case indefinitely, centralize as much of these type conversions to/from int from not-at-all-int types in one place, even when they're currently not feasible to avoid entirely.

tersec · 2022-11-01T16:59:09Z

beacon_chain/trusted_node_sync.nim

+          let tmp = StateIdent.decodeString(stateId).valueOr:
+            error "Cannot decode checkpoint state id, must be a slot, hash, 'finalized' or 'head'"
+            quit 1
+          if tmp.kind == StateQueryKind.Slot:


Equivalent to checking for tmp.kind == StateQueryKind.Slot and not tmp.slot.is_epoch() and else, tmp, no?

Seems clearer to avoid unnecessarily nested if statements.

Currently, we require genesis and a checkpoint block and state to start from an arbitrary slot - this PR relaxes this requirement so that we can start with a state alone. The current trusted-node-sync algorithm works by first downloading blocks until we find an epoch aligned non-empty slot, then downloads the state via slot. However, current [proposals](ethereum/beacon-APIs#226) for checkpointing prefer finalized state as the main reference - this allows more simple access control and caching on the server side - in particular, this should help checkpoint-syncing from sources that have a fast `finalized` state download (like infura and teku) but are slow when accessing state via slot. Earlier versions of Nimbus will not be able to read databases created without a checkpoint block and genesis. In most cases, backfilling makes the database compatible except where genesis is also missing (custom networks). * backfill checkpoint block from libp2p instead of checkpoint source, when doing trusted node sync * allow starting the client without genesis / checkpoint block * perform epoch start slot lookahead when loading tail state, so as to deal with the case where the epoch start slot does not have a block * replace `--blockId` with `--state-id` in TNS command line * when replaying, also look at the parent of the last-known-block (even if we don't have the parent block data, we can still replay from a "parent" state) - in particular, this clears the way for implementing state pruning * deprecate `--finalized-checkpoint-block` option (no longer needed)

arnetheduck force-pushed the state-tns branch from 78b4ebf to 86469f9 Compare October 21, 2022 00:17

arnetheduck mentioned this pull request Oct 21, 2022

Checkpoint Sync API ethereum/beacon-APIs#226

Closed

tersec reviewed Nov 1, 2022

View reviewed changes

beacon_chain/consensus_object_pools/blockchain_dag.nim Show resolved Hide resolved

tersec reviewed Nov 1, 2022

View reviewed changes

tersec approved these changes Nov 1, 2022

View reviewed changes

arnetheduck added 2 commits November 2, 2022 07:27

style

ae0d1ca

arnetheduck force-pushed the state-tns branch from 86469f9 to ae0d1ca Compare November 2, 2022 06:33

one more

e56659b

arnetheduck enabled auto-merge (squash) November 2, 2022 06:46

arnetheduck merged commit d839b9d into unstable Nov 2, 2022

arnetheduck deleted the state-tns branch November 2, 2022 10:02

arnetheduck mentioned this pull request Nov 8, 2022

[FEAT] Support genesis from a merged state #4193

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

State-only checkpoint state startup #4251

State-only checkpoint state startup #4251

arnetheduck commented Oct 21, 2022 •

edited

github-actions bot commented Oct 21, 2022

tersec Nov 1, 2022

tersec Nov 1, 2022

arnetheduck Nov 2, 2022

tersec Nov 1, 2022

tersec Nov 1, 2022

arnetheduck Nov 2, 2022

State-only checkpoint state startup #4251

State-only checkpoint state startup #4251

Conversation

arnetheduck commented Oct 21, 2022 • edited

github-actions bot commented Oct 21, 2022

Unit Test Results

tersec Nov 1, 2022

Choose a reason for hiding this comment

tersec Nov 1, 2022

Choose a reason for hiding this comment

arnetheduck Nov 2, 2022

Choose a reason for hiding this comment

tersec Nov 1, 2022

Choose a reason for hiding this comment

tersec Nov 1, 2022

Choose a reason for hiding this comment

arnetheduck Nov 2, 2022

Choose a reason for hiding this comment

arnetheduck commented Oct 21, 2022 •

edited