Hot/cold block storage #754

arnetheduck · 2020-02-19T09:27:11Z

We're currently using a key-value store for storing states and blocks. Due to the nature of eth2, when finalization happens a single history of blocks is chosen to be canonical, thus it would be efficient to store the block database in a cold storage that is a flat append-only file.

There are a few ways to design this - an example is keeping two files: one for blocks (which are variable-size) and an index file which contains fixed-size offsets - this would allow random-access to blocks by their block number.

It also probably makes sense to store block hashes - these can either go in the block file or a third file containing only hashes.

Another design is to keep offsets in the ordinary key-value store (for example with slot number as key, offset and hash as value) so that in total we have the kvstore and one cold-store file to deal with.

Finally, the block graph is currently stored in-memory - possibly, this could sit in the database as well, saving memory but increasing database traffic - the tradeoff is not clear here as the block graph is "fairly" light-weight.

zah · 2020-02-25T12:40:59Z

The API of the cold database should allow us to efficiently memory-map the SSZ representation of a particular block without loading it in memory. We can use SszNavigator objects to extract any data of interest:

https://github.com/status-im/nim-beacon-chain/blob/8ab0248209aba82cfdf6e64dacf2e21753a5a55a/tests/test_ssz.nim#L85-L89

Since Nim cannot safely return openArrays yet, the best way to design the API is rely on callback closures that will receive the memory-mapped data as an argument:

https://github.com/status-im/nim-beacon-chain/blob/2a67ac3c05859af682994facc36e646a3febc24a/tests/test_kvstore.nim#L32-L34

The description above by @arnetheduck focuses on our needs for storing the history of BeaconBlocks. Please note that we'll also need to store the latest finalized state and potentially periodic snapshots of earlier states. It may be premature to propose designs for this as we're planning to introduce some level of data sharing between different beacon states that may be also used in the on-disk representation.

disruptek · 2020-02-29T20:20:04Z

Here's a simple nimterop wrapper for lmdb. It's pretty great. Golden isn't a super example of its use, honestly. Maybe I'll finish it someday.

Anyway, if you like this API, you can use it to close this issue.

import os

import nimterop/[build, cimport]

const
  baseDir = getProjectCacheDir("nimlmdb")

static:
  #cDebug()

  gitPull(
    "https://github.com/LMDB/lmdb",
    outdir = baseDir,
    checkout = "mdb.master"
  )

getHeader(
  "lmdb.h",
  outdir = baseDir / "libraries" / "liblmdb"
)

type
  mode_t = uint32

when defined(lmdbStatic):
  cImport(lmdbPath)
else:
  cImport(lmdbPath, dynlib = "lmdbLPath")

arnetheduck · 2020-03-01T09:54:34Z

see also https://github.com/status-im/nim-beacon-chain/blob/devel/beacon_chain/kvstore_lmdb.nim - we've tried lmdb but it has issues on 32-bit platforms and needs local patching on windows - it's not great for our use case.

we use sqlite for now which also uses mmap if available but something else otherwise.

the point here is though that we don't want a database at all - the nature of the data is such that it's append-only - it allows for a very robust and trivially simple implementation with a flat file and an accompanying flat index - the lmdb btree would be overkill.

arnetheduck · 2020-03-01T12:02:11Z

re nimterop, we have a preference not to have it as a dependency for whoever is building the code - see https://github.com/arnetheduck/nim-sqlite3-abi (we've produced wrappers manually as well as with c2nim, for this reason)

disruptek · 2020-03-05T00:49:01Z

It sounds like the best course of action is to let @protolambda tell us when the design is fairly stable and then use it to inform the hot/cold storage approach. It sounds like there may be two layers required; one which is append-only and never requires compaction, and another that is append-only and rarely requires compaction.

But I'm really trying to read between the lines here on something I know nothing at all about. 😉

arnetheduck · 2020-03-05T09:40:33Z

This part of the design is stable: the way ethereum 2 works is that once finalization happens, there is not ever any rollback - the blocks that are older than the finalization point form a simple linear history, thus are append-only.

The blocks that are newer than finalization will be accessed randomly by hash - this is why they should be stored in an "ordinary" key/value store to begin with - even if it's likely that they are "almost-linear", we shouldn't make that assumption right now as it may open up for potential for DoS attacks, if accessing random non-finalized blocks is not constant time.

for some intuition as to what kind of requests will be made from the database, the networking spec is a good source:
https://github.com/ethereum/eth2.0-specs/blob/dev/specs/phase0/p2p-interface.md#beaconblocksbyrange
https://github.com/ethereum/eth2.0-specs/blob/dev/specs/phase0/p2p-interface.md#beaconblocksbyroot

the finalized blocks are accessed pretty much by their slot number while non-finalized blocks are accessed randomly - the databases in use should reflect this, storing the former in an append-only and the latter in.. well, they can stay in the KV store for now - there's an upper bound of about two weeks worth of blocks for how many there can be in the system.

JGcarv · 2020-03-13T15:10:40Z

Hello! I'm interested in this one! Is there still time for taking it? Thanks!

zah · 2020-03-13T18:06:49Z

It's all yours, @JGcarv. We'll be happy to fund 2 days of work for creating a very basic initial implementation with an accompanying test suite. After reviewing the initial results, we will reassess the goals and suggest further directions.

JGcarv · 2020-03-13T19:35:06Z

Awesome. Thank you!

arnetheduck · 2021-03-11T08:27:33Z

This topic has evolved a little since we last looked at it:

e2store: add era format #2382 provides a flat storage format that combines a state with the blocks that lead up to it - the interesting part here is that the file is self-contained, trivially verifiable and has all the roots and keys needed to fully validate the data - starting with an era file for the genesis state, we can produce a new era file every 8192 blocks (once per day more or less)
Because the flat file format is verifiable, it's also suitable for wider distribution, such as when dealing with weak subjectivity sync
Between head and the latest era, we can use https://github.com/status-im/nimbus-eth2/blob/stable/beacon_chain/statediff.nim and immutable validator database factoring #2297 to efficiently store states and diffs - these two features taken together mean we'll have a good balance between small footprint and simplicity of use, specially if the era files are indexed.

A downside of this approach is that we lose "here's an sqlite database with everything" world - but that's already the case somewhat with the slashing protection, validator keys and secrets being separate.

TennisBowling · 2022-02-15T16:55:51Z

this should be closed

tersec · 2022-02-15T17:14:06Z

Why? Nimbus still doesn't really have the hot/cold storage distinction this issue proposes. Era files have been gradually developing (#3394 develops them a bit further, for example), but they're not yet functionally exposed to end-users except via ncli_db.

TennisBowling · 2022-02-16T04:55:02Z

It seems that hot/cold storage wasn't being went after anymore

since this PR was created, we've pivoted towards using era files and state diffs as a future direction for hot/cold - closing as obsolete

#835

tersec · 2022-02-16T06:24:36Z

Yes, "using era files and state diffs as a future direction for hot/cold". That particular PR was closed as obsolete, but hot/cold block storage remains a goal, and as the sentence you quote suggests, era files and state diffs, neither of which is really end-user-visible yet modulo ncli/ncli_db, are the current approach to achieving that. This issue tracks hot/cold block storage overall, not just as a proxy for that one PR.

TennisBowling · 2022-02-16T17:13:51Z

ah I see. thank you

arnetheduck · 2022-10-27T08:41:35Z

The era store provides hot/cold storage functionality - further work in this area will be tracked separately: https://nimbus.guide/era-store.html

arnetheduck added the good first issue Good for newcomers label Feb 19, 2020

arnetheduck changed the title ~~Hot/cold storage~~ Hot/cold block storage Feb 19, 2020

zah added the bounty label Feb 25, 2020

tersec mentioned this issue Mar 13, 2020

remove lmdb #809

Merged

JGcarv mentioned this issue Mar 20, 2020

[WIP] Hot/Cold Storage #820

Closed

arnetheduck mentioned this issue Jun 28, 2021

WIP - Hot/Cold Storage #835

Closed

arnetheduck closed this as completed Oct 27, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hot/cold block storage #754

Hot/cold block storage #754

arnetheduck commented Feb 19, 2020

zah commented Feb 25, 2020 •

edited

Loading

disruptek commented Feb 29, 2020

arnetheduck commented Mar 1, 2020 •

edited

Loading

arnetheduck commented Mar 1, 2020

disruptek commented Mar 5, 2020

arnetheduck commented Mar 5, 2020

JGcarv commented Mar 13, 2020

zah commented Mar 13, 2020 •

edited

Loading

JGcarv commented Mar 13, 2020

arnetheduck commented Mar 11, 2021

TennisBowling commented Feb 15, 2022

tersec commented Feb 15, 2022

TennisBowling commented Feb 16, 2022 •

edited

Loading

tersec commented Feb 16, 2022

TennisBowling commented Feb 16, 2022

arnetheduck commented Oct 27, 2022

Hot/cold block storage #754

Hot/cold block storage #754

Comments

arnetheduck commented Feb 19, 2020

zah commented Feb 25, 2020 • edited Loading

disruptek commented Feb 29, 2020

arnetheduck commented Mar 1, 2020 • edited Loading

arnetheduck commented Mar 1, 2020

disruptek commented Mar 5, 2020

arnetheduck commented Mar 5, 2020

JGcarv commented Mar 13, 2020

zah commented Mar 13, 2020 • edited Loading

JGcarv commented Mar 13, 2020

arnetheduck commented Mar 11, 2021

TennisBowling commented Feb 15, 2022

tersec commented Feb 15, 2022

TennisBowling commented Feb 16, 2022 • edited Loading

tersec commented Feb 16, 2022

TennisBowling commented Feb 16, 2022

arnetheduck commented Oct 27, 2022

zah commented Feb 25, 2020 •

edited

Loading

arnetheduck commented Mar 1, 2020 •

edited

Loading

zah commented Mar 13, 2020 •

edited

Loading

TennisBowling commented Feb 16, 2022 •

edited

Loading