Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Re-organize the DB #3189

Closed
LesnyRumcajs opened this issue Jul 14, 2023 · 0 comments · Fixed by #3193
Closed

Re-organize the DB #3189

LesnyRumcajs opened this issue Jul 14, 2023 · 0 comments · Fixed by #3193
Assignees

Comments

@LesnyRumcajs
Copy link
Member

LesnyRumcajs commented Jul 14, 2023

Issue summary

At the moment, the backend DB consists of a single column where key-retrieval is not possible (keys are hashed). This introduces challenges when:

  • trying to list the Cids in the blockstore; while >99% of entries would be DAG-CBOR encoded with Blake2b256 as hasher for the Cid, there are a few entries that slip these constraints,
  • Outside of the blockstore, storing settings - currently, we do it semi-manually with files.

We could enable key retrieval by making the ParityDb use BTree storage, but it would introduce a significant performance and disk space penalty.

Another approach is to encode the values on our own, e.g., reserve the first few bytes of a value for some metadata. This is not ideal, as it would increase disk space usage for no reason (over 99% of entries are DAG-CBOR with Blake2b256 hasher) and bug-prone.

The chosen approach (a byproduct of a failed experiment #3093) is to use multiple columns.

enum DbColumn {
    /// Column for storing IPLD data with `Blake2b256` hash and `DAG_CBOR` codec.
    /// Most entries in the `blockstore` will be stored in this column.
    GraphDagCborBlake2b256,
    /// Column for storing other IPLD data (different codec or hash function).
    /// It allows key retrieval at the cost of degraded performance. Given that
    /// there will be a small number of entries in this column, the performance
    /// degradation is negligible.
    GraphFull,
    /// Column for storing anything non-IPLD data. This column is not exportable.
    Other,
}

This change should not affect the public API (significantly). Clients of the DB must be completely oblivious of the underlying DB structure.

Other information and links

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant