Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transition from RocksDB to Rust based database implementation #13340

Closed
wants to merge 15 commits into from

Conversation

sebastiencs
Copy link
Contributor

Explain your changes:

This PR removes the RocksDB dependency, replacing it with a Rust implementation.
The new database follows the same API than the previous rocksdb package, making it a drop-in replacement.
Compression is used on both keys and values, using zstd algorithm.
Encoding and decoding of data remains the same with binprot.
crc32 is used to validate data integrity.

Explain how you tested your changes:

This new database implementation has been tested by running the Mina node, including this implementation, on berkeleynet for a few days now.

Implementation details

The database is a single append-only file, where each entry (key-value) are preceded by a header.
Each entry in the Database has the following structure:

+------------+-----------+-----------+
|    HDR     |    KEY    |   VALUE   |
| (17 bytes) | (X bytes) | (X bytes) |
+------------+-----------+-----------+
      ^
      |
      |
+------------+--------------+-----------+------------+
| KEY_LENGTH | VALUE_LENGTH | BITFLAGS  |   CRC32    |
| (4 bytes)  | (8 bytes)    | (1 byte)  | (4 bytes)  |
+------------+--------------+-----------+------------+

Where:

  • HDR: A 17 bytes header
    • KEY_LENGTH: Length of the key, stored in 4 bytes.
    • VALUE_LENGTH: Length of the value, stored in 8 bytes.
    • BITFLAGS: A 1 byte bitflags including:
      • key_is_compressed: A flag indicating if the key is compressed.
      • value_is_compressed: A flag indicating if the value is compressed.
      • is_removed: A flag indicating if the entry has been removed.
    • CRC32: The CRC32 checksum of the entry (including its header), stored in 4 bytes.
  • KEY: The key data
  • VALUE: The value data

Incomplete Work

Currently the Rust code is compiled as a dynamic library:
Compiling and running Mina requires the user to set manually LD_LIBRARY_PATH=_build/default/src/lib/keyvaluedb_rust/
We can't compile it statically because proof-systems is a static library, and using 2 Rust static libraries doesn't seem to be supported.
An alternative would be to make use of OCaml virtual library, in the same way than kimchi

Checklist:

  • Dependency versions are unchanged
    • Notify Velocity team if dependencies must change in CI
  • Modified the current draft of release notes with details on what is completed or incomplete within this project
  • Document code purpose, how to use it
    • Mention expected invariants, implicit constraints
  • Tests were added for the new behavior
    • Document test purpose, significance of failures
    • Test names should reflect their purpose
  • All tests pass (CI will check this if you didn't)
  • Serialized types are in stable-versioned modules
  • Does this close issues? List them

@psteckler
Copy link
Member

Why would this be a worthwhile change?

Is there a quantifiable performance improvement?

@bkase
Copy link
Member

bkase commented Jun 9, 2023

Same question as @psteckler — What is the motivation to move away from a stable battle-tested kvstore here? I’m not a databases expert, but I know that it is very tricky to thoroughly test these sorts of systems and the Lindy effect of a well-known open source project seems hard to give up to me.

@jurajselep
Copy link

We are using compression on both keys and values with the library zstd. This provides the following advantages:

  • Space Efficiency: Compression reduces the amount of storage space required for our data.
  • Improved I/O Efficiency: Data compression improves I/O efficiency by reducing the amount of data that needs to be read from or written to disk. Since disk I/O tends to be one of the slowest operations in a database system, anything that reduces I/O can have a significant impact on performance.
  • Cache Utilization: By compressing data, more of it can fit into the database’s cache. This increases cache hit rates, meaning that the database can often serve data from fast memory, leading to better performance.
  • Reduced Latency: With smaller data sizes due to compression, the time taken for disk reads and writes is lowered.

The compression makes the biggest difference on the transition frontier:

  • With RocksDB, the frontier takes 245 MB on disk.
  • With our Rust implementation, takes 140 MB.

We can expand the documentation and clarify how we tested this implementation. We believe the quality is comparable, possibly slightly better, because it is less complex than RocksDB. However, from the Lindy effect perspective, we can't compete with RocksDB.

The key benefit of this PR is that it allows us to refine the ledger's implementation as we move forward with replacing the ledger's upper layers with the Rust version.

At the moment, we need to ensure full compatibility with the ledger's existing OCaml implementation, which limits our ability to make substantial improvements.

We do have the option to bypass this PR and draft a new one for a complete ledger replacement (including the transaction application logic). However, that would be a much more complex change to review, and it would also significantly increase the time required for completion.

@tizoc
Copy link
Member

tizoc commented Jun 12, 2023

The key benefit of this PR is that it allows us to refine the ledger's implementation as we move forward with replacing the ledger's upper layers with the Rust version.

One important detail that I think is not clear is that while this PR (in its current form) replaces RocksDB, that is not required, and the Mina node can keep working as usual with a RocksDB storage.

What is important is to have an easy way to switch between different implementations (probably through dune's virtual libraries feature) that provide the same interface as the current RocksDB backend. So the default will still be RocksDB, with the option to switch the backend to something else over which we have more control so that the (Rust) Ledger implementation can take advantage of it, as Juraj mentioned above.

@rbonichon
Copy link
Member

@jurajselep @sebastiencs What's the status of this PR? Nothing has happened for 9 months. Can we close it?

@rbonichon rbonichon self-assigned this Mar 5, 2024
@tizoc
Copy link
Member

tizoc commented Mar 5, 2024

Hi @rbonichon, at the moment we are not planning on supporting this database, so yes, lets close this issue.

@rbonichon rbonichon closed this Mar 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants