Re-implement merkletree with persistent storage (key-value db) by arnaucube · Pull Request #487 · 0xPARC/pod2

arnaucube · 2026-03-06T14:20:54Z

Joint work with @ax0 .

This PR re-implements the merkletree to allow it to use a key-value database in-disk. Needed a reimplementation of the core of the tree logic, since the previous implementation was designed to first build the tree structure in memory and at the end of the operations to compute the hashes, which in a key-value db would not be ideal.

The database interface is defined by the DB trait, which works with (atomic) transactions, and we added a rocksdb implementation of it as an example ready to use (aside from a naive in-memory db).

We also extend the tests of the merkletree to cover more edge cases (for future iterations to catch potential issues).

resolves #435 , resolves #439 (this is done by setting the divergence_level to MAX_DEPTH when new_siblings.len() is zero)

src/backends/plonky2/circuits/mainpod.rs

ed255

Overall looks good to me, but I would like to discuss the DB api

Cargo.toml

src/backends/plonky2/primitives/merkletree/db/mod.rs

src/backends/plonky2/primitives/merkletree/mod.rs

…f case in Leaf::new

dhvanipa · 2026-03-09T21:23:50Z

What was the approach picked for garbage collection?

ed255 · 2026-03-10T09:39:37Z

What was the approach picked for garbage collection?

In this implementation there's no garbage collection. We defer to finding a solution for that (which includes figuring out an API for it) in the future.

ed255

LGTM!

arnaucube · 2026-03-10T13:53:33Z

What was the approach picked for garbage collection?

To add on @ed255 's answer; a straight forward approach (which might not fit all use cases) would be exporting the leafs of the tree under the current root, and creating a new tree with them in a new db; getting rid of all the unused data in the process.

src/backends/plonky2/primitives/merkletree/mod.rs

Extend the work of #487 to the Containers (Dictionary, Set, Array). The merkle tree only stores `RawValue` for both the key and the value, so it is the responsibility of the Container to store the rich value. In order to handle containers with persistent storage efficiently (which means, cloning them or updating them should not cause an O(n) data copy) I figured we need to have a database of `Value`s indexed by their raw value; as this gives us deduplication and free cloning of containers. The issue with this approach is that in the current design we have collisions between Value's of different types: #426 and the current API relies on the single type of values. To resolve this issue I decided to change the API, instead of assuming that a Value has a fixed type, let the value be possibly multiple compatible types and let the user of the library try casting the Value to a particular type. For this I deprecated the public access of everything related to `TypedValue` and I propose for it to be considered an implementation detail and a blackbox from the external developer point of view. The `Value` type is now used like this: - To create a new Value use `Value::from(...)` where you can pass any compatible type (the same types as before) - To access the Value in typed form you cast it like `value.as_foo()` which returns `Option<Foo>`. Previously we had a collision between `true` and `1` (and `false` and `0`). Now it doesn't matter whether a value holds a `true` or a `1`, both should be seen as the same and both return `Some` when doing `as_int` and `as_bool`. Similarly we had collisions with containers. For example `set(0, 1, 2) == array[0, 1, 2]` and `set("a", "b") = dict("a": "a", "b": "b")`. Now any container can be casted to any of `set, array, dict`. There's a caveat here: each of these types expects a particular encoding of keys, so casting to the wrong type will return errors on some operations. With this design it no longer matters what is being stored and recovered because the API requires the user to express the expected type and any type with collisions for particular values can be casted to the right type. There's only one case where it's not desirable to swap one `TypedValue` for another: the `TypedValue::Raw`. If a non-`RawValue` in the DB is replaced by the corresponding `RawValue` we erase the required information to recover the rich value. For this reason the implementations of the database treat the `RawValue` as a special case: if an value is stored in non-`RawValue`, the corresponding `RawValue` can never overwrite it. If a value is stored in `RawValue`, a matching non-`RawValue` will overwrite it (promoting it to a rich value). This way we never lose data. A consequence of this is that the serialization, `Display` and `Debug` of a container is not stable. At any point any of the entries can be swapped for a "compatible" one if they share the storage with other containers that introduce collisions. I rewrote all containers as wrapper to a generic `Container` which holds a `Map` from `Value` to `Value`. The serialization of each container now uses the single implementation of the generic `Container`.

arnaucube and others added 16 commits March 2, 2026 23:43

refactor merkletree to work with disk keyvalue database (wip)

e9bd968

various fixes post reimplementation; pending delete leaf

f935c2b

add delete operation case for the new in db tree approach

b004c57

polish tree update & delete; everything works (pending polishing)

7e7f07b

polish panics into errs, prints, etc

8bf3f9d

Implement iterator

39bd557

Lint

295d58e

fix case no-siblings

3f527d4

case delete with semi-empty branch

745bba9

polishing

2967350

starting to add rocksdb & heeddb for the DB & Txn traits

2b01054

Satisfy the borrow checker

2698509

abstract merkletree tests to use the various available DBs

da37ea5

update store_node interface (rm hash input), rm heed.rs

a41988c

polishing

8ed3294

typos

df84ba3

arnaucube requested a review from ed255 March 6, 2026 14:34

arnaucube commented Mar 9, 2026

View reviewed changes

src/backends/plonky2/circuits/mainpod.rs Show resolved Hide resolved

ed255 reviewed Mar 9, 2026

View reviewed changes

ax0 and others added 2 commits March 10, 2026 01:37

Ditch transactions

202c919

add feature for rocksdb, return errs at new_with_db, remove empty lea…

bcddf96

…f case in Leaf::new

arnaucube force-pushed the merkletree-db branch from d008f18 to bcddf96 Compare March 9, 2026 16:33

ed255 mentioned this pull request Mar 10, 2026

Implement to_bytes and from_bytes for merkletree::Node #488

Open

ed255 approved these changes Mar 10, 2026

View reviewed changes

ed255 reviewed Mar 10, 2026

View reviewed changes

src/backends/plonky2/primitives/merkletree/mod.rs Outdated Show resolved Hide resolved

intermediate instead of leaf in empty node when deleting leaf

d5b12d6

ed255 approved these changes Mar 11, 2026

View reviewed changes

arnaucube merged commit 32f4587 into main Mar 11, 2026
6 checks passed

ed255 mentioned this pull request Mar 18, 2026

Support persistent storage in Containers #493

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Re-implement merkletree with persistent storage (key-value db)#487

Re-implement merkletree with persistent storage (key-value db)#487
arnaucube merged 19 commits intomainfrom
merkletree-db

arnaucube commented Mar 6, 2026 •

edited

Loading

Uh oh!

Uh oh!

ed255 left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dhvanipa commented Mar 9, 2026

Uh oh!

ed255 commented Mar 10, 2026

Uh oh!

ed255 left a comment

Uh oh!

arnaucube commented Mar 10, 2026

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

arnaucube commented Mar 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

ed255 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dhvanipa commented Mar 9, 2026

Uh oh!

ed255 commented Mar 10, 2026

Uh oh!

ed255 left a comment

Choose a reason for hiding this comment

Uh oh!

arnaucube commented Mar 10, 2026

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

arnaucube commented Mar 6, 2026 •

edited

Loading