Skip to content

feat: account tree with history#1292

Merged
drahnr merged 58 commits intonextfrom
bernhard-1227-account-tree-with-history
Nov 4, 2025
Merged

feat: account tree with history#1292
drahnr merged 58 commits intonextfrom
bernhard-1227-account-tree-with-history

Conversation

@drahnr
Copy link
Contributor

@drahnr drahnr commented Oct 14, 2025

An alternative to SmtWithHistory, but uses a similar pattern.

Depends on #2002 in miden-base merged and #596 in miden-crypto closed

See #1227


The idea here in contrast to SmtWithHistory is not to change the underpinnings of AccountTree but wrap it.

The advantage is a minimal API surface to be implemented, i.e. we really ever only need AccountTree'::open_at(block, account) which is significantly smaller to implement compared to representing a full Smt at past block BlockNumber.

Impl approach:

Use the latest opening, and override elements in it using the reversion mutation sets (so the inverse of the applied mutation on block number increment / block applied). This allows for traversing the reversion mutation sets if they contain inner nodes and leaf updates for relevant leaves, and as such update the merkle path, and hence opening.


Caveats:


Review approach:

The meat of changes is in accounts/mod.rs and some additional changes around the database in the store/*. Ignore the fact that we still don't return the actually correct account data, but that of the latest block, the remaining changes would alter the schema some and bloat the PR a little more, which is already large enough.

@drahnr drahnr force-pushed the bernhard-1227-account-tree-with-history branch 3 times, most recently from 3cb3f29 to 4fe22bf Compare October 14, 2025 22:45
@drahnr drahnr requested a review from bobbinth October 15, 2025 10:51
@drahnr drahnr marked this pull request as ready for review October 16, 2025 12:06
@drahnr drahnr force-pushed the bernhard-1227-account-tree-with-history branch from d8a1870 to 19b7365 Compare October 16, 2025 14:45
@drahnr drahnr marked this pull request as draft October 17, 2025 18:03
@drahnr
Copy link
Contributor Author

drahnr commented Oct 17, 2025

Access Ops ( against Smt @ 10 blocks depth):
Vanilla: 804 ns
Historical (d=0): 800 ns
Historical (d=5): 26.6 µs
Historical (d=10): 53.8 µs

500 Accounts:

Depth Time Per-block history
0 1.3267 µs - -
5 51.674 µs 10.07 µs/block
10 109.32 µs 10.80 µs/block
20 213.89 µs 10.63 µs/block
32 349.70 µs 10.89 µs/block

Copy link
Contributor

@bobbinth bobbinth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Thank you! Not a full review, but I left some comments inline.

#[derive(Debug, Clone)]
struct HistoricalOverlay {
block_number: BlockNumber,
rev_set: AccountMutationSet,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could probably split AccountMutationSet into individual fields. It could look something like:

struct HistoricalOverlay {
    block_number: BlockNumber,
    root: Word,
    node_mutations: HashMap<NodeIndex, Word>,
    account_updates: HashMap<Word, Word>,
}

Using hash maps should be more efficient for lookups, but may take more time to construct.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we'd use the hashmaps feature of miden-crypto that'd be the case. Do we expect that to be enabled by default?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can enable the hashmaps feature by default here, but I think this is separate from that. That is, we could use a hashmap (maybe from hashbrown) here without enabling the hashmaps feature. This would still make node looks faster.

But again, we can probably do both.

@bobbinth
Copy link
Contributor

Access Ops ( against Smt @ 10 blocks depth):
Vanilla: 804 ns
Historical (d=0): 800 ns
Historical (d=5): 26.6 µs
Historical (d=10): 53.8 µs

This is for calling open() method, right? How do these compare to LargetSmt::open() times?

500 Accounts:

Depth Time Per-block history
0 1.3267 µs - -
5 51.674 µs 10.07 µs/block
10 109.32 µs 10.80 µs/block
20 213.89 µs 10.63 µs/block
32 349.70 µs 10.89 µs/block

What are these benchmarks measuring?

@drahnr
Copy link
Contributor Author

drahnr commented Oct 20, 2025

Access Ops ( against Smt @ 10 blocks depth):
Vanilla: 804 ns
Historical (d=0): 800 ns
Historical (d=5): 26.6 µs
Historical (d=10): 53.8 µs

This is for calling open() method, right? How do these compare to LargetSmt::open() times?

500 Accounts:

Depth
Time
Per-block history

0
1.3267 µs

5
51.674 µs
10.07 µs/block

10
109.32 µs
10.80 µs/block

20
213.89 µs
10.63 µs/block

32
349.70 µs
10.89 µs/block

What are these benchmarks measuring?

These are the measurments per additional overlay (=block) going back in history @ 500 accounts in the store, creating an opening for one of them.

After 0xMiden/protocol#2006 lands I'll migrate to LargeSmt and run the benches against that (cargo bench --bench account_tree_historical --package miden-node-store)

@drahnr drahnr force-pushed the bernhard-1227-account-tree-with-history branch from eeb3c7a to 2d0081e Compare October 21, 2025 19:10
@drahnr
Copy link
Contributor Author

drahnr commented Oct 21, 2025

Benchmarks (excerpt, @ 2d0081ebd0f70eb05a0b50dba866d6a4272b6f74) with MermoryBackend:

For 500 accounts present in the account tree, we get the following performance across the hist_offset (=offset into the past from the latest AccountTree

account_tree_vanilla_access/vanilla/500
                        time:   [1.5114 µs 1.5127 µs 1.5153 µs]
                        change: [-0.5130% +0.1342% +0.7152%] (p = 0.69 > 0.05)
account_tree_historical_access/hist_offset_5/500
                        time:   [51.044 µs 51.095 µs 51.235 µs]
account_tree_historical_access/hist_offset_10/500
                        time:   [105.24 µs 105.35 µs 105.59 µs]
account_tree_historical_access/hist_offset_20/500
                        time:   [213.25 µs 213.54 µs 214.16 µs]
account_tree_historical_access/hist_offset_32/500
                        time:   [357.76 µs 358.91 µs 360.01 µs]

(hold the line for more ..)

Benchmarks (excerpt, @ 5d12ed7061a74931cf2b164fe35f6abbd419a27b) with RocksDbBackend:

account_tree_vanilla_access/vanilla/500
                        time:   [5.1918 µs 5.1956 µs 5.2000 µs]

account_tree_historical_access/depth_0/500
                        time:   [5.3584 µs 5.3609 µs 5.3636 µs]
account_tree_historical_access/depth_5/500
                        time:   [56.502 µs 56.565 µs 56.611 µs]
account_tree_historical_access/depth_10/500
                        time:   [109.78 µs 109.87 µs 110.04 µs]
account_tree_historical_access/depth_20/500
                        time:   [218.17 µs 218.47 µs 218.87 µs]
account_tree_historical_access/depth_32/500

@drahnr drahnr requested a review from sergerad October 21, 2025 21:09
Reduces noise significantly on testnet, and we don't have many practical
error cases _yet_. In the future we should increment it again.
@drahnr drahnr self-assigned this Oct 24, 2025
@drahnr drahnr marked this pull request as ready for review October 24, 2025 11:53
Copy link
Contributor

@bobbinth bobbinth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Thank you. Not a full review, but I left a few comments inline.

Also, I tried to make this branch work with the latest next from miden-base - but it seems like there are some inconsistencies and code duplication (e.g., AccountTreeBackend).

Regarding performance (based on the numbers in #1292) - it seems like we are taking quite a significant performance hit. Even comparing it with bigger trees (based on 0xMiden/crypto#438 (comment)) it seems like performance degradation for lookups is close to 10x. Maybe using hash maps instead of BTreeMap in HistoricalOverlay will fix this - but if not, we'll need to try to figure out how to fix this in a follow up.

The mental target I have is that lookup for a tree with 100M accounts at depth 20 should be under 50 µs (currently, with a non-historical tree, I believe this is about 20 µs). But again, beyond using hashmaps instead of BTreeMaps - this is not something to solve in this PR.

Another aspect of performance is how the update times change. Given that the extra work is basically just cloning a mutation set, I think it should be negligible - but would be good to confirm.

Comment on lines 21 to 22
/// Trait abstracting operations over different account tree backends.
pub trait AccountTreeBackend {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is this different from the AccountTreeBackend that we have in miden-base? Do we need both?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do, they're different. For better disambiguation I renamed the one here to trait AccountTreeStorage. The miden-base trait AccountTreeBackend we need to parameterize the AccountTree over storage backends aka Smt implementations.
Here we do parameterizer the AccountTreeWithHistory over implementations over a minimal set of primitives required to allow for it to work, i.e. AccountTree<S: AccountTreeBackend>.

The names are far from optimal, my current approach is to call miden-base::AccounTreeBackend ~~~> AccountTreeStorage and minde-node::AccountTreeStorage ~~~> LatestAccountTreeImpl. I don't really like these names either; open to suggestions

#[derive(Debug, Clone)]
struct HistoricalOverlay {
block_number: BlockNumber,
rev_set: AccountMutationSet,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can enable the hashmaps feature by default here, but I think this is separate from that. That is, we could use a hashmap (maybe from hashbrown) here without enabling the hashmaps feature. This would still make node looks faster.

But again, we can probably do both.

Comment on lines 457 to 458
/// Returns the oldest block still in history.
pub fn oldest_block_num(&self) -> BlockNumber {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAICT, this method does not mutate the state - so, it should probably not be in the "public mutators" section.

Similar comment for contains_account_id_prefix() method below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do use it, it wasn't shown as used before since I wanted to hold off on using AccountTreeWithHistory with the remaining integration points.

crates/store/src/state.rs
870-        let new_account_id_prefix_is_unique = if account_commitment.is_empty() {
871:            Some(!inner.account_tree.contains_account_id_prefix(account_id.prefix()))
872-        } else {
873-            None
874-        };

@drahnr
Copy link
Contributor Author

drahnr commented Oct 31, 2025

Looks good! Thank you! I left a few small comments inline, but I think AccountTreeWithHistory is in a really good shape.

I haven't really gotten to reviewing the select_historical_account_at part yet, but upon a brief look, I think we are not really using it in this PR.

That is correct and intentional, since we talked about postponing the required schema change to a follow up PR (the second piece of disabling is https://github.com/0xMiden/miden-node/blob/8c1b34e42de3ec29a71d8b7891b36fcdd2c8a178/crates/store/src/state.rs#L944 )

So, I'm wondering if it may make sense to extract these changes into a follow-up PR. Basically, this PR would have all the structure necessary for returning historical account witnesses, but we won't be using it yet. Once we introduce the ability to retrieve historical account state, we'll turn on the full functionality - but that would be in a different PR (and could even be a non-breaking change if we keep the gRPC interfaces as they are in this PR).

So the intent here is to keep public API as is and de-risk planned demos? We indeed could do this and combine it with the schema change.

tl;dr I'll carve out that part from the PR and do follow-up for those pieces combined with the schema change

@bobbinth
Copy link
Contributor

So the intent here is to keep public API as is and de-risk planned demos? We indeed could do this and combine it with the schema change.

Yeah, basically keep the schema the same, but error out if the endpoint is invoked for a historical block.

Copy link
Contributor

@bobbinth bobbinth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Thank you! I left a few more comments inline. The main one is about making block_num an optional parameter for the endpoint and, for now, returning an error if it is provided.

Also, would be great to see the latest performance numbers, and I'd love to get one more review either from @sergerad or from @igamigo.

Comment on lines +186 to +196
pub struct AccountTreeWithHistory<S>
where
S: AccountTreeStorage,
{
/// The current block number (latest state).
block_number: BlockNumber,
/// The latest account tree state.
latest: S,
/// Historical overlays indexed by block number, storing reversion data.
overlays: BTreeMap<BlockNumber, HistoricalOverlay>,
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe not for this PR, but I think it should be possible to define AccountTreeWithHistory as:

pub struct AccountTreeWithHistory<S: AccountTreeBackend> {
    block_number: BlockNumber,
    latest: AccountTree<S>,
    overlays: BTreeMap<BlockNumber, HistoricalOverlay>,
}

Because we probably won't have any type besides AccountTree to use here for latest.

And this should allow us to get rid of AccountTreeStorage trait and associated code.

@bobbinth bobbinth requested a review from igamigo November 2, 2025 22:52
@drahnr drahnr force-pushed the bernhard-1227-account-tree-with-history branch from 4dcd48e to 8ac3183 Compare November 3, 2025 09:43
@drahnr drahnr force-pushed the bernhard-1227-account-tree-with-history branch from 8ac3183 to 503ac11 Compare November 3, 2025 09:44
@drahnr
Copy link
Contributor Author

drahnr commented Nov 3, 2025

InMemory Benchmark Overview / LargeSmt<Memory>

[Access] Account Tree Vanilla as baseline

Size Time
1 1.31 µs
10 1.35 µs
50 1.42 µs
100 1.50 µs
500 1.51 µs
1000 1.48 µs

[Access] Account Tree Historical vs Vanilla (n Accounts = 500)

Depth Historical Vanilla
0 1.61 µs 1.51 µs
5 8.20 µs 1.51 µs
10 13.95 µs 1.51 µs
20 25.18 µs 1.51 µs
32 39.75 µs 1.51 µs

Summary: Historical access adds ballpark 1µs per depth level


Account Tree Historical Access by Depth and Size

Depth Size 10 Size 100 Size 500 Size 2500
0 1.40 µs 1.47 µs 1.61 µs 1.55 µs
5 7.69 µs 7.91 µs 8.20 µs 8.30 µs
10 13.73 µs 13.49 µs 13.95 µs 14.56 µs
20 25.51 µs 24.36 µs 25.18 µs 26.88 µs
32 40.10 µs 37.80 µs 39.75 µs 40.92 µs

Summary: Historical access time dominated by depth, size has minimal impact

@bobbinth
Copy link
Contributor

bobbinth commented Nov 3, 2025

Very nice! Thank you for running these! Comparing #1292 (comment) with #1292 (comment) it seems like w/e we did recently resulted in almost a 10x improvement - e.g., accessing a path at depth 20 now takes ~25 µs vs. ~210 µs before.

Is this how you interpret this as well?

@drahnr
Copy link
Contributor Author

drahnr commented Nov 3, 2025

Very nice! Thank you for running these! Comparing #1292 (comment) with #1292 (comment) it seems like w/e we did recently resulted in almost a 10x improvement - e.g., accessing a path at depth 20 now takes ~25 µs vs. ~210 µs before.

Is this how you interpret this as well?

My interpretation is the same.
I think removing the find was significant, combined with BTreeMap -> HashMap conversion.

@drahnr drahnr requested a review from sergerad November 3, 2025 21:08
pub block_num: u32,
/// Block at which we'd like to get this data. If present, must be close to the chain tip.
#[prost(message, optional, tag = "2")]
pub block_num: ::core::option::Option<super::blockchain::BlockNumber>,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have a corresponding PR in client live or coming up?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not yet, CC @igamigo do you want to do it, otherwise I'll do it today/tomorrow

Copy link
Contributor

@bobbinth bobbinth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All looks good! Thank you!

Copy link
Collaborator

@igamigo igamigo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Not the most in-depth review but the critical functionality looks good and so do tests

@drahnr drahnr merged commit 782db94 into next Nov 4, 2025
6 checks passed
@drahnr drahnr deleted the bernhard-1227-account-tree-with-history branch November 4, 2025 09:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants