You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This file is a serialized roaring bitmap and the entire file is rewritten for every new block added to the chain.
This file fits poorly in the overall MMR data structure on disk. It does not materially affect the MMR itself.
It is simply an index of leaf positions that represent the utxo (leaves that represent unspent outputs).
For efficiency we also maintain the "leaf_set" in memory.
This "cache" is initialized on node startup based on the leaf_set on disk.
Note we also maintain an output_pos index in the database.
This index allows us to lookup MMR positions by output commitment.
This allows us to quickly lookup the MMR position for an output for a transaction when attempting to spend the output.
So the "leaf_set" is effectively maintained in 3 separate places -
on disk: pmmr_leaf.bin
in memory: leaf_set "cache"
in the db: output_pos index
If these get out of sync for any reason we can find ourselves in a "corrupted data" situation very easily.
for example, we are in the process of writing the MMR files to disk and we do not successfully write pmmr_leaf.bin.
As the leaf_set is simply an index into the actual MMR structure it would make more sense for this to be maintained in the db. We could still cache it in memory for performance, but "source of truth" would be the db index.
We would not actually need to store this data locally on disk.
Note: We do still need this file during IBD, for both receiving and providing txhashet.zip. But this can be done "on demand".
If we store this data in the db we can take full advantage of the transactional semantics. If we accept a new block and update the head of the chain we can also update the "leaf_set" in the same db transaction. We would no longer risk being in a state where the leaf_set was not updated correctly and was not aligned with the current chain head.
PR experimenting with this approach is here - #3428
The text was updated successfully, but these errors were encountered:
We currently maintain a "leaf_set" for both the output and rangeproof MMR data structures on disk.
This is stored in the
pmmr_leaf.bin
file below -This file is a serialized roaring bitmap and the entire file is rewritten for every new block added to the chain.
This file fits poorly in the overall MMR data structure on disk. It does not materially affect the MMR itself.
It is simply an index of leaf positions that represent the utxo (leaves that represent unspent outputs).
For efficiency we also maintain the "leaf_set" in memory.
This "cache" is initialized on node startup based on the leaf_set on disk.
Note we also maintain an
output_pos
index in the database.This index allows us to lookup MMR positions by output commitment.
This allows us to quickly lookup the MMR position for an output for a transaction when attempting to spend the output.
So the "leaf_set" is effectively maintained in 3 separate places -
pmmr_leaf.bin
output_pos
indexIf these get out of sync for any reason we can find ourselves in a "corrupted data" situation very easily.
for example, we are in the process of writing the MMR files to disk and we do not successfully write
pmmr_leaf.bin
.As the leaf_set is simply an index into the actual MMR structure it would make more sense for this to be maintained in the db. We could still cache it in memory for performance, but "source of truth" would be the db index.
We would not actually need to store this data locally on disk.
Note: We do still need this file during IBD, for both receiving and providing
txhashet.zip
. But this can be done "on demand".If we store this data in the db we can take full advantage of the transactional semantics. If we accept a new block and update the head of the chain we can also update the "leaf_set" in the same db transaction. We would no longer risk being in a state where the leaf_set was not updated correctly and was not aligned with the current chain head.
PR experimenting with this approach is here - #3428
The text was updated successfully, but these errors were encountered: