Skip to content

Add Python deserialisation for Merkle tree mini-trees#7676

Merged
eddyashton merged 16 commits into
mainfrom
copilot/parse-public-ccf-internal-tree
Feb 19, 2026
Merged

Add Python deserialisation for Merkle tree mini-trees#7676
eddyashton merged 16 commits into
mainfrom
copilot/parse-public-ccf-internal-tree

Conversation

Copilot AI commented Feb 17, 2026

Copy link
Copy Markdown
Contributor

Implements Python parsing of serialized Merkle trees from public:ccf.internal.tree to enable isolated chunk validation without full ledger context.

Changes

  • Added MerkleTree.deserialise(buffer, position=0): Parses compact mini-tree format from signature transactions

    • Big-endian: [uint64 num_leaves][uint64 num_flushed][32-byte hashes...][32-byte extra_hashes...]
    • Handles flushed nodes via bitmask iteration (matching C++ merklecpp::deserialise)
    • Lazy reconstruction: stores leaves only, builds upper levels on-demand via get_merkle_root()
    • Uses read_bytes() helper function to safely read buffer segments and advance position
  • Fixed edge case: get_merkle_root() now handles empty trees correctly

  • Added validation test: Integrated into run_read_ledger_on_testdata in tests/e2e_operations.py

    • Maintains an accumulated MerkleTree via add_leaf() for each transaction
    • Compares accumulated tree root with deserialized tree root at signature transactions
    • Validates against real ledger data from tests/testdata/
    • Will be removed once deserialization is integrated into Ledger constructor (future PR)

Implementation Details

The deserialization correctly reconstructs the Merkle tree structure by:

  1. Reading leaf hashes from the serialized buffer
  2. Processing the num_flushed bitmask to insert extra hashes on the left edge
  3. Building tree levels bottom-up by pairing nodes
  4. Storing only the leaf level (consistent with add_leaf() behavior)

Files Changed

  • python/src/ccf/merkletree.py: Core implementation of deserialise() method
  • tests/e2e_operations.py: Validation test integrated into existing e2e test infrastructure
Original prompt

This section details on the original issue you should resolve

<issue_title>Parse public:ccf.internal.tree in Python</issue_title>
<issue_description>To support the isolated Merkle-root-only validation level in #7502, we need to be able to initialise a ccf.MerkleTree from an intermediate state, not just a blank tree appending leafs. This state is what we call a "minitree", written to the public:ccf.internal.tree table in every signature transaction. It is not necessarily a full tree, as nodes from the left may have been flushed.

The source implementation is in C++ under 3rdparty/internal/merklecpp, from the full implementation (including tests) at https://github.com/microsoft/merklecpp. The only bit we need in Python is an equivalent of deserialise. We should test this against sample data from tests/testdata/.</issue_description>

Comments on the Issue (you are @copilot in this section)


💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

Copilot AI and others added 4 commits February 17, 2026 15:08
Co-authored-by: eddyashton <6000239+eddyashton@users.noreply.github.com>
Co-authored-by: eddyashton <6000239+eddyashton@users.noreply.github.com>
…e comments

Co-authored-by: eddyashton <6000239+eddyashton@users.noreply.github.com>
Co-authored-by: eddyashton <6000239+eddyashton@users.noreply.github.com>
Copilot AI changed the title [WIP] Add parsing for public:ccf.internal.tree in Python Add Python deserialisation for Merkle tree mini-trees Feb 17, 2026
Copilot AI requested a review from eddyashton February 17, 2026 15:14
Comment thread python/src/ccf/merkletree.py Outdated
Extract repeated buffer[position:position+N] pattern into read_bytes()
helper function to reduce error-prone code duplication.

Co-authored-by: eddyashton <6000239+eddyashton@users.noreply.github.com>
Comment thread python/tests/test_merkle_tree.py Outdated
Comment thread python/merkle_tree_example.py Outdated
- Remove pytest test file (project doesn't use pytest)
- Remove example script (not needed)
- Add proper validation test in run_read_ledger_on_testdata
- Test compares accumulated merkle tree (via add_leaf) with deserialized trees
- Uses real ledger data from testdata directories

Co-authored-by: eddyashton <6000239+eddyashton@users.noreply.github.com>
Copilot AI requested a review from eddyashton February 17, 2026 16:07
@eddyashton eddyashton marked this pull request as ready for review February 17, 2026 16:14
@eddyashton eddyashton requested a review from a team as a code owner February 17, 2026 16:14
Copilot AI review requested due to automatic review settings February 17, 2026 16:14

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements Python deserialization of Merkle tree mini-trees from signature transactions, enabling validation of isolated ledger chunks without requiring full ledger context.

Changes:

  • Added MerkleTree.deserialise() method to parse compact Merkle tree format from public:ccf.internal.tree table
  • Fixed edge case in get_merkle_root() to handle empty trees correctly
  • Added temporary validation test in run_read_ledger_on_testdata to verify deserialization against real ledger data

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
python/src/ccf/merkletree.py Core implementation of deserialise() method and empty tree fix
tests/e2e_operations.py Temporary validation test comparing accumulated vs. deserialized tree roots

Comment thread python/src/ccf/merkletree.py Outdated
Comment thread tests/e2e_operations.py Outdated
@eddyashton eddyashton merged commit d532ab4 into main Feb 19, 2026
17 checks passed
@eddyashton eddyashton deleted the copilot/parse-public-ccf-internal-tree branch February 19, 2026 13:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Parse public:ccf.internal.tree in Python

4 participants