Skip to content

Latest commit

 

History

History
92 lines (71 loc) · 8.23 KB

README.md

File metadata and controls

92 lines (71 loc) · 8.23 KB

Namespaced Merkle Tree Wrapper

Abstract

In Celestia, block transactions are grouped into identical-size shares and arranged in a k by k matrix called original data square. According to the Celestia consensus rules, k must be a power of 2, and the share size is determined by the SHARE_SIZE parameter. The rows and columns of the original data square are then extended using a 2D Reed-Solomon coding scheme. The extended version is called extended data square, that is a 2k by 2k square consisting of 4 quadrants namely, Q0, Q1, Q2, and Q3. Figure 1 provides an illustration of a sample data square and its extended version. Q0 corresponds to the original data square. Q1 and Q2 represent the horizontal and vertical extensions of Q0, respectively. Q3 is the horizontal extension of Q2 or alternatively, it can be considered as the vertical extension of Q1. Additional information about the extension logic can be found in the specifications of the 2D Reed-Solomon encoding scheme. Figure 1. Extended Data Square.

Figure 1. r and c stand for row and column, respectively.

Each row and column of the extended data square is modeled by a Namespace Merkle Tree. NMTs require the data items they represent to be namespaced, which means that the shares within each row or column of the extended data square must be namespaced before being added to the NMT. This is where the Namespaced Merkle Tree Wrapper, more specifically, ErasuredNamespacedMerkleTree comes into play. It is a data structure that wraps around the Namespaced Merkle Tree and ensures the proper namespaces are prepended to the shares before they are added to their respective row or column NMT. In this specification, we elaborate on the design and structure of the Namespace Merkle Tree wrapper.

Namespaced Merkle Tree Wrapper Structure

Namespaced Merkle Tree Wrapper is used to represent a row or column of Celestia extended data square. At its core, it is an NMT and is defined by a similar set of parameters. Namely, the namespace ID size NamespaceIDSize, the underlying hash function for the digest calculation of the namespaced hashes, and the IgnoreMaxNamespace flag which dictates how namespace range of non-leaf nodes are calculated. In addition, the NMT wrapper is configured with the original data square size SqaureSize (k in the above example), and the index of the row or column it represents AxisIndex which is a value in [0, 2*SquareSize). These additional configurations are used to determine the namespace ID of the shares that the NMT wrapper represents based on the quadrants to which they belong.

NMT wrapper supports Merkle inclusion proof for the given share index and Merkle range proof for a range of share indices. It extends the NMT data insertion behaviour (i.e., the Push method) to prepend shares with proper namespace before inclusion in the tree.

Namespace ID assignment to the shares of an extended data square

To understand the NMT wrapper data structure, it's important to first understand how shares are namespaced in Celestia. The namespace ID of a share depends on the quadrant it belongs to.

Shares in the original data square Q0 already carry their namespace IDs, which are located in the first NamespaceIDSize bytes of the share. Thus, the namespace ID of a share in Q0 can be extracted from the first NamespaceIDSize bytes of the share.

However, shares in Q1, Q2, and Q3 (that are erasure-coded versions of Q0 and known as parity shares) do not have any namespace IDs by default. These shares must be assigned a reserved namespace ID, which is called ParitySharesNamespaceID. The ParitySharesNamespaceID corresponds to the lexicographically last namespace ID that can be represented by NamespaceIDSize bytes. If NamespaceIDSize is 8, then the value of ParitySharesNamespaceID is 2^64-1, which is equivalent to 0xFFFFFFFFFFFFFFFF. In Celestia, the values for NamespaceIDSize and ParitySharesNamespaceID can be found in NAMESPACE_SIZE constant and the PARITY_SHARE_NAMESPACE, respectively.

NMT Wrapper Data Insertion

The NMT wrapper insertion logic is governed by the same rules as the NMT insertion logic. However, it takes care of namespace ID assignment to the shares before inserting them into the tree. During the insertion, the wrapper first checks whether the share is within the original data square or not. A share with the index ShareIndex where ShareIndex is a value between [0, SquareSize), belongs to the original data square if and only if both of the following conditions are satisfied:

ShareIndex < SquareSize && AxisIndex < SquareSize

Otherwise, the share is in Q1, Q2, or Q3.

If the added item falls in the original data square, it's first NamespaceIDSize bytes are treated as its namespace ID (as explained in Namespace ID assignment to the shares of an extended data square, and the share is further prepended with the same namespace ID before getting pushed into the tree. If the added share is not within Q0 i.e, it is a parity share, then it is prepended with the ParitySharesNamespaceID before getting pushed into the tree.

Some insightful observations can be made from the NMT wrapper description:

  • In the NMT wrapper, the shares are extended in size by NamespaceIDSize bytes before insertion to the tree.
  • For every row and column that overlaps with Q0, it is the case that the shares in the first half of the tree leaves belong to Q0, whereas the second half of the leaves are the erasure coded version of the first half. This means, the second half of the tree leaves all have identical namespace IDs, i.e., ParitySharesNamespaceID.
  • Each leaf in the NMT wrapper that corresponds to shares in Q0 has a doubly namespaced structure. Specifically, the underlying data of the leaf contains the namespace ID of the share twice. One namespace ID is located in the first NamespaceIDSize bytes, while the other is located in the second NamespaceIDSize bytes.

References