diff --git a/.vale/config/vocabularies/Custom/accept.txt b/.vale/config/vocabularies/Custom/accept.txt index fb55bfe1..52ce6fb6 100644 --- a/.vale/config/vocabularies/Custom/accept.txt +++ b/.vale/config/vocabularies/Custom/accept.txt @@ -332,6 +332,7 @@ Burnosov callables buildx bundler +C_i cNFT cNFTs callables @@ -916,4 +917,4 @@ Yolo Yung Zellic ZK-proof -Figma +Figma \ No newline at end of file diff --git a/ton/catchain.mdx b/ton/catchain.mdx index 7a76169e..dac260cf 100644 --- a/ton/catchain.mdx +++ b/ton/catchain.mdx @@ -4,27 +4,39 @@ sidebarTitle: "Catchain consensus" description: "Whitepaper by Dr. Nikolai Durov" --- +import { Aside } from '/snippets/aside.jsx'; + **Authors**: Nikolai Durov
**Date**: February 19, 2020
: [Original whitepaper, PDF](/resources/pdfs/catchain.pdf) ## Abstract -The aim of this text is to provide an outline of the Catchain Consensus Protocol, a Byzantine Fault Tolerant (BFT) protocol specifically crafted for block generation and validation in the TON Blockchain [\[3\]](#ref-3). This protocol can potentially be used for purposes other than block generation in a proof-of-stake (PoS) blockchain; however, the current implementation uses some optimizations valid only for this specific problem. +The aim of this text is to provide an outline of the Catchain Consensus Protocol, a Byzantine Fault Tolerant (BFT) protocol specifically crafted for block generation and validation in the TON Blockchain \[[3, 2.1.12](/ton/ton#2-1-12-block-generation-intervals)]. This protocol can potentially be used for purposes other than block generation in a proof-of-stake (PoS) blockchain; however, the current implementation uses some optimizations valid only for this specific problem. + + + +## 1 Overview -The Catchain Consensus protocol builds upon the overlay network construction protocol and the overlay network broadcast protocol of TON Network ([\[3\]](#ref-3)). The Catchain Consensus protocol itself can be decomposed into two separate protocols, one more low-level and general-purpose (the Catchain protocol[¹](#footnote-1)), and the other the high-level Block Consensus Protocol (BCP), which makes use of the Catchain protocol. Higher levels in the TON protocol stack are occupied by the block generation and validation levels; however, all of them are executed essentially locally on one (logical) machine, with the problem of achieving consensus on the newly-generated block delegated to the Catchain protocol level. +The Catchain Consensus protocol builds upon the overlay network construction protocol and the overlay network broadcast protocol of TON Network ([3, 3.3](/ton/ton#3-3-overlay-networks-and-multicasting-messages)). The Catchain Consensus protocol itself can be decomposed into two separate protocols, one more low-level and general-purpose (the Catchain protocol[¹](#footnote-1)), and the other the high-level Block Consensus Protocol (BCP), which makes use of the Catchain protocol. Higher levels in the TON protocol stack are occupied by the block generation and validation levels; however, all of them are executed essentially locally on one (logical) machine, with the problem of achieving consensus on the newly-generated block delegated to the Catchain protocol level. Here is an approximate diagram of the protocol stack employed by TON for block generation and distribution, showing the correct place of the Catchain Consensus protocol (or rather its two component protocols): -- **Top-level**: Block generation and block validation software, logically running on a stand-alone logical machine, with all the inputs provided and outputs handled by the lower-level protocols. The job of this software is to either generate a new valid block for a blockchain (a shardchain or the masterchain of the TON Blockchain; cf. [\[3\]](#ref-3) for a discussion of shardchains and the masterchain), or to check the validity of a block generated by somebody else. +- **Top-level**: Block generation and block validation software, logically running on a stand-alone logical machine, with all the inputs provided and outputs handled by the lower-level protocols. The job of this software is to either generate a new valid block for a blockchain (a shardchain or the masterchain of the TON Blockchain; cf. \[[3, 2.6](/ton/ton#2-6-creating-and-validating-new-blocks)] for a discussion of shardchains and the masterchain), or to check the validity of a block generated by somebody else. - **(TON) Block consensus protocol**: Achieves (byzantine fault tolerant) consensus on the block to be accepted as the next one in the current validator group for the masterchain or a shardchain. This level makes use of (the abstract interface of) the block generation and validation software, and builds upon the lower-level Catchain protocol. This protocol is explained in more detail in Section [3](#3-block-consensus-protocol). - **Catchain protocol**: Provides secure persistent broadcasts in an overlay network (e.g., the task group of validators for a specific shardchain or the masterchain dedicated to generation, validation, and propagation of new blocks in this shardchain or masterchain), and detects attempts of "cheating" (protocol violation) on the part of some participants. This protocol is explained in more detail in Section [2](#2-catchain-protocol). -- **(TON Network) overlay broadcast protocol**: A simple best-effort broadcast protocol for overlay networks in the TON Network as described in [\[3\]](#ref-3). Simply broadcasts received broadcast messages to all neighbors in the same overlay network that did not receive a copy of these messages before, with minimal effort dedicated to keeping copies of undelivered broadcast messages for a short period of time. +- **(TON Network) overlay broadcast protocol**: A simple best-effort broadcast protocol for overlay networks in the TON Network as described in ([3, 3.3](/ton/ton#3-3-overlay-networks-and-multicasting-messages)). Simply broadcasts received broadcast messages to all neighbors in the same overlay network that did not receive a copy of these messages before, with minimal effort dedicated to keeping copies of undelivered broadcast messages for a short period of time. -- **(TON Network) overlay protocol**: Creates overlay networks (cf. [\[3\]](#ref-3)) inside the ADNL protocol network, manages neighbor lists for these overlay networks. Each participant of an overlay network tracks several neighbors in the same overlay network and keeps dedicated ADNL connections (called "channels") to them, so that incoming messages can be efficiently broadcast to all neighbors with minimal overhead. +- **(TON Network) overlay protocol**: Creates overlay networks ([3, 3.3](/ton/ton#3-3-overlay-networks-and-multicasting-messages)) inside the ADNL protocol network, manages neighbor lists for these overlay networks. Each participant of an overlay network tracks several neighbors in the same overlay network and keeps dedicated ADNL connections (called "channels") to them, so that incoming messages can be efficiently broadcast to all neighbors with minimal overhead. - **Abstract Datagram Network Layer (ADNL) protocol**: The basic protocol of the TON Network, that delivers packets (datagrams) between network nodes identified only by 256-bit abstract (ADNL) addresses, which effectively are cryptographic keys (or their hashes). @@ -32,7 +44,7 @@ This text aims to describe only the second and the third protocol in this suite, We would like to point out here that the author of this text, while providing the general guidelines of how this protocol should be designed (on the lines of "let's create a BFT-hardened group broadcast message system, and run a suitably adapted simple two-phase or three-phase commit protocol on top of this system") and participating in several discussions during the development and implementation of the protocol, is definitely not the only designer of this protocol and especially of its current implementation. This is the work of several people. -A few words on the efficiency of the combined Catchain Consensus protocol. Firstly, it is a true Byzantine Fault Tolerant (BFT) protocol, in the sense that it eventually achieves consensus on a valid next block of the blockchain even if some participants (validators) exhibit arbitrarily malicious behavior, provided these malicious participants are less than one third of the total number of the validators. It is well-known that achieving BFT consensus is impossible if at least one third of participants are malicious (cf. [\[5\]](#ref-5)), so the Catchain Consensus protocol is as good as theoretically possible in this respect. Secondly, when the Catchain Consensus was first implemented (in December 2018) and tested on up to 300 nodes distributed all over the world, it achieved consensus on a new block in 6 seconds for 300 nodes and in 4–5 seconds for 100 nodes (and in 3 seconds for 10 nodes), even if some of these nodes fail to participate or exhibit incorrect behavior.[²](#footnote-2) Since the TON Blockchain task groups are not expected to consist of more than a hundred validators (even if a total of a thousand or ten thousand validators are running, only a hundred of them with the largest stakes will generate new masterchain blocks, and the others will participate only in the creation of new shardchain blocks, each shardchain block generated and validated by 10–30 validators; of course, all numbers given here are configuration parameters (cf. [\[3\]](#ref-3) and [\[4\]](#ref-4)) and can be adjusted later by a consensus vote of validators if necessary), this means that the TON Blockchain is able to generate new blocks once every 4–5 seconds, as originally planned. This promise has been further tested and found out to be fulfilled with the launch of the Test Network of the TON Blockchain a couple of months later (in March 2019). Therefore, we see that the Catchain Consensus protocol is a new member of the ever-growing family of practical BFT protocols (cf. [\[2\]](#ref-2)), even though it is based on slightly different principles. +A few words on the efficiency of the combined Catchain Consensus protocol. Firstly, it is a true Byzantine Fault Tolerant (BFT) protocol, in the sense that it eventually achieves consensus on a valid next block of the blockchain even if some participants (validators) exhibit arbitrarily malicious behavior, provided these malicious participants are less than one third of the total number of the validators. It is well-known that achieving BFT consensus is impossible if at least one third of participants are malicious (cf. [5](#ref-5)), so the Catchain Consensus protocol is as good as theoretically possible in this respect. Secondly, when the Catchain Consensus was first implemented (in December 2018) and tested on up to 300 nodes distributed all over the world, it achieved consensus on a new block in 6 seconds for 300 nodes and in 4–5 seconds for 100 nodes (and in 3 seconds for 10 nodes), even if some of these nodes fail to participate or exhibit incorrect behavior.[²](#footnote-2) Since the TON Blockchain task groups are not expected to consist of more than a hundred validators (even if a total of a thousand or ten thousand validators are running, only a hundred of them with the largest stakes will generate new masterchain blocks, and the others will participate only in the creation of new shardchain blocks, each shardchain block generated and validated by 10–30 validators; of course, all numbers given here are configuration parameters (cf. \[[3, 2.1.21](/ton/ton#2-1-21-configurable-parameters)] and \[[4, 1.6](/ton/tblkch#1-6-configurable-parameters-and-smart-contracts)]) and can be adjusted later by a consensus vote of validators if necessary), this means that the TON Blockchain is able to generate new blocks once every 4–5 seconds, as originally planned. This promise has been further tested and found out to be fulfilled with the launch of the Test Network of the TON Blockchain a couple of months later (in March 2019). Therefore, we see that the Catchain Consensus protocol is a new member of the ever-growing family of practical BFT protocols (cf. [2](#ref-2)), even though it is based on slightly different principles. ## 2 Catchain Protocol @@ -46,7 +58,7 @@ The main prerequisite for running (an instance of) the Catchain protocol is the For the specific task of creating new blocks for one of the blockchains (i.e., the masterchain or one of the active shardchains) of the TON Blockchain, a special task group consisting of several validators is created. The list of members of this task group is used both to create a private overlay network inside ADNL (this means that the only nodes that can join this overlay network are those explicitly listed during its creation) and to run the corresponding instance of the Catchain protocol. -The construction of this list of members is the responsibility of the higher levels of the overall protocol stack (the block creation and validation software) and therefore is not the topic of this text ([\[4\]](#ref-4) would be a more appropriate reference). It is sufficient to know at this point that this list is a deterministic function of the current (most recent) masterchain state (and especially of the current value of the configuration parameters, such as the active list of all validators elected for creating new blocks along with their respective weights). Since the list is computed deterministically, all validators compute the same lists and in particular each validator knows in which task groups (i.e., instances of the Catchain protocol) it participates without any further need for network communication or negotiation.[³](#footnote-3) +The construction of this list of members is the responsibility of the higher levels of the overall protocol stack (the block creation and validation software) and therefore is not the topic of this text (\[[4, 1.2](/ton/tblkch#1-2-principal-components-of-a-block-and-the-blockchain-state)] would be a more appropriate reference). It is sufficient to know at this point that this list is a deterministic function of the current (most recent) masterchain state (and especially of the current value of the configuration parameters, such as the active list of all validators elected for creating new blocks along with their respective weights). Since the list is computed deterministically, all validators compute the same lists and in particular each validator knows in which task groups (i.e., instances of the Catchain protocol) it participates without any further need for network communication or negotiation.[³](#footnote-3) #### 2.2.1. Catchains are created in advance @@ -66,7 +78,7 @@ Note that the (ordered) list of nodes participating in a catchain is fixed in th ### 2.4. Messages in a catchain. Catchain as a process group -One perspective is that a catchain is a (distributed) process group consisting of $N$ known and fixed (communicating) processes (or nodes in the preceding terminology), and these processes generate broadcast messages, that are eventually broadcast to all members of the process group. The set of all processes is denoted by $I$; we usually assume that $I = \{1 \ldots N\}$. The broadcasts generated by each process are numbered starting from one, so the $n$-th broadcast of process $i$ will receive sequence number or height $n$; each broadcast should be uniquely determined by the identity or the index $i$ of the originating process and its height $n$, so we can think of the pair $(i, n)$ as the natural identifier of a broadcast message inside a process group.[⁴](#footnote-4) The broadcasts generated by the same process $i$ are expected to be delivered to every other process in exactly the same order they have been created, i.e., in increasing order of their height. In this respect a catchain is very similar to a process group in the sense of [\[1\]](#ref-1) or [\[7\]](#ref-7). The principal difference is that a catchain is a "hardened" version of a process group tolerant to possible Byzantine (arbitrarily malicious) behavior of some participants. +One perspective is that a catchain is a (distributed) process group consisting of $N$ known and fixed (communicating) processes (or nodes in the preceding terminology), and these processes generate broadcast messages, that are eventually broadcast to all members of the process group. The set of all processes is denoted by $I$; we usually assume that $I = \{1 \ldots N\}$. The broadcasts generated by each process are numbered starting from one, so the $n$-th broadcast of process $i$ will receive sequence number or height $n$; each broadcast should be uniquely determined by the identity or the index $i$ of the originating process and its height $n$, so we can think of the pair $(i, n)$ as the natural identifier of a broadcast message inside a process group.[⁴](#footnote-4) The broadcasts generated by the same process $i$ are expected to be delivered to every other process in exactly the same order they have been created, i.e., in increasing order of their height. In this respect a catchain is very similar to a process group in the sense of [1](#ref-1) or [7](#ref-7). The principal difference is that a catchain is a "hardened" version of a process group tolerant to possible Byzantine (arbitrarily malicious) behavior of some participants. #### 2.4.1. Dependence relation on messages @@ -100,7 +112,7 @@ Recall that we have assumed that any message depends on all preceding messages o m_{i,s} \in D \Leftrightarrow s \leq \text{Vt}(D)_i ``` -We say that the vector $\text{Vt}(D) = (\text{Vt}(D)_i)$ with $i \in I$ and $\text{Vt}(D) \in \mathbb{N}^I_0$ with non-negative components $\text{Vt}(D)_i$ is the vector time or vector timestamp corresponding to cone $D$ (cf. [\[1\]](#ref-1) or [\[7\]](#ref-7) for a more detailed discussion of vector time). +We say that the vector $\text{Vt}(D) = (\text{Vt}(D)_i)$ with $i \in I$ and $\text{Vt}(D) \in \mathbb{N}^I_0$ with non-negative components $\text{Vt}(D)_i$ is the vector time or vector timestamp corresponding to cone $D$ (cf. [1](#ref-1) or [7](#ref-7) for a more detailed discussion of vector time). #### 2.4.6. Partial order on vector timestamps @@ -166,7 +178,7 @@ Note that all messages created by the same sender $i$ in a catchain turn out to In this way each process $i$ generates a simple blockchain consisting of its messages, with each "block" of this blockchain corresponding to one message and referring to the previous block by its hash, and sometimes includes references to blocks (i.e., messages) of other processes by mentioning the hashes of these blocks in its blocks. -Each block is signed by its creator. The resulting structure is very similar to that of an "asynchronous payment channel" considered in [\[3\]](#ref-3), but with $N$ participants instead of 2. +Each block is signed by its creator. The resulting structure is very similar to that of an "asynchronous payment channel" considered in \[[3, 5.1.5](/ton/ton#5-1-5-asynchronous-payment-channel-as-a-virtual-blockchain-with-two-workchains)], but with $N$ participants instead of 2. ### 2.6. Message propagation in a catchain @@ -442,7 +454,7 @@ The first block producer may suggest a block candidate immediately after the rou #### 3.4.8. Suggesting a block candidate -A block candidate for the TON Blockchain consists of two large "files" — the block and the collated data, along with a small header containing the description of the block being generated (most importantly, the complete block identifier for the block candidate, containing the workchain and the shard identifier, the block sequence number, its file hash and its root hash) and the sha256 hashes of the two large files. Only a part of this small header (including the hashes of the two files and other important data) is used as `candidate` in BCP events such as `Submit` or `CommitSign` to refer to a specific block candidate. The bulk of the data (most importantly, the two large files) is propagated in the overlay network associated with the catchain by the streaming broadcast protocol implemented over ADNL for this purpose (cf. [\[3\]](#ref-3)). This bulk data propagation mechanism is unimportant for the validity of the consensus protocol (the only important point is that the hashes of the large files are part of BCP events and hence of the catchain messages, where they are signed by the sender, and these hashes are checked after the large files are received by any participating nodes; therefore, nobody can replace or corrupt these files). A `Submit(round, candidate)` BCP event is created in the catchain by the block producer in parallel with the propagation of the block candidate, indicating the submission of this specific block candidate by this block producer. +A block candidate for the TON Blockchain consists of two large "files" — the block and the collated data, along with a small header containing the description of the block being generated (most importantly, the complete block identifier for the block candidate, containing the workchain and the shard identifier, the block sequence number, its file hash and its root hash) and the sha256 hashes of the two large files. Only a part of this small header (including the hashes of the two files and other important data) is used as `candidate` in BCP events such as `Submit` or `CommitSign` to refer to a specific block candidate. The bulk of the data (most importantly, the two large files) is propagated in the overlay network associated with the catchain by the streaming broadcast protocol implemented over ADNL for this purpose (cf. \[[3, 3.3.15](/ton/ton#3-3-15-streaming-broadcast-protocol)]). This bulk data propagation mechanism is unimportant for the validity of the consensus protocol (the only important point is that the hashes of the large files are part of BCP events and hence of the catchain messages, where they are signed by the sender, and these hashes are checked after the large files are received by any participating nodes; therefore, nobody can replace or corrupt these files). A `Submit(round, candidate)` BCP event is created in the catchain by the block producer in parallel with the propagation of the block candidate, indicating the submission of this specific block candidate by this block producer. #### 3.4.9. Processing block candidates @@ -598,33 +610,19 @@ Now we claim that (each round of) the BCP protocol as described above terminates ## References - - -\[1] K. Birman, Reliable Distributed Systems: Technologies, Web Services and Applications, Springer, 2005. - - - -\[2] M. Castro, B. Liskov, et al., Practical byzantine fault tolerance, Proceedings of the Third Symposium on Operating Systems Design and Implementation (1999), p. 173–186, available at [http://pmg.csail.mit.edu/papers/osdi99.pdf](http://pmg.csail.mit.edu/papers/osdi99.pdf). - - - -\[3] N. Durov, Telegram Open Network, 2017. - - - -\[4] N. Durov, Telegram Open Network Blockchain, 2018. + \[1] K. Birman, _Reliable Distributed Systems: Technologies, Web Services and Applications_, Springer, 2005. - + \[2] M. Castro, B. Liskov, et al., _Practical byzantine fault tolerance_, Proceedings of the Third Symposium on Operating Systems Design and Implementation (1999), p. 173–186, available at [http://pmg.csail.mit.edu/papers/osdi99.pdf](http://pmg.csail.mit.edu/papers/osdi99.pdf). -\[5] L. Lamport, R. Shostak, M. Pease, The byzantine generals problem, ACM Transactions on Programming Languages and Systems, 4/3 (1982), p. 382–401. + \[3] N. Durov, [_Telegram Open Network_](https://www.editionmultimedia.fr/wp-content/uploads/2019/10/Telegram-Open-Network-2017.pdf), 2017. - + \[4] N. Durov, [_Telegram Open Network Blockchain_](https://www.docdroid.net/qY4sQEv/telegram-open-network-blockchain-september-5-2018-pdf), 2018. -\[6] A. Miller, Yu Xia, et al., The honey badger of BFT protocols, Cryptology e-print archive 2016/99, [https://eprint.iacr.org/2016/199.pdf](https://eprint.iacr.org/2016/199.pdf), 2016. + \[5] L. Lamport, R. Shostak, M. Pease, _The byzantine generals problem_, ACM Transactions on Programming Languages and Systems, 4/3 (1982), p. 382–401. - + \[6] A. Miller, Yu Xia, et al., _The honey badger of BFT protocols_, Cryptology e-print archive 2016/99, [https://eprint.iacr.org/2016/199.pdf](https://eprint.iacr.org/2016/199.pdf), 2016. -\[7] M. van Steen, A. Tanenbaum, Distributed Systems, 3rd ed., 2017. + \[7] M. van Steen, A. Tanenbaum, _Distributed Systems_, 3rd ed., 2017. ## Footnotes @@ -636,7 +634,7 @@ Now we claim that (each round of) the BCP protocol as described above terminates **4** In the Byzantine environment of a catchain this is not necessarily true in all situations. [Back ↑](#2-4-messages-in-a-catchain-catchain-as-a-process-group) - **5** We assume that all broadcast messages in the process group are "causal broadcasts" or "cbcast" in the terminology of [\[1\]](#ref-1), because we only need cbcasts for the implementation of Catchain protocol and Catchain consensus. [Back ↑](#2-4-8-using-vector-timestamps-to-correctly-deliver-broadcast-messages) + **5** We assume that all broadcast messages in the process group are "causal broadcasts" or "cbcast" in the terminology of [1](#ref-1), because we only need cbcasts for the implementation of Catchain protocol and Catchain consensus. [Back ↑](#2-4-8-using-vector-timestamps-to-correctly-deliver-broadcast-messages) **6** This also means that each process implicitly determines the Unixtime of the start of the next round, and computes all delays, e.g., the block candidate submission delays, starting from this time. [Back ↑](#3-4-5-protocol-overview)