-
Notifications
You must be signed in to change notification settings - Fork 187
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TIP128: Lite Fullnode implementation #128
Comments
Why do we want to develop a lite node? Is the lite node used for the SPV? |
No, you can understand that it is a lightweight fullnode. The fullnode started with snapshot will synchronize the block data after it, but there is no historical block data. It is mainly used to solve the problem of fullnode fast startup. |
what is the relationship between "history data" and "snapshot"?? Since those two are from full Node, for easy understanding, snapshot like metadata ?? history data like all block data?? If that is true, then after liter node start from a snapshot, it still need a large amount of time to syn the block data like 300 GB , so it still take a lot of time to syn those data. In this way, it cannot be any difference since it still syn block block data after start. Only different is that liter node can start very fast and maintain the min function of a fullnode ?? |
oh, I get your point. One more question, how can I access the snapshot data originally if my node startup with it. If I already have the whole data, why not startup with the whole data. So I think this TIP just solve the problem a new node without the whole data and the new node must trust the data source provided by Tron Foundation. |
Yeah, you are right, lite fullnode is mainly used for people who don't have a fullnode but want run a fullnode immediately. In this situation, he must trust Tron Foundation completely. |
Basically correct, snapshot contains all data needed for starting a fullnode, not metadata. If history data query is not needed, there is no need for synchronize block from network. Meanwhile, if you have the historical data set, you can also merge it into lite fullnode, this operation won't take a very long time. |
The naming snapshot & history is misleading. Actually "history" is the real blockchain. "snapshot" is the chain's global state. Simpler naming may be better. |
Is RocksDB's column family suitable for the 2 categories of data store? |
Sry, this is all I can figure out, as I know EOS has a similar function, it's also called snapshot. |
I think it does not work, using column family means the fullnode still needs to hold all the data, this can't achieve the fast startup and also need to copy all data when starting a new fullnode. |
Will the lite node suspend during copy? |
Good question, we should stop the lite fullnode when copying, because leveldb and rocksdb only support one process access at the same time. |
What's the meaning of hot synchronise? Can it process "copy" at the same time? |
Sry, I said too simple, next version I hope lite fullnode can synchronize the historical data from the mainnet directly and no depend on manual operation, how do you think about this idea? |
Ooo, that's a nice solution to handle old data. I am looking forward to the lite fullnode. |
Thanks to everyone for contributing to this issue. |
what we truly need is "pruning feature" |
Why simply we can not run a full node like what is on Bitcoin which is full node in prune |
Simple Summary
This Tip describes a quick startup scheme of FullNode
Abstract
At present, each time a brand new FullNode starts, it has to synchronize all the blocks from the Genesis block to the latest block so as to work properly. As TRON public chain runs stably and the block height increases steadily, such the synchronization process is highly time-consuming. In addition, the database for FullNode is constantly growing, imposing ever-higher requirements for hardware capacity to run a FullNode. So it is necessary to develop a brand new type of FullNode (namely Lite FullNode) to achieve fast startup and data reduction.
Motivation
Currently, the database for TRON’s public chain is over 300 gigabytes. It takes at least a month or so for a FullNode to start and synchronize all the blocks through the latest block, which meanwhile imposes ever higher requirements for hard disk capacity and its speed. In the foreseeable future, a large number of nodes will be incapable of running a FullNode.
Specification
Rationale
Currently, all databases of FullNode are mixed together without a clearly-defined boundary. Lite FullNode, however, differentiates Snapshot Dataset, which stores all necessary data for FullNode to synchronize blocks and handle transactions, from History Dataset, which holds the historical data. Also, Snapshot Dataset is much smaller than History Dataset in volume. The segmentation of Snapshot Dataset and History Dataset helps support quick startup and reduces disk usage for FullNode. Lite FullNode, a node that does not offer history data query, but synchronizes blocks, handles and broadcasts transactions, is the better option.
After starting the Lite FullNode, it will store all the archived data hereafter, namely, data from the five databases
block
,block-index
,trans
,transactionRetStore
andtransactionHistoryStore
though it has no history database.As Lite FullNode only features Snapshot Dataset and does not support historical data queries, it is unable to provide full functions of FullNode. To gain full functionality of FullNode, one can replicate History Dataset to the node and then merge the history data into the Lite FullNode databases.
Split
When FullNode is running, a complete world state is needed to validate new transactions and synchronize blocks. In the TRON network, a complete world state consists of all the databases other than
block
,block-index
,trans
,transactionRetStore
andtransactionHistoryStore
. As a result, Snapshot Dataset records data from all databases other than the five ones while History Dataset stores information of the five.Data Consistency
The state data of FullNode is scattered among all databases. Therefore, to ensure atomicity in the reading and writing of all databases in whole or, in other words, to make sure that all databases are updated atomically when processing each block and each transaction, updates to related databases during the processing period are such that either all occur or nothing occurs, thus avoiding inconsistent states among databases that can cause damage when FullNode process exits due to an exception. FullNode introduced Checkpoint mechanism that first stores memory data on disk, which is an atomic operation, then updates all databases.
Therefore, if there is data in Checkpoint when splitting the FullNode, such data should also be split and merged into the corresponding data set. For instance,
block
data in Checkpoint should be merged into HistoryDataset.Transaction Valid
As an indispensable feature of blockchain, transaction validation is implemented in two aspects:
FullNode provides data support for duplication detection and Tapos with a
transactionCache
object and arecentBlockStore
database respectively. As the data required for initializingtransactionCache
is in the History dataset, it is necessary to reconstruct the logic for initializingtransactionCache
so that all data needed for this operation is loaded from the Snapshot dataset.A persistent storage is added to store all the transaction data required for
transactionCache
. So instead of accessing transactions fromblock
,transactionCache
completes its initialization by accessing data in its own persistent storage.recentBlockStore
is included in Snapshot Dataset by default. It does not require extra operations.Merge
Merge HistoryDataset into Lite FullNode by appending it directly. Since the update does not apply to the data in HistoryDataset, it is impossible for old data to overwrite new data.
Implementation
Program Stage1
A tool is provided in Stage One to enable split, backup and merging. With the split option, FullNode can be split into SnapshotDataset and HistoryDataset. With the merge option, HistoryDataset can be merged into Lite FullNode.
Splitting requires the directory of FullNode original database and the target directory of the dataset. Given that splitting HistoryDataset may take a pretty long time, the tool supports splitting based upon the type of the data set:
Tool parameters explained:
--operation | -o
: [ split | merge ] specifies the operation as either to split or to merge--type | -t
: [ snapshot | history ] is used only withsplit
to specify the type of the dataset to be split; snapshot refers to Snapshot Dataset and history refers to History Dataset.--fn-data-path
: FullNode database directory--dataset-path
: dataset directoryProgram Stage2
Stage Two focuses on sending instructions to FulNode to split, back up, download and merge datasets without stopping FullNode process or affecting block syncing and transaction processing on FullNode.
TransactionCache
transactionCache
stores transaction records of the latest 65536 blocks, mainly for the purpose of detecting duplicate transactions. The current initialization logic fortransactionCache
is to read transaction information of the latest 65536 blocks fromblockStore
when FullNode starts. The logic needs to be reconstructed to stop relying onblockStore
so that the SnapshotDataset-based FullNode can function normally.First, add a persistent storage to
transactionCache
so that transaction information in cache will be put onto the disk as solidified blocks are updated.To ensure that transactions in only the latest 65536 blocks are stored in the persistent storage, outdated transaction data must be deleted at the same time when cache is updated.
Meanwhile, modify the initialization logic for
transactionCache
so that transaction information is read fromlocalStore
instead ofblockStore
.Future
Function to enable Lite FullNode to automatically complete History Dataset from the Internet will be built in the future.
The text was updated successfully, but these errors were encountered: