TIP128: Lite Fullnode implementation

```
tip: 128
title: TIP128 Lite Fullnode support
author: ray.wu <ray.wu@tron.network>
discussions to: https://github.com/tronprotocol/tips/issues/128
status: Draft
type: Standards Track 
category: Core
created: 2020-01-21
```

## Simple Summary

This Tip describes a quick startup scheme of FullNode

## Abstract
    
At present, each time a brand new FullNode starts, it has to synchronize all the blocks from the Genesis block to the latest block so as to work properly. As TRON public chain runs stably and the block height increases steadily, such the synchronization process is highly time-consuming. In addition, the database for FullNode is constantly growing, imposing ever-higher requirements for hardware capacity to run a FullNode. So it is necessary to develop a brand new type of FullNode (namely Lite FullNode) to achieve fast startup and data reduction. 

## Motivation
    
Currently, the database for TRON’s public chain is over 300 gigabytes. It takes at least a month or so for a FullNode to start and synchronize all the blocks through the latest block, which meanwhile imposes ever higher requirements for hard disk capacity and its speed. In the foreseeable future, a large number of nodes will be incapable of running a FullNode.

|Database|Total capacity|  
|--- |---|  
| account | 1.1G |  
| accountid-index | 12M |  
| account-index | 14M |  
| accountTrie | 12M |  
| asset-issue | 13M |  
| asset-issue-v2 | 13M |  
| block | 124G |  
| block-index | 583M |  
| block_KDB | 24K |  
| code | 61M |  
| contract | 76M |  
| DelegatedResource | 14M |  
| DelegatedResourceAccountIndex | 14M |  
| delegation | 154M |  
| exchange | 17M |  
| exchange-v2 | 12M |  
| nodeId.properties | 4.0K |  
| peers | 14M |  
| properties | 3.1M |  
| proposal | 12M |  
| recent-block | 15M |  
| storage-row | 3.9G |  
| trans | 59G |  
| transactionHistoryStore | 76G |  
| transactionRetStore | 39G |  
| utxo | 16K |  
| votes | 1.2M |  
| witness | 3.0M |  
| witness_schedule | 13M |

## Specification
    
- Snapshot Dataset: minimum start dataset, established on a FullNode of a given time, which keeps all the dataset necessary for a FullNode to synchronize blocks and handle transactions. 
- History Dataset: Archived history dataset that keeps historical data such as blocks and transactions for Lite FullNode to complement history data.
- Lite FullNode: A light FullNode that starts on the base of SnapshotDataset and has all other functions of a FullNode except for the feature of history data query. 
- FullNode: Conventional FullNode that has all data and functions.
- World State: World state is the mapping from address (account) to account state. It can be viewed as a constantly-updated overall state as transactions are executed. The information of all accounts on TRON can be seen as a world state, which has its data stored in multiple databases. 
- CheckPoint：Data stored in databases are categorized as memory and hard disk data. As we can’t ensure the atomicity when operating databases, when memory data are stored on disk, data in databases may become inconsistent due to unexpected quitting of programs. Therefore, CheckPoint mechanism is adopted to guarantee the atomicity of persistent storage.

## Rationale

Currently, all databases of FullNode are mixed together without a clearly-defined boundary. Lite FullNode, however, differentiates Snapshot Dataset, which stores all necessary data for FullNode to synchronize blocks and handle transactions, from History Dataset, which holds the historical data. Also, Snapshot Dataset is much smaller than History Dataset in volume. The segmentation of Snapshot Dataset and History Dataset helps support quick startup and reduces disk usage for FullNode. Lite FullNode, a node that does not offer history data query, but synchronizes blocks, handles and broadcasts transactions, is the better option.

After starting the Lite FullNode, it will store all the archived data hereafter, namely, data from the five databases `block`, `block-index`, `trans`, `transactionRetStore` and `transactionHistoryStore` though it has no history database.

As Lite FullNode only features Snapshot Dataset and does not support historical data queries, it is unable to provide full functions of FullNode. To gain full functionality of FullNode, one can replicate History Dataset to the node and then merge the history data into the Lite FullNode databases.

![image](https://user-images.githubusercontent.com/31307926/74419389-c79fa700-4e84-11ea-8a92-251acd3bd9f9.png)

### Split 

When FullNode is running, a complete world state is needed to validate new transactions and synchronize blocks. In the TRON network, a complete world state consists of all the databases other than `block`, `block-index`, `trans`, `transactionRetStore` and `transactionHistoryStore`. As a result, Snapshot Dataset records data from all databases other than the five ones while History Dataset stores information of the five.

### Data Consistency  

The state data of FullNode is scattered among all databases. Therefore, to ensure atomicity in the reading and writing of all databases in whole or, in other words, to make sure that all databases are updated atomically when processing each block and each transaction, updates to related databases during the processing period are such that either all occur or nothing occurs, thus avoiding inconsistent states among databases that can cause damage when FullNode process exits due to an exception. FullNode introduced Checkpoint mechanism that first stores memory data on disk, which is an atomic operation, then updates all databases.

Therefore, if there is data in Checkpoint when splitting the FullNode, such data should also be split and merged into the corresponding data set. For instance, `block` data in Checkpoint should be merged into HistoryDataset.

### Transaction Valid

As an indispensable feature of blockchain, transaction validation is implemented in two aspects:
- Duplication detection
- Tapos

FullNode provides data support for duplication detection and Tapos with a `transactionCache` object and a `recentBlockStore` database respectively. As the data required for initializing `transactionCache` is in the History dataset, it is necessary to reconstruct the logic for initializing `transactionCache` so that all data needed for this operation is loaded from the Snapshot dataset.

A persistent storage is added to store all the transaction data required for `transactionCache`. So instead of accessing transactions from `block`, `transactionCache` completes its initialization by accessing data in its own persistent storage.

`recentBlockStore` is included in Snapshot Dataset by default. It does not require extra operations.

### Merge

Merge HistoryDataset into Lite FullNode by appending it directly. Since the update does not apply to the data in HistoryDataset, it is impossible for old data to overwrite new data.

## Implementation

### Program Stage1

A tool is provided in Stage One to enable split, backup and merging. With the split option, FullNode can be split into SnapshotDataset and HistoryDataset. With the merge option, HistoryDataset can be merged into Lite FullNode.

Splitting requires the directory of FullNode original database and the target directory of the dataset. Given that splitting HistoryDataset may take a pretty long time, the tool supports splitting based upon the type of the data set:

```  
// Split SnapshotDataset  
java -jar LiteFNTool.jar -o split -t snapshot --fn-data-path {path} --dataset-path {path}

// Split HistoryDataset  
java -jar LiteFNTool.jar -o split -t history --fn-data-path {path} --dataset-path {path}

// Merge HistoryDataset into Lite FullNode  
java -jar LiteFNTool.jar -o merge --fn-data-path {path} --dataset-path {path}  
```

Tool parameters explained:
- `--operation | -o`: [ split | merge ] specifies the operation as either to split or to merge
- `--type | -t`: [ snapshot | history ] is used only with `split` to specify the type of the dataset to be split; snapshot refers to Snapshot Dataset and history refers to History Dataset. 
- `--fn-data-path`: FullNode database directory
- `--dataset-path`: dataset directory

### Program Stage2

Stage Two focuses on sending instructions to FulNode to split, back up, download and merge datasets without stopping FullNode process or affecting block syncing and transaction processing on FullNode.

### TransactionCache

`transactionCache` stores transaction records of the latest 65536 blocks, mainly for the purpose of detecting duplicate transactions. The current initialization logic for `transactionCache` is to read transaction information of the latest 65536 blocks from `blockStore` when FullNode starts. The logic needs to be reconstructed to stop relying on `blockStore` so that the SnapshotDataset-based FullNode can function normally.

First, add a persistent storage to `transactionCache` so that transaction information in cache will be put onto the disk as solidified blocks are updated.

```
public class TxCacheDB implements DB<byte[], byte[]>, Flusher {
  ...
  private DB<byte[], byte[]> localStore;

  public TxCacheDB(String name) {
    this.name = name;
    // init localStore to store the cache, so when fullnode startup,
    // transactionCache will read trx from this store instead from blocks
    int dbVersion = CommonParameter.getInstance().getStorage().getDbVersion();
    String dbEngine = CommonParameter.getInstance().getStorage().getDbEngine();
    // only support version2 db
    if (dbVersion == 2) {
      if ("LEVELDB".equals(dbEngine.toUpperCase())) {
        this.localStore = new LevelDB(...);
      } else if ("ROCKSDB".equals(dbEngine.toUpperCase())) {
        this.localStore = new RocksDB(...);
      } else {
        throw new RuntimeException("unknown db");
      }
    } else {
      throw new RuntimeException("db version is not supported.");
    }
  }
    
  ...
  @Override
  public void flush(Map<WrappedByteArray, WrappedByteArray> batch) {
    batch.forEach((k, v) -> this.put(k.getBytes(), v.getBytes()));
    localStore.flush(batch);
  }
  ...
}

```

To ensure that transactions in only the latest 65536 blocks are stored in the persistent storage, outdated transaction data must be deleted at the same time when cache is updated.

```
private void removeEldestFromLocalstore(Set<Long> keys) {
  // remove Eldest transactions
  ....
}
```

Meanwhile, modify the initialization logic for `transactionCache` so that transaction information is read from `localStore` instead of `blockStore`.

```
public TxCacheDB(String name) {
    ...
    // init db
    DBIterator iterator = localStore.iterator();
    for (iterator().seekToFirst(); iterator.hasNext(); iterator.next()) {
      db.put(iterator.getKey(), iterator.getValue());
    }

  }
```

### Future

Function to enable Lite FullNode to automatically complete History Dataset from the Internet will be built in the future.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

TIP128: Lite Fullnode implementation #128

Simple Summary

Abstract

Motivation

Specification

Rationale

Split

Data Consistency

Transaction Valid

Merge

Implementation

Program Stage1

Program Stage2

TransactionCache

Future

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Database	Total capacity
account	1.1G
accountid-index	12M
account-index	14M
accountTrie	12M
asset-issue	13M
asset-issue-v2	13M
block	124G
block-index	583M
block_KDB	24K
code	61M
contract	76M
DelegatedResource	14M
DelegatedResourceAccountIndex	14M
delegation	154M
exchange	17M
exchange-v2	12M
nodeId.properties	4.0K
peers	14M
properties	3.1M
proposal	12M
recent-block	15M
storage-row	3.9G
trans	59G
transactionHistoryStore	76G
transactionRetStore	39G
utxo	16K
votes	1.2M
witness	3.0M
witness_schedule	13M

TIP128: Lite Fullnode implementation #128

Description

Simple Summary

Abstract

Motivation

Specification

Rationale

Split

Data Consistency

Transaction Valid

Merge

Implementation

Program Stage1

Program Stage2

TransactionCache

Future

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions