[Indexer] Full Data & Real-time Indexer #1028

baichuan3 · 2023-10-24T13:47:04Z

Typical scenario

Which coins are registered in the system, the total amount of coin supply, and the number of currency holding addresses
Transaction list of a certain address; list of all coins and coin balance
Query event list by transaction

Goal

Automatic indexing state and event data on Rooch
Provide API to query Index data
Provide SQL to query the Index data
Provide the ability to customize Indexer?

Indexer solution

Checkpoint regularly and then write in batches
Listen the database file, generate Transaction Stream, and trigger writing
Automatically parse SMT nodes, parse Type types, and automatically generate Table Schema

Option	Data Mode	Create Table Schema
Option 1: Base on Checkpoint	Offchain	Manual
Option 2: Listen Database And Genarate Transaction Stream	Offchain	Manual
Option 3: Auto generate table schema	Onchain	Auto

Automatic table creation solution

Architecture

Core Process

Parse SMT+NodeStore and get the AnnotatedMoveValue corresponding to State and Event.
Parse AnnotatedMoveValue and generate Table Schema Metadata and Data Schema Metadata
Rely on Table Schema Template to convert Table Schema Metadata and Data Schema Metadata into Table Schema and Data Schema
Determine whether the Table Schema is created. If not, create the Table first, then create the index; then write the data.
Repeat the above process

How to parse

Automatically parse SMT leaf nodes and obtain the Type type. Problem: crate dependency problem, need to rely on statedb

Entry: statedb module
Call the method AnnotatedStateReader::view_value

fn view_value(&self, ty_tag: &TypeTag, blob: &[u8]) -> 
Result<AnnotatedMoveValue> {
    let annotator = MoveValueAnnotator::new(self);
    annotator.view_value(ty_tag, blob)
}

Parse TableChange in StateChangeSet at the storage layer. The data structure is <Vec, Op> and get the State type.

pub struct State {
    /// the bytes of state
    pub value: Vec<u8>,
    /// the type of state
    pub value_type: TypeTag,
}

state entry statedb module
distinguish:
resource
module

event entry event_store/mod module
Parse AnnotatedMoveValue, reference
https://github.com/rooch-network/rooch/blob/encrypt_keystore/crates/rooch-types/src/framework/coin.rs

Core Table Schema

transaction

Field	Type	Description
tx_hash	varchar	the hash of the transaction
transaction_type	varchar	the type of the transaction
chain_id	int	the chain id
auth_validator_id	int	the auth validator id of the authenticator info
payload	blob	the authenticator info payload
encode_data	blob	the transaction encode data
created_at	timestamp	when the row was created
updated_at	timestamp	when the row was updated
sender_address	varchar	user address who emit the event

Primary key: tx_hash pk
Index:

transaction_sequence_info

Field	Type	Description
tx_order	bigint	the sequencer order of the transaction
auth_validator_id	int	the auth validator id of the tx order signature
payload	blob	tx order signature payload
tx_accumulator_root	varchar	the tx accumulator root after the tx is append to the accumulator
created_at	timestamp	when the row was created
updated_at	timestamp	when the row was updated
sender_address	varchar	user address who emit the event

Primary key: tx_order pk
Index:

transaction_execute_info

Field	Type	Description
tx_hash	varchar	the hash of the transaction
state_root	varchar	the root hash of Sparse Merkle Tree describing the world state at the end of this transaction.
event_root	varchar	the root hash of Merkle Accumulator storing all events emitted during this transaction
gas_used	int	the amount of gas used
status	varchar	the vm status
created_at	timestamp	when the row was created
updated_at	timestamp	when the row was updated
sender_address	varchar	user address who emit the event

Primary key: tx_hash pk
Index:

coin_info

Field	Type	Description
coin_type	varchar	coin type of the coin
name	varchar	name of the coin
symbol	varchar	symbol of the coin
decimals	smallint	decimals of the coin
supply	bigint	supply of the coin
created_at	timestamp	when the row was created
updated_at	timestamp	when the row was updated

Primary key: symbol unique pk
Index:

coin_store

Field	Type	Description
address	varchar	user address
coin_type	varchar	coin type of the coin
balance	numeric	balance of a specific coin type
frozen	bool	freeze status
created_at	timestamp	when the row was created
updated_at	timestamp	when the row was updated

Primary key: (address, coin_type) union pk
Index:

event

Field	Type	Description
event_handle_id	varchar	event handle id
event_seq	int	the number of messages that have been emitted to the path previously
type_tag	varchar	the type of the event data
event_data	blob	the data payload of the event
event_index	int	event index in the transaction events
created_at	timestamp	when the row was created
updated_at	timestamp	when the row was updated
tx_order	bigint	the sequencer order of the transaction
tx_hash	varchar	the transaction hash of the transaction
sender_address	varchar	user address who emit the event

Primary key: (event_handle_id, event_seq) union pk
Index:

Challenge in automatic table creation scheme solution

Nested Struct, automatic table creation how to convert the problem. Create multiple tables? Or convert it into a single table through a template?
For example

    /// The Balance resource that stores the balance of a specific coin type.
   struct Balance has store {
       value: u256,
   }
   
   /// A holder of a specific coin types.
   /// These are kept in a single resource to ensure locality of data.
   struct CoinStore has key {
       coin_type: string::String,
       balance: Balance,
       frozen: bool,
   }

Primary key problem. Automatically create primary keys and use auto-increment primary keys? Or use business primary keys, such as address, transaction sequence order?
How to automatically create indexes after creating a table? How to create a composite index to meet query scenarios?
Table Schema adjustments brought about by updating Struct (for example, adding fields to Struct)
Batch writing? Performance optimization issues

Convention

All Tables have created_at and updated_at fields by default, which are automatically filled in by Table Schema Template

TODO

Auto parse and Table Schema Template
Should we split the SMT's leaf value from the node?
Sqlite ORM
GraphQL Server

Appendix

SMT principle

Relative issues

The text was updated successfully, but these errors were encountered:

baichuan3 · 2023-10-25T03:44:57Z

Discussion conclusion:

Transaction sender and other types are normalized. The sender of the transaction from Ethereum has Ethereum address and Rooch address. At least the Rooch address must be stored.
Transactions can be merged into a single table, and transaction data is triggered in RpcService::execute_tx;
Indexer trigger entry? Distinguish between transaction data, state data, and event data
It is difficult to expand by Struct, State table is mapped to Indexer table; try to use JSON to store Table V
Expand by table type:
Option 1: Create a global Object table: objectid, owner, JSON (V); create a table for other tables, and store V according to JSON. Implemented in the first phase
Option 2: Object ID creates a table by type: objectid, hash(K), expand by Struct;
Indexer's created_at and updated_at cannot use indexer time. They need to use the time on the chain and support replay.
Create a Changeset table for data recovery and state sync.

baichuan3 added status::design The issue need to do more detail design feature New feature skill::rust Need the rust language skill to complete the issue labels Oct 24, 2023

baichuan3 added this to the Rooch v0.3 milestone Oct 24, 2023

baichuan3 self-assigned this Oct 24, 2023

baichuan3 changed the title ~~Full Data & Real-time Indexer~~ [Indexer] Full Data & Real-time Indexer Oct 25, 2023

This was referenced Oct 25, 2023

Rooch v0.3 Milestone #1000

Closed

[IDEA] Rooch as a programmable platform for Github Events #1005

Open

Fully on-chain gaming #420

Open

Provide indexer store base on SQLite #1088

Merged

This was referenced Nov 8, 2023

Implement transaction indexer #1121

Merged

Implements event indexer write #1123

Merged

Refactor event filter and Implements indexer event RPC #1136

Merged

This was referenced Nov 29, 2023

[Indexer] Implements state indexer #1186

Merged

[Indexer] Recode state change set for state sync And provide state sync RPC #1190

Merged

jolestar closed this as completed Jan 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Indexer] Full Data & Real-time Indexer #1028

[Indexer] Full Data & Real-time Indexer #1028

baichuan3 commented Oct 24, 2023 •

edited

Loading

baichuan3 commented Oct 25, 2023 •

edited

Loading

[Indexer] Full Data & Real-time Indexer #1028

[Indexer] Full Data & Real-time Indexer #1028

Comments

baichuan3 commented Oct 24, 2023 • edited Loading

Typical scenario

Goal

Indexer solution

Automatic table creation solution

TODO

Appendix

Relative issues

baichuan3 commented Oct 25, 2023 • edited Loading

baichuan3 commented Oct 24, 2023 •

edited

Loading

baichuan3 commented Oct 25, 2023 •

edited

Loading