Skip to content
This repository has been archived by the owner on Dec 26, 2023. It is now read-only.

SMIP-0003: Global state data, STF, APIs #13

Closed
avive opened this issue May 19, 2020 · 4 comments
Closed

SMIP-0003: Global state data, STF, APIs #13

avive opened this issue May 19, 2020 · 4 comments
Assignees
Projects

Comments

@avive
Copy link
Contributor

avive commented May 19, 2020

Global state design - data + apis

UPDATE (5/26): renamed "meta-mesh" to "global state" for clarity
UPDATE (5/31): added "Scope" section; updated list of open questions

Scope

What's in scope for this proposal:

  • STF definition and specification (i.e., how is the global state calculated from the mesh?)
  • what data are included in global state? (e.g., account data, transaction state, rewards, etc. - see below for a more complete list)
  • how is global state data surfaced to extrinsic systems that rely on it? (e.g., API calls)

What's out of scope for this proposal:

  • consensus (consensus is about what goes into the mesh; global state is calculated based on the mesh, after consensus is achieved)
  • reorgs/checkpointing (this also happens at the mesh/consensus layer and is orthogonal)
  • accounts model, SVM storage, or how a go-spacemesh node stores additional data such as receipts, events, rewards, fees, etc. (these things are part of global state but this is a low-level implementation detail; the focus of this proposal is a higher layer of abstraction)

Motivation

Spacemesh aims to build and release a minimally viable cryptocurrency mainnet which is powered by the Spacemesh consensus protocol. The primary use case of any cryptocurrency is the ability of anyone to transact w/o limitations. Transacting means the ability to submit transactions and to know the results of transactions executed by the system that a user may be involved in (as a sender, receiver, signer, etc...). Another important use case is smeshers getting rewards and transaction fees for honest participation in the protocol. For miners, it is utterly important to know exactly what fees and rewards they got credit for, in what time (layer) did they get rewarded and why did they get rewarded a specific amount in a layer.

The data required to implement these use cases on the most basic level does not live explicitly in the core Spacemesh distributed data structure we call the mesh due to the design of the Spacemesh consensus protocol.

This document aims to define what data needs to be collected, stored and made available by a Spacemesh node in a way that will enable Spacemesh to be a minimally viable cryptocurrency.

Specifically, this document lays out what data will be stored by a Spacemesh full node in addition to the core mesh data required by the Spacemesh consensus protocol, and how and when these data should be computed by a Spacemesh full node. While there's team consensus about the contents of the mesh, further attention and thought are needed around what data are needed that are not explicitly stored in the mesh.

The Spacemesh data API provides access to these data as mesh data alone is insufficient for many API consumers. A consensus on the above is required to design a functional API and to design API clients such as backup agents, explorers, wallets, and dashboards. All of these clients consume Spacemesh data and most of them require data beyond the basic, raw mesh data.

This formulation is important due to the special and unique design of the Spacemesh consensus protocol and mesh and the implication of the consensus protocol design on creating a functional cryptocurrency based on it.

The Mesh

Core data that the Spacemesh protocol provides decentralized consensus on is called the "mesh." The mesh is made from the contents of numbered layers starting from layer 0 at genesis. Each layer is made of ordered lists of zero or more immutable data structures: blocks that contain transactions. The mesh consists of all layers from genesis to the present, as determined by the Spacemesh consensus protocol.

By design, the mesh doesn’t include any data regarding the execution of transactions (including transaction processing results or side effects) and may include duplicate transactions in the same layer or even across different layers. The goal of the Spacemesh consensus protocol is to get all honest full nodes who participate in the protocol to agree (eventually) on the contents of all layers.

The contents of a layer (and therefore the mesh) can change over time while the Spacemesh protocol is executing. For example, the Tortoise protocol may change the contents of a layer as was determined by the execution of the Hare protocol. A self-healing process may change the content of past layers.

A transaction included in the mesh may fail to execute due to several conditions only known at runtime (i.e., the time when the full node actually attempts to execute the transaction in situ). Transaction processing doesn’t change the immutable transaction stored in the mesh. (See STF, below.)

The mesh doesn't store any additional data besides the contents of layers described above. (TBD - does it include state root hashes of previous layers?)

Global state data

The global state is an abstract data structure that includes all data, essential to a minimally viable cryptocurrency, that are not explicitly stored in the mesh. These data can be computed implicitly from mesh data. In addition to the global state, we also need to consider data structures emitted from the STF as side effects of the process of updating the global state.

Global state of layer n is define to include the following data:

  1. The balance and nonce (counter) of each account after all txs were processed for this layer.
  2. The state (and metadata) of all app templates and instances deployed on the mesh, including code, balance, storage, etc after all txs were processed for this layer. The accounts and app state part of the global state can be called accounts db.
  3. The results of the execution of transaction , e.g. insufficient funds, invalid nonce, runtime error, out of gas, executed, etc. (The data structure storing these results is called a transaction receipt.) - a side effect of changes to global state.
  4. Block rewards and transaction fees awarded to smeshers who submitted blocks to the layer - a side effect of changes to global state.
  5. App events: events emitted by execution of apps (see caveat below) - a side effect of smart contract execution.

Full node core data includes both the mesh and the global state and the Spacemesh data API should provide access to both mesh and global state data.

The STF (State Transition Function)

The global state can only modified by the state transition function (STF). It cannot be modified any other way. The STF is executed per layer. The global state of layer n is defined as the global state after the STF was executed for layer n.

STF INPUT

  1. 'n' - the layer number the STF should be using as input.
  2. An ordered unique list of transactions created from the contents of the layer. This is output of the consensus layer. The list must not contain txs already processed in the past and be deterministically sorted.
  3. Layer n's blocks - (used as input to rewards and tx fees processing).
  4. The global state at layer n-1 - this is essentially access to the accounts and app state db state at layer n-'.

STF OUTPUT

The output of the STF is the global state of layer n. This includes all changes to global state data members and new data items created by the STF, e.g., new reward events and new transaction receipts.

STF EXECUTION

  • The STF executes the transactions from the list in sequence and is responsible for generating and storing a transaction receipt for each execution in the global state.

  • The transaction execution initiated by the STF modifies the global state on a per-transaction basis. So transaction b executed after transaction a should have the global state which includes changes to data created by the execution of transaction a.

  • Each transaction is executed based on its type: app transactions are executed by SVM and simple coin transactions by a coin transaction execution function. The STF is responsible for properly executing these transactions by using the SVM runtime and other transaction execution-specific modules.

  • When a transaction fails to execute, any changes it did to the global states should be rolled-back, these include any account or app state changes and generated events Note that some failed transaction may still change global state because gas will be paid as tx free to the miners of the blocks of this layer.

  • In addition to processing all the transactions from a list, the STF should compute the rewards and the transaction fees for layer n and update the state of the accounts that should receive rewards and transaction fees as part of the global state computation at layer n. These rewards and fees modify the global state (account balances). Rewards and tx fee computation should happen only after all the txs have been executed so the correct tx fee amounts to be distributed between the layer's smeshers are known.

Design Considerations

  1. An STF for a specific layer number may be executed more than once because a layer’s contents may change (see The Mesh, above). For example, the STF is executed when the Hare protocol agrees on the contents of a layer but will be executed again if the Tortoise protocol or self healing changes the contents of that layer.

  2. The state root hash is a hash of the global state of a layer computed by the STF. It is a compact binary representation of the global state of a layer. It is not yet clear if the mesh will include state root hashes for previous layers in ATXs or blocks (see discussion here). If it is included then the STF can compare its own computed state root hash with the state root hash from the mesh and alert the user if there’s a mismatch. This is not as strong as the state validation features provided by most other major cryptocurrencies, but it can help to more quickly identify full node bugs and runtime errors.

  3. Including the results of transaction execution (receipts) in the global state is an important product requirement for a cryptocurrency. Without these data, users have no way to know if a transaction they (or a transaction counterparty) sent to the network was successfully executed or not, and if there was an error, what the error was (e.g., insufficient funds at execution time, a runtime error, etc.).

    • Empowering users to use a full node to know the results of their transactions instead of relying on a trusted third party and "bank teller" is arguably one of the most basic feature of a cryptocurrency - a distributed transaction processing system where users submit transactions and can tell the effect of their transactions on coin balances and app state. There is no major cryptocurrency design which does not have this basic feature. It is hard to imagine from a user perspective a cryptocurrency without this feature. It will certainly not be very useful or usable to end users.

    • One of the most important features of crypto wallets is the ability to display the results of any transaction that touched the user's account. This has implications on the global state data and the Spacemesh data API design. It is quite hard to imagine a cryptocurrency that can’t tell its users the results of the most basic actions these users are interested in: executing basic financial transactions.

  4. App events are only emitted via a full node low-level API as they are created, and are not stored in global state. Maybe we can exclude them from the definition of the global state and think about them as emitted metadata by the STF that can be stored by full node clients if they subscribe to it (Tal's suggestion).

Open Questions

  1. What global state data do full nodes store? What data do archive nodes store?
    • Do we want archive nodes? What is an archive node? (in short: it stores all intermediate state)
    • Is global state monolithic or are there "shades" of global state, e.g., "light node", "full node", "archive node", "super archive node", etc.?
    • When and how do nodes prune old data? Consideration: pruning transaction receipts means wallets lose the ability to display old transaction execution results.

Answer: archive and full nodes where full nodes only store side-effects (rewards, receipts, events) for up to a fixed number of layers from the current layer and where archive nodes store all global state side effects from genesis.

  1. What happens to transaction receipts created by the STF in a specific layer when it is executed again on the same layer? There needs to be a way to mark old receipts as invalid/superseded or discard them as each executed transaction has a new receipt in the new STF execution.

Answer: Properly handling reorgs requires additional design review.

  1. Should transactions already executed by the STF in a previous layer be pruned from the input to an STF for a later layer in the case that the same transaction is also in the layer contents for that layer? If not, is the STF responsible for not executing a transaction again in this case, or should it create an error receipt for this transaction?

Answer: yes, each layer tx input list must not include a tx that was in an input list in a previous layer.

  1. Wallets need to be smart enough to inspect more than one transaction receipt for a transaction and display the relevant information to the user. How do they do this?

Answer: with the design proposed above that full node hold recent layers side effects.

  1. Do we want a single, unified account type (“account abstraction”)? See comments below.

Answer: Lane to further discuss with Iddo and make a recommendation.

  1. Does the mesh store the state hash for the global state of a previous layer? if not, how does a node know it has the correct, canonical global state as of a given layer? (See State root checkpointing research#45)

Answer: currently not. The mesh doesn't store any global state data.

Transaction Status

  • Note that the term "transaction status" at present may be used to refer to three different kinds of concerns:

    1. The status of a transaction on its journey to the mesh via a full node, a mempool, and inclusion in a block (rejected by mempool, pending in mempool, included in a block, confirmed, etc.)
    2. The status of the layer which includes a block with the transaction (confirmed, approved, etc.)
    3. The actual transaction processing status that resulted from execution of the transaction (succeeded, failed due to lack of funds, runtime error, etc.)
  • We propose to refactor transaction result out of transaction state as follows:

  1. Transaction State

The TransactionState is computed by a full node pre-STF and doesn't reflect any transaction processing by the STF, only transaction processing as far as the mempool and the mesh are concerned.

enum TransactionStateType {
        UNKNOWN = 0; // default state
        REJECTED = 1; // rejected pre-STF processing due to, e.g., invalid syntax
        INSUFFICIENT_FUNDS = 2; // rejected pre-STF processing by funds check
        CONFLICTING = 3; // rejected pre-STF due to conflicting counter
        MEMPOOL = 4; // in mem-pool but not on the mesh yet
        MESH = 5; // included in a layer on the mesh. Pending STF processing
}
  1. Transaction Result

TransactionResult encapsulates the results of transaction execution by the STF. The result can only be known once an STF attempted to execute the transaction.

enum TransactionResult { // the results of STF transaction processing
        UNDEFINED = 0;
        EXECUTED = 1; // executed w/o error by the STF
        BAD_COUNTER = 2; // unexpected transaction counter
        RUNTIME_EXCEPTION = 3; // app code exception
        INSUFFICIENT_GAS = 4; // out of gas
        INSUFFICIENT_FUNDS = 5; // failed due to sender's insufficient funds
    }

The TransactionResult is one field of a TransactionReceipt. A TransactionReceipt also includes the layer number when the transaction was executed. Note that it is the only data structure that includes execution side effects such as gas used and transaction fee charged. Note that one transaction may result in multiple receipts, if the layer changes and is rerun later (as described above), or if the transaction is included in multiple layers.

message TransactionReceipt {
    TransactionId id = 1; // the source transaction
    TransactionResult result = 2; // tx processing result
    uint64 gas_used = 3; // gas units used by the transaction
    Amount fee = 4; // transaction fee charged for the transaction (in smidge, gas_price * gas_used)
    uint64 layer_number = 5; // the layer in which the STF processed this transaction
    uint32 index = 6; // the index of the tx in the ordered list of txs to be executed by stf in the layer.
    AccountId app_address = 7; // deployed app address or code template address
}

// An immutable Spacemesh transaction.
// do not include mutable data such as tx state or result.
message Transaction {
    TransactionId id = 1;
    oneof data {
        CoinTransferTransaction coin_transfer = 2;
        SmartContractTransaction smart_contract = 3;
    }
    AccountId sender = 4; // tx originator, should match signer inside Signature
    GasOffered gas_offered = 5; // gas price and max gas offered
    Amount amount = 6; // amount of coin transfered in this tx by sender
    uint64 counter = 7; // tx counter aka nonce
    Signature signature = 8; // sender signature on transaction
}

@avive
Copy link
Contributor Author

avive commented May 19, 2020

moved from branch

@lrettig lrettig changed the title Meta-mesh design review Global state design review May 26, 2020
@avive
Copy link
Contributor Author

avive commented May 29, 2020

Discussion: Account Abstraction.

  • All transactions, including simple coin transactions are smart contract txs. e.g. executed by the vm. Examples: eth 2.0, Libra move vm.

  • Only 1 type of account in global state and not 2 e.g. users and smart contracts. A smart contract account can have no code. In this case only funds can be move in and out of that account via a simple transfer tx that runs in the vm.

  • Pros: devs and users don't need to worry about type of on-mesh accounts. Simplified transaction format - there are only smart contract transactions and atxs instead of 3 types: atxs, simple coin transfer and smart contract txs. Other???

  • Cons: might be expensive to use SVM runtime to execute simple coin transactions. Other ???

  • Simple coin tx using ED25519++ is packed and small by design (100 bytes). It might be larger as a standard SVM transaction and therefore may cause higher mesh growth rate. It is critical we'll keep coin transactions packed as small as they are the fundamental transaction type in the system and even with heavy smart contracts use at least 50% of txs are expected to be of the simple kind.

@lrettig

@lrettig
Copy link
Member

lrettig commented May 29, 2020

A smart contract account can have no code

I think you meant the opposite here, a "user" account can have no code.

Actually, the way I think about it, all accounts have code: at minimum, a "basic user account" uses a very simple template with send, receive, and withdraw functionality, using ED25519. Think about it like a multisig contract with only one signer. (Since the code is in fact contained in a template, there's only a pointer, so the actual on-mesh account size is tiny, only a few bytes.) CC @YaronWittenstein

Agree on the pros. I'd add that a user can implement whichever signature scheme they want - they're not bound to ED25519.

Cons: might be expensive to use SVM runtime to execute simple coin transactions.

The actual code can be implemented as a precompile, i.e., still run natively in go-spacemesh. It doesn't need to run in the VM. There would be a minimal amount of overhead associated with context switching into and out of the VM, but I think it's manageable, and even this, we could probably work around cleverly if necessary (e.g., we could probably avoid calling into the VM at all for the most common templates - the entire template could become a precompile).

Simple coin tx using ED25519++ is packed and small by design (100 bytes). It might be larger as a standard SVM transaction and therefore may cause higher mesh growth rate.

I think it's worth digging a little deeper into this - how compact we could make transactions of this sort, and how much overhead there might be with account abstraction.

Another downside here is that it makes metering a little more complicated. Typically, if the sig check fails, gas is not charged for a tx - but if you let the user write the sig check code, you have a catch-22. There are lots of ways to work around this (provide a hardcoded set of trusted sig check templates, refuse to perform more than a certain basic amount of work before charging gas, etc.).

@lrettig
Copy link
Member

lrettig commented Jun 6, 2020

By definition of the STF, I had something more like this in mind: https://community.spacemesh.io/t/formal-specification-of-stf/88.

@lrettig lrettig changed the title SMIP-0003: Global state data + apis SMIP-0003: Global state data, STF, APIs Jun 7, 2020
@lrettig lrettig moved this from In progress to Review in API Jun 12, 2020
@lrettig lrettig moved this from Review to Done in API Jul 24, 2020
This was referenced Dec 23, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
No open projects
API
  
Done
Development

No branches or pull requests

3 participants