Skip to content

Conservative State Tracking #3094

@noamnelke

Description

@noamnelke

A.K.A. Pending Transactions.

Motivation

The node must maintain an index of pending transactions (known, applicable transactions that have not been executed yet) in order to calculate the conservative nonce and balance of accounts that might be principle accounts. This information will be used for processing gossip transactions: if the conservative balance of the principle account cannot cover the transaction's gas cost, or the conservative account nonce indicates the transaction's nonce is invalid - the node should refrain from relaying the transaction or storing it in its mempool. Clearly, layer committee members should only include in their proposals transactions that pass nonce and balance checks against their principle account's conservative state.

Principles of Conservative State

  1. Transactions are treated asymmetrically. We only consider the money going out of each account and not money coming in. This makes the state "conservative". If we were to consider incoming transactions, but they would eventually fail, we could allow transactions that we shouldn't have. On the other hand, if we consider a spending transaction and it fails, the worst case is that a subsequent transaction would have to unjustifiably wait a few more layers.
  2. Transactions are maxed out for conservative balance tracking. When gas metering is enabled in the future, the max_gas will be deducted from the conservative balance. In addition, part of the principle interface is supporting a max_balance_transfer(transaction) method and the output of that method is also deducted. If at execution time this amount is exceeded, the transaction fails and pays the max_gas.

Implementation Details

The Spacemesh protocol has a weak assumption that nodes keep transactions when they are first published. Thus, Spacemesh proposals and blocks don't contain actual transactions, but only references to them. If a node is missing a referenced transaction it can request it from its peers, but if many nodes had to sync most transactions referenced in proposals during a Hare run, it would put considerable strain on the gossip network and put the Hare protocol (which is sensitive to timing) success at risk.

For this reason, mempool transactions should be persisted on-disk (we use the name "mempool" liberally).

Storing Transactions and Maintaining Their State

Transactions should be stored in a transactions SQLite table. The table should contain the following fields:

  • transaction_id: 20-byte prefix of the Sha256 sum.
  • principle_address: The source of funds for the transaction.
  • nonce: 128-bit nonce value of the transaction.
  • raw: The raw serialized transaction bytes.
  • applied: Integer indicating the layer in which this transaction has been applied to global state (or NULL).

The following indices should be defined:

  • [transaction_id]
  • [applied, transaction_id] (partial index where applied IS NULL)
  • [principle_address, nonce]

Relationships

Given a transaction we need to be able to find:

  • Blocks it appears in.
  • "Live" proposals it appears in.

In a relational database system the standard way to represent these many-to-many relationships is in dedicated tables.

Table Name Fields Indices
block_transactions block_id, transaction_id, applied [applied, transaction_id] [block_id]
proposal_transactions proposal_id, transaction_id [proposal_id] [transaction_id]

The ProposalTransactions table should only refer to "live" proposals, i.e. those that belong to a currently-running Hare round. When a Hare process is completed its proposals should be pruned. This means that joins on this table are not expected to be resource intensive.

The BlockTransactions table, on the other hand, is expected to grow indefinitely. It will be impractical to perform joins on this table without pre-filtering. To keep joins performant, we add an indication if a block has been applied to the relationship table, as well as the transactions table. The applied boolean field in both tables, enables a quick search for unapplied transactions, by adding Transactions.applied=false AND BlockTransactions.applied=false to the join predicate. This subset of both tables is expected to remain small and relatively constant (reflects mempool size rather than transaction history size).

Transaction States and Transitions

🟪 Pending

A transaction that has been received over gossip and is syntactically valid, but its nonce and the principal's ability to cover gas have not been verified yet.

🟫 Mempool
Validation passed. This transaction is a candidate for the next block.

🟥 Discarded
Validation failed. Possible reasons: (1) Bad nonce (2) Insufficient balance to cover gas (3) Higher fee transaction is known.

🟫 Mempool

A candidate for adding to a proposal. We don't store competing transactions here, only the most likely version.

🟦 Proposal
A new proposal becomes known (created locally or received over gossip) that includes this transaction.

🪝HOOK Process new proposal.

🟥 Discarded
A new transaction is received that invalidates this transaction. Possible reasons: (1) New transaction has a lower nonce and doesn't leave enough balance in the account to cover this transaction's fee (2) New transaction has the same nonce and pays a higher fee.

🪝HOOK Process new transaction.

A block is applied that includes a transaction that invalidates this transaction. Possible reasons: (1) The block includes a transaction that invalidates this transaction's nonce (higher TTL value) (2) The block includes a transaction with a lower/equal nonce value that is not in the mempool and it doesn't leave enough balance in the account to cover this transaction's fee.

🪝HOOK Apply block.

Hare has finished accepting proposals for the layer in which the transaction's TTL value expires or we've processed a block for this layer.

🪝HOOK Hare pre-round completed.
🪝HOOK Apply block.

🟦 Proposal

A transaction that's included in a proposal while the Hare is in progress.

🟨 Block
Hare certifies a block that includes this transaction or a ballot votes for such a block.

🪝HOOK Process new block.

🟪 Pending
Hare converges on a set of proposals that excludes all proposals that reference this transaction. The transaction's validity is re-evaluated.

🪝HOOK Hare agrees on a set of proposals.

🟨 Block

A transaction that's included in a block that has not been applied yet (e.g. because an earlier layer Hare has not converged yet).

🟩 Applied
Transactions should normally skip the "Block" state, as in the optimistic case (when all previous layers have been applied) transactions are executed during block construction. If there's no consensus about a previous layer, a block may be constructed but not applied.

🪝HOOK Apply block.

🟪 Pending
While rare, a block may end up being rejected, mostly in an adversarial setting. This can happen when Hare certified one block while a minority of ballots vote for another, or Hare output being unknown and a majority of ballots voting for one block and not another. In such cases, transactions that were included in the rejected block and not the accepted block should be reconsidered for the mempool (unless they are already in a new proposal for a later layer).

🪝HOOK Apply block.

🟩 Applied

The final state of a transaction. A transaction can only leave this state if self-healing occurs.

🟥 Discarded

A transaction that's been rejected and discarded. This can be permanent (invalid nonce) or temporary (insufficient balance to cover gas).

A discarded transaction may be rebroadcasted if it becomes relevant again in the future.

General Notes

A transaction always gets the highest possible state. E.g. if a transaction is in a block and for some reason it's also included in a proposal - the state is "block".

Conflicting transactions must be handled properly. If an inferior competing transaction is observed in a proposal or block, we don't want to discard the superior transaction until the inferior one is applied. This is to prevent censorship or manipulation by creating bad proposals or blocks. E.g. any smesher can create a block with arbitrary content and vote for it, disregarding the Hare. Such block will never be applied as other smeshers will vote against it, but if we discarded all conflicting transactions from the mempool because of this block then it would have real power.

Data in the Conservative State

Conservative state fields for each account are Nonce and Balance.

Readers should be able to read conservative state for an account even if there are no pending transactions. In that case they will simply see the actual nonce and balance from the global state (the result of actually applying transactions).

When to Aggregate

It should be clear by now, that every change to the set of known transactions, proposals, blocks or the global state likely results in a change to the conservative state. This means that if we kept an aggregated conservative diff over the global state we would need to constantly update it. These updates are not trivial, since in many cases we'd need to recalculate the entire conservative state, re-assessing each transaction. To prevent doing calculations preemptively when we might not need to the result of the calculation, while also making obtaining the result quick and efficient, we do a partial aggregation, essentially caching relevant information in memory in a way that makes common operations quick and efficient.

Structure of Cached Data

The conservative state cache should be stored according to the following hierarchy:

principle
 ↳ layer
   ↳ nonce
     ↳ (transaction_id, c_amount, c_fee)

By putting the principle at the root, when querying conservative state, we can quickly know if there is any diff from the global state for that principle. We then have access to all relevant transactions for that principle.

By putting the layer next, we can process transactions layer-by-layer, as we assume they will be applied. This assumption has its downsides, but it considerably simplifies things. It can't easily be abused since only transactions signed by the principle can be used by an adversary for this purpose. Each transaction will only be stored in the lowest layer that it's referenced in. Transactions that are still in the mempool will be stored under a preset layer value (e.g. nil) and processed last. When a transaction gets evicted from a layer without being applied, we must check if it also appears in a proposal or block of a later layer and move it into that layer (transactions that have been applied should not appear in this structure at all).

By putting the nonce next, we can easily process transactions in the correct order, considering all candidates for a given nonce.

For transactions that are associated with a layer there can be multiple competing versions for a given nonce, but in the mempool we only store one chosen candidate (the one with the highest fee). Any candidate transaction that we hear about that doesn't pay a higher fee that another known transaction with the same nonce gets dropped entirely and not even relayed. If a better transaction comes along, we drop the previous one (both from the conservative state data structure and the transactions table).

c_amount is the conservative amount. It's the maximal amount the transaction can cause the account to spend. To find this value we call a method on the SVM module with the transaction. In the future, this method will execute a special method that the template exposes to know this value. For now, we should implement a method that just returns the amount that's specified in the transaction.

c_fee is the conservative fee. For now it's just gas_price * max_gas but we should also create an SVM method to do this calculation, because in the future these values will not necessarily be explicit in the transaction and could also be returned by the template.

Aggregating Conservative State

When aggregating the conservative state, we always start by querying the actual global nonce and balance for the required principle. We must also know what layer has already been applied and what's the current layer.

def conservative_state(principle):
	nonce, balance = get_from_global_state(principle)
	if principle not in conservative_cache:
		return nonce, balance
	for layer in range(last_applied_layer + 1, current_layer - 1):
		nonce, balance = apply_layer(conservative_cache[principle].get(layer, []), nonce, balance)
	return apply_layer(conservative_cache[principle].get("mempool", []), nonce, balance)

def apply_layer(txs_by_nonce, nonce, balance):
	while True:
		if (nonce + 1) not in txs_by_nonce:
			break // can't find another consecutive nonce in this layer
		for tx in txs_by_nonce[nonce+1]: // assuming txs are ordered by c_fee, descending - otherwise, need to sort txs first
			if balance >= tx.c_amount + tx.c_fee:
				nonce += 1
				balance -= tx.c_amount + tx.c_fee
				break // out of the for loop
	return nonce, balance

This means we greedily apply transactions to the conservative state, while there's enough balance to cover gas, not allowing nonce gaps.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions