# <center>Ethereum Yellow Paper</center>
<center>Summary/Code for Shanghai version</center>

### References:

* [Ethereum Yellow Paper - Shanghai Version](https://github.com/ethereum/yellowpaper/tree/f3553dd559f574f061037718512444abc53bbd40)
* [Ethereum Execution Spec](https://github.com/ethereum/execution-specs/tree/1adcc1bfe774798bcacc685aebc17bd9935078c3)

## 1. Introduction

### Bitcoin:
* introduced consensus mechanism with voluntary respect of the social contract
* decentralized value-transfer system, shared across the world
* a very specialised version of a secure, transaction-based state machine

### Ethereum:
* Platform to build **ANY** transaction-based state machine
* Provide developers with a system to build on top of a new system
* a trustful object messaging compute framework 

## 2. Blockchain
* Ethereum: Transaction-based state machine
* State may includes: Account balance, reputations, trust arrangements, external data, etc.
    * Anything that can be represented by a computer is admissible
* Transaction: Any **valid** arc/transition/changes between two states
    *  invalid changes doesn't count

* Transactions are collated into blocks
* Blocks are chained together using a cryptographic hash
* Blocks functions as a journal, recording:
    * Previous block
    * Transactions
    * Identifier for the final state

### 2.1 Value

* In order to incentivise computation within the network, there needs to be an agreed method for transmitting value
* Ethereum: **Ether - ETH - [Ð](https://en.wikipedia.org/wiki/Eth)**
* The smallest subdenomination of Ether, and thus the one in which all integer values of the currency are counted, is the *Wei*
```
10^0 Wei  = Wei
10^9 Wei  = GWei
10^18 Wei = Ether
```

### 2.2. History

* Decentralized -> Tree of blocks
* Consensus: There must be an agreed upon scheme
    * Disagreement -> causes **fork**
* *Paris* fork, reaching consensus is using *Beacon Chain*
    * Separation of *consensus* layer & *execution* layer

* There are many versions of Ethereum, because of protocol update
* There is an agreed conditions to switch: block numbers (pre-Paris), total difficulty (Paris), timestamp (post-Paris)
* link to [Ethereum Protocol Release List](https://github.com/ethereum/execution-specs/blob/master/README.md#ethereum-protocol-releases)
* Sometime people don't agree on protocol change
    *  [EIP-155](https://github.com/ethereum/EIPs/blob/master/EIPS/eip-155.md) - introduce concept of chain ID
    *  Main net chainID is 1

## 3. Conventions

* Ethereum Yellow Paper specifies list of typographical conventions for the formal Notation
* This summary will instead link to the class name and implementations in the [Ethereum Execution Spec](https://github.com/ethereum/execution-specs/tree/1adcc1bfe774798bcacc685aebc17bd9935078c3)

## 4. Blocks, State, and Transactions Details 

### 4.1 World State

* Mapping between Address (160bit/20byte id) -> Account State (RLP encoded data structure)
    * RLP: Resource Length Prefix Serialization ([details](https://ethereum.org/en/developers/docs/data-structures-and-encoding/rlp/), [code](https://github.com/ethereum/pyrlp)) 
    * NOTE: This mapping is **NOT** stored in blockchain
* Implementation maintain it in a Modified Merkle Patricia Tree (Trie)
    * MPT: Modified Merkle Patrice Trie ([details](https://ethereum.org/en/developers/docs/data-structures-and-encoding/patricia-merkle-trie/), [full code](https://github.com/ethereum/py-trie), [simple func](https://github.com/ethereum/execution-specs/blob/1adcc1bfe774798bcacc685aebc17bd9935078c3/src/ethereum/shanghai/trie.py#L397))
    * root node hash depend on all underlying children
    * can be used as secure identity for the whole system state - any changes will modify the hash
    * easily retrieve/revert old states 

#### 4.1.1 Account state
([link to specs](https://github.com/ethereum/execution-specs/blob/1adcc1bfe774798bcacc685aebc17bd9935078c3/src/ethereum/byzantium/fork_types.py#L48-L62))

```python
class Account:
    """
    State associated with an address.
    """

    nonce: Uint     # EOA: number of txn sent|Code: number of contract-creations made
    balance: U256   # number of wei owned
    code: bytes     # hash of EVM code

def encode_account(raw_account_data: Account, storage_root: Bytes) -> Bytes:
    return rlp.encode(
        (
            raw_account_data.nonce,            
            raw_account_data.balance,          
            storage_root,                      # 256bit/32bytes address of MPT root node
            keccak256(raw_account_data.code),
        )
    )
```

### 4.2 Transaction

* single cryptographically-signed instruction by external actor
    * Transaction types: 3 types (so far): Legacy(0), AccessList(1) & FeeMarket(2) - ([link to specs](https://github.com/ethereum/execution-specs/blob/1adcc1bfe774798bcacc685aebc17bd9935078c3/src/ethereum/shanghai/transactions.py#L25-L86))
    * Transaction Subtypes: result in message calls, and result in contract creation

([link to specs](https://github.com/ethereum/execution-specs/blob/1adcc1bfe774798bcacc685aebc17bd9935078c3/src/ethereum/shanghai/transactions.py#L25-L86))
```python
class EthTransaction: # pseudo code only - combination of 3 types
    txn_type                           # transaction type (not in code, known by class name) 
    chain_id: U64                      # only in 1 & 2 - to prevent replay attack on other chain                     
    nonce: U256                        # number of txn sent by sender (WARNING: a bit tricky)                        
    
    # gas related
    gas: Uint                          # max amount of unit of gas to execute this txn
    gas_price: Uint                    # only for type 0,1 - price per unit of gas (wei) - 
    max_fee_per_gas: Uint              # only for type 2 - max fee (priority+base) for a unit of gas
    max_priority_fee_per_gas: Uint     # only for type 2 - max priority fee for unit a gas
    
    to: Union[Bytes0, Address]         # 160bit address for message recipient. 0 for contract creation
    value: U256                        # number of wei to be transferred/endowment for new contract
    init: Bytes                        # only for contract creation - EVM code for account initialization procedure
    data: Bytes                        # input data for message call

    access_list: Tuple[...]            # access entries to 'warm up' - Tuple of account address & list of storage keys
    y_parity: U256                     # signature y parity
    r: U256                            # signature of transaction
    s: U256                            # signature of transaction
```

* Nonce trickiness ([link to article](https://github.com/ethereumbook/ethereumbook/blob/develop/06transactions.asciidoc#the-transaction-nonce))

### 4.3 Withdrawal

* Describing a consensus' layer validator's withdrawal of some amount of its staked Ether
* Created & validated in the consensus layer and then pushed to execution layer

([link to specs](https://github.com/ethereum/execution-specs/blob/1adcc1bfe774798bcacc685aebc17bd9935078c3/src/ethereum/shanghai/blocks.py#L25C1-L34C1))
```python
class Withdrawal:
    """
    Withdrawals that have been validated on the consensus layer.
    """
    index: U64            # zero based incrementing withdrawal index - unique id
    validator_index: U64  # index of validator in the consensus layer
    address: Address      # recipient of the Ether of this withdrawal
    amount: U256          # positive amount to be transfered

```

### 4.4 The Block

Collection of relevant pieces of information regarding a block

([link to specs](https://github.com/ethereum/execution-specs/blob/1adcc1bfe774798bcacc685aebc17bd9935078c3/src/ethereum/shanghai/blocks.py#L64-L72))

```python

class Block:
    """
    A complete block.
    """
    header: Header                                              # Block header
    transactions: Tuple[Union[Bytes, LegacyTransaction], ...]   # list of transactions
    ommers: Tuple[Header, ...]                                  # deprecated
    withdrawals: Tuple[Withdrawal, ...]                         # validator's withdrawal - since Shanghai fork
```

  

([link to specs](https://github.com/ethereum/execution-specs/blob/1adcc1bfe774798bcacc685aebc17bd9935078c3/src/ethereum/shanghai/blocks.py#L38-L59))
```python
class Header:
    """
    Header portion of a block on the chain.
    """

    parent_hash: Hash32      # Keccak256 hash of the parent block's header
    ommers_hash: Hash32      # deprecated - used in prof of work. Now constant of KEC(RLP())
    coinbase: Address        # i.e. beneficiary - where priority fees is transferred
    state_root: Root         # Keccak256 hash of the root node of state trie after all are applied
    transactions_root: Root  # Keccak256 of the root node of the transactions trie
    receipt_root: Root       # Keccak256 of the root node of the receipts trie
    bloom: Bloom             # Bloom filter composed from log's indexable info (logger address & topics)
    difficulty: Uint         # deprecated - used in proof of work. Now set to 0
    number: Uint             # number of ancestor blocks (block heights?) - genesis is 0
    gas_limit: Uint          # current limit of total gas expenditure per block
    gas_used: Uint           # total gas used in transactions in this block
    timestamp: U256          # timestamp of Unix time at this block's inception
    extra_data: Bytes        # arbitrary data relevant to this block. must be <32 bytes
    prev_randao: Bytes32     # latest RANDAO mix of the post beacon state of prev blocks
    nonce: Bytes8            # deprecated - used in proof of work. Now set to 0
    base_fee_per_gas: Uint   # amount of wei burned for each unit of gas
    withdrawals_root: Root   # Keccak256 hash of the root node of withdrawal trie
```
Notes:
* Bloom: ([S/O detail](https://ethereum.stackexchange.com/questions/3418/how-does-ethereum-make-use-of-bloom-filters), [code](https://github.com/ethereum/eth-bloom))
* Prev RANDAO?: random output of the beacon chain's randomness oracle for the previous block. ([link to specs](https://github.com/ethereum/execution-specs/blob/1adcc1bfe774798bcacc685aebc17bd9935078c3/src/ethereum/shanghai/vm/instructions/block.py#L159-L188))

#### 4.4.1 Transaction Receipt

* Information about a transaction concerning which it may be useful to form a zero-knowledge proof, or index and search
* Receipt of each transaction containing certain information from its execution.
* Put in index-keyed trie, root recorded in Header

([link to specs](https://github.com/ethereum/execution-specs/blob/1adcc1bfe774798bcacc685aebc17bd9935078c3/src/ethereum/shanghai/blocks.py#L89-L97))
```python
class Receipt:
    """
    Result of a transaction.
    """
    # Receipt type that correspond to Transaction type - not serialized  
    succeeded: bool             # status code? NOTE: different from yellow paper  
    cumulative_gas_used: Uint   # Cumulative gas used in the block containing the transaction receipt after it happened
    bloom: Bloom                # bloom filter for the logs (256 bytes)
    logs: Tuple[Log, ...]       # logs created through execution of the transaction
```
* Bloom filter function sets 3 bits out of 2048, given arbitrary byte sequence
    
```python
class Log:
    """
    Data record produced during the execution of a transaction.
    """

    address: Address            # logger's address
    topics: Tuple[Hash32, ...]  # series of log's topic (can be empty)
    data: bytes                 # data (not included in bloom filter)
```

#### 4.4.2 Holistic Validity

* Block is valid if it satisfies several conditions: ([link to specs](https://github.com/ethereum/execution-specs/blob/1adcc1bfe774798bcacc685aebc17bd9935078c3/src/ethereum/shanghai/fork.py#L139-L197))

```python
    # initial validation check
    parent_header = chain.blocks[-1].header
    validate_header(block.header, parent_header)
    if block.ommers != ():
        raise InvalidBlock

    # apply the block & save output
    apply_body_output = apply_body(...)
    
    # validation after application
    if apply_body_output.block_gas_used != block.header.gas_used:
        raise InvalidBlock(
            f"{apply_body_output.block_gas_used} != {block.header.gas_used}"
        )
    if apply_body_output.transactions_root != block.header.transactions_root:
        raise InvalidBlock
    if apply_body_output.state_root != block.header.state_root:
        raise InvalidBlock
    if apply_body_output.receipt_root != block.header.receipt_root:
        raise InvalidBlock
    if apply_body_output.block_logs_bloom != block.header.bloom:
        raise InvalidBlock
    if apply_body_output.withdrawals_root != block.header.withdrawals_root:
        raise InvalidBlock
```

#### 4.4.3 Serialization

* RLP Encoding of the blocks - it follows the sequence defined above

#### 4.4.4 Block header validity

* Couple rules to assert that block header is valid: ([link to specs](https://github.com/ethereum/execution-specs/blob/1adcc1bfe774798bcacc685aebc17bd9935078c3/src/ethereum/shanghai/fork.py#L262-L306))
  
* Note about gas base fee calculation rules: ([link to specs](https://github.com/ethereum/execution-specs/blob/1adcc1bfe774798bcacc685aebc17bd9935078c3/src/ethereum/shanghai/fork.py#L200-L259))
    * The base fee is the amount of wei burned per unit of gas consumed while executing transactions within the block
    * Value of base fee is function of difference between gas used & parent block's gas target
    * Can go up & down based on whether parent's gas usage is more/less than gas target
    * target is to follow gas target

## 5. Gas and Payment

* Gas is the fuel of Ethereum. Gas is not ether—it’s a separate virtual currency with its own exchange rate against ether.
* Used to control the amount of resources that a transaction can use, since it will be processed on thousands of computers around the world.

* Every transaction has specific amount of gas associated with it: gasLimit
* This will be charged from Sender's account balance at the beginning of transaction
    * the amount that will be charged is gasLimit * effective gas price
    * Transaction is invalid if sender's account balance can't cover that amount
    * Unused gas is refunded at the same rate as purchase (effective gas price)
    * If at any point the gas supply is reduced to zero we get an "Out of Gas" (OOG) exception;
        * execution immediately halts and the transaction is abandoned.
        * sender’s nonce being incremented
        * ether balance going down to pay the block’s beneficiary
* Gas is only used in the context of execution of transaction
    * Account with trusted code can set a high gas limit & left alone

* effective gas price consist of base fee & priority fee
* base fee: constant for all traensactions within a block
    * the ether that is paid to base fee is burned
    * adjust dynamically according to gas target (see section 4.4.4)
      
* priority fee set per transaction to incentivize validators to include that transactions
    * delivered to beneficiary address
    * Transactions type 2 can specify maxPriorityFeePerGas & maxFeePerGas
    * Transactions type 0 & 1 only has gasPrice
    * maxFeePerGas or gasPrice must be at least as high as base fee to be included in the block
    * Transactor can set any priority fee, validators are free to ignore transactions
    * Higher priority fee is greater value to validators, and higher chance to be included

## 6. Transaction Execution

([link to specs](https://github.com/ethereum/execution-specs/blob/1adcc1bfe774798bcacc685aebc17bd9935078c3/src/ethereum/shanghai/fork.py#L566-L680))

* any transaction must pass initial test of intrinsic validity

### 6.1 Substate

* Throughout transaction execution, we accrue certain information that is acted upon immediately following the transaction.
* Substate values (NOTE: not found the code yet)
    * Self-destruct set: a set of accounts that will be discarded following transaction completions
    * Log Series: series of checkpoints to be easily tracked
    * Touched account: empty one will be deleted at end of transactions
    * Refund balance: increased when we reset contract storage (or other negative gas operation)
    * Accessed account address
    * Accessed storage keys

### 6.2 Execution

([link to specs](https://github.com/ethereum/execution-specs/blob/1adcc1bfe774798bcacc685aebc17bd9935078c3/src/ethereum/shanghai/fork.py#L566-L680))

* NOTE: the sum of the total transaction gas limit must be no greater than block's gas Limit


## 7. Contract Creation

([link to article](https://github.com/ethereumbook/ethereumbook/blob/develop/06transactions.asciidoc#special-transaction-contract-creation))
* One special case is a transaction that creates a new contract on the blockchain
    * by sending to special destination called zero address
    * Can contain only data payload that contain compiled bytecode (CREATE?)
    * Or use init function (using CREATE2?)
    * Ether amount can be included in the value field to set the new contract up with a starting balance
    * If it is sent to the contract creation address without a data payload (no contract), then it is burned
 

TBD: not very clear yet

* done via transaction -
    * prepare_message call: ([link to specs](https://github.com/ethereum/execution-specs/blob/1adcc1bfe774798bcacc685aebc17bd9935078c3/src/ethereum/shanghai/utils/message.py#L27-L116))
        * CREATE function: ([link to specs](https://github.com/ethereum/execution-specs/blob/1adcc1bfe774798bcacc685aebc17bd9935078c3/src/ethereum/shanghai/vm/instructions/system.py#L140-L179))
        * CREATE2 function: ([link to specs](https://github.com/ethereum/execution-specs/blob/1adcc1bfe774798bcacc685aebc17bd9935078c3/src/ethereum/shanghai/vm/instructions/system.py#L182-L232))
 

### 7.1 Subtleties

* While initialization code is executing, the newly created address exists but with no body code
* Any message call cause no code to be executed
* For normal STOP code, or empty return code - then state is left with zombie account - remaining balance will be locked into account forever

## 8. Message Call

TBD: not very clear yet - is it 'function call' ?

* In the case of executing a message call, several parameters are required:
    * sender
    * transaction originator
    * the account whose code is to be executed (usually the same as recipient)
    * available gas
    * effective gas price
    * value
    * input data to the call
    * the present depth of the message-call/contract creation stack
    * permission to make modifications to the state
    * (extra) output data - ignored by transactions, but can be initiated due to VM code execution
* It can go to 'Precompiled' contracts (in address 1..9)
---
* Message class reference: ([link to specs](https://github.com/ethereum/execution-specs/blob/1adcc1bfe774798bcacc685aebc17bd9935078c3/src/ethereum/shanghai/vm/__init__.py#L54-L72))
* Process message call: ([link to specs](https://github.com/ethereum/execution-specs/blob/1adcc1bfe774798bcacc685aebc17bd9935078c3/src/ethereum/shanghai/fork.py#L631-L642))

## 9. Execution Model

* Specifies how the system state is altered given a series of bytecode instruction & tuple of environmental data -> EVM
* EVM (Ethereum Virtual Machine) -> Quasi Turing-complete machine, quasi because limited by gas

### 9.1 Basics

* Stack based architecture
* Word size - and size of each stack items - is 256bit (32 bytes)
    * chosen to facilitate Keccak-256 hash scheme & Elliptic-curve computations
* Max stack size is 1024
* Memory model is simple word-addressed byte array (you can access the array based on word address?)
* Storage model is simple word-addressable word array
    * Memory is volatile, storage is non-volatile
    * maintained as part of system state
* All locations in both storage & memory is initialized to 0
* Code is not stored in memory/storage - but in a special location
* EVM can have exceptions:
    * stack underflows, invalid instruction, out-of-gas exception
    * On exception, they will report to execution agent, which will deal with it separately

### 9.2 Fees Overview

* Fees (denominated in gas) is charged under 3 circumstances:
    * fee intrinsic to the computation of the operation
    * form the payment for a subordinate message call or contact creation [link](https://github.com/ethereum/execution-specs/blob/1adcc1bfe774798bcacc685aebc17bd9935078c3/src/ethereum/shanghai/vm/gas.py#L185-L224)
    * increase in the usage of memory
* More memory usage -> more gas [link](https://github.com/ethereum/execution-specs/blob/1adcc1bfe774798bcacc685aebc17bd9935078c3/src/ethereum/shanghai/vm/gas.py#L146-L182)
* Storage fees -> charged a lot to incentivize minimization of storage; refunded on release [link](https://github.com/ethereum/execution-specs/blob/1adcc1bfe774798bcacc685aebc17bd9935078c3/src/ethereum/shanghai/vm/instructions/storage.py#L61-L123)

### 9.3 Execution Environment

* There are several pieces of important information used in execution environment that the execution agent must provide:
    * ([link to specs - maybe?](https://github.com/ethereum/execution-specs/blob/1adcc1bfe774798bcacc685aebc17bd9935078c3/src/ethereum/shanghai/vm/__init__.py#L54-L72))
* There is a function which can compute the resultant state, remaining gas, substate, given these definitions
    * ([link to specs](https://github.com/ethereum/execution-specs/blob/1adcc1bfe774798bcacc685aebc17bd9935078c3/src/ethereum/shanghai/fork.py#L631-L642))
    * ([link to specs - maybe??](https://github.com/ethereum/execution-specs/blob/1adcc1bfe774798bcacc685aebc17bd9935078c3/src/ethereum/shanghai/vm/interpreter.py#L247-L313))

### 9.4 Execution overview

* 9.4.1 [link to Machine state](https://github.com/ethereum/execution-specs/blob/1adcc1bfe774798bcacc685aebc17bd9935078c3/src/ethereum/shanghai/vm/interpreter.py#L266-L284)
* 9.4.2 Exceptional Halting: conditions where instruction is stopped [Search result](https://github.com/search?q=repo%3Aethereum%2Fexecution-specs+path%3A%2F%5Esrc%5C%2Fethereum%5C%2Fshanghai%5C%2F%2F+%22raise+%22&type=code)
* 9.4.3 [Jump Destination Validity](https://github.com/ethereum/execution-specs/blob/1adcc1bfe774798bcacc685aebc17bd9935078c3/src/ethereum/shanghai/vm/runtime.py#L21-L67)
* 9.4.4 Normal Halting: When it is RETURN or REVERT, return the return code. () if STOP or SELFDESTRUCT, NULL otherwise

 ### 9.5 Execution Cycle

* the main loop of code execution: ([link to specs](https://github.com/ethereum/execution-specs/blob/1adcc1bfe774798bcacc685aebc17bd9935078c3/src/ethereum/shanghai/vm/interpreter.py#L293-L301))

 

## 10. Transition to Proof of State

* Paris hard fork changed the consensus from PoW to PoS
* Use a 'terminal total difficulty' (to avoid race to get to block height requirement and claim the first PoS block)
* 10.1 Post paris update, because beacon chain generate new slot every 12 seconds, Post-Paris update can be scheduled at specific timestamp.

## 11. Blocktree to Blockchain

* Prior to transition to PoS, the canonical blockchain was defined as block with the greatest total difficulty
* After Paris: LMD Ghost (Latest Message Driven - Greedy Heaviest-observed Sub-Tree)
* TBD: Not very clear on POS_FORKCHOICE_UPDATED

## 12. Block Finalization

The process of finalising a block involves three stages:
* executing withdrawals
* validating transactions
* verify state

### 12.1 Executing Withdrawals

* After processing the block's transactions, the withdrawals are executed.
    * withdrawal is simply an increase of the recipient account's balance
    * No other balances are decreased. Not a transfer but a creation of funds.
    * Withdrawal can't fail and no gas cost
    * [link](https://github.com/ethereum/execution-specs/blob/1adcc1bfe774798bcacc685aebc17bd9935078c3/src/ethereum/shanghai/state.py#L485-L496)


 ### 12.2 Transaction Validation

 * gasUsed must correspond faithfully to the transaction listed
 * ([link to specs](https://github.com/ethereum/execution-specs/blob/1adcc1bfe774798bcacc685aebc17bd9935078c3/src/ethereum/shanghai/fork.py#L178-L180))

### 12.3 State Validation

* TBD: Not very clear
* ([link to specs - maybe??](https://github.com/ethereum/execution-specs/blob/1adcc1bfe774798bcacc685aebc17bd9935078c3/src/ethereum/shanghai/fork.py#L182-L191))

## 13. Implementing contracts

* 13.1 Data feeds - ([Link to Oracle page](https://ethereum.org/en/developers/docs/oracles/))
* 13.2 Random Numbers - Use [BLOCKHASH](https://github.com/ethereum/execution-specs/blob/1adcc1bfe774798bcacc685aebc17bd9935078c3/src/ethereum/shanghai/vm/instructions/block.py#L22-L58) to use hashes of the previous 256 blocks as pseudo-random numbers

## 14. Future Directions

* Optimize state database:
    * State database won't be forced to maintain all past state trie structures
    * should maintain age, and discard node that are neither recent nor checkpoints
    * Use checkpoints or set of nodes to reduce amount of computation
* Blockchain consolidation
    * Compressed archive of the trie at given point in time (one ever n-th block) could be maintained by peer network
* Blockchain compressions:
    * nodes in state trie that haven't sent/received trnsaction in some amount of blocks could be thrown out

    

## Misc

* RLP
* MPT
* Precompile contracts
* Fee schedule
* VM Specifications & Instruction sets