## SSZ-QL and GMP
This is a PoC pursues three technical objectives:
- reinforce the rationale for the Fellowship project
- quantify the efficiency improvements unlocked by the proposed approach
- validate merkle proofs and multiproofs in a realistic setting


Our starting point is `class BeaconState`, the "God's object" that gives us a versatile surface for exercising multiple use-case scenarios.

In [None]:
# Added this snippet to prevent uploading a beacon state to github: 
# Download the finalized BeaconState as raw SSZ
# (≈︎ `curl -o state.ssz -H 'Accept: application/octet-stream' \
#    https://beaconstate.ethstaker.cc/eth/v2/debug/beacon/states/finalized`)

import requests

RPC = "https://beaconstate.ethstaker.cc/eth/v2/debug/beacon/states/finalized"
HEADERS = {"Accept": "application/octet-stream"}
OUTFILE = "state.ssz"

with requests.get(RPC, headers=HEADERS, stream=True, timeout=30) as resp:
    resp.raise_for_status()                     
    with open(OUTFILE, "wb") as fp:
        for chunk in resp.iter_content(8192):   
            if chunk:                           
                fp.write(chunk)

In [1]:
# Here we are using the same "BeaconState" from a given SSZ file. We could retrieve different ones on-demmand, but it would slow down the process.
import os
import eth2spec.electra.mainnet as spec

beacon_state_path = "state.ssz"
beacon_state_size = os.stat(beacon_state_path).st_size
print(f"Beacon state file size: {beacon_state_size} bytes !!!")

state: spec.BeaconState
with open(beacon_state_path, "rb") as f:
    state = spec.BeaconState.deserialize(f, beacon_state_size)


print(f"Successfully loaded the beacon state for {state.slot}")

beaconState = spec.BeaconState
print(f"BeaconState type: {beaconState}")
print(f"BeaconState type: {beaconState.__dict__}")


Beacon state file size: 281549813 bytes !!!
Successfully loaded the beacon state for 12073824
BeaconState type: <class 'eth2spec.electra.mainnet.BeaconState'>
BeaconState type: {'__module__': 'eth2spec.electra.mainnet', '__annotations__': {'genesis_time': <class 'remerkleable.basic.uint64'>, 'genesis_validators_root': <class 'eth2spec.electra.mainnet.Root'>, 'slot': <class 'eth2spec.electra.mainnet.Slot'>, 'fork': <class 'eth2spec.electra.mainnet.Fork'>, 'latest_block_header': <class 'eth2spec.electra.mainnet.BeaconBlockHeader'>, 'block_roots': <class 'remerkleable.complex.Vector.__class_getitem__.<locals>.FixedSpecialVectorView'>, 'state_roots': <class 'remerkleable.complex.Vector.__class_getitem__.<locals>.FixedSpecialVectorView'>, 'historical_roots': <class 'remerkleable.complex.List.__class_getitem__.<locals>.SpecialListView'>, 'eth1_data': <class 'eth2spec.electra.mainnet.Eth1Data'>, 'eth1_data_votes': <class 'remerkleable.complex.List.__class_getitem__.<locals>.SpecialListView'>,

#### Different ways to represent the Path (just to consider for our query). We may need to consider these!!!!!
https://github.com/ethereum/consensus-specs/issues/2179#issuecomment-759305714

- an anchor_type, Sequence[Union[int, SSZVariableName]]) tuple (like now, but abstracted away)
- an anchor_type, GeneralizedIndex tuple (what I think a Go/Rust implementation would naturally shift towards)
- a literal URI string (not fast, but readable)
- some kind of a pointer structure
- something precomputed

SSZ-QL provides enough flexibility to ask for any arbitrary field or data from any object within the Consensus Layer.

Below are several demonstration queries that highlight its capabilities:

In [None]:
import json

# Case 1: verify domain with the genesis_validators_root and the fork_version
# `POST` https://RPC/eth/v1/beacon/states/12073824/sszql?path=genesis_validators_root&fork_version&includeProofs=false
# {
#   "query": [
#     {
#       "path": ".genesis_validators_root",
#       "include_proof": false
#     },
#     {
#       "path": ".fork_version",
#       "include_proof": false
#     }
#   ]
# }

# sszql_query = json.dumps(
#     {
#         "query": [
#             {
#                 "path": ".genesis_validators_root",
#                 "include_proof": False
#             },
#             {
#                 "path": ".fork_version",
#                 "include_proof": False
#             }
#         ]
#     },
#     indent=2
# )

sszql_response = json.dumps(
    {
        "version": "electra",
        "execution_optimistic": True,
        "finalized": True,
        "data": [
            {
                "value": "0x"+state.genesis_validators_root.hex(), # in a subsequent iteration we can do this dynamically, grabbing the value from the sszql_query variable.
            },
            {
                "value": "0x"+state.fork.current_version.hex(),
            }
        ]
    },
    indent=2
)

print("SSZ-QL response: \n",sszql_response)


SSZ-QL response: 
 {
  "version": "electra",
  "execution_optimistic": true,
  "finalized": true,
  "data": [
    {
      "value": "0x4b363db94e286120d76eb905340fdd4e54bfe9f06bf33ff6cf5ad27f511bfe95"
    },
    {
      "value": "0x05000000"
    }
  ]
}


The previous SSZ-QL query shows only a minimal use case and offers little practical value because the returned fields are not cryptographically proven.

The core objective of SSZ-QL & GMP is to fetch any subset of fields efficiently. Bundling Merkle proofs with each response lets a client verify the data’s integrity without downloading the entire beacon state.

In [None]:
# Helper functions to compute Merkle proofs https://github.com/ethereum/consensus-specs/issues/2179#issuecomment-1399477411

from remerkleable.tree import merkle_hash
from remerkleable.core import Path

def compute_merkle_proof_for_state(state, gindex) -> tuple[int, list[hex]]:
    merkle_tree = state.get_backing()
    node = merkle_tree

    print("Beacon state root from beacon state:", merkle_tree.merkle_root().hex())

    # 1 less because bitlength returns 1 more than we need to shift, and 1 less because we don't care about the root
    check_bit = 1 << (gindex.bit_length()-2)
    print("check bit:", bin(check_bit))
    witness = []
    while check_bit > 0:
        if check_bit & gindex != 0:  # follow bit path of target gindex, and get sibling nodes as witness
            witness.append(node.get_left().merkle_root())
            node = node.get_right()
        else:
            witness.append(node.get_right().merkle_root())
            node = node.get_left()
        check_bit >>= 1

    print("Genesis validators root:", node.merkle_root().hex())
    print("witness:")

    for i, sib in enumerate(reversed(witness)):
        print(f"{i:3}: {sib.hex()}")

    # now let's see if we can verify the proof
    x = node.merkle_root()
    for i, sib in enumerate(reversed(witness)):
        if (1 << i) & gindex != 0:
            x = merkle_hash(sib, x)
        else:
            x = merkle_hash(x, sib)

    print("Reconstructed state root from proof and value, to verify against real state:", x.hex())

    return [sib.hex() for sib in reversed(witness)]



In [None]:
# Case 1: verify domain with the genesis_validators_root and the fork_version
# `POST` https://RPC/eth/v1/beacon/states/12073824/sszql?path=genesis_validators_root&fork_version&includeProofs=false
# {
#   "query": [
#     {
#       "path": ".genesis_validators_root",
#       "include_proof": true
#     },
#     {
#       "path": ".fork_version",
#       "include_proof": true
#     }
#   ]
# }

genesis_validators_root_path = Path(spec.BeaconState) / 'genesis_validators_root'
genesis_validators_root_gindex = genesis_validators_root_path.gindex()

fork_version_path = Path(spec.BeaconState) / 'fork' / 'current_version'
fork_version_gindex = fork_version_path.gindex()

print(f"Merkle path for genesis_validators_root: {genesis_validators_root_path}")
print(f"\nTarget gindex: {genesis_validators_root_gindex}")

print(f"Merkle path for genesis_validators_root: {fork_version_path}")
print(f"Target gindex: {fork_version_gindex}")

print(f"Fork current version: {state.fork.current_version.hex()}")

state_root = state.hash_tree_root()
print(f"Beacon state hash tree root: {state_root.hex()}")

Merkle path for genesis_validators_root: <remerkleable.core.Path object at 0x105bbea10>

Target gindex: 65
Merkle path for genesis_validators_root: <remerkleable.core.Path object at 0x105bbcd90>
Target gindex: 269
Fork current version: 05000000
Beacon state hash tree root: 20a9492caf11e9756c9bb4ad296c683d9f532153505d01ad2272c81329b0bb39


With the previous code snippet, we got the Merkle Proofs of the genesis validators root

We need to parse this into a proper response:

In [None]:
# `POST` https://RPC/eth/v1/beacon/states/12073824/sszql?path=genesis_validators_root&fork_version&includeProofs=false
# {
#     "path": [".fork_version"],
#     "include_proof": true,
# }

genesis_validators_root_witness=compute_merkle_proof_for_state(state, genesis_validators_root_gindex)

fork_version_witness=compute_merkle_proof_for_state(state, fork_version_gindex)

sszql_response = json.dumps(
    {
        "version": "electra",
        "execution_optimistic": True,
        "finalized": True,
        "data": [
            {
                "value": "0x"+state.genesis_validators_root.hex(), # in a subsequent iteration we can do this dynamically, grabbing the value from the sszql_query variable.
                "proofs": [
                    {
                        # "path": genesis_validators_root_path.hex(),
                        "gindex": genesis_validators_root_gindex,
                        "witness": [sib for sib in genesis_validators_root_witness],
                    }
                ],
                "state_root": "0x"+state_root.hex()
            },
            {
                "value": "0x"+state.fork.current_version.hex(),
                "proofs": [
                    {
                        "gindex": fork_version_gindex,
                        "witness": [sib for sib in fork_version_witness],
                    }
                ],
                "state_root": "0x"+state_root.hex() # it is probably redundant to include a state_root per value. Are we allowing different state_roots per query? In this case we would need to include it in the query.
            }
        ]
    },
    indent=2
)


print("\n\nSSZ-QL response: \n",sszql_response)

Beacon state root from beacon state: 20a9492caf11e9756c9bb4ad296c683d9f532153505d01ad2272c81329b0bb39
check bit: 0b100000
Genesis validators root: 4b363db94e286120d76eb905340fdd4e54bfe9f06bf33ff6cf5ad27f511bfe95
witness:
  0: 5730c65f00000000000000000000000000000000000000000000000000000000
  1: a1c7156c56303f50c74571c98d31dff9063b2be7cb7730807016faa2c32204c5
  2: 0d57648ba4f17796fe73a4ca852cba87afec830ca6be8255ffe5a30b50790963
  3: 3fa04f6128a85cbed6bbcc0b287ab2c1c63b05f6e656c900e5cd179c231406d1
  4: ca763deccfc214e4f693fe664b1cd2fc1726c79498251ee5b482bfac6e8d8981
  5: 1558f2526b4d5f66595d61fcd37a0bf3385e64934d9fadd0fe5ec091ae5ba412
Reconstructed state root from proof and value, to verify against real state: 20a9492caf11e9756c9bb4ad296c683d9f532153505d01ad2272c81329b0bb39
Beacon state root from beacon state: 20a9492caf11e9756c9bb4ad296c683d9f532153505d01ad2272c81329b0bb39
check bit: 0b10000000
Genesis validators root: 0500000000000000000000000000000000000000000000000000000000000000
wit

As expected, most single-leaf proofs share identical hash nodes because the leaves lie on overlapping Merkle paths. A compact Merkle multiproof eliminates this redundancy by proving all target leaves in one shot, shipping only the minimal witness set.