Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 10 additions & 15 deletions .claude/rules/test-framework.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,17 +10,15 @@ paths:

1. **Unit tests** (`tests/lean_spec/`) - Standard pytest tests for implementation
2. **Spec tests** (`tests/consensus/`) - Generate JSON test vectors via fillers
- *Note: `tests/execution/` infrastructure is ready for future execution layer work*

**Test Filling Framework:**

- Layer-agnostic pytest plugin in `packages/testing/src/framework/pytest_plugins/filler.py`
- Layer-specific packages: `consensus_testing` (active) and `execution_testing` (future)
- Pytest plugin in `packages/testing/src/framework/pytest_plugins/filler.py`
- Consensus fixture package: `consensus_testing`
- Write consensus spec tests using `state_transition_test` or `fork_choice_test` fixtures
- These fixtures are type aliases that create test vectors when called
- Run `uv run fill --fork=Lstar --clean -n auto` to generate consensus fixtures
- Use `--layer=execution` flag when execution layer is implemented
- Output goes to `fixtures/{layer}/{format}/{test_path}/...`
- Output goes to `fixtures/consensus/{format}/{test_path}/...`

**Example spec test:**

Expand All @@ -40,29 +38,26 @@ def test_block(state_transition_test: StateTransitionTestFiller) -> None:
3. `make_fixture()` executes the spec code (state transitions, fork choice steps)
4. Validates output against expectations (`StateExpectation`, `StoreChecks`)
5. Serializes to JSON via Pydantic's `model_dump(mode="json")`
6. Writes fixtures at session end to `fixtures/{layer}/{format}/{test_path}/...`
6. Writes fixtures at session end to `fixtures/consensus/{format}/{test_path}/...`

**Layer-specific architecture:**
**Package architecture:**

- `framework/` - Shared infrastructure (base classes, pytest plugin, CLI)
- `consensus_testing/` - Consensus layer fixtures, forks, builders
- `execution_testing/` - Execution layer fixtures, forks, builders
- `framework/` - Pytest plugin, CLI entry points, fork registry infrastructure
- `consensus_testing/` - Consensus fixtures, forks, builders
- Regular pytest runs (`uv run pytest`) ignore spec tests - they only run via `fill` command

**Serialization requirements:**

- All spec types (State, Block, Uint64, etc.) must be Pydantic models
- Custom types need `@field_serializer` or `model_serializer` for JSON output
- SSZ types typically serialize to hex strings (e.g., `"0x1234..."`)
- Fixture models inherit from layer-specific base classes:
- Consensus: `BaseConsensusFixture` (in `consensus_testing/test_fixtures/base.py`)
- Execution: `BaseExecutionFixture` (in `execution_testing/test_fixtures/base.py`)
- Both use `CamelModel` for camelCase JSON output
- Fixture models inherit from `BaseConsensusFixture` (in
`consensus_testing/test_fixtures/base.py`), which uses `CamelModel` for
camelCase JSON output
- Test the serialization: `fixture.model_dump(mode="json")` must produce valid JSON

**Key fixture types:**

- `StateTransitionTest` - Tests state transitions with blocks
- `ForkChoiceTest` - Tests fork choice with steps (tick/block/attestation)
- Selective validation via `StateExpectation` and `StoreChecks` (only validates fields you specify)

3 changes: 0 additions & 3 deletions .claude/rules/workflow.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,6 @@ uv sync # Install dependencies
uv run pytest # Run unit tests
uv run fill --fork=lstar --clean -n auto # Generate test vectors
uv run fill --fork=lstar --clean -n auto --scheme=prod # Generate test vectors with production scheme
# Note: execution layer support is planned for future, infrastructure is ready
# for now, `--layer=consensus` is default and the only value used.
```

## Code Quality
Expand All @@ -27,5 +25,4 @@ just # List all available recipes
- **Subspecs**: `src/lean_spec/subspecs/{subspec}/`
- **Unit tests**: `tests/lean_spec/` (mirrors source structure)
- **Consensus spec tests**: `tests/consensus/` (generates test vectors)
- **Execution spec tests**: `tests/execution/` (future - infrastructure ready)

4 changes: 2 additions & 2 deletions .claude/skills/spec-diff/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,8 +11,8 @@ Show what changed in the **spec code** (`src/lean_spec/`) and **consensus test v
**Scope**: Protocol-level spec types, functions, containers, forkchoice logic, and the
test fixtures that generate cross-client test vectors.

**Excluded**: Test framework infrastructure (`packages/testing/`, `consensus_testing/`,
`execution_testing/`), unit tests (`tests/lean_spec/`), interop tests (`tests/interop/`),
**Excluded**: Test framework infrastructure (`packages/testing/`, `consensus_testing/`),
unit tests (`tests/lean_spec/`), interop tests (`tests/interop/`),
documentation (`docs/`), CI/tooling configs, and the node implementation layer
(networking, sync, storage, node runner).

Expand Down
115 changes: 95 additions & 20 deletions packages/testing/src/consensus_testing/test_fixtures/base.py
Original file line number Diff line number Diff line change
@@ -1,22 +1,43 @@
"""Base fixture definitions for consensus test formats."""

import hashlib
import json
from functools import cached_property
from typing import Any, ClassVar

from framework.test_fixtures import BaseFixture
from pydantic import field_serializer
from framework.forks import BaseFork
from pydantic import Field, field_serializer

from lean_spec.base import CamelModel
from lean_spec.config import LEAN_ENV

class BaseConsensusFixture(BaseFixture):

class BaseConsensusFixture(CamelModel):
"""
Base class for all consensus test fixtures.

Inherits shared functionality from framework.fixtures.BaseFixture
and adds consensus-specific behavior if needed.
Provides:
- JSON serialization with custom encoders
- Hash generation for fixtures
- Common metadata handling
"""

# Class-level registry of all consensus fixture formats
# Override parent's formats to maintain a separate registry
formats: ClassVar[dict[str, type["BaseConsensusFixture"]]] = {}
# Fixture format metadata
format_name: ClassVar[str] = ""
"""The name of this fixture format (e.g., 'state_transition_test')."""

description: ClassVar[str] = "Unknown fixture format"
"""Human-readable description of what this fixture tests."""

# Instance fields
network: str | None = None
"""The fork/network this fixture is valid for (e.g., 'Devnet', 'Shanghai')."""

lean_env: str = Field(default=LEAN_ENV)
"""The target lean environment (e.g. 'test' or 'prod')."""

info: dict[str, Any] = Field(default_factory=dict, alias="_info")
"""Metadata about the test (description, fork, etc.)."""

expect_exception: type[Exception] | None = None
"""
Expand All @@ -26,18 +47,6 @@ class BaseConsensusFixture(BaseFixture):
The test passes only if the exception is raised.
"""

@classmethod
def __pydantic_init_subclass__(cls, **kwargs: Any) -> None:
"""
Auto-register consensus fixture formats when subclasses are defined.

Overrides parent to register in BaseConsensusFixture.formats instead
of BaseFixture.formats.
"""
super().__pydantic_init_subclass__(**kwargs)
if cls.format_name:
BaseConsensusFixture.formats[cls.format_name] = cls

@field_serializer("expect_exception", when_used="json")
def serialize_exception(self, exception_type: type[Exception] | None) -> str | None:
"""Serialize exception type to its class name for JSON output."""
Expand Down Expand Up @@ -73,3 +82,69 @@ def assert_expected_outcome(self, exception_raised: Exception | None) -> None:
f"Expected {self.expect_exception.__name__} but got "
f"{type(exception_raised).__name__}: {exception_raised}"
)

@cached_property
def json_dict(self) -> dict[str, Any]:
"""
Return the JSON representation of the fixture.

Excludes the `info` field and converts snake_case to camelCase.
"""
return self.to_json(
exclude_none=True,
exclude={"info"},
)

@cached_property
def hash(self) -> str:
"""
Generate a deterministic hash for this fixture.

The hash is computed from the JSON representation to ensure
consistency across runs.
"""
json_str = json.dumps(
self.json_dict,
sort_keys=True,
separators=(",", ":"),
)
h = hashlib.sha256(json_str.encode("utf-8")).hexdigest()
return f"0x{h}"

def json_dict_with_info(self, hash_only: bool = False) -> dict[str, Any]:
"""
Return JSON representation with the info field included.

Args:
hash_only: If True, only include the hash in _info.

Returns:
Dictionary ready for JSON serialization.
"""
dict_with_info = self.json_dict.copy()
dict_with_info["_info"] = {"hash": self.hash}
if not hash_only:
dict_with_info["_info"].update(self.info)
return dict_with_info

def fill_info(
self,
test_id: str,
description: str,
fork: BaseFork,
) -> None:
"""
Fill metadata information for this fixture.

Args:
test_id: Unique identifier for the test case.
description: Human-readable description of the test.
fork: The fork this test is valid for.
"""
if "comment" not in self.info:
self.info["comment"] = "`leanSpec` generated test"
self.info["testId"] = test_id
self.info["description"] = description
self.info["fixtureFormat"] = self.format_name
# Set network field on the fixture itself
self.network = fork.name()
7 changes: 1 addition & 6 deletions packages/testing/src/framework/__init__.py
Original file line number Diff line number Diff line change
@@ -1,6 +1 @@
"""
Shared testing infrastructure for Ethereum consensus and execution layers.

This module provides base classes and utilities that are common across
both consensus and execution layer testing.
"""
"""Shared testing infrastructure for Lean Ethereum spec tests."""
39 changes: 12 additions & 27 deletions packages/testing/src/framework/cli/fill.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
"""Unified CLI command for generating Ethereum test fixtures across all layers."""
"""CLI command for generating Lean Ethereum consensus test fixtures."""

import os
import sys
Expand All @@ -25,13 +25,7 @@
@click.option(
"--fork",
required=True,
help="Fork to generate fixtures for (e.g., Lstar for consensus)",
)
@click.option(
"--layer",
type=click.Choice(["consensus", "execution"], case_sensitive=False),
default="consensus",
help="Ethereum layer to generate fixtures for (default: consensus)",
help="Fork to generate fixtures for (e.g., Lstar)",
)
@click.option(
"--clean",
Expand All @@ -50,21 +44,14 @@ def fill(
pytest_args: Sequence[str],
output: str,
fork: str,
layer: str,
clean: bool,
scheme: str,
) -> None:
"""
Generate Ethereum test fixtures from test specifications.

This unified command works across both consensus and execution layers.
The --layer flag determines which layer's forks and fixtures to use.
Generate consensus test fixtures from test specifications.

Examples:
# Generate consensus layer fixtures
fill tests/consensus/devnet --fork=Lstar --layer=consensus --clean -v

# Default layer is consensus
# Generate consensus fixtures
fill tests/consensus/devnet --fork=Lstar --clean -v

# Use specific XMSS scheme (overrides LEAN_ENV env var)
Expand All @@ -75,17 +62,16 @@ def fill(
# environment.
os.environ["LEAN_ENV"] = scheme.lower()

# Check and download keys if needed (only for consensus layer)
if layer.lower() == "consensus":
# Import here to avoid loading leanSpec modules before LEAN_ENV is set
from consensus_testing.keys import download_keys, get_keys_directory
# Check and download keys if needed
# Import here to avoid loading leanSpec modules before LEAN_ENV is set
from consensus_testing.keys import download_keys, get_keys_directory

keys_directory = get_keys_directory(scheme.lower())
keys_directory = get_keys_directory(scheme.lower())

# Check if keys already exist, if not, download them
if not (keys_directory.exists() and any(keys_directory.glob("*.json"))):
click.echo(f"Test keys for '{scheme}' scheme not found. Downloading...")
download_keys(scheme.lower())
# Check if keys already exist, if not, download them
if not (keys_directory.exists() and any(keys_directory.glob("*.json"))):
click.echo(f"Test keys for '{scheme}' scheme not found. Downloading...")
download_keys(scheme.lower())

config_path = Path(__file__).parent / "pytest_ini_files" / "pytest-fill.ini"
# Find project root by looking for pyproject.toml with [tool.uv.workspace]
Expand All @@ -105,7 +91,6 @@ def fill(
f"--rootdir={project_root}",
f"--output={output}",
f"--fork={fork}",
f"--layer={layer}",
]

if clean:
Expand Down
Original file line number Diff line number Diff line change
@@ -1,9 +1,8 @@
[pytest]
# Configuration for fill command

# Search for layer-specific tests
# The actual testpath will be determined dynamically by the --layer flag
# in the pytest plugin
# Search for spec tests
# The pytest plugin restricts collection to the consensus spec tests
testpaths = tests

# Load pytest plugins
Expand Down
Loading
Loading