The Ethspectoor

Deterministic extraction and exploration of Ethereum specification data. Parses every spec repo (consensus, execution, builder, relay, beacon APIs, execution APIs, remote signing) into structured indexes, then serves them over MCP and a static explorer UI.

1,083 types, 168 endpoints, 355 constants, 47 type aliases across 7 specs.

Live Explorer

ethspectoor.blockspaceforum.com or open docs/index.html locally. No build step, no dependencies.

Tabs

Specs -- overview of all indexed specification repos with item counts, fork timelines, and one-sentence descriptions.
Types -- browse and search all types, functions, and classes with fuzzy matching, fork-aware code display, syntax highlighting, and clickable cross-references. Three-panel layout: sidebar filters, item list, detail view.
Endpoints -- REST and JSON-RPC endpoints with parameters, response types, SSZ support indicators, and fork variants.
Diff -- compare what changed between any two forks per spec. Inline side-by-side code diff with LCS-based line highlighting (green for added, red for removed, aligned gutters).
PRs -- browse indexed open pull requests against spec repos. Each PR shows which types it adds, modifies, or removes with inline diff previews against mainline.
Visualizer (visualizer.html) -- fork-aware transaction lifecycle diagram. Shows the PBS data flow (consensus, execution, builder, relay, sidecar) for deneb through fulu, and switches to the ePBS path (EIP-7732) for gloas with P2P bid gossip, payload revelation, and PTC voting.
About (about.html) -- overview of the project, tools, and setup for first-time visitors.

PR Viewer

The PR tab tracks open pull requests against spec repos and shows their impact before they merge. For each PR you can see:

Which types/functions are added, modified, or removed
Inline side-by-side code diff with line-level highlighting
Field-level diff summaries (+3 fields, -1 field, ~2 fields)
Direct links to the source PR on GitHub

PR data is generated by pr_index.py and embedded as overlays in catalog.json. The PR viewer and MCP server share the same data.

Quick Start

Requires Python 3.10+ and git.

# Install dependencies (pyyaml for build, mcp for server)
pip install pyyaml mcp
# or with uv:
uv pip install pyyaml mcp

# Build everything: clones all 7 spec repos, extracts, links, builds catalog
python3 build.py --all

# Open the explorer
open docs/index.html

# Start the MCP server
python3 server.py --catalog docs/catalog.json

That's it. build.py --all handles cloning repos (to ./repos/specs/), building per-spec indexes, cross-reference linking, and assembling the final catalog.json. First run takes a few minutes to clone; subsequent runs pull updates and rebuild.

To include PR overlays (requires GITHUB_TOKEN):

python3 build.py --all --include-prs

MCP Server

The MCP server exposes 10 tools over stdio transport. AI agents (Claude, Hermes, Cursor, etc.) can query any type, endpoint, or PR across all specs with structured responses.

# stdio transport (for agent integration)
python3 server.py

# custom catalog and repos directory (enables reindex)
python3 server.py --catalog docs/catalog.json --repos-dir ./repos

# rebuild everything before starting
python3 server.py --rebuild --repos-dir ./repos

Or run without a persistent venv:

uv run --with mcp --with pyyaml python3 server.py --catalog docs/catalog.json

Tools

Tool	Description
`list_specs`	List all indexed specs with item counts, endpoint counts, and available forks
`lookup_type`	Look up a type, function, or container by name. Returns fields, code, source link, references, and EIP associations. Supports fuzzy matching and PR fork resolution
`lookup_endpoint`	Search API endpoints by path, operation name, or keyword. Returns parameters, response types, SSZ support, and fork variants
`what_changed`	Show what was added or modified in a specific fork. Includes EIP associations
`trace_type`	Trace a type across spec boundaries. Shows where it is defined, who uses it, and cross-spec references
`search`	Fuzzy search across all spec items, constants, type aliases, and endpoints
`diff_type`	Compare a type or function between two forks. Shows field additions, removals, and code changes
`list_prs`	List indexed PR overlays with PR number, title, author, and what changed
`index_pr`	Index a GitHub PR as a virtual fork. Makes it queryable via `pr-NNNN` fork syntax
`reindex`	Rebuild spec indexes from source repos and reload. Requires `--repos-dir`

Client Configuration

Hermes / Claude Desktop (config.yaml)

mcp:
  ethspectoor:
    command: "uv"
    args:
      - "run"
      - "--with"
      - "mcp"
      - "--with"
      - "pyyaml"
      - "python3"
      - "/path/to/ethspectoor/server.py"
      - "--catalog"
      - "/path/to/ethspectoor/docs/catalog.json"
      - "--indexes-dir"
      - "/path/to/ethspectoor/indexes"
      - "--repos-dir"
      - "/path/to/ethspectoor/repos/specs"

The --indexes-dir and --repos-dir flags are optional but enable the reindex and index_pr tools to rebuild indexes without restarting.

Data Flow

Both the MCP server and the explorer UI read the same artifact: catalog.json. Types that appear in multiple specs are merged with canonical-source attribution (e.g. BeaconState resolves to consensus-specs, not beacon-apis). No drift between what agents see and what the UI shows.

repos/ --> build.py --> indexes/ (per-spec, intermediate)
                            |
                        link.py --> _cross_refs.json
                            |
                    build_catalog.py --> catalog.json (canonical)
                            |                  |
                    pr_index.py (overlays)      |
                                        +------+------+
                                        |             |
                                  server.py (MCP)  docs/ (UI)

PR Shadow Indexes

Track open PRs against spec repos as virtual forks. PRs are indexed but invisible to normal queries. Reference them explicitly to see full resulting types, field-level diffs, and cross-type impact.

# Index all open PRs for consensus-specs
python3 pr_index.py --spec consensus-specs --repo-dir ./repos/specs/consensus-specs

# Index a single PR
python3 pr_index.py --spec consensus-specs --repo-dir ./repos/specs/consensus-specs --pr 1234

# Clean up merged/closed PRs
python3 pr_index.py --spec consensus-specs --cleanup

# List indexed PRs
python3 pr_index.py --list

Requires GITHUB_TOKEN env var or --github-token for API access.

Querying PR Data via MCP

PR forks use the naming convention pr-{number}:

list_prs(spec="consensus-specs")
  -> PR #4123: "Add exit queue to BeaconState" (gloas, 3 items changed)

lookup_type("BeaconState", fork="pr-4123")
  -> full BeaconState as it would look after the PR

diff_type("BeaconState", from_fork="gloas", to_fork="pr-4123")
  -> field-level diff: what the PR adds/removes/modifies

what_changed(fork="pr-4123")
  -> all items the PR touches with action (added/modified/removed)

Normal queries (no PR fork specified) never see PR data.

Building with PR Overlays

# Build catalog including PR overlays
python3 build_catalog.py --indexes-dir ./indexes --output docs/catalog.json --include-prs

# Or start MCP server with --rebuild to include PRs
python3 server.py --rebuild --repos-dir ./repos --include-prs

CI/CD

GitHub Actions rebuilds the catalog and deploys to GitHub Pages on every push to main. A scheduled workflow runs daily to pull upstream spec changes and re-index open PRs.

See .github/workflows/deploy.yml. The pipeline runs:

python3 build.py --all --include-prs

This clones/updates all spec repos, builds indexes, links cross-references, indexes open PRs, and assembles the catalog. The docs/ directory is then deployed to GitHub Pages.

Spec Coverage

Spec	Items	Endpoints	Constants	Extractor	Forks
consensus-specs	528	--	218	Python AST	phase0 through heze
execution-specs	298	--	135	Python AST	frontier through amsterdam
execution-apis	93	72	--	OpenRPC	paris through amsterdam
beacon-apis	77	84	--	OpenAPI + Markdown	phase0 through gloas
remote-signing-api	59	2	--	OpenAPI	phase0 through fulu
builder-specs	16	5	2	OpenAPI + Markdown	bellatrix through fulu
relay-specs	12	5	--	OpenAPI + Markdown	bellatrix through fulu

Build

Full build (recommended)

python3 build.py --all

This clones all spec repos (if not already present), builds per-spec indexes, runs cross-reference linking, and assembles docs/catalog.json.

Single spec

# Auto-clones the repo if needed
python3 build.py --profile consensus-specs

# Or point at an existing local clone
python3 build.py --profile builder-specs --repo-dir /path/to/builder-specs

After building individual specs, run linking and catalog assembly manually:

python3 link.py --indexes-dir ./indexes
python3 build_catalog.py --indexes-dir ./indexes --output docs/catalog.json

What each step does

build.py extracts types, endpoints, constants, and fork metadata from a spec repo and writes a {spec}_index.json to ./indexes/.
link.py resolves cross-spec type references (e.g. beacon-apis types referencing consensus-specs containers).
build_catalog.py merges all indexes into catalog.json, deduplicating shared types across specs using canonical-source attribution. This is the single artifact consumed by both the MCP server and the explorer UI.

Architecture

.
├── build.py                  # orchestrates extraction per spec profile
├── build_catalog.py          # merges indexes into catalog.json (canonical artifact)
├── pr_index.py               # PR shadow indexer (fetch, extract, diff open PRs)
├── link.py                   # cross-spec reference resolution
├── server.py                 # MCP server (10 tools, reads catalog.json)
├── fetch_repos.sh            # clones all spec repos
├── extractors/
│   ├── profiles.py           # spec profiles (paths, fork orders, extractor config)
│   ├── extract_python.py     # Python AST extractor (consensus-specs, execution-specs)
│   ├── extract_openapi.py    # OpenAPI extractor (beacon-apis, builder-specs, relay-specs, remote-signing-api)
│   ├── extract_openrpc.py    # OpenRPC extractor (execution-apis)
│   ├── extract_markdown.py   # Markdown type/endpoint extractor (beacon-apis, builder-specs)
│   ├── enrich.py             # structural annotation (fields, params, references, domains)
│   └── fetch_examples.py     # test fixture fetcher (standalone)
├── indexes/                  # generated per-spec indexes (intermediate build artifacts)
│   └── pr/                   # PR overlay indexes (per-spec, per-PR)
├── docs/
│   ├── index.html            # HTML shell (loads app.js, no inline logic)
│   ├── about.html            # project overview, MCP docs, skill card
│   ├── visualizer.html       # fork-aware transaction lifecycle diagram (PBS + ePBS)
│   ├── catalog.json          # canonical data (from build_catalog.py)
│   ├── SKILL.md              # MCP skill document for AI agents
│   ├── logo.svg              # site logo
│   ├── favicon.svg           # browser tab icon
│   ├── css/
│   │   ├── styles.css        # shared styles (layout, nav, search, detail panels)
│   │   ├── about.css         # about page styles
│   │   └── visualizer.css    # visualizer page styles
│   ├── js/
│   │   ├── app.js            # entry point (init, routing, global bindings)
│   │   ├── state.js          # shared state (catalog data, selections)
│   │   ├── constants.js      # fork orders, spec colors, kind/method badges
│   │   ├── utils.js          # HTML escaping, ID sanitization
│   │   ├── forks.js          # fork sorting, code-for-fork resolution
│   │   ├── search.js         # fuzzy scoring and highlighting
│   │   ├── diff.js           # LCS-based line diff engine
│   │   ├── router.js         # hash-based routing and navigation
│   │   ├── url.js            # URL parameter parsing
│   │   └── views/
│   │       ├── home.js       # specs overview + setup/MCP/skill sections
│   │       ├── types.js      # type browser (three-panel, filters, detail)
│   │       ├── endpoints.js  # endpoint browser
│   │       ├── prs.js        # PR browser with inline diffs
│   │       ├── diff-view.js  # fork-to-fork diff comparison
│   │       └── skill-modal.js # SKILL.md viewer/copy modal
│   └── js/__tests__/
│       ├── constants.test.js
│       ├── diff.test.js
│       ├── forks.test.js
│       ├── router.test.js
│       ├── search.test.js
│       └── utils.test.js
├── .github/workflows/
│   └── deploy.yml            # CI: deno test -> rebuild catalog -> deploy to GitHub Pages
├── SCHEMA.md                 # index JSON schema documentation
├── CLAUDE.md                 # agent context (enzyme CLI, project conventions)
└── PLAN.md                   # development roadmap

Extractors

Each extractor handles one source format:

Python AST (extract_python.py): Walks Python source files, extracts class/function definitions with full code, tracks fork modifications via [New in fork] / [Modified in fork] annotations.
OpenAPI (extract_openapi.py): Parses OpenAPI YAML, resolves $ref chains, extracts endpoints with parameters, response types, SSZ support, and fork variants.
OpenRPC (extract_openrpc.py): Parses OpenRPC JSON, extracts JSON-RPC methods with params, results, error codes, and content descriptors.
Markdown (extract_markdown.py): Extracts type definitions and endpoint descriptions from Markdown spec pages (used alongside OpenAPI for specs that document types in prose).

Enrichment

enrich.py adds structural metadata after extraction: field lists for containers, function signatures, reference graphs between types, domain classification, and fork diff annotations (is_new, is_modified).

Profiles

profiles.py defines the extraction configuration for each spec: which extractors to run, directory paths within the repo, fork ordering, GitHub URL templates, and any spec-specific extraction options.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

The Ethspectoor

Live Explorer

Tabs

PR Viewer

Quick Start

MCP Server

Tools

Client Configuration

Data Flow

PR Shadow Indexes

Querying PR Data via MCP

Building with PR Overlays

CI/CD

Spec Coverage

Build

Full build (recommended)

Single spec

What each step does

Architecture

Extractors

Enrichment

Profiles

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
.github/workflows		.github/workflows
docs		docs
extractors		extractors
.gitignore		.gitignore
README.md		README.md
SCHEMA.md		SCHEMA.md
build.py		build.py
build_catalog.py		build_catalog.py
fetch_repos.sh		fetch_repos.sh
link.py		link.py
pr_index.py		pr_index.py
server.py		server.py

Folders and files

Latest commit

History

Repository files navigation

The Ethspectoor

Live Explorer

Tabs

PR Viewer

Quick Start

MCP Server

Tools

Client Configuration

Data Flow

PR Shadow Indexes

Querying PR Data via MCP

Building with PR Overlays

CI/CD

Spec Coverage

Build

Full build (recommended)

Single spec

What each step does

Architecture

Extractors

Enrichment

Profiles

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages