Deterministic extraction and exploration of Ethereum specification data. Parses every spec repo (consensus, execution, builder, relay, beacon APIs, execution APIs, remote signing) into structured indexes, then serves them over MCP and a static explorer UI.
1,083 types, 168 endpoints, 355 constants, 47 type aliases across 7 specs.
ethspectoor.blockspaceforum.com or
open docs/index.html locally. No build step, no dependencies.
- Specs -- overview of all indexed specification repos with item counts, fork timelines, and one-sentence descriptions.
- Types -- browse and search all types, functions, and classes with fuzzy matching, fork-aware code display, syntax highlighting, and clickable cross-references. Three-panel layout: sidebar filters, item list, detail view.
- Endpoints -- REST and JSON-RPC endpoints with parameters, response types, SSZ support indicators, and fork variants.
- Diff -- compare what changed between any two forks per spec. Inline side-by-side code diff with LCS-based line highlighting (green for added, red for removed, aligned gutters).
- PRs -- browse indexed open pull requests against spec repos. Each PR shows which types it adds, modifies, or removes with inline diff previews against mainline.
- Visualizer (
visualizer.html) -- fork-aware transaction lifecycle diagram. Shows the PBS data flow (consensus, execution, builder, relay, sidecar) for deneb through fulu, and switches to the ePBS path (EIP-7732) for gloas with P2P bid gossip, payload revelation, and PTC voting. - About (
about.html) -- overview of the project, tools, and setup for first-time visitors.
The PR tab tracks open pull requests against spec repos and shows their impact before they merge. For each PR you can see:
- Which types/functions are added, modified, or removed
- Inline side-by-side code diff with line-level highlighting
- Field-level diff summaries (+3 fields, -1 field, ~2 fields)
- Direct links to the source PR on GitHub
PR data is generated by pr_index.py and embedded as overlays in
catalog.json. The PR viewer and MCP server share the same data.
Requires Python 3.10+ and git.
# Install dependencies (pyyaml for build, mcp for server)
pip install pyyaml mcp
# or with uv:
uv pip install pyyaml mcp
# Build everything: clones all 7 spec repos, extracts, links, builds catalog
python3 build.py --all
# Open the explorer
open docs/index.html
# Start the MCP server
python3 server.py --catalog docs/catalog.jsonThat's it. build.py --all handles cloning repos (to ./repos/specs/),
building per-spec indexes, cross-reference linking, and assembling the
final catalog.json. First run takes a few minutes to clone; subsequent
runs pull updates and rebuild.
To include PR overlays (requires GITHUB_TOKEN):
python3 build.py --all --include-prsThe MCP server exposes 10 tools over stdio transport. AI agents (Claude, Hermes, Cursor, etc.) can query any type, endpoint, or PR across all specs with structured responses.
# stdio transport (for agent integration)
python3 server.py
# custom catalog and repos directory (enables reindex)
python3 server.py --catalog docs/catalog.json --repos-dir ./repos
# rebuild everything before starting
python3 server.py --rebuild --repos-dir ./reposOr run without a persistent venv:
uv run --with mcp --with pyyaml python3 server.py --catalog docs/catalog.json| Tool | Description |
|---|---|
list_specs |
List all indexed specs with item counts, endpoint counts, and available forks |
lookup_type |
Look up a type, function, or container by name. Returns fields, code, source link, references, and EIP associations. Supports fuzzy matching and PR fork resolution |
lookup_endpoint |
Search API endpoints by path, operation name, or keyword. Returns parameters, response types, SSZ support, and fork variants |
what_changed |
Show what was added or modified in a specific fork. Includes EIP associations |
trace_type |
Trace a type across spec boundaries. Shows where it is defined, who uses it, and cross-spec references |
search |
Fuzzy search across all spec items, constants, type aliases, and endpoints |
diff_type |
Compare a type or function between two forks. Shows field additions, removals, and code changes |
list_prs |
List indexed PR overlays with PR number, title, author, and what changed |
index_pr |
Index a GitHub PR as a virtual fork. Makes it queryable via pr-NNNN fork syntax |
reindex |
Rebuild spec indexes from source repos and reload. Requires --repos-dir |
Hermes / Claude Desktop (config.yaml)
mcp:
ethspectoor:
command: "uv"
args:
- "run"
- "--with"
- "mcp"
- "--with"
- "pyyaml"
- "python3"
- "/path/to/ethspectoor/server.py"
- "--catalog"
- "/path/to/ethspectoor/docs/catalog.json"
- "--indexes-dir"
- "/path/to/ethspectoor/indexes"
- "--repos-dir"
- "/path/to/ethspectoor/repos/specs"The --indexes-dir and --repos-dir flags are optional but enable the
reindex and index_pr tools to rebuild indexes without restarting.
Both the MCP server and the explorer UI read the same artifact: catalog.json.
Types that appear in multiple specs are merged with canonical-source attribution
(e.g. BeaconState resolves to consensus-specs, not beacon-apis). No drift
between what agents see and what the UI shows.
repos/ --> build.py --> indexes/ (per-spec, intermediate)
|
link.py --> _cross_refs.json
|
build_catalog.py --> catalog.json (canonical)
| |
pr_index.py (overlays) |
+------+------+
| |
server.py (MCP) docs/ (UI)
Track open PRs against spec repos as virtual forks. PRs are indexed but invisible to normal queries. Reference them explicitly to see full resulting types, field-level diffs, and cross-type impact.
# Index all open PRs for consensus-specs
python3 pr_index.py --spec consensus-specs --repo-dir ./repos/specs/consensus-specs
# Index a single PR
python3 pr_index.py --spec consensus-specs --repo-dir ./repos/specs/consensus-specs --pr 1234
# Clean up merged/closed PRs
python3 pr_index.py --spec consensus-specs --cleanup
# List indexed PRs
python3 pr_index.py --listRequires GITHUB_TOKEN env var or --github-token for API access.
PR forks use the naming convention pr-{number}:
list_prs(spec="consensus-specs")
-> PR #4123: "Add exit queue to BeaconState" (gloas, 3 items changed)
lookup_type("BeaconState", fork="pr-4123")
-> full BeaconState as it would look after the PR
diff_type("BeaconState", from_fork="gloas", to_fork="pr-4123")
-> field-level diff: what the PR adds/removes/modifies
what_changed(fork="pr-4123")
-> all items the PR touches with action (added/modified/removed)
Normal queries (no PR fork specified) never see PR data.
# Build catalog including PR overlays
python3 build_catalog.py --indexes-dir ./indexes --output docs/catalog.json --include-prs
# Or start MCP server with --rebuild to include PRs
python3 server.py --rebuild --repos-dir ./repos --include-prsGitHub Actions rebuilds the catalog and deploys to GitHub Pages on every push
to main. A scheduled workflow runs daily to pull upstream spec changes and
re-index open PRs.
See .github/workflows/deploy.yml. The pipeline runs:
python3 build.py --all --include-prsThis clones/updates all spec repos, builds indexes, links cross-references,
indexes open PRs, and assembles the catalog. The docs/ directory is then
deployed to GitHub Pages.
| Spec | Items | Endpoints | Constants | Extractor | Forks |
|---|---|---|---|---|---|
| consensus-specs | 528 | -- | 218 | Python AST | phase0 through heze |
| execution-specs | 298 | -- | 135 | Python AST | frontier through amsterdam |
| execution-apis | 93 | 72 | -- | OpenRPC | paris through amsterdam |
| beacon-apis | 77 | 84 | -- | OpenAPI + Markdown | phase0 through gloas |
| remote-signing-api | 59 | 2 | -- | OpenAPI | phase0 through fulu |
| builder-specs | 16 | 5 | 2 | OpenAPI + Markdown | bellatrix through fulu |
| relay-specs | 12 | 5 | -- | OpenAPI + Markdown | bellatrix through fulu |
python3 build.py --allThis clones all spec repos (if not already present), builds per-spec indexes,
runs cross-reference linking, and assembles docs/catalog.json.
# Auto-clones the repo if needed
python3 build.py --profile consensus-specs
# Or point at an existing local clone
python3 build.py --profile builder-specs --repo-dir /path/to/builder-specsAfter building individual specs, run linking and catalog assembly manually:
python3 link.py --indexes-dir ./indexes
python3 build_catalog.py --indexes-dir ./indexes --output docs/catalog.jsonbuild.pyextracts types, endpoints, constants, and fork metadata from a spec repo and writes a{spec}_index.jsonto./indexes/.link.pyresolves cross-spec type references (e.g. beacon-apis types referencing consensus-specs containers).build_catalog.pymerges all indexes intocatalog.json, deduplicating shared types across specs using canonical-source attribution. This is the single artifact consumed by both the MCP server and the explorer UI.
.
├── build.py # orchestrates extraction per spec profile
├── build_catalog.py # merges indexes into catalog.json (canonical artifact)
├── pr_index.py # PR shadow indexer (fetch, extract, diff open PRs)
├── link.py # cross-spec reference resolution
├── server.py # MCP server (10 tools, reads catalog.json)
├── fetch_repos.sh # clones all spec repos
├── extractors/
│ ├── profiles.py # spec profiles (paths, fork orders, extractor config)
│ ├── extract_python.py # Python AST extractor (consensus-specs, execution-specs)
│ ├── extract_openapi.py # OpenAPI extractor (beacon-apis, builder-specs, relay-specs, remote-signing-api)
│ ├── extract_openrpc.py # OpenRPC extractor (execution-apis)
│ ├── extract_markdown.py # Markdown type/endpoint extractor (beacon-apis, builder-specs)
│ ├── enrich.py # structural annotation (fields, params, references, domains)
│ └── fetch_examples.py # test fixture fetcher (standalone)
├── indexes/ # generated per-spec indexes (intermediate build artifacts)
│ └── pr/ # PR overlay indexes (per-spec, per-PR)
├── docs/
│ ├── index.html # HTML shell (loads app.js, no inline logic)
│ ├── about.html # project overview, MCP docs, skill card
│ ├── visualizer.html # fork-aware transaction lifecycle diagram (PBS + ePBS)
│ ├── catalog.json # canonical data (from build_catalog.py)
│ ├── SKILL.md # MCP skill document for AI agents
│ ├── logo.svg # site logo
│ ├── favicon.svg # browser tab icon
│ ├── css/
│ │ ├── styles.css # shared styles (layout, nav, search, detail panels)
│ │ ├── about.css # about page styles
│ │ └── visualizer.css # visualizer page styles
│ ├── js/
│ │ ├── app.js # entry point (init, routing, global bindings)
│ │ ├── state.js # shared state (catalog data, selections)
│ │ ├── constants.js # fork orders, spec colors, kind/method badges
│ │ ├── utils.js # HTML escaping, ID sanitization
│ │ ├── forks.js # fork sorting, code-for-fork resolution
│ │ ├── search.js # fuzzy scoring and highlighting
│ │ ├── diff.js # LCS-based line diff engine
│ │ ├── router.js # hash-based routing and navigation
│ │ ├── url.js # URL parameter parsing
│ │ └── views/
│ │ ├── home.js # specs overview + setup/MCP/skill sections
│ │ ├── types.js # type browser (three-panel, filters, detail)
│ │ ├── endpoints.js # endpoint browser
│ │ ├── prs.js # PR browser with inline diffs
│ │ ├── diff-view.js # fork-to-fork diff comparison
│ │ └── skill-modal.js # SKILL.md viewer/copy modal
│ └── js/__tests__/
│ ├── constants.test.js
│ ├── diff.test.js
│ ├── forks.test.js
│ ├── router.test.js
│ ├── search.test.js
│ └── utils.test.js
├── .github/workflows/
│ └── deploy.yml # CI: deno test -> rebuild catalog -> deploy to GitHub Pages
├── SCHEMA.md # index JSON schema documentation
├── CLAUDE.md # agent context (enzyme CLI, project conventions)
└── PLAN.md # development roadmap
Each extractor handles one source format:
- Python AST (
extract_python.py): Walks Python source files, extracts class/function definitions with full code, tracks fork modifications via[New in fork]/[Modified in fork]annotations. - OpenAPI (
extract_openapi.py): Parses OpenAPI YAML, resolves$refchains, extracts endpoints with parameters, response types, SSZ support, and fork variants. - OpenRPC (
extract_openrpc.py): Parses OpenRPC JSON, extracts JSON-RPC methods with params, results, error codes, and content descriptors. - Markdown (
extract_markdown.py): Extracts type definitions and endpoint descriptions from Markdown spec pages (used alongside OpenAPI for specs that document types in prose).
enrich.py adds structural metadata after extraction: field lists for containers,
function signatures, reference graphs between types, domain classification, and
fork diff annotations (is_new, is_modified).
profiles.py defines the extraction configuration for each spec: which
extractors to run, directory paths within the repo, fork ordering, GitHub URL
templates, and any spec-specific extraction options.