Self-hosted onchain analytics, redesigned for AI agents.
The open-source MCP stage for what an AI-first Dune alternative could look like. Pre-alpha; replacement claims withheld until live-mainnet ingest is reproducibly verified — see Status.
No infrastructure? No problem. Try without installing anything:
chainq is a self-hosted onchain analytics stack designed from the ground up for AI agents — Claude Code, Codex, OpenClaw, and any custom LLM application that speaks the Model Context Protocol.
Where Dune is a web SQL editor optimized for human analysts, chainq is an MCP server (and CLI) that lets AI agents:
- Discover what tables and metrics exist (
search_tables,describe,list_metrics) - Estimate a query's cost before running it (
estimate_cost) - Execute SQL or pre-defined metrics with hard budgets (
query,metric) - Visualize results as charts (
chart_render) - Recall past analyses so the same work isn't redone (
recall) - Report findings into Markdown / HTML automatically (
report)
Under the hood: Apache Parquet on local disk or S3, DuckDB as the default query engine (Trino / ClickHouse pluggable), dbt for the transformation layer (Dune's Spellbook is MIT-licensed — we leverage it), and a thin TypeScript MCP surface.
Dune (2019) and Nansen (2020) were built when humans wrote SQL. In 2026, the primary consumer of onchain data is increasingly an AI agent:
- An agent needs machine-readable schemas, not a web autocomplete.
- An agent needs cost estimates upfront, not a "you ran out of credits" message after the fact.
- An agent needs structured errors, not HTML stack traces.
- An agent needs persistent memory of what it has already queried.
- A team that owns sensitive data needs self-hosting, not a vendor API.
Nansen shipped an excellent agent-facing CLI in 2026. We respect it. But it's closed-source, label-driven, and doesn't let you run arbitrary SQL over your own indexed data. Dune is open in spirit (Spellbook is OSS) but the engine and agent surface are proprietary.
chainq fills the gap: fully OSS, self-hosted, MCP-first, SQL-open.
- Researchers and consultancies who need to investigate onchain projects rapidly and store evidence locally.
- Builders who want their dApp's data warehouse with no vendor lock-in.
- Agentic-finance teams who need an LLM agent to do due diligence without exfiltrating sensitive data to a third party.
- Hobbyists who want a Dune-grade environment on a laptop.
Pre-alpha. Active development; APIs will break without notice until 0.1.0.
What's proven working (in CI / smoke tests / dbt run / live mainnet):
- 45 chains supported end-to-end: 43 EVM via public Subsquid archives + Solana (Helius) + Filecoin (Filfox / Spacescan). Reachability is probed on every push; the live list is at docs/SUPPORTED-CHAINS.md
- 18 MCP tools across 8 capability groups (discovery / execution / semantic / analytics / recall / render / report / budget)
- 11 curated catalog tables:
base.logs,dex.trades,erc20.transfers,prices.usd,labels.addresses,filecoin.deals,solana.transfers,solana.dex.trades,nft.trades,lending.events,bridge.transfers - 21 semantic-layer metrics, including cross-table joins (DEX × prices, ERC-20 × labels, sanctioned exposure) and a live-data metric (
base_logs_hourly) - dbt-duckdb spellbook: 23 working models, 42 dbt tests — schema constraints (
not_null,accepted_values) enforced at build time, plus a topic0 decoder (raw logs → ERC-20 Transfer events) that closes the loop on live data - Live Base mainnet ingest:
chainq pull --chain base --from 24000000 --to 24000010against the public Subsquid archive returned 6,534 logs across 11 blocks with the expected 2-second cadence andWETH (0x4200…0006)+USDC (0x833589fc…02913)as the top emitters — full evidence at docs/LIVE-INGEST-PROOF.md - Bilingual single-file HTML reports (JA / EN / both with CSS toggle, no JS), brand customisation, interactive vega-embed charts, CSV download chips
- Per-session cost governor (
chainq_budget_set/status/clear) with structuredBUDGET_EXCEEDEDerrors and BM25-rankedchainq_recall - Benchmark suite (BENCHMARKS.md) — P95 0.5 ms - 27 ms across 10 representative queries on a 103.6 MiB dataset
What's still pending:
- dbt against real data: the 18 spellbook models read seeded synthetic Parquet by default. Pointing them at the live-pulled
base.logs.parquetis the next milestone (the engine and metric YAMLs already accept it; only the spellbook source mapping needs the swap). - Operational reliability: no 90-day uptime trace yet; reorg-safe head-following is unimplemented (the live archive serves finalised blocks, so the current path is bounded but not stress-tested).
- Real Whuffie / Agentic Finance dogfooding: pending Phase 2 of docs/ROADMAP.md.
See docs/ROADMAP.md for the full list of distance markers.
Pick one — they all reach the same place.
| You want | Do this |
|---|---|
| No install, just try | Browser playground — paste a Parquet URL, write SQL |
| One-click cloud VM | Open in Codespaces — chainq running with seed data in a hosted VS Code |
| One-click deploy | Deploy to Render — public URL, free tier |
| One-line laptop install | curl -fsSL https://raw.githubusercontent.com/Jacksstt/chainq/main/scripts/install.sh | sh |
| Docker on laptop | docker compose -f docker/docker-compose.yml up |
| Dev from source | See below |
All paths are documented in docs/INSTALL.md.
git clone https://github.com/Jacksstt/chainq.git
cd chainq
pnpm install
pnpm seed # generate sample Parquet files in ./data
pnpm test # typecheck + smoke + MCP end-to-end test
pnpm mcp:serve # start the MCP server over stdioWire it into Claude Code (one-time):
claude mcp add chainq -- pnpm --dir /absolute/path/to/chainq mcp:serveThree integration paths are documented in docs/CLAUDE_CODE_INTEGRATION.md.
Then in Claude Code:
"Use chainq. Tell me how many DEX trades happened on Base, and show me a query against
dex.tradesaggregated by hour."
The agent will call chainq_list_tables, chainq_describe, chainq_estimate_cost,
and chainq_query in sequence and stream results back.
If you don't want to pay for an Alchemy / Infura subscription, pull a Parquet snapshot from a public Subsquid archive instead:
chainq pull --chain base --from 18000000 --to 18001000
# → writes data/base.logs.parquetNo node, no RPC key, no monthly bill. See
packages/snapshot and
docker/ for the full self-hosted stack.
18 MCP tools across 8 capability groups (discovery / execution /
semantic / analytics / recall / render / report / budget). Run
chainq tools for the auto-generated list, or chainq mcp serve and
send tools/list from any MCP client.
| Group | Tool | Notes |
|---|---|---|
| Discovery | chainq_list_tables |
Enumerate the catalog |
| Discovery | chainq_search_tables |
Free-text + chain filter |
| Discovery | chainq_describe |
Schema, sample rows, lineage, sample queries, partitions, gotchas |
| Execution | chainq_estimate_cost |
Row / cost estimate + budget decision (advisory) |
| Execution | chainq_query |
DuckDB SQL with row + timeout caps; budget-aware; cached |
| Semantic | chainq_list_metrics |
List YAML-defined metrics |
| Semantic | chainq_metric |
Run a named metric with dimensions / filters / window |
| Recall | chainq_recall |
BM25-ranked search across past queries (SQL + label) |
| Recall | chainq_recall_by_id |
Pull a cached result preview |
| Render | chainq_chart_render |
Vega-Lite → svg / html / vegalite-json |
| Report | chainq_report |
Single-file HTML reports (inline CSS, dark/light auto, print-ready); Markdown via .md filename |
| Budget | chainq_budget_set |
Per-session caps on credits / rows / bytes / seconds |
| Budget | chainq_budget_status |
Active caps, running totals, remaining headroom |
| Budget | chainq_budget_clear |
Clear caps and consumption counters |
20 semantic metrics loaded by default. Run chainq metrics to list.
Cover DEX volume / trade count / trader count / protocol share / avg
trade size, ERC-20 transfers + active addresses, Filecoin deals +
storage, Solana transfers + DEX volume, NFT volume + top collections,
lending borrows + liquidations, bridge corridor volume,
USD-priced DEX volume (cross-table join), CEX-inflow volume
(label-joined), sanctioned-address exposure, and Whuffie
reputation score.
Supporting code:
@chainq/ingest-evm—assertCryoInstalled+backfill()that shells out to cryo@chainq/ingest-filecoin— Filfox + Spacescan REST wrappers@chainq/ingest-solana— Helius RPC client (signatures + enriched txs)@chainq/snapshot— RPC-free pulls from public Subsquid archives@chainq/engine-clickhouse— scaffold for v0.5.0 alternative backendspellbook/— dbt-duckdb project with starter models (dex, erc20, filecoin, solana)- Structured errors (
@chainq/coreChainqError+ChainqErrorCode) so agent harnesses can branch onBUDGET_EXCEEDED,UNKNOWN_TABLE, etc.
- Yellowstone gRPC realtime stream for Solana (Helius polling works today)
- Iceberg storage format
- Trino / Starburst backend (ClickHouse driver is scaffolded only)
- Multi-machine ingest
See docs/ROADMAP.md.
Short version: if you're a human writing occasional SQL and sharing dashboards, Dune is the right tool — pay them. If an AI agent is doing the work, or your data is sensitive, or the chain is exotic, or your workload is heavy, you want chainq. See docs/COMPARISON.md for the full side-by-side (Dune Free vs Dune Analyst vs chainq).
┌──────────────────────────────────────────────────────────────────┐
│ Agent / CLI surface │
│ MCP tools: search, describe, query, metric, chart, report │
├──────────────────────────────────────────────────────────────────┤
│ Semantic layer │
│ YAML metric definitions (LLM-readable) → SQL plans │
├──────────────────────────────────────────────────────────────────┤
│ Query engine (pluggable) │
│ default: DuckDB optional: Trino, ClickHouse, DataFusion │
├──────────────────────────────────────────────────────────────────┤
│ Transformation │
│ dbt-core + chainq-spellbook (fork of duneanalytics/spellbook) │
├──────────────────────────────────────────────────────────────────┤
│ Storage │
│ Parquet on local fs or S3 (Iceberg in Phase 2) │
├──────────────────────────────────────────────────────────────────┤
│ Ingest │
│ EVM: cryo (Paradigm) for backfill, Subsquid for realtime │
│ Filecoin: Filfox + Glif + Spacescan wrappers │
│ Solana: Helius / Yellowstone gRPC │
└──────────────────────────────────────────────────────────────────┘
See docs/ARCHITECTURE.md for rationale and tradeoffs.
| Package | Purpose |
|---|---|
@chainq/core |
Shared types, schemas, semantic-layer model |
@chainq/mcp-server |
MCP server exposing the agent tools |
@chainq/cli |
Standalone CLI (wraps the MCP server) |
@chainq/snapshot |
Pull Parquet snapshots from public archives — the RPC-free path |
@chainq/storage |
Filecoin / IPFS pinning for community snapshot sharing |
@chainq/light-client |
Trust-minimised verification (Helios / Lodestar plan) |
@chainq/x402 |
Pay-per-call gating for public MCP endpoints (USDC on Base / Solana) |
@chainq/playground |
DuckDB-WASM in-browser playground (no install) |
@chainq/ingest-evm |
EVM ingestion: cryo backfill + Subsquid realtime stream |
@chainq/ingest-filecoin |
Filecoin-native (Filfox + Spacescan) |
@chainq/ingest-solana |
Solana via Helius RPC |
@chainq/whuffie |
Sybil-resistant reputation data product (research line) |
spellbook/ |
dbt-duckdb project (Spellbook fork + chainq additions; runs in CI) |
docker/ |
One-command stack (chainq + cron + optional Metabase) |
See docs/ROADMAP.md. Headline:
- v0.0.x — Skeleton, MCP server stub, single-chain (Base) ingest demo
- v0.1.0 — Multi-chain ingest (Ethereum + Base + Filecoin), 5 MCP tools, 10 curated tables
- v0.2.0 — Semantic layer GA, chart rendering, report-to-vault
- v0.5.0 — Iceberg support, Trino backend, Solana
- v1.0.0 — Production-ready, public OSS launch
Pre-alpha — please open an issue before sending a PR so we can align. See CONTRIBUTING.md.
duneanalytics/spellbook— the curated table definitions we build on (MIT)paradigmxyz/cryo— EVM data extractionsubsquid/squid-sdk— realtime indexing- DuckDB — the engine that makes single-node analytics feasible
MIT — see LICENSE.
Built by Prime Beat.